### 计算机代写|机器学习代写machine learning代考|Probability Theory

## 计算机代写|机器学习代写machine learning代考|Set Theory

A set describes an ensemble of elements, also referred to as events. An elementary event $x$ refers to a single event among a sampling space (or universe) denoted by the calligraphic letter $\mathcal{S}$. By definition, a sampling space contains all the possible events, $E \subseteq \mathcal{S}$. The special case where an event is equal to the sampling space, $E=\mathcal{S}$, is called a certain event. The opposite, $E=\emptyset$, where an event is an empty set, is called a null event. $E$ refers to the complement of a set, that is, all elements belonging to $\mathcal{S}$ and not to $E$. Figure $3.3$ illustrates these concepts using a Venn diagram.

Let us consider the example, ${ }^{5}$ of the state of a structure following an earthquake, which is described by a sampling space,
\begin{aligned} \mathcal{S} &={\text { no damage, light damage, important damage, collapse }} \ &={N, L, I, C} . \end{aligned}
In that context, an event $E_{1}={\mathbf{N}, \mathrm{L}}$ could contain the no damage and light damage events, and another event $E_{2}={C}$ could contain only the collapsed state. The complements of these events are, respectively, $\overline{E_{1}}={\mathrm{I}, \mathrm{C}}$ and $\overline{E_{2}}={\mathrm{N}, \mathrm{L}, \mathrm{I}}$.
The two main operations for events, union and intersection, are illustrated in figure 3.4. A union is analogous to the “or” operator, where $E_{1} \cup E_{2}$ holds if the event belongs to either $E_{1}, E_{2}$, or both. The intersection is analogous to the “and” operator, where $E_{1} \backslash E_{2} \equiv E_{1} E_{2}$ holds if the event belongs to both $E_{1}$ and $E_{2}$. As a convention, intersection has priority over union. Moreover, both operations are commutative, associative, and distributive.

Given a set of $n$ events $\left{E_{1}, E_{2}, \cdots, E_{n}\right} \in \mathcal{S}, E_{1}, E_{2}, \cdots, E_{n}$,the events are mutually exclusive if $E_{i} E_{j}=\emptyset, \forall i \neq j$, that is, if the intersection for any pair of events is an empty set. Events $E_{1}, E_{2}, \cdots, E_{n}$ are collectively exhaustive if $\cup_{i=1}^{n} E_{i}=\mathcal{S}$, that is, the union of all events is the sampling space. Events $E_{1}, E_{2}, \cdots, E_{n}$ are mutually exclusive and collectively exhaustive if they satisfy both properties simultaneously. Figure $3.5$ presents examples of mutually exclusive (3.5a), collectively exhaustive (3.5b), and mutually exclusive and collectively exhaustive $(3.5 \mathrm{c}-\mathrm{d})$ events. Note that the difference between (b) and (c) is the absence of overlap in the latter.

## 计算机代写|机器学习代写machine learning代考|Probability of Events

$\operatorname{Pr}\left(E_{i}\right)$ denotes the probability of the event $E_{i}$. There are two main interpretations for a probability: the Frequentist and the Bayesian. Frequentists interpret a probability as the number of occurrences of $E_{i}$ relative to the number of samples $s$, as $s$ goes to $\infty$,
$$\operatorname{Pr}\left(E_{i}\right)=\lim {s \rightarrow \infty} \frac{#\left{E{i}\right}}{s} .$$
For Bayesians, a probability measures how likely is $E_{i}$ in comparison with other events in $\mathcal{S}$. This interpretation assumes that the nature of uncertainty is epistemic, that is, it describes our knowledge of a phenomenon. For instance, the probability depends on the available knowledge and can change when new information is obtained. Throughout this book we are adopting this Bayesian interpretation.

By definition, the probability of an event is a number between zero and one, $0 \leq \operatorname{Pr}\left(E_{i}\right) \leq 1$. At the ends of this spectrum, the probability of any event in $\mathcal{S}$ is one, $\operatorname{Pr}(\mathcal{S})=1$, and the probability of an empty set is zero, $\operatorname{Pr}(\emptyset)=0$. If two events $E_{1}$ and $E_{2}$ are mutually exclusive, then the probability of the events’ union is the sum of each event’s probability. Because the union of an event and its complement are the sampling space, $E \cup E=\mathcal{S}$ (see figure $3.5 \mathrm{~d}$ ), and because $\operatorname{Pr}(\mathcal{S})=1$, then the probability of the complement is $\operatorname{Pr}(E)=1-\operatorname{Pr}(E)$.

When events are not mutually exclusive, the general addition rule for the probability of the union of two events is
$$\operatorname{Pr}\left(E_{1} \cup E_{2}\right)=\operatorname{Pr}\left(E_{1}\right)+\operatorname{Pr}\left(E_{2}\right)-\operatorname{Pr}\left(E_{1} E_{2}\right) .$$
This general addition rule is illustrated in figure 3.6, where if we simply add the probability of each event without accounting for the subtraction of $\operatorname{Pr}\left(E_{1} E_{2}\right)$, the probability of the intersection of both events will be counted twice.

## 计算机代写|机器学习代写machine learning代考|The probability of a single e

The probability of a single event is referred to as a maryinal probability. A joint probability designates the probability of the intersection of events. The terms in equation $3.1$ can be rearranged to explicitly show that the joint probability of two events $\left{E_{1}, E_{2}\right}$ is the product of a conditional probability and its associated marginal,
\begin{aligned} \operatorname{Pr}\left(E_{1} E_{2}\right) &=\operatorname{Pr}\left(E_{1} \mid E_{2}\right) \cdot \operatorname{Pr}\left(E_{2}\right) \ &=\operatorname{Pr}\left(E_{2} \mid E_{1}\right) \cdot \operatorname{Pr}\left(E_{1}\right) . \end{aligned}
In cases where $E_{1}$ and $E_{2}$ are statistically independent, $E_{1} \perp E_{2}$,
Note: Statis conditional probabilities are equal to the marginal, tween a pair
$$E_{1} \perp E_{2} \begin{cases}\operatorname{Pr}\left(E_{1} \mid E_{2}\right)=\operatorname{Pr}\left(E_{1}\right) & \text { that learning } \ \operatorname{Pr}\left(E_{2} \mid E_{1}\right)=\operatorname{Pr}\left(E_{2}\right) & \text { other. }\end{cases}$$
In the special case of statistically independent events, the joint probability reduces to the product of the marginals,
$$\operatorname{Pr}\left(E_{1} E_{2}\right)=\operatorname{Pr}\left(E_{1}\right) \cdot \operatorname{Pr}\left(E_{2}\right)$$
The joint probability for $n$ events can be broken down into $n-1$ conditionals and one marginal probability using the chain rule,
\begin{aligned} \operatorname{Pr}\left(E_{1} E_{2} \cdots E_{n}\right) &=\operatorname{Pr}\left(E_{1} \mid E_{2} \cdots E_{n}\right) \operatorname{Pr}\left(E_{2} \cdots E_{n}\right) \ &=\operatorname{Pr}\left(E_{1} \mid E_{2} \cdots E_{n}\right) \operatorname{Pr}\left(E_{2} \mid E_{3} \cdots E_{n}\right) \operatorname{Pr}\left(E_{3} \cdots E_{n}\right) \ &=\operatorname{Pr}\left(E_{1} \mid E_{2} \cdots E_{n}\right) \operatorname{Pr}\left(E_{2} \mid E_{3} \cdots E_{n}\right) \cdots \operatorname{Pr}\left(E_{n-1} \mid E_{n}\right) \operatorname{Pr}\left(E_{n}\right) \end{aligned}
Let us define $\left{E_{1}, E_{2}, E_{3}, \cdots, E_{n}\right} \in \mathcal{S}$, a set of mutually exclusive and collectively exhaustive events, that is, $E_{i} E_{j}=\emptyset, \forall i \neq$ $j, \cup_{i=1}^{n} E_{i}=\mathcal{S}$ – and an event $A$ belonging to the same sampling

space, that is, $A \in \mathcal{S}$. This context is illustrated using a Venn diagram in figure 3.7. The probability of the event $A$ can be obtained by summing the joint probability of $A$ and each event $E_{i}$,
$$\operatorname{Pr}(A)=\sum_{i=1}^{n} \underbrace{\operatorname{Pr}\left(A \mid E_{i}\right) \cdot \operatorname{Pr}\left(E_{i}\right)}{\operatorname{Pr}\left(A E{i}\right)} .$$

\operatorname{Pr}\left(E_{i}\right)=\lim {s \rightarrow \infty} \frac{#\left{E{i}\right}}{s} 。\operatorname{Pr}\left(E_{i}\right)=\lim {s \rightarrow \infty} \frac{#\left{E{i}\right}}{s} 。

