计算机代写|机器学习代写machine learning代考|Random Variables

计算机代写|机器学习代写machine learning代考|Discrete Random Variables

In the case where $\mathcal{S}$ is a discrete domain, the probability that $X=x$ is described by a probability mass function (PMF). In terms of notation $\operatorname{Pr}(X=x) \equiv p_{X}(x) \equiv p(x)$ are all equivalent. Moreover, we typically describe a random variable by defining its sampling space and its probability mass function so that $x: X \sim p_{X}(x)$. The symbol $\sim$ reads as distributed like. Analogously to the probability of events, the probability that $X=x$ must be
$$0 \leq p_{X}(x) \leq 1$$
and the sum of the probability for all $x \in \mathcal{S}$ follows
$$\sum_{x} p_{X}(x)=1 .$$
For the post-earthquake structural safety example introduced in $\S 3.1$, where
$$\mathcal{S}=\left{\begin{array}{rc} \text { no damage } & (\mathrm{N}) \ \text { light damage } & (\mathrm{L}) \ \text { important damage } & \text { (I) } \ \text { collapse } & \text { (C) } \end{array}\right},$$
the sampling space along with the probability of each event can be represented by a probability mass function as depicted in figure $3.10$.

The event corresponding to damages that are either light or important corresponds to $\mathrm{L} \cup \mathrm{I} \equiv{1 \leq x \leq 2}$. Because the events $x=1$ and $x=2$ are mutually exclusive, the probability
\begin{aligned} \operatorname{Pr}(\mathbf{L} \cup \mathrm{I}) &=\operatorname{Pr}({1 \leq X \leq 2}) \ &=p_{X}(x=1)+p_{X}(x=2) \end{aligned}
The probability that $X$ takes a value less than or equal to $x$ is described by a cumulative mass function (CMF),
$$\operatorname{Pr}(X \leq x)=F_{X}(x)=\sum_{x^{\prime} \leq x} p_{X}\left(x^{\prime}\right)$$
Figure $3.11$ presents on the same graph the probability mass function (PMF) and the cumulative mass function. As its name indicates, the CMF corresponds to the cumulative sum of the PMF. Inversely, the PMF can be obtained from the CMF following
$$p_{X}\left(x_{i}\right)=F_{X}\left(x_{i}\right)-F_{X}\left(x_{i-1}\right) .$$

计算机代写|机器学习代写machine learning代考|Multivariate Random Variables

It is common to study the joint occurrence of multiple phenomena. In the context of probability theory, it is done using multivariate random variables. $\mathbf{x}=\left[\begin{array}{llll}x_{1} & x_{2} & \cdots & x_{n}\end{array}\right]^{\top}$ is a vector (column) containing realizations for $n$ random variables $\mathbf{X}=\left[\left.\begin{array}{llll}X_{1} & X_{2} & \cdots & X_{n}\end{array}\right|^{\top}\right.$, $\mathbf{x}: \mathbf{X} \sim p \mathbf{x}(\mathbf{x}) \equiv p(\mathbf{x})$, or $\mathbf{x}: \mathbf{X} \sim f \mathbf{X}(\mathbf{x}) \equiv f(\mathbf{x})$. For the discrete case, the probability of the joint realization $\mathbf{x}$ is described by
$$p_{\mathbf{X}}(\mathbf{x})=\operatorname{Pr}\left(X_{1}=x_{1} \backslash X_{2}=x_{2} \backslash \cdots \backslash X_{n}=x_{n}\right)$$
where $0 \leq p \mathbf{x}(\mathbf{x}) \leq 1$. For the continuous case, it is
$$f \mathbf{x}(\mathbf{x}) \Delta \mathbf{x}=\operatorname{Pr}\left(x_{1}1 because it describes a probability density. As mentioned earlier, two random variables X_{1} and X_{2} are statistically independent (\perp) if$$
p_{X_{1} \mid x_{2}}\left(x_{1} \mid x_{2}\right)=p X_{1}\left(x_{1}\right) .
$$If X_{1} \perp X_{2} \perp \cdots \perp_{n} X the joint PMF is defined by the product of its marginals,$$
p_{X_{1}: X_{n}}\left(x_{1}, \cdots, x_{n}\right)=p_{X_{1}}\left(x_{1}\right) p_{X_{2}}\left(x_{2}\right) \cdots p_{X_{n}}\left(x_{n}\right)
$$For the general case where X_{1}, X_{2}, \cdots, X_{n} are not statistically independent, their joint PMF can be defined using the chain rule,$$
\begin{aligned}
p_{X_{1}: X_{n}}\left(x_{1}, \cdots, x_{n}\right)=& p_{X_{1} \mid X_{2}: X_{n}}\left(x_{1} \mid x_{2}, \cdots, x_{n}\right) \cdots \
\cdot p_{X_{n-1}} \mid X_{n}\left(x_{n-1} \mid x_{n}\right) \cdot p_{X_{\mathrm{n}}}\left(x_{n}\right) .
\end{aligned}
$$The same rules apply for contimunus random variahles except. that p_{\mathbf{X}}(\mathbf{x}) is replaced by f_{\mathbf{X}}(\mathbf{x}). Figure 3.14 presents examples of marginals and a bivariate joint probability density function. The multivariate cumulative distribution function describes the probability that a set of n random variables is simultaneously lesser or equal to \mathbf{x},$$
F_{\mathbf{x}}(\mathbf{x})=\operatorname{Pr}\left(X_{1} \leq x_{1} \backslash \cdots \backslash X_{n} \leq x_{n}\right)
$$计算机代写|机器学习代写machine learning代考|Functions of Random Varia Let us consider a continuous random variable X \sim f_{X}(x) and a monotonic deterministic function y=g(x). The function’s output Y is a random variable because it takes as input the random variable X. The PDF f_{Y}(y) is defined knowing that for each infinitesimal part of the domain d x, there is a corresponding d y, and the probability over both domains must be equal,$$
\begin{aligned}
\operatorname{Pr}(y<Y \leq y+d y) &=\operatorname{Pr}(x<X \leq x+d x) \
\underbrace{f_{y}(y)}{\geq 0} d y &=\underbrace{f{X}(x)}{\geq 0} d x . \end{aligned} $$The change-of-variable rule for f{Y}(y) is defined by$$
\begin{aligned}
f_{Y}(y) &=f_{X}(x)\left|\frac{d x}{d y}\right| \
&=f_{X}(x)\left|\frac{d y}{d x}\right|^{-1} \
&=f_{X}\left(g^{-1}(y)\right)\left|\frac{d g\left(g^{-1}(y)\right)}{d x}\right|^{-1}
\end{aligned}
$$where multiplying by \frac{d x}{d y} accounts for the change in the size of the neighborhood of x with respect to y, and where the absolute value ensures that f_{Y}(y) \geq 0. For a function y=g(x) and its inverse x=g^{-1}(y), the gradient is obtained from$$
\frac{d y}{d x} \equiv \frac{d g(x)}{d x} \equiv \frac{d g(\overbrace{\left.g^{-1}(y)\right)}^{=x}}{d x} .
$$Figure 3.19 presents an example of nonlinear transformation y= g(x). Notice how, because of the nonlinear transformation, the maximum for f_{X}\left(x^{}\right) and the maximum for f_{Y}\left(y^{}\right) do not occur for the same locations, that is, y^{} \neq g\left(x^{}\right). Given a set of n random variables \mathbf{x} \in \mathbb{R}^{n}: \mathbf{X} \sim f_{\mathbf{X}}(\mathbf{x}), we can generalize the transformation rule for an n to n multivariate function \mathbf{y}=g(\mathbf{x}), as illustrated in figure 3.20 a for a case where n=2. As with the univariate case, we need to account for the change in the neighborhood size when going from the original to the transformed space, as illustrated in figure 3.20 \mathrm{~b}. The transformation is then defined by$$
\begin{aligned}
f_{\mathbf{Y}}(\mathbf{y}) d \mathbf{y} &=f_{\mathbf{X}}(\mathbf{x}) d \mathbf{x} \
f_{\mathbf{Y}}(\mathbf{y}) &=f_{\mathbf{X}}(\mathbf{x})\left|\frac{d \mathbf{x}}{d \mathbf{y}}\right|,
\end{aligned}
$$where \left|\frac{d \mathbf{x}}{d \mathbf{y}}\right| is the inverse of the determinant of the Jacobian matrix,$$
\begin{aligned}
\left|\frac{d \mathbf{x}}{d \mathbf{y}}\right| &=\left|\operatorname{det} \mathbf{J}{\mathbf{y}, \mathbf{x}}\right|^{-1} \ \left|\frac{d \mathbf{y}}{d \mathbf{x}}\right| &=\left|\operatorname{det} \mathbf{J}{\mathbf{y}, \mathbf{x}}\right|
\end{aligned}
$$机器学习代考 计算机代写|机器学习代写machine learning代考|Discrete Random Variables 在这种情况下小号是一个离散域，概率X=X由概率质量函数 (PMF) 描述。在符号方面公关⁡(X=X)≡pX(X)≡p(X)都是等价的。此外，我们通常通过定义其采样空间和概率质量函数来描述随机变量，以便X:X∼pX(X). 符号∼读起来像分布式。类似于事件的概率，X=X一定是 0≤pX(X)≤1 和所有概率的总和X∈小号跟随 ∑XpX(X)=1. 对于地震后结构安全的例子介绍§§3.1， 在哪里 \mathcal{S}=\left{\begin{array}{rc} \text { 无损伤 } & (\mathrm{N}) \ \text { 轻微损伤 } & (\mathrm{L}) \ \text {重要损坏 } & \text { (I) } \ \text { collapse } & \text { (C) } \end{array}\right},\mathcal{S}=\left{\begin{array}{rc} \text { 无损伤 } & (\mathrm{N}) \ \text { 轻微损伤 } & (\mathrm{L}) \ \text {重要损坏 } & \text { (I) } \ \text { collapse } & \text { (C) } \end{array}\right}, 采样空间以及每个事件的概率可以用概率质量函数表示，如图所示3.10. 与轻微或重要的损害相对应的事件对应于大号∪我≡1≤X≤2. 因为事件X=1和X=2是互斥的，概率 公关⁡(大号∪我)=公关⁡(1≤X≤2) =pX(X=1)+pX(X=2) 的概率X取小于或等于的值X由累积质量函数 (CMF) 描述， 公关⁡(X≤X)=FX(X)=∑X′≤XpX(X′) 数字3.11在同一图表上显示概率质量函数 (PMF) 和累积质量函数。顾名思义，CMF 对应于 PMF 的累积和。相反，PMF 可以从 CMF 中获得 pX(X一世)=FX(X一世)−FX(X一世−1). 计算机代写|机器学习代写machine learning代考|Multivariate Random Variables 研究多种现象的共同发生是很常见的。在概率论的背景下，它是使用多元随机变量完成的。X=[X1X2⋯Xn]⊤是一个包含实现的向量（列）n随机变量X=[X1X2⋯Xn|⊤, X:X∼pX(X)≡p(X)， 或者X:X∼FX(X)≡F(X). 对于离散情况，联合实现的概率X描述为 pX(X)=公关⁡(X1=X1∖X2=X2∖⋯∖Xn=Xn) 在哪里0≤pX(X)≤1. 对于连续的情况，它是 f \mathbf{x}(\mathbf{x}) \Delta \mathbf{x}=\operatorname{Pr}\left(x_{1}1 因为它描述了一个概率密度。如前所述，两个随机变量  X_{1} 和 X_{2} 在统计上独立 (\perp) 如果f \mathbf{x}(\mathbf{x}) \Delta \mathbf{x}=\operatorname{Pr}\left(x_{1}1 因为它描述了一个概率密度。如前所述，两个随机变量  X_{1} 和 X_{2} 在统计上独立 (\perp) 如果 p_{X_{1} \mid x_{2}}\left(x_{1} \mid x_{2}\right)=p X_{1}\left(x_{1}\right) 。 我FX1⊥X2⊥⋯⊥nX吨H和j○一世n吨磷米F一世sd和F一世n和db是吨H和pr○d在C吨○F一世吨s米一个rG一世n一个ls, p_{X_{1}: X_{n}}\left(x_{1}, \cdots, x_{n}\right)=p_{X_{1}}\left(x_{1}\right) p_{ X_{2}}\left(x_{2}\right) \cdots p_{X_{n}}\left(x_{n}\right) F○r吨H和G和n和r一个lC一个s和在H和r和X1,X2,⋯,Xn一个r和n○吨s吨一个吨一世s吨一世C一个ll是一世nd和p和nd和n吨,吨H和一世rj○一世n吨磷米FC一个nb和d和F一世n和d在s一世nG吨H和CH一个一世nr在l和, pX1:Xn(X1,⋯,Xn)=pX1∣X2:Xn(X1∣X2,⋯,Xn)⋯ ⋅pXn−1∣Xn(Xn−1∣Xn)⋅pXn(Xn).$$

FX(X)=公关⁡(X1≤X1∖⋯∖Xn≤Xn)

计算机代写|机器学习代写machine learning代考|Functions of Random Varia

F是(是)=FX(X)|dXd是| =FX(X)|d是dX|−1 =FX(G−1(是))|dG(G−1(是))dX|−1

d是dX≡dG(X)dX≡dG(G−1(是))⏞=XdX.

F是(是)d是=FX(X)dX F是(是)=FX(X)|dXd是|,

|dXd是|=|这⁡Ĵ是,X|−1 |d是dX|=|这⁡Ĵ是,X|

