### 统计代写 | Statistical Learning and Decision Making代考| Inference in Bayesian Networks

statistics-lab™ 为您的留学生涯保驾护航 在代写Statistical Learning and Decision Making方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写Statistical Learning and Decision Making代写方面经验极为丰富，各种代写Statistical Learning and Decision Making相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• Advanced Probability Theory 高等楖率论
• Advanced Mathematical Statistics 高等数理统计学
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写 | Statistical Learning and Decision Making代考|Inference in Bayesian Networks

In inference problems, we want to infer a distribution over query variables given some observed evidence variables. The other nodes are referred to as hidden variables. We often refer to the distribution over the query variables given the evidence as a posterior distribution.

To illustrate the computations involved in inference, recall the Bayesian network from example $2.5$, the structure of which is reproduced in figure $3.1$. Suppose we have $B$ as a query variable and evidence $D=1$ and $C=1$. The inference task is to compute $P\left(b^{1} \mid d^{1}, c^{1}\right)$, which corresponds to computing the probability that we have a battery failure given an observed trajectory deviation and communication loss.

From the definition of conditional probability introduced in equation (2.22), we know
$$P\left(b^{1} \mid d^{1}, c^{1}\right)=\frac{P\left(b^{1}, d^{1}, c^{1}\right)}{P\left(d^{1}, c^{1}\right)}$$

To compute the numerator, we must use a process known as marginalization, where we sum out variables that are not involved, in this case $S$ and $E$ :
$$P\left(b^{1}, d^{1}, c^{1}\right)=\sum_{s} \sum_{e} P\left(b^{1}, s, e, d^{1}, c^{1}\right)$$
We know from the chain rule for Bayesian networks introduced in equation (2.31) that
$$P\left(b^{1}, s, e, d^{1}, c^{1}\right)=P\left(b^{1}\right) P(s) P\left(e \mid b^{1}, s\right) P\left(d^{1} \mid e\right) P\left(c^{1} \mid e\right)$$
All of the components on the right-hand side are specified in the conditional probability distributions associated with the nodes in the Bayesian network. We can compute the denominator in equation (3.1) using the same approach but with anditional summation over the values for $B$.

This process of using the definition of conditional probability, marginalization, and applying the chain rule can be used to perform exact inference in any Bayesian network. We can implement exact inference using factors. Recall that factors represent discrete multivariate distributions. We use the following three operations on factors to achieve this:

• We use the factor product (algorithm 3.1) to combine two factors to produce a larger factor whose scope is the combined scope of the input factors. If we have $\phi(X, Y)$ and $\psi(Y, Z)$, then $\phi \cdot \psi$ will be over $X, Y$, and $Z$ with $(\phi \cdot \psi)(x, y, z)=$ $\phi(x, y) \psi(y, z)$. The factor product is demonstrated in example $3.1$.
• We use factor marginalization (algorithm 3.2) to sum out a particular variable from the entire factor table, removing it from the resulting scope. Example $3.2$ illustrates this process.
• We use factor conditioning (algorithm 3.3) with respect to some evidence to remove any rows in the table inconsistent with that evidence. Example $3 \cdot 3$ demonstrates factor conditioning.

## 统计代写 | Statistical Learning and Decision Making代考|Inference in Naive Bayes Models

The previous section presented a general method for performing exact inference in any Bayesian network. This section discusses how this same method can be used to solve classification problems for a special kind of Bayesian network structure known as a naive Bayes model. This structure is shown in figure 3.2. An equivalent but more compact representation is shown in figure $3.3$ using a plate, shown as a rounded box. The $i=1: n$ in the bottom of the box specifies that the $i$ in the subscript of the variable name is repeated from 1 to $n$.

In the naive Bayes model, the class $C$ is the query variable, and the observed features $O_{1: n}$ are the evidence variables. The naive Bayes model is called naive because it assumes conditional independence between the evidence variables given the class. Using the notation introduced in section 2.6, we can say $\left(O_{i} \perp O_{j}\right.$ C) for all $i \neq j$. Of course, if these conditional independence assumptions do not hold, then we can add the necessary directed edges between the observed features.

We have to specify the prior $P(C)$ and the class-conditional distributions $P\left(O_{i} \mid C\right)$. As done in the previous section, we can apply the chain rule to compute the joint distribution:
$$P\left(c, o_{1: n}\right)=P(c) \prod_{i=1}^{n} P\left(o_{i} \mid c\right)$$
Our classification task involves computing the conditional probability $P\left(c \mid o_{1: n}\right)$. From the definition of conditional probability, we have
$$P\left(c \mid o_{1: n}\right)=\frac{P\left(c, o_{1: n}\right)}{P\left(o_{1: n}\right)}$$

We can compute the denominator by marginalizing the joint distribution:
$$P\left(o_{1: n}\right)=\sum_{c} P\left(c, o_{1: n}\right)$$
The denominator in equation (3.5) is not a function of $C$ and can therefore be treated as a constant. Hence, we can write
$$P\left(c \mid o_{1: n}\right)=\kappa P\left(c, o_{1: n}\right)$$
where $\kappa$ is a normalization constant such that $\sum_{c} P\left(c \mid o_{1: n}\right)=1$. We often drop $\kappa$ and write
$$P\left(c \mid o_{1: n}\right) \propto P\left(c, o_{1 \Omega}\right)$$
where the proportional to symbol $\propto$ is used to represent that the left-hand side is proportional to the right-hand side. Example $3.4$ illustrates how inference can be applied to classifying radar tracks.

We can use this method to infer a distribution over classes, but for many applications, we have to commit to a particular class. It is common to classify according to the class with the highest posterior probability, $\arg \max {c} P\left(c \mid o{1: n}\right)$. However, choosing a class is really a decision problem that often should take into account the consequences of misclassification. For example, if we are interested in using our classifier to filter out targets that are not aircraft for the purpose of air traffic control, then we can afford to occasionally let a few birds and other clutter tracks through our filter. However, we would want to avoid filtering out any real aircraft because that could lead to a collision. In this case, we would probably only want to classify a track as a bird if the posterior probability were close to $1 .$ Decision problems will be discussed in chapter $6 .$

## 统计代写 | Statistical Learning and Decision Making代考|Sum-Product Variable Elimination

A variety of methods can be used to perform efficient inference in more complicated Bayesian networks. One method is known as sum-product variable elimination, which interleaves eliminating hidden variables (summations) with applications of the chain rule (products). It is more efficient to marginalize variables out as early as possible to avoid generating large factors.

We will illustrate the variable elimination algorithm by computing the distribution $P\left(B \mid d^{1}, c^{1}\right)$ for the Bayesian network in figure $3.1$. The conditional probability distributions associated with the nodes in the network can be represented by the following factors:
$$\phi_{1}(B), \phi_{2}(S), \phi_{3}(E, B, S), \phi_{4}(D, E), \phi_{5}(C, E)$$
Because $D$ and $C$ are observed variables, the last two factors can be replaced with $\phi_{6}(E)$ and $\phi_{7}(E)$ by setting the evidence $D=1$ and $C=1$.

We then proceed by eliminating the hidden variables in sequence. Different strategies can be used for choosing an ordering, but for this example, we arbitrarily choose the ordering $E$ and then $S$. To eliminate $E$, we take the product of all the factors involving $E$ and then marginalize out $E$ to get a new factor:
$$\phi_{8}(B, S)=\sum_{e} \phi_{3}(e, B, S) \phi_{6}(e) \phi_{7}(e)$$
We can now discard $\phi_{3}, \phi_{6}$, and $\phi_{7}$ because all the information we need from them is contained in $\phi_{8}$.

Next, we eliminate $S$. Again, we gather all remaining factors that involve $S$ and marginalize out $S$ from the product of these factors:
$$\phi_{9}(B)=\sum_{s} \phi_{2}(s) \phi_{8}(B, s)$$
We discard $\phi_{2}$ and $\phi_{8}$, and are left with $\phi_{1}(B)$ and $\phi_{9}(B)$. Finally, we take the product of these two factors and normalize the result to obtain a factor representing $P\left(B \mid d^{1}, c^{1}\right) .$
The above procedure is equivalent to computing the following:
$$P\left(B \mid d^{1}, c^{1}\right) \propto \phi_{1}(B) \sum_{s}\left(\phi_{2}(s) \sum_{e}\left(\phi_{3}\left(e \mid B_{t} s\right) \phi_{4}\left(d^{1} \mid e\right) \phi_{5}\left(c^{1} \mid e\right)\right)\right)$$
This produces the same result as, but is more efficient than, the naive procedure of taking the product of all of the factors and then marginalizing:
$$P\left(B \mid d^{1}, c^{1}\right) \propto \sum_{s} \sum_{e} \phi_{1}(B) \phi_{2}(s) \phi_{3}(e \mid B, s) \phi_{4}\left(d^{1} \mid e\right) \phi_{5}\left(c^{1} \mid e\right)$$

## 统计代写 | Statistical Learning and Decision Making代考|Inference in Bayesian Networks

• 我们使用因子乘积（算法 3.1）将两个因子组合起来产生一个更大的因子，其范围是输入因子的组合范围。如果我们有φ(X,是)和ψ(是,从)， 然后φ⋅ψ将会结束X,是， 和从和(φ⋅ψ)(X,是,和)= φ(X,是)ψ(是,和). 因子积在例子中演示3.1.
• 我们使用因子边缘化（算法 3.2）从整个因子表中总结出特定变量，将其从结果范围中删除。例子3.2说明了这个过程。
• 我们针对某些证据使用因子条件（算法 3.3）来删除表中与该证据不一致的任何行。例子3⋅3演示因子调节。

## 统计代写 | Statistical Learning and Decision Making代考|Sum-Product Variable Elimination

φ1(乙),φ2(小号),φ3(和,乙,小号),φ4(D,和),φ5(C,和)

φ8(乙,小号)=∑和φ3(和,乙,小号)φ6(和)φ7(和)

φ9(乙)=∑sφ2(s)φ8(乙,s)

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。