## 统计代写|贝叶斯网络代写Bayesian network代考|Score metrics

A scoring criteria for a DAG is a function that assigns a value to each DAG based on the data. Cooper and Hersovits (1992) proposed a score based on a Bayesian approach with Dirichlet priors(known as BD: Bayesian Dirichlet). Starting from a prior distribution on the possible structure $P(B)$, the objective is to express the posterior probability of all possible structures $(P(B \mid D)$ or simply $P(B, D)$ ) conditional on a dataset $\mathrm{D}$ :
$$S_{B D}(B, D)=P(B, D)=\int_{\Theta} P(D \mid \Theta, B) P(\Theta \mid B) P(B) d \Theta=P(B) \int_{\Theta} P(D \mid \Theta, B) P(\Theta \mid B) d \Theta$$
The BD score is analitycally expressed as:
$$S_{B D}(B, D)=P(B) \prod_{i=1}^n \prod_{j=1}^{q_i} \frac{\left(r_i-1\right) !}{\left(N_{i j}+r_i-1\right) !} \prod_{k=1}^{r_i} N_{i j k} !$$
The BIC (Bayesian Information Criteria) score metric was proposed by Schwartz (1978) and is defined as:
$$S_{B I C}=\log L\left(D \mid \theta^{M V}, B\right)-\frac{1}{2} \operatorname{Dim}(B) \log N$$
where $\theta^{M V}$ is the maximum likelihood estimate of the parameters, $\mathrm{B}$ is the $\mathrm{BN}$ structure and $\operatorname{Dim}(B)$ is the dimension of the network defined by : $\operatorname{Dim}(B)=\sum_{i=1}^n \operatorname{Dim}\left(X_i, B\right)$ and $\operatorname{Dim}(B)=\left(r_i-1\right) q_i$
Another common score in structure learning is the Mutual Information (MI). The Mutual Information between two random variables $X$ and $Y$, denoted by $I(X, Y)$ is defined by Chow and Liu (1968):
$$I(X, Y)=H(X)-H(X \mid Y)$$
Where $H(X)$ is the entropy of random variables $X$ defined as: $H(X)=-\sum_{i=1}^{r_x} P\left(X=x_i\right) \log \left(P\left(X=x_i\right)\right)$ and $H(X \mid Y)=-\sum_{i=1}^{r_x} \sum_{j=1}^{r_y} P\left(X=x_i / Y=y_j\right) \log \left(P\left(X=x_i \mid Y=y_j\right)\right)$ where $r_x$ and $r_y$ are the number of discrete states for variables $X$ and $Y$, respectively.

## 统计代写|贝叶斯网络代写Bayesian network代考|Training the structure of a Bayesian network

Learning Bayesian network can be broken up into two phases. As a first step, the network structure is determined, either by an expert, either automatically from observations made over the studied domain (most often). Finally, the set of parameters $\theta$ is defined here too by an expert or by means of an algorithm.

The problem of learning structure can be compared to the exploration of the data, i.e. the extraction of knowledge (in our case, network topology) from a database (Krause, 1999). It is not always possible for experts to determine the structure of a Bayesian network. In some cases, the determination of the model can therefore be a problem to resolve. Thus, in (Yu et al., 2002) learning the structure of a Bayesian network can be used to identify the most obvious relationships between different genetic regulators in order to guide subsequent experiments.

The structure is then only a part of the solution to the problem but itself a solution.
Learning the structure of a Bayesian network may need to take into account the nature of the data provided for learning (or just the nature of the modeled domain): continuous variables – variables can take their values in a continuous space (Cobb \& Shenoy, 2006; Lauritzen \& Wermuth, 1989; Lerner et al., 2001) -, incomplete databases (Heckerman, 1995; Lauritzen, 1995). We assume in this work that the variables modeled take their values in a discrete set, they are fully observed, there is no latent variable i.e. there is no model in the field of non-observable variable that is the parent of two or more observed variables.

The methods used for learning the structure of a Bayesian network can be divided into two main groups:

1. Discovery of independence relationships: these methods consist in the testing procedures on allowing conditional independence to find a structure;
2. Exploration and evaluation: these methods use a score to evaluate the ability of the graph to recreate conditional independence within the model. A search algorithm will build a solution based on the value of the score and will make it evolve iteratively.
Without being exhaustive, belonging to the statistical test-based methods it should be noted first the algorithm PC, changing the algorithm SGS (Spirtes et al., 2001). In this approach, considering a graph $G={X, E, \theta})$, two vertices $X_i$ and $X_j$ from $X$ and a subset of vertices $S_{X_i, X_j} \in X /\left{X_i, X_j\right}$, the vertices $X_i$ and $X_j$ are connected by an arc in $G$ if there is no $S_{X_i, X_j}$ such as $\left(X i \perp X j \mid S_{X_i, X_1}\right)$ where $\perp$ denotes the relation of conditional independence.

## 统计代写|贝叶斯网络代写Bayesian network代考|Score metrics

DAG 的评分标准是一个函数，它根据数据为每个 DAG 分配一个值。Cooper 和 Hersovits (1992) 提出了 一种基于贝叶斯方法和 Dirichlet 先验 (称为 BD: Bayesian Dirichlet) 的分数。从可能结构的先验分布开 始 $P(B)$ ，目标是表达所有可能结构的后验概率 $(P(B \mid D)$ 或者简单地 $P(B, D))$ 以数据集为条件 $\mathrm{D}$ ：
$$S_{B D}(B, D)=P(B, D)=\int_{\Theta} P(D \mid \Theta, B) P(\Theta \mid B) P(B) d \Theta=P(B) \int_{\Theta} P(D \mid \Theta, B) P(\Theta \mid B)$$
$\mathrm{BD}$ 分数在分析上表示为:
$$S_{B D}(B, D)=P(B) \prod_{i=1}^n \prod_{j=1}^{q_i} \frac{\left(r_i-1\right) !}{\left(N_{i j}+r_i-1\right) !} \prod_{k=1}^{r_i} N_{i j k} !$$
BIC (Bayesian Information Criteria) 评分指标由 Schwartz (1978) 提出，定义为:
$$S_{B I C}=\log L\left(D \mid \theta^{M V}, B\right)-\frac{1}{2} \operatorname{Dim}(B) \log N$$

$\operatorname{Dim}(B)=\sum_{i=1}^n \operatorname{Dim}\left(X_i, B\right)$ 和 $\operatorname{Dim}(B)=\left(r_i-1\right) q_i$

$$I(X, Y)=H(X)-H(X \mid Y)$$

## 统计代写|贝叶斯网络代写Bayesian network代考|Training the structure of a Bayesian network

1. 独立关系的发现：这些方法包括允许条件独立性寻找结构的测试程序；
2. 探索和评估：这些方法使用分数来评估图形在模型中重建条件独立性的能力。搜索算法将根据分数 值构建解决方案，并使其迭代演化。
没有详尽的，属于基于统计测试的方法，首先应该注意算法 PC，更改算法 SGS（Spirtes 等人， 2001) 。在这种方法中，考虑一个图 $G=X, E, \theta)$ ，两个顶点 $X_i$ 和 $X_j$ 从 $X$ 和顶点的子集 S_{X_i, X_j $} \backslash$ \in $X \wedge l e f t\left{X_{-} \mathrm{i}, X_{_}\right.$j $\backslash$ right $}$, 顶点 $X_i$ 和 $X_j$ 由弧连接在 $G$ 如果没有 $S_{X_i, X_j}$ 例如 $\left(X i \perp X j \mid S_{X_i, X_1}\right)$ 在哪里上表示条件独立的关系。

