## 统计代写|贝叶斯网络代写Bayesian network代考|Principle of semBnet

This section thoroughly explains the working principle of sembnet with respect to the following two major aspects, considering the spatial time series prediction scenario described in Sect. 5.3:

• Parameter learning
• Inference generation
semBnet extends standard Bayesian network analysis by incorporating domain knowledge, represented in terms of a semantic hierarchy [5]. In case of spatio-temporal prediction, semantic hierarchy is developed on the various concepts from the spatial domain and serves as the knowledge base to incorporate domain semantics in standard Bayesian analysis.

Typically, the semBnet consists of a qualitative component, comprising of a causal dependency graph (CDG), and a quantitative component, comprising of conditional probability distribution information for each of the nodes in the CDG.

Formally, the qualitative component of semBnet can be defined as a graph $G\left(V_{O}, V_{S}, E\right)$ which is directed as well as acyclic, where $V_{O}$ represents the set of nodes indicating random variables with no available semantics, $V_{S}$ represents the set of nodes indicating random variables with available semantics, and $E$ represents the set of edges between any two nodes in $\left(V_{O} \cup V_{S}\right)$. An edge from $V_{i} \in\left(V_{O} \cup V_{S}\right)$ to $V_{j} \in\left(V_{O} \cup V_{S}\right)$ indicates that variable $V_{i}$ influences variable $V_{j}$.
On the other side, the quantitative component of semBnet, i.e., the conditional probability distribution of any node $V_{x}$ in semBnet is represented as $P^{\uparrow}\left(V_{x} \mid\right.$ Parents $\left.\left(V_{x}\right)\right)$ if either $V_{x} \in V_{S}$ and/or $\left(\right.$ Parents $\left.\left(V_{x}\right) \cap V_{S}\right) \neq \emptyset$, where Parents $\left(V_{x}\right)$ denotes the set of parents or nodes influencing the target node $V_{x}$. Otherwise, the conditional probability is represented as that of the standard BN, i.e. $P\left(V_{x} \mid\right.$ Parents $\left.\left(V_{x}\right)\right)$.

## 统计代写|贝叶斯网络代写Bayesian network代考|Parameter Learning

This section illustrates the principle of semBnet learning in terms of marginal and conditional probability estimation.

For any node $V_{x} \in V_{O}$, the marginal probability $P\left(V_{x}\right)$ is estimated as that of a standard Bayesian network. However, if the node $V_{x}$ has available semantics (i.e. $V_{x} \in V_{S}$ ), the marginal probability is estimated as follows:
$$P^{\dagger}\left(v_{x}\right)=\gamma \cdot\left[P\left(v_{x}\right)+\sum_{v_{x i}} S S\left(v_{x}, v_{x c}\right) \cdot P\left(v_{x c}\right)\right]$$
where, $v_{x}$ and $v_{x c}$ are any two domain values corresponding to $V_{x} \in V_{S}$, so that $v_{x} \neq v_{x c} ; P\left(v_{x}\right)$ denotes the standard probability of $v_{x} ; \gamma$ is the normalization constant; and $S S\left(v_{x}, v_{x c}\right)$ denotes the semantic similarity between $v_{x}$ and $v_{x c}$

In order to estimate semantic similarity between any two concepts, semBnet needs the semantic knowledge base in the form of a semantic hierarchy (refer Fig.5.2). Assuming that a variable $X$ has semantic hierarchy available over its various concepts, the semantic similarity between any two of its concepts $x_{c_{1}}$ and $x_{c_{2}}$ is calculated as per the measure defined in [11] as follows.
$$S S\left(x_{c_{1}}, x_{c_{2}}\right)=e^{-\delta l} \cdot \frac{e^{\lambda d}-e^{-\lambda . d}}{e^{\lambda d}+e^{-\lambda . d}}$$
where, $d$ denotes the depth of subsumer (most immediate common ancestor) of the concept $x_{c_{1}}$ and $x_{c_{2}}$ in the semantic hierarchy; $l$ is the length of the shortest path between the concepts; $\lambda>0$ and $\delta \geq 0$ are control parameters that help to scale the contribution of $d$ and $l$, respectively. As mentioned in [11], usually, the $\lambda$ and $\delta$ are set to $0.6$ and $0.2$ respectively.

During conditional probability estimation, if the target variable $V_{x}$ does not have its semantic knowledge base available (i.e. $V_{x} \in V_{O}$ ) and neither of its parents has so (i.e. (Parent $\left.\left(V_{x}\right) \cap V_{S}\right)=\emptyset$ ), then the conditional probability distribution $P\left(V_{x} \mid\right.$ Parext $\left.\left(V_{x}\right)\right)$ is derived in the same way as that of standard BN. Otherwise, the available semantic information is utilized to estimate the conditional probabilities. Following are the three cases that can arise during conditional probability estimation in presence of domain semantics of at least one of the variables involved (target and/or its parents):
$\frac{\text { Case-I: } V_{x} \in V_{S} \text { and }\left(\text { Parents }\left(V_{x}\right) \cap V_{S}\right)=\emptyset:}{\text { Similar case arises for the variable } V_{S}^{4} \text { in Fig.5.3. }}$

## 统计代写|贝叶斯网络代写Bayesian network代考|semBnet-Based Prediction

Once the inferred probability distribution for the target/query variable is obtained, this is further processed to generate the predicted value of the variable. Considering the same example of rainfall prediction as described in the previous section, let $i n f e r_{R}^{s e m B n e t}$ is the semBnet inferred rainfall range corresponding to the highest probability estimate and infer standardBN is the standard Bayesian network inferred rainfall range corresponding to the highest probability estimate. Then $P^{\dagger}\left(i n f e r_{R}^{s e m} B n e t\right.$ $P\left(\right.$ infer ${ }{R}^{\text {standard } B N} \mid L U L C$, Elev, Lat $)=\max {i}\left{P\left(R_{i} \mid L U L C\right.\right.$, Elev, Lat $\left.)\right}$, where infer sembnet $=\left[L B_{R}^{\text {sem Bnet }}, U B_{R}^{\text {sem Bnet }}\right]$ and infer standard in $_{R}^{\text {s. }}=\left[L B_{R}^{\text {standard } B N}\right.$, $\left.U B_{R}^{\text {standard } B N}\right]$ (since the inferred values are in the form of ranges). Here $L B$ and $U B$ indicate the lower and upper bound of the range, respectively.
Then, the predicted value of Rainfall $\left(\mathrm{pred}{R}\right)$ is estimated as follows: $$\text { pred }{R}=\left[\frac{L B_{R}^{\text {sem Bnet }}+L B_{R}^{\text {standardBN }}}{2}, \frac{U B_{R}^{\text {sem } B \text { net }}+U B_{R}^{\text {standard } B N}}{2}\right]=\left[L B_{R}^{\text {pred }}, U B_{R}^{\text {pred }}\right]$$
In order to obtain a single value for Rainfall, one may use the mean of the predicted range: $\left(\frac{L B_{R}^{\text {pral }}+U B_{R}^{\text {preal }}}{2}\right)$.

In the following part of the chapter, we attempt to present a case study to validate the effectiveness of semBnet-based prediction model in the presence of domain knowledge over the variables.

