数学代写|信息论代写information theory代考|Entropies for Multivariate Joint Distributions

Let $\left{p\left(x_{1}, \ldots, x_{n}\right)\right}$ be a probability distribution on $X_{1} \times \ldots \times X_{n}$ for finite $X_{i}$ ‘s. Let $S$ be a subset of $\left(X_{1} \times \ldots \times X_{n}\right)^{2}$ consisting of certain ordered pairs of ordered $n$-tuples $\left(\left(x_{1}, \ldots, x_{n}\right),\left(x_{1}^{\prime}, \ldots, x_{n}^{\prime}\right)\right)$ so the product probability measure on $S$ is:
$$\mu(S)=\sum\left{p\left(x_{1}, \ldots, x_{n}\right) p\left(x_{1}^{\prime}, \ldots, x_{n}^{\prime}\right):\left(\left(x_{1}, \ldots, x_{n}\right),\left(x_{1}^{\prime}, \ldots, x_{n}^{\prime}\right)\right) \in S\right}$$

Then all the logical entropies for this $n$-variable case are given as the product measure of certain infosets $S$. Let $I, J \subseteq N$ be subsets of the set of all variables $N=\left{X_{1}, \ldots, X_{n}\right}$ and let $x=\left(x_{1}, \ldots, x_{n}\right)$ and $x^{\prime}=\left(x_{1}^{\prime}, \ldots, x_{n}^{\prime}\right)$.

Since two ordered $n$-tuples are different if they differ in some coordinate, the joint logical entropy of all the variables is: $h\left(X_{1}, \ldots, X_{n}\right)=\mu\left(S_{\mathrm{V} N}\right)$ where:
$$\begin{gathered} S_{\vee N}=\left{\left(x, x^{\prime}\right): \vee_{i=1}^{n}\left(x_{i} \neq x_{i}^{\prime}\right)\right}=\cup\left{S_{X_{i}}: X_{i} \in N\right} \text { where } \ S_{X_{i}}=S_{x_{i} \neq x_{i}^{\prime}}=\left{\left(x, x^{\prime}\right): x_{i} \neq x_{i}^{\prime}\right} \end{gathered}$$
(where $\vee$ represents the disjunction of statements). For a non-empty $I \subseteq N$, the joint logical entropy of the variables in $I$ could be represented as $h(I)=\mu\left(S_{\vee I}\right)$ where:
$$S_{\vee I}=\left{\left(x, x^{\prime}\right): \vee\left(x_{i} \neq x_{i}^{\prime}\right) \text { for } X_{i} \in I\right}=\cup\left{S_{X_{i}}: X_{i} \in I\right}$$
so that $h\left(X_{1}, \ldots, X_{n}\right)=h(N)$.
As before, the information algebra $\mathcal{I}\left(X_{1} \times \ldots \times X_{n}\right)$ is the Boolean subalgebra of $\wp\left(\left(X_{1} \times \ldots \times X_{n}\right)^{2}\right)$ generated by the basic infosets $S_{X_{i}}$ for the variables and their complements $S_{\neg X_{f}}$.

For the conditional logical entropies, let $I, J \subseteq N$ be two non-empty disjoint subsets of $N$. The idea for the conditional entropy $h(I \mid J)$ is to represent the information in the variables $I$ given by the defining condition: $\vee\left(x_{i} \neq x_{i}^{\prime}\right)$ for $X_{i} \in I$, after taking away the information in the variables $J$ which is defined by the condition: $\vee\left(x_{j} \neq x_{j}^{\prime}\right)$ for $X_{j} \in J$. “After the bar $\mid$ ” means “negate” so we negate that condition $\vee\left(x_{j} \neq x_{j}^{\prime}\right)$ for $X_{j} \in J$ and add it to the condition for $I$ to obtain the conditional logical entropy as $h(I \mid J)=h(\vee I \mid \vee J)=\mu\left(S_{\vee I \mid \vee J}\right)$ (where $\wedge$ represents the conjunction of statements):
\begin{aligned} S_{\vee I \mid \vee J}=&\left{\left(x, x^{\prime}\right): \vee\left(x_{i} \neq x_{i}^{\prime}\right) \text { for } X_{i} \in I \text { and } \wedge\left(x_{j}=x_{j}^{\prime}\right) \text { for } X_{j} \in J\right} \ &=\cup\left{S_{X_{i}}: X_{i} \in I\right}-\cup\left{S_{X_{j}}: X_{j} \in J\right}=S_{\vee I}-S_{\vee J} \end{aligned}

数学代写|信息论代写information theory代考|An Example of Negative Mutual Information

Norman Abramson gives an example [1, pp. 130-131] where the Shannon mutual information of three variables is negative. ${ }^{3}$ William Feller gives a similar concrete example that we will use [11, Exercise 26, p. 143]. Any probability theory textbook example to show that pair-wise independence does not imply mutual independence for three or more random variables would do as well.

One fair die is thrown first and the result is recorded odd as 1 (the number of the face up mod 2) or even as 0 . Then the same is done with a second fair die so the outcome space if $U={(0,0),(0,1),(1,0),(1,1)}={0,1} \times{0,1}$ (first die on the left and second die on the right). Let $X$ be the random variable for the outcome ( 0 or 1 ) of the first throw, $Y$ for the second throw, and $Z$ for the sum $X+Y$ mod 2. Since $Z$ is a function of $X$ and $Y$, the outcome space is $U \times U=(X \times Y)^{2}$. So many Venn diagrams are illustrated rather symbolically, e.g., with circles for $h(X)$, $h(Y)$, and $h(Z)$, that it will be useful to give the actual Venn/box diagrams for this example.

The two-draw outcome space $U \times U$ can be represented as a $4 \times 4$ square with each square representing a point $\left((x, y),\left(x^{\prime}, y^{\prime}\right)\right)$ for the two trials with the dice. The points included in $h(X)$ (indicated with shading) are the pairs $\left((x, y),\left(x^{\prime}, y^{\prime}\right)\right)$ where $x \neq x^{\prime}$ and symmetrically for $h(Y)$. Since $Z=X+Y \bmod 2$, the shaded squares for $h(Z)$ are the squares $\left((x, y),\left(x^{\prime}, y^{\prime}\right)\right)$ where $x+y \neq x^{\prime}+y^{\prime} \bmod 2$ as shown in Fig. 4.2.

Since each point in $U \times U$ has the product probability $\frac{1}{4} \times \frac{1}{4}=\frac{1}{16}$ and the logical entropies are just the sum of the probabilities of the shaded squares, we see that $h(X)=h(Y)=h(Z)=\frac{8}{16}=\frac{1}{2}$. The joint logical entropies are the sum of the probabilities for the squares that are shaded in one or the other (or both) shown in Fig. $4.3$.

数学代写|信息论代写information theory代考|Entropies for Countable Probability Distributions

The logical entropies for any discrete probability distributions, finite or countably infinite, can be illustrated with an logical entropy box diagram. A square with unit length sides can have the probabilities $p_{1}, p_{2} \ldots .$ marked off along the width and height so that squares $p_{i}^{2}$ and products $p_{i} p_{j}$ and $p_{j} p_{i}$ all correspond to the area of rectangles in the square. The sum of the square areas along the diagonal is $\sum_{i} p_{i}^{2}$ so the logical entropy $h(p)=1-\sum_{i} p_{i}^{2}$ is the sum of the two equal areas of rectangles on either side of the diagonal. Figure $4.7$ gives the logical entropy box diagram for the countable distribution, $p_{1}=\frac{1}{2}, p_{2}=\left(\frac{1}{2}\right)^{2}, \ldots, p_{n}=\left(\frac{1}{2}\right)^{n}, \ldots$ which sums to 1 .

Since $\sum_{i} p_{i}=1$, the logical entropy $h(p)=1-\sum_{i} p_{i}^{2}$ for countable distributions is always well-defined, strictly less than one, and interpretable as the two-draw probability of getting different indices $i \neq j$. However, the Shannon entropy for countable distributions $H(p)=\sum_{i} p_{i} \log {2}\left(\frac{1}{p{i}}\right)$ may blow up (not have a finite sum); examples are given in [23, Example 2.46, p. 30] and [6, p. 48].
For the example at hand, the sum of the probabilities squared is $\frac{1}{4}+\frac{1}{16}+\frac{1}{64}+\ldots=$ $\left(\frac{1}{4}\right)^{1}+\left(\frac{1}{4}\right)^{2}+\left(\frac{1}{4}\right)^{3}+\ldots=\frac{1 / 4}{1-1 / 4}=\frac{1}{3}$ (the area of the diagonal squares in the box diagram of Fig. 4.7) so the logical entropy is $h(p)=1-\frac{1}{3}=\frac{2}{3}$. In the box diagram with a uniform distribution over the unit area, the logical entropy is the probability that a random point in the square lies off the boxed diagonal.

