### 数学代写|信息论代写information theory代考|The Compound Notions for Logical

## 数学代写|信息论代写information theory代考|The Dit-Bit Transform

The logical entropy formulas for various compound notions (e.g., conditional entropy, mutual information, and joint entropy) stand in certain Venn diagram relationships because logical entropy is a measure. The Shannon entropy formulas for these compound notions, e.g., $H(\alpha, \beta)=H(\alpha)+H(\beta)-I(\alpha, \beta)[1]$, are defined so as to satisfy the Venn diagram relationships. There is a deeper connection that helps to explain the connection between the two entropies, the dit-bit transform. This transform can be heuristically motivated by considering two ways to treat the standard set $U_{n}$ of $n$ elements with the equal probabilities $p_{0}=\frac{1}{n}$. In that basic case of an equiprobable set, we can derive the dit-bit connection, and then by using a probabilistic average, we can develop the Shannon entropy, expressed in terms of bits, from the logical entropy, expressed in terms of (normalized) dits.

Given $U_{n}$ with $n$ equiprobable elements, the number of dits (of the discrete partition on $U_{n}$ ) is $n^{2}-n$ so the normalized dit count is:
$$h\left(p_{0}\right)=h\left(\frac{1}{n}\right)=\frac{n^{2}-n}{n^{2}}=1-\frac{1}{n}=1-p_{0} \text { normalized dits. }$$

That is the (normalized) dit-count or logical measure of the information in a set of $n$ distinct elements (think of it as the logical entropy of the discrete partition on $U_{n}$ with equiprobable elements).

But we can also measure the information in the set by the number of binary partitions it takes (on average) to distinguish the elements, and that bit-count is [5]:
$$H\left(p_{0}\right)=H\left(\frac{1}{n}\right)=\log (n)=\log \left(\frac{1}{p_{0}}\right) \text { bits. }$$
Shannon – Hartley entropy for an equiprobable set $U$ of $n$ elements
The dit-bit connection is that the Shannon-Hartley entropy $H\left(p_{0}\right)=\log \left(\frac{1}{p_{0}}\right)$ will play the same role in the Shannon formulas that $h\left(p_{0}\right)=1-p_{0}$ plays in the logical entropy formulas – when both are formulated as probabilistic averages or expectations.

The common thing being measured is an equiprobable $U_{n}$ where $n=\frac{1}{p_{0}}$. The dit-count for $U_{n}$ is $h\left(p_{0}\right)=1-p_{0}$ and the bit-count for $U$ is $H\left(p_{0}\right)=\log \left(\frac{1}{p_{0}}\right)$, and the dit-bit transform converts one count into the other. This dit-bit transform should not be interpreted as if it was just converting a length using centimeters to inches or the like since it is highly nonlinear.

We start with the logical entropy of a probability distribution $p=\left{p_{1}, \ldots, p_{n}\right}$ :
$$h(p)=\sum_{i=1}^{n} p_{i} h\left(p_{i}\right)=\sum_{i} p_{i}\left(1-p_{i}\right)$$

## 数学代写|信息论代写information theory代考|Conditional Logical Entropy

All the compound notions for Shannon and logical entropy could be developed using either partitions (with point probabilities) or probability distributions of random variables as the given data. Since the treatment of Shannon entropy is most often in terms of probability distributions, we will stick to that case for both types of entropy. The formula for the compound notion of logical entropy will be developed first, and then the formula for the corresponding Shannon compound entropy will be obtained by the dit-bit transform.

The general idea of a conditional entropy of a random variable $X$ given a random variable $Y$ is to measure the information in $X$ when we take away the information contained in $Y$, i.e., the set difference operation in terms of information sets.

For the definition of the conditional entropy $h(X \mid Y)$, we simply take the product measure of the set of pairs $(x, y)$ and $\left(x^{\prime}, y^{\prime}\right)$ that give an $X$-distinction but not a $Y$-distinction. Hence we use the inequation $x \neq x^{\prime}$ for the $X$-distinction and negate the $Y$-distinction $y \neq y^{\prime}$ to get the infoset that is the difference of the infosets for $X$ and $Y$ :
$$\begin{gathered} S_{X \wedge \neg Y}=\left{\left((x, y),\left(x^{\prime}, y^{\prime}\right)\right): x \neq x^{\prime} \wedge y=y^{\prime}\right}=S_{X}-S_{Y} \text { so } \ h(X \mid Y)=\mu\left(S_{X \wedge \neg Y}\right)=\mu\left(S_{X}-S_{Y}\right) \end{gathered}$$

Since $S_{X \vee Y}$ can be expressed as the disjoint union $S_{X \vee Y}=S_{X \wedge \neg Y} \uplus S_{Y}$, we have for the measure $\mu$ :
$$h(X, Y)=\mu\left(S_{X \vee Y}\right)=\mu\left(S_{X \wedge \neg Y}\right)+\mu\left(S_{Y}\right)=h(X \mid Y)+h(Y),$$
which is illustrated in the Venn diagram Fig. 3.1.
In terms of the probabilities:
\begin{aligned} h(X \mid Y)=h(X, Y) &-h(Y)=\sum_{x, y} p(x, y)(1-p(x, y))-\sum_{y} p(y)(1-p(y)) \ =& \sum_{x, y} p(x, y)[(1-p(x, y))-(1-p(y))] \ & \text { Logical conditional entropyof } X \text { given } Y . \end{aligned}
Also of interest is the:
$$d(X, Y)=h(X \mid Y)+h(Y \mid X)=\mu\left(S_{X} \Delta S_{Y}\right)$$
Logical distance between random variables
where $\Delta$ is the symmetric difference (or inequivalence) operation on sets. This logical distance is a Hamming-style distance function [4, p. 66] based on the difference between the random variables.

## 数学代写|信息论代写information theory代考|Shannon Conditional Entropy

Given the joint distribution ${p(x, y)}$ on $X \times Y$, the conditional probability distribution for a specific $y_{0} \in Y$ is $p\left(x \mid y_{0}\right)=\frac{p\left(x, y_{0}\right)}{p\left(y_{0}\right)}$ which has the Shannon entropy: $H\left(X \mid y_{0}\right)=\sum_{x} p\left(x \mid y_{0}\right) \log \left(\frac{1}{p\left(x \mid y_{0}\right)}\right)$. Then the Shannon conditional entropy $H(X \mid Y)$ is usually defined as the average of these entropies:
$$H(X \mid Y)=\sum_{y} p(y) \sum_{x} \frac{p(x, y)}{p(y)} \log \left(\frac{p(y)}{p(x, y)}\right)=\sum_{x, y} p(x, y) \log \left(\frac{p(y)}{p(x, y)}\right)$$
Shannon conditional entropy of $X$ given $Y$.
All the Shannon notions can be obtained by the dit-bit transform of the corresponding logical notions. Applying the transform $1-p \sim \log \left(\frac{1}{p}\right)$ to the logical conditional entropy expressed as an average of “1- $p$ ” expressions: $h(X \mid Y)=\sum_{x, y} p(x, y)[(1-p(x, y))-(1-p(y))]$, yields the Shannon conditional entropy:
\begin{aligned} H(X \mid Y) &=\sum_{x, y} p(x, y)\left[\log \left(\frac{1}{p(x, y)}\right)-\log \left(\frac{1}{p(y)}\right)\right] \ &=\sum_{x, y} p(x, y) \log \left(\frac{p(y)}{p(x, y)}\right) \end{aligned}
Since the dit-bit transform preserves sums and differences, we will have the same sort of Venn diagram formula for the Shannon entropies and this can be illustrated in the analogous “mnemonic” Venn diagram Fig. 3.2.

## 数学代写|信息论代写information theory代考|The Dit-Bit Transform

H(p0)=H(1n)=n2−nn2=1−1n=1−p0 标准化的滴答声。

H(p0)=H(1n)=日志⁡(n)=日志⁡(1p0) 位。
Shannon – Hartley entropy for a equiprobable set在的n元素
dit-bit 连接是 Shannon-Hartley 熵H(p0)=日志⁡(1p0)将在香农公式中扮演相同的角色H(p0)=1−p0在逻辑熵公式中起作用——当两者都被表述为概率平均值或期望值时。

H(p)=∑一世=1np一世H(p一世)=∑一世p一世(1−p一世)

## 数学代写|信息论代写information theory代考|Conditional Logical Entropy

\begin{聚集} S_{X \wedge \neg Y}=\left{\left((x, y),\left(x^{\prime}, y^{\prime}\right)\right)： x \neq x^{\prime} \wedge y=y^{\prime}\right}=S_{X}-S_{Y} \text { 所以 } \h(X \mid Y)=\mu\left (S_{X \wedge \neg Y}\right)=\mu\left(S_{X}-S_{Y}\right) \end{聚集}\begin{聚集} S_{X \wedge \neg Y}=\left{\left((x, y),\left(x^{\prime}, y^{\prime}\right)\right)： x \neq x^{\prime} \wedge y=y^{\prime}\right}=S_{X}-S_{Y} \text { 所以 } \h(X \mid Y)=\mu\left (S_{X \wedge \neg Y}\right)=\mu\left(S_{X}-S_{Y}\right) \end{聚集}

H(X,是)=μ(小号X∨是)=μ(小号X∧¬是)+μ(小号是)=H(X∣是)+H(是),

H(X∣是)=H(X,是)−H(是)=∑X,是p(X,是)(1−p(X,是))−∑是p(是)(1−p(是)) =∑X,是p(X,是)[(1−p(X,是))−(1−p(是))]  逻辑条件熵 X 给定 是.

d(X,是)=H(X∣是)+H(是∣X)=μ(小号XΔ小号是)

## 数学代写|信息论代写information theory代考|Shannon Conditional Entropy

H(X∣是)=∑是p(是)∑Xp(X,是)p(是)日志⁡(p(是)p(X,是))=∑X,是p(X,是)日志⁡(p(是)p(X,是))

H(X∣是)=∑X,是p(X,是)[日志⁡(1p(X,是))−日志⁡(1p(是))] =∑X,是p(X,是)日志⁡(p(是)p(X,是))

