## CSE208 information theory课程简介

Information theory is the branch of mathematics that deals with the quantification, storage, and communication of information. It was developed by Claude Shannon in the 1940s, and has applications in a wide range of fields, including communications, computer science, physics, and biology.

The basic idea in information theory is that information can be thought of as a reduction in uncertainty. The more uncertain we are about something, the more information we gain when we learn about it. For example, if I tell you that the weather tomorrow will be either sunny or rainy, and you don’t know which one, then your uncertainty about the weather is high. If I then tell you that the weather will be sunny, then your uncertainty is reduced, and you gain some information.

## PREREQUISITES

Information theory is the branch of mathematics that deals with the quantification, storage, and communication of information. It was developed by Claude Shannon in the 1940s, and has applications in a wide range of fields, including communications, computer science, physics, and biology.

The basic idea in information theory is that information can be thought of as a reduction in uncertainty. The more uncertain we are about something, the more information we gain when we learn about it. For example, if I tell you that the weather tomorrow will be either sunny or rainy, and you don’t know which one, then your uncertainty about the weather is high. If I then tell you that the weather will be sunny, then your uncertainty is reduced, and you gain some information.

## CSE208 information theory HELP（EXAM HELP， ONLINE TUTOR）

1. Coin flips. A fair coin is flipped until the first head occurs. Let $X$ denote the number of flips required.
(a) Find the entropy $H(X)$ in bits. The following expressions may be useful:
$$\sum_{n=1}^{\infty} r^n=\frac{r}{1-r}, \quad \sum_{n=1}^{\infty} n r^n=\frac{r}{(1-r)^2} .$$
(b) A random variable $X$ is drawn according to this distribution. Find an “efficient” sequence of yes-no questions of the form, “Is $X$ contained in the set $S$ ?” Compare $H(X)$ to the expected number of questions required to determine $X$.

(a) The number $X$ of tosses till the first head appears has the geometric distribution with parameter $p=1 / 2$, where $P(X=n)=p q^{n-1}, n \in{1,2, \ldots}$. Hence the entropy of $X$ is
\begin{aligned} H(X) & =-\sum_{n=1}^{\infty} p q^{n-1} \log \left(p q^{n-1}\right) \ & =-\left[\sum_{n=0}^{\infty} p q^n \log p+\sum_{n=0}^{\infty} n p q^n \log q\right] \ & =\frac{-p \log p}{1-q}-\frac{p q \log q}{p^2} \ & =\frac{-p \log p-q \log q}{p} \ & =H(p) / p \text { bits. } \end{aligned}
If $p=1 / 2$, then $H(X)=2$ bits.

(b) Intuitively, it seems clear that the best questions are those that have equally likely chances of receiving a yes or a no answer. Consequently, one possible guess is that the most “efficient” series of questions is: Is $X=1$ ? If not, is $X=2$ ? If not, is $X=3$ ? … with a resulting expected number of questions equal to $\sum_{n=1}^{\infty} n\left(1 / 2^n\right)=2$. This should reinforce the intuition that $H(X)$ is a measure of the uncertainty of $X$. Indeed in this case, the entropy is exactly the same as the average number of questions needed to define $X$, and in general $E$ (# of questions) $\geq H(X)$. This problem has an interpretation as a source coding problem. Let $0=$ no, $1=$ yes, $X=$ Source, and $Y=$ Encoded Source. Then the set of questions in the above procedure can be written as a collection of $(X, Y)$ pairs: $(1,1),(2,01),(3,001)$, etc. . In fact, this intuitively derived code is the optimal (Huftman) code minimizing the expected number of questions.

1. Entropy of functions. Let $X$ be a random variable taking on a finite number of values. What is the (general) inequality relationship of $H(X)$ and $H(Y)$ if
(a) $Y=2^X$ ?
(b) $Y=\cos X$ ?

Solution: Let $y=g(x)$. Then
$$p(y)=\sum_{x: y=g(x)} p(x)$$
Consider any set of $x$ ‘s that map onto a single $y$. For this set
$$\sum_{x: y=g(x)} p(x) \log p(x) \leq \sum_{x: y=g(x)} p(x) \log p(y)=p(y) \log p(y),$$
since $\log$ is a monotone increasing function and $p(x) \leq \sum_{x: y=g(x)} p(x)=p(y)$. Extending this argument to the entire range of $X$ (and $Y$ ), we obtain
\begin{aligned} H(X) & =-\sum_x p(x) \log p(x) \ & =-\sum_y \sum_{x: y=g(x)} p(x) \log p(x) \ & \geq-\sum_y p(y) \log p(y) \ & =H(Y) \end{aligned}
with equality iff $g$ is one-to-one with probability one.
(a) $Y=2^X$ is one-to-one and hence the entropy, which is just a function of the probabilities (and not the values of a random variable) does not change, i.e., $H(X)=H(Y)$
(b) $Y=\cos (X)$ is not necessarily one-to-one. Hence all that we can say is that $H(X) \geq H(Y)$, with equality if cosine is one-to-one on the range of $X$.

## Textbooks

• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.

Statistics-lab™可以为您提供ucsc.edu CSE208 information theory信息论的代写代考和辅导服务！ 请认准Statistics-lab™. Statistics-lab™为您的留学生涯保驾护航。

## CSE208 information theory课程简介

An introduction to information theory including topics such as entropy, relative entropy, mutual information, asymptotic equipartition property, channel capacity, differential entropy, rate distortion theory, and universal source coding. Prerequisite(s): Computer Engineering 107, or Applied Mathematics and Statistics 131 or equivalent course, or permission of instructor.

Information theory is a branch of mathematics that deals with the quantification, storage, and communication of information. It was first introduced by Claude Shannon in 1948 and has since found numerous applications in areas such as communication engineering, computer science, statistics, and physics.

## PREREQUISITES

Some of the key concepts and topics covered in an introduction to information theory course are:

1. Entropy: Entropy is a measure of the uncertainty or randomness in a random variable. It is a fundamental concept in information theory and has applications in fields such as cryptography, data compression, and statistical mechanics.
2. Relative entropy: Also known as Kullback-Leibler divergence, relative entropy measures the difference between two probability distributions. It is widely used in machine learning, data science, and statistical inference.
3. Mutual information: Mutual information measures the amount of information that two random variables share. It is a key concept in signal processing, machine learning, and neuroscience.
4. Asymptotic equipartition property (AEP): AEP states that for a large number of independent and identically distributed random variables, the probability of the sequence being close to its expected value approaches one.
5. Channel capacity: Channel capacity is the maximum rate at which information can be reliably transmitted through a communication channel. It is a fundamental concept in communication engineering and has applications in wireless communication, data transmission, and coding theory.
6. Differential entropy: Differential entropy is a measure of the amount of information in a continuous random variable. It is closely related to entropy and has applications in fields such as signal processing, machine learning, and image processing.
7. Rate distortion theory: Rate distortion theory studies the problem of lossy data compression, where the goal is to compress data while minimizing the distortion between the original and compressed signals.
8. Universal source coding: Universal source coding studies the problem of lossless data compression, where the goal is to compress data without losing any information.

Overall, an introduction to information theory provides students with a solid foundation in the mathematical theory of information and its applications in a wide range of fields.

## CSE208 information theory HELP（EXAM HELP， ONLINE TUTOR）

This is an 80 minute exam. You may have an additional 100 minutes to answer the following five questions in the notebooks provided. You are permitted 2 double-sided sheet of notes. Make sure that you have included your name, ID number (last 4 digits only) and signature in each book used ( 5 points). Read each question carefully. All statements must be justified. Computations should be simplified as much as possible.

1. 20 points Let $Y=g(X)$ be deterministic function of discrete random variable $X$.
(a) 5 points Give an example of a random variable $X$ and a function $Y=g(X)$ such that $H(Y)<$ $H(X)$.
(b) 5 points Give an example of a random variable $X$ and a function $Y=g(X)$ such that $H(Y)=$ $H(X)$
(c) 10 points Either give an example of a random variable $X$ and a function $Y=g(X)$ such that $H(Y) \geq H(X)$ or prove that $H(Y) \leq H(X)$. Hint: look at $H(X, Y)$.

(a) Let $X$ be a random variable that takes values in ${0,1}$ with equal probabilities and let $Y=X$. Then, $H(X)=H(Y)=1$, but $H(Y)=0$ since $Y$ is a constant function of $X$.

(b) Let $X$ be a random variable that takes values in ${0,1}$ with equal probabilities and let $Y=X \oplus 1$ (where $\oplus$ denotes the bitwise exclusive or operation). Then, $H(X)=H(Y)=1$, since $X$ and $Y$ are equally likely to take on either value. Furthermore, since $Y$ is a one-to-one function of $X$, $H(Y \mid X) = 0$, so by the chain rule of entropy, \begin{align*} H(X,Y) &= H(Y \mid X) + H(X) \ &= H(Y) \ &= H(X). \end{align*} Thus, $H(Y)=H(X)$.

(c) Let $X$ be a random variable that takes values in ${0,1,2}$ with probabilities $\mathbb{P}(X=0)=1/2$, $\mathbb{P}(X=1)=1/4$, and $\mathbb{P}(X=2)=1/4$. Define $Y$ by $Y=0$ if $X=0$, and $Y=1$ otherwise. Then, $H(X)=1.5$ and $H(Y)=1$, since $Y$ is a Bernoulli random variable with parameter $1/2$. However, $H(X,Y)=H(Y)+H(X\mid Y)=H(Y)+\mathbb{P}(Y=0)H(X\mid Y=0)+\mathbb{P}(Y=1)H(X\mid Y=1)=1.25+0.5+0=1.75$. Therefore, $H(Y)<H(X,Y)$, which implies $H(Y)<H(X)$.

1. 20 points Let be random variables such that
$$X \rightarrow Y \rightarrow Z, \quad Y \rightarrow Z \rightarrow X, \quad Z \rightarrow X \rightarrow Y$$
If $I(X ; Y)=3$, find $I(X ; Z)$ and $I(Y ; Z)$. Or, if these quantities cannot be known, find tight upper and lower bounds. Make sure to justify your answers.

By the chain rule of mutual information, we have \begin{align*} I(X;Y,Z) &= I(X;Z) + I(X;Y|Z) \ I(Y;X,Z) &= I(Y;Z) + I(Y;X|Z) \ I(Z;X,Y) &= I(Z;X) + I(Z;Y|X). \end{align*} Since $X\to Y\to Z$, we have $I(Y;Z|X)=0$. Also, since $Y\to Z\to X$, we have $I(Y;X|Z)=0$. Hence, we have \begin{align*} 3 &= I(X;Y) \ &= I(X;Y,Z) – I(X;Z|Y) \ &= I(X;Z) + I(X;Y|Z) – I(X;Z|Y) \ &= I(X;Z) + I(X;Y|Z,Y) \ &= I(X;Z) + I(X;Y|Z) \ &= I(X;Z) + I(X;Y,Z) – I(X;Z|Y) \ &= I(X;Z) + I(X;Y) – I(X;Z|Y). \end{align*} Thus, $I(X;Z) = 3 – I(X;Y) + I(X;Z|Y)$. Similarly, we can derive that

I(Y;Z) = 3 – I(Y;X) + I(Y;Z|X).I(Y;Z)=3−I(Y;X)+I(Y;Z∣X).

Since $I(Y;X,Z) = I(Y;X) = I(X;Y)$, we have $I(Y;Z|X) \leq I(X;Z)$. Therefore, we have

I(Y;Z) \geq 3 – I(X;Y) + I(X;Z|Y).I(Y;Z)≥3−I(X;Y)+I(X;Z∣Y).

For a lower bound of $I(X;Z)$, note that $I(X;Z|Y) \geq 0$, and hence $I(X;Z) \geq 3 – I(X;Y)$. For an upper bound of $I(X;Z)$, note that $I(X;Z) \leq \min{H(X),H(Z)}$. To see this, note that

H(X) = H(X|Z) + I(X;Z) \geq I(X;Z),H(X)=H(X∣Z)+I(X;Z)≥I(X;Z),

and similarly $H(Z) \geq I(X;Z)$. Therefore, we have

3 – I(X;Y) \leq I(X;Z) \leq \min\{H(X),H(Z)\}.3−I(X;Y)≤I(X;Z)≤min{H(X),H(Z)}.

For $I(Y;Z)$, we can use similar arguments to derive that

3 – I(X;Y) \leq I(Y;Z) \leq \min\{H(Y),H(Z)\}.3−I(X;Y)≤I(Y;Z)≤min{H(Y),H(Z)}.

## Textbooks

• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.

Statistics-lab™可以为您提供soe.ucsc.edu CSE208 information theory信息论课程的代写代考辅导服务！ 请认准Statistics-lab™. Statistics-lab™为您的留学生涯保驾护航。