## AM221 Convex optimization课程简介

This is a graduate-level course on optimization. The course covers mathematical programming and combinatorial optimization from the perspective of convex optimization, which is a central tool for solving large-scale problems. In recent years convex optimization has had a profound impact on statistical machine learning, data analysis, mathematical finance, signal processing, control, theoretical computer science, and many other areas. The first part will be dedicated to the theory of convex optimization and its direct applications. The second part will focus on advanced techniques in combinatorial optimization using machinery developed in the first part.

## PREREQUISITES

Instructor: Yaron Singer (OH: Wednesdays 4-5pm, MD239)
Teaching fellows:

• Thibaut Horel (OH: Tuesdays 4:30-5:30pm, MD’s 2nd floor lounge)
• Rajko Radovanovic (OH: Mondays 5:30-6:30pm, MD’s 2nd floor lounge)
Time: Monday \& Wednesday, 2:30pm-4:00pm
Room: MD119
Sections: Wednesdays 5:30-7:00pm in MD221

## AM221 Convex optimization HELP（EXAM HELP， ONLINE TUTOR）

2.7 Voronoi description of halfspace. Let $a$ and $b$ be distinct points in $\mathbf{R}^n$. Show that the set of all points that are closer (in Euclidean norm) to $a$ than $b$, i.e., $\left{x \mid|x-a|_2 \leq|x-b|_2\right}$, is a halfspace. Describe it explicitly as an inequality of the form $c^T x \leq d$. Draw a picture. Solution. Since a norm is always nonnegative, we have $|x-a|_2 \leq|x-b|_2$ if and only if $|x-a|_2^2 \leq|x-b|_2^2$, so
\begin{aligned} |x-a|_2^2 \leq|x-b|_2^2 & \Longleftrightarrow(x-a)^T(x-a) \leq(x-b)^T(x-b) \ & \Longleftrightarrow x^T x-2 a^T x+a^T a \leq x^T x-2 b^T x+b^T b \ & \Longleftrightarrow 2(b-a)^T x \leq b^T b-a^T a . \end{aligned}
Therefore, the set is indeed a halfspace. We can take $c=2(b-a)$ and $d=b^T b-a^T a$. This makes good geometric sense: the points that are equidistant to $a$ and $b$ are given by a hyperplane whose normal is in the direction $b-a$.

To illustrate this, we can consider the case of $n=2$. Let $a=(a_1,a_2)$ and $b=(b_1,b_2)$ be two distinct points in $\mathbf{R}^2$. Then the set of all points that are closer to $a$ than $b$ is given by

\left\{(x_1,x_2)\in\mathbf{R}^2\mid(x_1-a_1)^2+(x_2-a_2)^2\leq(x_1-b_1)^2+(x_2-b_2)^2\right\}.{(x1​,x2​)∈R2∣(x1​−a1​)2+(x2​−a2​)2≤(x1​−b1​)2+(x2​−b2​)2}.

Expanding the inequality, we get

2(b_1-a_1)x_1+2(b_2-a_2)x_2\leq b_1^2+b_2^2-a_1^2-a_2^2.2(b1​−a1​)x1​+2(b2​−a2​)x2​≤b12​+b22​−a12​−a22​.

Dividing both sides by $2\sqrt{(b_1-a_1)^2+(b_2-a_2)^2}$, we obtain

\frac{b_1-a_1}{\sqrt{(b_1-a_1)^2+(b_2-a_2)^2}}x_1+\frac{b_2-a_2}{\sqrt{(b_1-a_1)^2+(b_2-a_2)^2}}x_2\leq\frac{b_1^2+b_2^2-a_1^2-a_2^2}{2\sqrt{(b_1-a_1)^2+(b_2-a_2)^2}},(b1​−a1​)2+(b2​−a2​)2​b1​−a1​​x1​+(b1​−a1​)2+(b2​−a2​)2​b2​−a2​​x2​≤2(b1​−a1​)2+(b2​−a2​)2​b12​+b22​−a12​−a22​​,

which is the equation of a halfplane whose boundary is the perpendicular bisector of the segment connecting $a$ and $b$.

Here is a plot of the halfplane in the case where $a=(0,0)$ and $b=(1,2)$:

Which of the following sets $S$ are polyhedra? If possible, express $S$ in the form $S=$ ${x \mid A x \preceq b, F x=g}$.
(a) $S=\left{y_1 a_1+y_2 a_2 \mid-1 \leq y_1 \leq 1,-1 \leq y_2 \leq 1\right}$, where $a_1, a_2 \in \mathbf{R}^n$.
(b) $S=\left{x \in \mathbf{R}^n \mid x \succeq 0,1^T x=1, \sum_{i=1}^n x_i a_i=b_1, \sum_{i=1}^n x_i a_i^2=b_2\right}$, where $a_1, \ldots, a_n \in \mathbf{R}$ and $b_1, b_2 \in \mathbf{R}$.
(c) $S=\left{x \in \mathbf{R}^n \mid x \succeq 0, x^T y \leq 1\right.$ for all $y$ with $\left.|y|_2=1\right}$.
(d) $S=\left{x \in \mathbf{R}^n \mid x \succeq 0, x^T y \leq 1\right.$ for all $y$ with $\left.\sum_{i=1}^n\left|y_i\right|=1\right}$.

Solution.
(a) $S$ is a polyhedron. It is the parallelogram with corners $a_1+a_2, a_1-a_2,-a_1+a_2$, $-a_1-a_2$, as shown below for an example in $\mathbf{R}^2$.

For simplicity we assume that $a_1$ and $a_2$ are independent. We can express $S$ as the intersection of three sets:

• $S_1$ : the plane defined by $a_1$ and $a_2$
• $S_2=\left{z+y_1 a_1+y_2 a_2 \mid a_1^T z=a_2^T z=0,-1 \leq y_1 \leq 1\right}$. This is a slab parallel to $a_2$ and orthogonal to $S_1$
• $S_3=\left{z+y_1 a_1+y_2 a_2 \mid a_1^T z=a_2^T z=0,-1 \leq y_2 \leq 1\right}$. This is a slab parallel to $a_1$ and orthogonal to $S_1$
Each of these sets can be described with linear inequalities.
• $S_1$ can be described as
$$v_k^T x=0, k=1, \ldots, n-2$$
where $v_k$ are $n-2$ independent vectors that are orthogonal to $a_1$ and $a_2$ (which form a basis for the nullspace of the matrix $\left[\begin{array}{ll}a_1 & a_2\end{array}\right]^T$ ).
• Let $c_1$ be a vector in the plane defined by $a_1$ and $a_2$, and orthogonal to $a_2$. For example, we can take
$$c_1=a_1-\frac{a_1^T a_2}{\left|a_2\right|_2^2} a_2 \text {. }$$
Then $x \in S_2$ if and only if
$$-\left|c_1^T a_1\right| \leq c_1^T x \leq\left|c_1^T a_1\right|$$
• Similarly, let $c_2$ be a vector in the plane defined by $a_1$ and $a_2$, and orthogonal to $a_1, e . g_{.1}$
$$c_2=a_2-\frac{a_2^T a_1}{\left|a_1\right|_2^2} a_1 .$$
Then $x \in S_3$ if and only if
$$-\left|c_2^T a_2\right| \leq c_2^T x \leq\left|c_2^T a_2\right|$$
Putting it all together, we can describe $S$ as the solution set of $2 n$ linear inequalities
\begin{aligned} v_k^T x & \leq 0, k=1, \ldots, n-2 \ -v_k^T x & \leq 0, k=1, \ldots, n-2 \ c_1^T x & \leq\left|c_1^T a_1\right| \ -c_1^T x & \leq\left|c_1^T a_1\right| \ c_2^T x & \leq\left|c_2^T a_2\right| \ -c_2^T x & \leq\left|c_2^T a_2\right| . \end{aligned}

## Textbooks

• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.

Statistics-lab™可以为您提供harvard.edu AM221 Convex optimization现代代数课程的代写代考辅导服务！ 请认准Statistics-lab™. Statistics-lab™为您的留学生涯保驾护航。

## AM221 Convex optimization课程简介

This is a graduate-level course on optimization. The course covers mathematical programming and combinatorial optimization from the perspective of convex optimization, which is a central tool for solving large-scale problems. In recent years convex optimization has had a profound impact on statistical machine learning, data analysis, mathematical finance, signal processing, control, theoretical computer science, and many other areas. The first part will be dedicated to the theory of convex optimization and its direct applications. The second part will focus on advanced techniques in combinatorial optimization using machinery developed in the first part.

## PREREQUISITES

Instructor: Yaron Singer (OH: Wednesdays 4-5pm, MD239)
Teaching fellows:

• Thibaut Horel (OH: Tuesdays 4:30-5:30pm, MD’s 2nd floor lounge)
• Rajko Radovanovic (OH: Mondays 5:30-6:30pm, MD’s 2nd floor lounge)
Time: Monday \& Wednesday, 2:30pm-4:00pm
Room: MD119
Sections: Wednesdays 5:30-7:00pm in MD221

## AM221 Convex optimization HELP（EXAM HELP， ONLINE TUTOR）

Bi-criterion optimization. Figure 4.11 shows the optimal trade-off curve and the set of achievable values for the bi-criterion optimization problem
$$\text { minimize (w.r.t. } \left.\mathbf{R}{+}^2\right) \quad\left(|A x-b|^2,|x|_2^2\right)$$ for some $A \in \mathbf{R}^{100 \times 10}, b \in \mathbf{R}^{100}$. Answer the following questions using information from the plot. We denote by $x{1 \mathrm{~s}}$ the solution of the least-squares problem
$$\text { minimize }|A x-b|_2^2$$
(a) What is $\left|x_{1 s}\right|_2$ ?
(b) What is $\left|A x_{1 s}-b\right|_2$ ?
(c) What is $|b|_2$ ?

(a) From the plot, we can see that the optimal point for the least-squares problem corresponds to a value of $|x|2^2$ of approximately 1.0. Therefore, we have $\left|x{1 s}\right|_2 \approx \sqrt{1.0} = 1.0$.

(b) From the plot, we can see that the optimal point for the least-squares problem corresponds to a value of $|A x – b|^2$ of approximately 5.0. Therefore, we have $\left|A x_{1 s} – b\right|_2 \approx \sqrt{5.0} \approx 2.236$.

(c) The plot does not provide information about the norm of $b$.

Self-concordance and negative entropy.
(a) Show that the negative entropy function $x \log x$ (on $\mathbf{R}{++}$) is not self-concordant. (b) Show that for any $t>0, t x \log x-\log x$ is self-concordant (on $\left.\mathbf{R}{++}\right)$.

(a) To show that the negative entropy function $x \log x$ is not self-concordant, we need to show that it does not satisfy the self-concordance inequality. Let $f(x) = x \log x$ and $f”(x) = \frac{1}{x}$ be the second derivative of $f(x)$. Then, the self-concordance inequality is given by:

\left|f”'(x)\right| \leq 2\left(f”(x)\right)^{\frac{3}{2}} \quad \text{for all } x > 0∣f′′′(x)∣≤2(f′′(x))23​for all x>0

Taking the third derivative of $f(x)$, we have:

f”'(x) = -\frac{1}{x^2}f′′′(x)=−x21​

Substituting this back into the self-concordance inequality, we get:

\left|\frac{-1}{x^2}\right| \leq 2 \left(\frac{1}{x}\right)^{\frac{3}{2}}∣∣​x2−1​∣∣​≤2(x1​)23​

Simplifying, we have:

\frac{1}{x^2} \leq 2\sqrt{\frac{1}{x}}x21​≤2x1​​

Multiplying both sides by $x^{\frac{3}{2}}$ and simplifying, we get:

1 \leq 2\sqrt{x}1≤2x

Squaring both sides, we get:

1 \leq 4×1≤4x

However, this inequality does not hold for all $x > 0$, since $x \log x$ approaches 0 as $x$ approaches 0. Therefore, the negative entropy function $x \log x$ is not self-concordant.

(b) To show that $tx\log x – \log x$ is self-concordant, we need to show that it satisfies the self-concordance inequality. Let $f(x) = tx\log x – \log x$ and $f”(x) = t \frac{1 + \log x}{x^2}$ be the second derivative of $f(x)$. Then, the self-concordance inequality is given by:

\left|f”'(x)\right| \leq 2\left(f”(x)\right)^{\frac{3}{2}} \quad \text{for all } x > 0∣f′′′(x)∣≤2(f′′(x))23​for all x>0

Taking the third derivative of $f(x)$, we have:

f”'(x) = -t\frac{2+\log x}{x^3}f′′′(x)=−tx32+logx​

Substituting this back into the self-concordance inequality, we get:

\left|-\frac{t(2+\log x)}{x^3}\right| \leq 2 \left(t\frac{1+\log x}{x^2}\right)^{\frac{3}{2}}∣∣​−x3t(2+logx)​∣∣​≤2(tx21+logx​)23​

Simplifying, we have:

\frac{t(2+\log x)}{x^3} \leq 2t\frac{(1+\log x)^{\frac{3}{2}}}{x^3}x3t(2+logx)​≤2tx3(1+logx)23​​

Dividing both sides by $t\frac{(1+\log x)^{\frac{3}{2}}}{x^3}$ and simplifying, we get:

\frac{2+\log x}{(1+\log x)^{\frac{3}{2}}} \leq 2(1+logx)23​2+logx​≤2

Squaring both sides and simplifying, we get:

## Textbooks

• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.

Statistics-lab™可以为您提供harvard.edu AM221 Convex optimization现代代数课程的代写代考辅导服务！ 请认准Statistics-lab™. Statistics-lab™为您的留学生涯保驾护航。

## AM221 Convex optimization课程简介

This is a graduate-level course on optimization. The course covers mathematical programming and combinatorial optimization from the perspective of convex optimization, which is a central tool for solving large-scale problems. In recent years convex optimization has had a profound impact on statistical machine learning, data analysis, mathematical finance, signal processing, control, theoretical computer science, and many other areas. The first part will be dedicated to the theory of convex optimization and its direct applications. The second part will focus on advanced techniques in combinatorial optimization using machinery developed in the first part.

## PREREQUISITES

Instructor: Yaron Singer (OH: Wednesdays 4-5pm, MD239)
Teaching fellows:

• Thibaut Horel (OH: Tuesdays 4:30-5:30pm, MD’s 2nd floor lounge)
• Rajko Radovanovic (OH: Mondays 5:30-6:30pm, MD’s 2nd floor lounge)
Time: Monday \& Wednesday, 2:30pm-4:00pm
Room: MD119
Sections: Wednesdays 5:30-7:00pm in MD221

## AM221 Convex optimization HELP（EXAM HELP， ONLINE TUTOR）

Problem 1 (20pts). Let $A \in \mathbb{R}^{m \times n}$ be given. We are interested in finding a vector $x \in \mathbb{R}_{+}^n$ such that $A x=\mathbf{0}$ and the number of positive components of $x$ is maximized. Formulate this problem as a linear program. Justify your answer.

We want to find a vector $x\in\mathbb{R}_+^n$ such that $Ax=\mathbf{0}$ and the number of positive components of $x$ is maximized. We can formulate this as a linear programming problem as follows:

\begin{alignat*}{3} &\text{maximize} \quad && \sum_{i=1}^{n} z_i \ &\text{subject to} \quad && A x = 0 \ & && x_i \geq 0 \quad (i=1,\dots,n) \ & && z_i \in {0,1} \quad (i=1,\dots,n) \ & && z_i \geq x_i \quad (i=1,\dots,n) \end{alignat*}

Here, $z_i$ is a binary variable indicating whether $x_i$ is positive or not. We want to maximize the sum of all $z_i$, subject to the constraint that $Ax = 0$. The constraints $z_i \geq x_i$ ensure that $z_i$ is only $1$ if $x_i$ is positive.

To see that this is a valid formulation of the problem, note that the objective function is linear and the constraints are all linear as well. The constraint $Ax = 0$ ensures that $x$ is a feasible solution to the original problem, while the constraints $x_i \geq 0$ and $z_i \geq x_i$ ensure that the binary variables $z_i$ correctly indicate whether $x_i$ is positive or not. Since the objective function is to maximize the number of positive components of $x$, it follows that maximizing the sum of $z_i$ is equivalent to the original problem. Therefore, the above linear program is a valid formulation of the problem.

Problem 2 (25pts).
(a) (15pts). Let $S=\left{x \in \mathbb{R}^n: x^T A x+b^T x+c \leq 0\right}$, where $A \in \mathcal{S}^n, b \in \mathbb{R}^n$, and $c \in \mathbb{R}$ are given. Show that $S$ is convex if $A \succeq \mathbf{0}$. Is the converse true? Explain.
(b) (10pts). Is the set $S=\left{X \in \mathcal{S}^n: \lambda_{\max }(X) \geq 1\right}$ convex? Justify your answer.

(a) To show that $S$ is convex, we need to show that for any $x, y \in S$ and $\lambda \in [0,1]$, we have $\lambda x+(1-\lambda) y \in S$. That is, we need to show that $(\lambda x+(1-\lambda) y)^T A (\lambda x+(1-\lambda) y)+b^T(\lambda x+(1-\lambda) y)+c \leq 0$.

Expanding the left-hand side, we have \begin{align*} &(\lambda x+(1-\lambda) y)^T A (\lambda x+(1-\lambda) y)+b^T(\lambda x+(1-\lambda) y)+c \ &= \lambda^2 x^T A x + (1-\lambda)^2 y^T A y + 2\lambda (1-\lambda) x^T A y + \lambda b^T x + (1-\lambda) b^T y + c \ &\leq \lambda^2 x^T A x + (1-\lambda)^2 y^T A y + 2\lambda (1-\lambda) \sqrt{x^T A x}\sqrt{y^T A y} + \lambda b^T x + (1-\lambda) b^T y + c \ &= \lambda^2 (\sqrt{x^T A x}+\sqrt{y^T A y})^2 + (1-\lambda)^2 (\sqrt{x^T A x}+\sqrt{y^T A y})^2 – 2\lambda(1-\lambda) (\sqrt{x^T A x}+\sqrt{y^T A y})^2 \ &\qquad+ \lambda b^T x + (1-\lambda) b^T y + c \ &= \lambda(\sqrt{x^T A x}+\sqrt{y^T A y}+b^T x)+ (1-\lambda)(\sqrt{x^T A x}+\sqrt{y^T A y}+b^T y) – \lambda(1-\lambda) (\sqrt{x^T A x}+\sqrt{y^T A y})^2 + c \ &= \lambda(\sqrt{x^T A x}+b^T x)+ (1-\lambda)(\sqrt{y^T A y}+b^T y) – \lambda(1-\lambda) (\sqrt{x^T A x}+\sqrt{y^T A y})^2 + c \ &\leq \lambda(\sqrt{x^T A x}+b^T x)+ (1-\lambda)(\sqrt{y^T A y}+b^T y) + c \ &= \lambda(x^T A x+b^T x+c) + (1-\lambda)(y^T A y+b^T y+c) \ &\leq 0, \end{align*} where we used the fact that $A \succeq 0$ implies $\sqrt{x^T A x}+\sqrt{y^T A y} \geq \sqrt{(x+y)^T A (x+y)}$ (which follows from the Cauchy-Schwarz inequality), and that $x, y \in S$ implies $x^T A x+b^T x+c \leq 0$ and $y^T A y+b^T y+c \leq 0$.

## Textbooks

• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.