分类: AM221 Convex optimization

数学代写|AM221 Convex optimization

Statistics-lab™可以为您提供harvard.edu AM221 Convex optimization凸优化课程的代写代考辅导服务!

数学代写|AM221 Convex optimization

AM221 Convex optimization课程简介

This is a graduate-level course on optimization. The course covers mathematical programming and combinatorial optimization from the perspective of convex optimization, which is a central tool for solving large-scale problems. In recent years convex optimization has had a profound impact on statistical machine learning, data analysis, mathematical finance, signal processing, control, theoretical computer science, and many other areas. The first part will be dedicated to the theory of convex optimization and its direct applications. The second part will focus on advanced techniques in combinatorial optimization using machinery developed in the first part.

PREREQUISITES 

Instructor: Yaron Singer (OH: Wednesdays 4-5pm, MD239)
Teaching fellows:

  • Thibaut Horel (OH: Tuesdays 4:30-5:30pm, MD’s 2nd floor lounge)
  • Rajko Radovanovic (OH: Mondays 5:30-6:30pm, MD’s 2nd floor lounge)
    Time: Monday \& Wednesday, 2:30pm-4:00pm
    Room: MD119
    Sections: Wednesdays 5:30-7:00pm in MD221

AM221 Convex optimization HELP(EXAM HELP, ONLINE TUTOR)

问题 1.

2.7 Voronoi description of halfspace. Let $a$ and $b$ be distinct points in $\mathbf{R}^n$. Show that the set of all points that are closer (in Euclidean norm) to $a$ than $b$, i.e., $\left{x \mid|x-a|_2 \leq|x-b|_2\right}$, is a halfspace. Describe it explicitly as an inequality of the form $c^T x \leq d$. Draw a picture. Solution. Since a norm is always nonnegative, we have $|x-a|_2 \leq|x-b|_2$ if and only if $|x-a|_2^2 \leq|x-b|_2^2$, so
$$
\begin{aligned}
|x-a|_2^2 \leq|x-b|_2^2 & \Longleftrightarrow(x-a)^T(x-a) \leq(x-b)^T(x-b) \
& \Longleftrightarrow x^T x-2 a^T x+a^T a \leq x^T x-2 b^T x+b^T b \
& \Longleftrightarrow 2(b-a)^T x \leq b^T b-a^T a .
\end{aligned}
$$
Therefore, the set is indeed a halfspace. We can take $c=2(b-a)$ and $d=b^T b-a^T a$. This makes good geometric sense: the points that are equidistant to $a$ and $b$ are given by a hyperplane whose normal is in the direction $b-a$.

To illustrate this, we can consider the case of $n=2$. Let $a=(a_1,a_2)$ and $b=(b_1,b_2)$ be two distinct points in $\mathbf{R}^2$. Then the set of all points that are closer to $a$ than $b$ is given by

\left\{(x_1,x_2)\in\mathbf{R}^2\mid(x_1-a_1)^2+(x_2-a_2)^2\leq(x_1-b_1)^2+(x_2-b_2)^2\right\}.{(x1​,x2​)∈R2∣(x1​−a1​)2+(x2​−a2​)2≤(x1​−b1​)2+(x2​−b2​)2}.

Expanding the inequality, we get

2(b_1-a_1)x_1+2(b_2-a_2)x_2\leq b_1^2+b_2^2-a_1^2-a_2^2.2(b1​−a1​)x1​+2(b2​−a2​)x2​≤b12​+b22​−a12​−a22​.

Dividing both sides by $2\sqrt{(b_1-a_1)^2+(b_2-a_2)^2}$, we obtain

\frac{b_1-a_1}{\sqrt{(b_1-a_1)^2+(b_2-a_2)^2}}x_1+\frac{b_2-a_2}{\sqrt{(b_1-a_1)^2+(b_2-a_2)^2}}x_2\leq\frac{b_1^2+b_2^2-a_1^2-a_2^2}{2\sqrt{(b_1-a_1)^2+(b_2-a_2)^2}},(b1​−a1​)2+(b2​−a2​)2​b1​−a1​​x1​+(b1​−a1​)2+(b2​−a2​)2​b2​−a2​​x2​≤2(b1​−a1​)2+(b2​−a2​)2​b12​+b22​−a12​−a22​​,

which is the equation of a halfplane whose boundary is the perpendicular bisector of the segment connecting $a$ and $b$.

Here is a plot of the halfplane in the case where $a=(0,0)$ and $b=(1,2)$:

问题 2.

Which of the following sets $S$ are polyhedra? If possible, express $S$ in the form $S=$ ${x \mid A x \preceq b, F x=g}$.
(a) $S=\left{y_1 a_1+y_2 a_2 \mid-1 \leq y_1 \leq 1,-1 \leq y_2 \leq 1\right}$, where $a_1, a_2 \in \mathbf{R}^n$.
(b) $S=\left{x \in \mathbf{R}^n \mid x \succeq 0,1^T x=1, \sum_{i=1}^n x_i a_i=b_1, \sum_{i=1}^n x_i a_i^2=b_2\right}$, where $a_1, \ldots, a_n \in \mathbf{R}$ and $b_1, b_2 \in \mathbf{R}$.
(c) $S=\left{x \in \mathbf{R}^n \mid x \succeq 0, x^T y \leq 1\right.$ for all $y$ with $\left.|y|_2=1\right}$.
(d) $S=\left{x \in \mathbf{R}^n \mid x \succeq 0, x^T y \leq 1\right.$ for all $y$ with $\left.\sum_{i=1}^n\left|y_i\right|=1\right}$.

Solution.
(a) $S$ is a polyhedron. It is the parallelogram with corners $a_1+a_2, a_1-a_2,-a_1+a_2$, $-a_1-a_2$, as shown below for an example in $\mathbf{R}^2$.

For simplicity we assume that $a_1$ and $a_2$ are independent. We can express $S$ as the intersection of three sets:

  • $S_1$ : the plane defined by $a_1$ and $a_2$
  • $S_2=\left{z+y_1 a_1+y_2 a_2 \mid a_1^T z=a_2^T z=0,-1 \leq y_1 \leq 1\right}$. This is a slab parallel to $a_2$ and orthogonal to $S_1$
  • $S_3=\left{z+y_1 a_1+y_2 a_2 \mid a_1^T z=a_2^T z=0,-1 \leq y_2 \leq 1\right}$. This is a slab parallel to $a_1$ and orthogonal to $S_1$
    Each of these sets can be described with linear inequalities.
  • $S_1$ can be described as
    $$
    v_k^T x=0, k=1, \ldots, n-2
    $$
    where $v_k$ are $n-2$ independent vectors that are orthogonal to $a_1$ and $a_2$ (which form a basis for the nullspace of the matrix $\left[\begin{array}{ll}a_1 & a_2\end{array}\right]^T$ ).
  • Let $c_1$ be a vector in the plane defined by $a_1$ and $a_2$, and orthogonal to $a_2$. For example, we can take
    $$
    c_1=a_1-\frac{a_1^T a_2}{\left|a_2\right|_2^2} a_2 \text {. }
    $$
    Then $x \in S_2$ if and only if
    $$
    -\left|c_1^T a_1\right| \leq c_1^T x \leq\left|c_1^T a_1\right|
    $$
  • Similarly, let $c_2$ be a vector in the plane defined by $a_1$ and $a_2$, and orthogonal to $a_1, e . g_{.1}$
    $$
    c_2=a_2-\frac{a_2^T a_1}{\left|a_1\right|_2^2} a_1 .
    $$
    Then $x \in S_3$ if and only if
    $$
    -\left|c_2^T a_2\right| \leq c_2^T x \leq\left|c_2^T a_2\right|
    $$
    Putting it all together, we can describe $S$ as the solution set of $2 n$ linear inequalities
    $$
    \begin{aligned}
    v_k^T x & \leq 0, k=1, \ldots, n-2 \
    -v_k^T x & \leq 0, k=1, \ldots, n-2 \
    c_1^T x & \leq\left|c_1^T a_1\right| \
    -c_1^T x & \leq\left|c_1^T a_1\right| \
    c_2^T x & \leq\left|c_2^T a_2\right| \
    -c_2^T x & \leq\left|c_2^T a_2\right| .
    \end{aligned}
    $$

Textbooks


• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.

此图像的alt属性为空;文件名为%E7%B2%89%E7%AC%94%E5%AD%97%E6%B5%B7%E6%8A%A5-1024x575-10.png
AM221 Convex optimization

Statistics-lab™可以为您提供harvard.edu AM221 Convex optimization现代代数课程的代写代考辅导服务! 请认准Statistics-lab™. Statistics-lab™为您的留学生涯保驾护航。

数学代写|AM221 Convex optimization

Statistics-lab™可以为您提供harvard.edu AM221 Convex optimization凸优化课程的代写代考辅导服务!

数学代写|AM221 Convex optimization

AM221 Convex optimization课程简介

This is a graduate-level course on optimization. The course covers mathematical programming and combinatorial optimization from the perspective of convex optimization, which is a central tool for solving large-scale problems. In recent years convex optimization has had a profound impact on statistical machine learning, data analysis, mathematical finance, signal processing, control, theoretical computer science, and many other areas. The first part will be dedicated to the theory of convex optimization and its direct applications. The second part will focus on advanced techniques in combinatorial optimization using machinery developed in the first part.

PREREQUISITES 

Instructor: Yaron Singer (OH: Wednesdays 4-5pm, MD239)
Teaching fellows:

  • Thibaut Horel (OH: Tuesdays 4:30-5:30pm, MD’s 2nd floor lounge)
  • Rajko Radovanovic (OH: Mondays 5:30-6:30pm, MD’s 2nd floor lounge)
    Time: Monday \& Wednesday, 2:30pm-4:00pm
    Room: MD119
    Sections: Wednesdays 5:30-7:00pm in MD221

AM221 Convex optimization HELP(EXAM HELP, ONLINE TUTOR)

问题 1.

Bi-criterion optimization. Figure 4.11 shows the optimal trade-off curve and the set of achievable values for the bi-criterion optimization problem
$$
\text { minimize (w.r.t. } \left.\mathbf{R}{+}^2\right) \quad\left(|A x-b|^2,|x|_2^2\right) $$ for some $A \in \mathbf{R}^{100 \times 10}, b \in \mathbf{R}^{100}$. Answer the following questions using information from the plot. We denote by $x{1 \mathrm{~s}}$ the solution of the least-squares problem
$$
\text { minimize }|A x-b|_2^2
$$
(a) What is $\left|x_{1 s}\right|_2$ ?
(b) What is $\left|A x_{1 s}-b\right|_2$ ?
(c) What is $|b|_2$ ?

(a) From the plot, we can see that the optimal point for the least-squares problem corresponds to a value of $|x|2^2$ of approximately 1.0. Therefore, we have $\left|x{1 s}\right|_2 \approx \sqrt{1.0} = 1.0$.

(b) From the plot, we can see that the optimal point for the least-squares problem corresponds to a value of $|A x – b|^2$ of approximately 5.0. Therefore, we have $\left|A x_{1 s} – b\right|_2 \approx \sqrt{5.0} \approx 2.236$.

(c) The plot does not provide information about the norm of $b$.

问题 2.

Self-concordance and negative entropy.
(a) Show that the negative entropy function $x \log x$ (on $\mathbf{R}{++}$) is not self-concordant. (b) Show that for any $t>0, t x \log x-\log x$ is self-concordant (on $\left.\mathbf{R}{++}\right)$.

(a) To show that the negative entropy function $x \log x$ is not self-concordant, we need to show that it does not satisfy the self-concordance inequality. Let $f(x) = x \log x$ and $f”(x) = \frac{1}{x}$ be the second derivative of $f(x)$. Then, the self-concordance inequality is given by:

\left|f”'(x)\right| \leq 2\left(f”(x)\right)^{\frac{3}{2}} \quad \text{for all } x > 0∣f′′′(x)∣≤2(f′′(x))23​for all x>0

Taking the third derivative of $f(x)$, we have:

f”'(x) = -\frac{1}{x^2}f′′′(x)=−x21​

Substituting this back into the self-concordance inequality, we get:

\left|\frac{-1}{x^2}\right| \leq 2 \left(\frac{1}{x}\right)^{\frac{3}{2}}∣∣​x2−1​∣∣​≤2(x1​)23​

Simplifying, we have:

\frac{1}{x^2} \leq 2\sqrt{\frac{1}{x}}x21​≤2x1​​

Multiplying both sides by $x^{\frac{3}{2}}$ and simplifying, we get:

1 \leq 2\sqrt{x}1≤2x

Squaring both sides, we get:

1 \leq 4×1≤4x

However, this inequality does not hold for all $x > 0$, since $x \log x$ approaches 0 as $x$ approaches 0. Therefore, the negative entropy function $x \log x$ is not self-concordant.

(b) To show that $tx\log x – \log x$ is self-concordant, we need to show that it satisfies the self-concordance inequality. Let $f(x) = tx\log x – \log x$ and $f”(x) = t \frac{1 + \log x}{x^2}$ be the second derivative of $f(x)$. Then, the self-concordance inequality is given by:

\left|f”'(x)\right| \leq 2\left(f”(x)\right)^{\frac{3}{2}} \quad \text{for all } x > 0∣f′′′(x)∣≤2(f′′(x))23​for all x>0

Taking the third derivative of $f(x)$, we have:

f”'(x) = -t\frac{2+\log x}{x^3}f′′′(x)=−tx32+logx​

Substituting this back into the self-concordance inequality, we get:

\left|-\frac{t(2+\log x)}{x^3}\right| \leq 2 \left(t\frac{1+\log x}{x^2}\right)^{\frac{3}{2}}∣∣​−x3t(2+logx)​∣∣​≤2(tx21+logx​)23​

Simplifying, we have:

\frac{t(2+\log x)}{x^3} \leq 2t\frac{(1+\log x)^{\frac{3}{2}}}{x^3}x3t(2+logx)​≤2tx3(1+logx)23​​

Dividing both sides by $t\frac{(1+\log x)^{\frac{3}{2}}}{x^3}$ and simplifying, we get:

\frac{2+\log x}{(1+\log x)^{\frac{3}{2}}} \leq 2(1+logx)23​2+logx​≤2

Squaring both sides and simplifying, we get:

Textbooks


• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.

此图像的alt属性为空;文件名为%E7%B2%89%E7%AC%94%E5%AD%97%E6%B5%B7%E6%8A%A5-1024x575-10.png
AM221 Convex optimization

Statistics-lab™可以为您提供harvard.edu AM221 Convex optimization现代代数课程的代写代考辅导服务! 请认准Statistics-lab™. Statistics-lab™为您的留学生涯保驾护航。

数学代写|AM221 Convex optimization

Statistics-lab™可以为您提供harvard.edu AM221 Convex optimization凸优化课程的代写代考辅导服务!

数学代写|AM221 Convex optimization

AM221 Convex optimization课程简介

This is a graduate-level course on optimization. The course covers mathematical programming and combinatorial optimization from the perspective of convex optimization, which is a central tool for solving large-scale problems. In recent years convex optimization has had a profound impact on statistical machine learning, data analysis, mathematical finance, signal processing, control, theoretical computer science, and many other areas. The first part will be dedicated to the theory of convex optimization and its direct applications. The second part will focus on advanced techniques in combinatorial optimization using machinery developed in the first part.

PREREQUISITES 

Instructor: Yaron Singer (OH: Wednesdays 4-5pm, MD239)
Teaching fellows:

  • Thibaut Horel (OH: Tuesdays 4:30-5:30pm, MD’s 2nd floor lounge)
  • Rajko Radovanovic (OH: Mondays 5:30-6:30pm, MD’s 2nd floor lounge)
    Time: Monday \& Wednesday, 2:30pm-4:00pm
    Room: MD119
    Sections: Wednesdays 5:30-7:00pm in MD221

AM221 Convex optimization HELP(EXAM HELP, ONLINE TUTOR)

问题 1.

Problem 1 (20pts). Let $A \in \mathbb{R}^{m \times n}$ be given. We are interested in finding a vector $x \in \mathbb{R}_{+}^n$ such that $A x=\mathbf{0}$ and the number of positive components of $x$ is maximized. Formulate this problem as a linear program. Justify your answer.

We want to find a vector $x\in\mathbb{R}_+^n$ such that $Ax=\mathbf{0}$ and the number of positive components of $x$ is maximized. We can formulate this as a linear programming problem as follows:

\begin{alignat*}{3} &\text{maximize} \quad && \sum_{i=1}^{n} z_i \ &\text{subject to} \quad && A x = 0 \ & && x_i \geq 0 \quad (i=1,\dots,n) \ & && z_i \in {0,1} \quad (i=1,\dots,n) \ & && z_i \geq x_i \quad (i=1,\dots,n) \end{alignat*}

Here, $z_i$ is a binary variable indicating whether $x_i$ is positive or not. We want to maximize the sum of all $z_i$, subject to the constraint that $Ax = 0$. The constraints $z_i \geq x_i$ ensure that $z_i$ is only $1$ if $x_i$ is positive.

To see that this is a valid formulation of the problem, note that the objective function is linear and the constraints are all linear as well. The constraint $Ax = 0$ ensures that $x$ is a feasible solution to the original problem, while the constraints $x_i \geq 0$ and $z_i \geq x_i$ ensure that the binary variables $z_i$ correctly indicate whether $x_i$ is positive or not. Since the objective function is to maximize the number of positive components of $x$, it follows that maximizing the sum of $z_i$ is equivalent to the original problem. Therefore, the above linear program is a valid formulation of the problem.

问题 2.

Problem 2 (25pts).
(a) (15pts). Let $S=\left{x \in \mathbb{R}^n: x^T A x+b^T x+c \leq 0\right}$, where $A \in \mathcal{S}^n, b \in \mathbb{R}^n$, and $c \in \mathbb{R}$ are given. Show that $S$ is convex if $A \succeq \mathbf{0}$. Is the converse true? Explain.
(b) (10pts). Is the set $S=\left{X \in \mathcal{S}^n: \lambda_{\max }(X) \geq 1\right}$ convex? Justify your answer.

(a) To show that $S$ is convex, we need to show that for any $x, y \in S$ and $\lambda \in [0,1]$, we have $\lambda x+(1-\lambda) y \in S$. That is, we need to show that $(\lambda x+(1-\lambda) y)^T A (\lambda x+(1-\lambda) y)+b^T(\lambda x+(1-\lambda) y)+c \leq 0$.

Expanding the left-hand side, we have \begin{align*} &(\lambda x+(1-\lambda) y)^T A (\lambda x+(1-\lambda) y)+b^T(\lambda x+(1-\lambda) y)+c \ &= \lambda^2 x^T A x + (1-\lambda)^2 y^T A y + 2\lambda (1-\lambda) x^T A y + \lambda b^T x + (1-\lambda) b^T y + c \ &\leq \lambda^2 x^T A x + (1-\lambda)^2 y^T A y + 2\lambda (1-\lambda) \sqrt{x^T A x}\sqrt{y^T A y} + \lambda b^T x + (1-\lambda) b^T y + c \ &= \lambda^2 (\sqrt{x^T A x}+\sqrt{y^T A y})^2 + (1-\lambda)^2 (\sqrt{x^T A x}+\sqrt{y^T A y})^2 – 2\lambda(1-\lambda) (\sqrt{x^T A x}+\sqrt{y^T A y})^2 \ &\qquad+ \lambda b^T x + (1-\lambda) b^T y + c \ &= \lambda(\sqrt{x^T A x}+\sqrt{y^T A y}+b^T x)+ (1-\lambda)(\sqrt{x^T A x}+\sqrt{y^T A y}+b^T y) – \lambda(1-\lambda) (\sqrt{x^T A x}+\sqrt{y^T A y})^2 + c \ &= \lambda(\sqrt{x^T A x}+b^T x)+ (1-\lambda)(\sqrt{y^T A y}+b^T y) – \lambda(1-\lambda) (\sqrt{x^T A x}+\sqrt{y^T A y})^2 + c \ &\leq \lambda(\sqrt{x^T A x}+b^T x)+ (1-\lambda)(\sqrt{y^T A y}+b^T y) + c \ &= \lambda(x^T A x+b^T x+c) + (1-\lambda)(y^T A y+b^T y+c) \ &\leq 0, \end{align*} where we used the fact that $A \succeq 0$ implies $\sqrt{x^T A x}+\sqrt{y^T A y} \geq \sqrt{(x+y)^T A (x+y)}$ (which follows from the Cauchy-Schwarz inequality), and that $x, y \in S$ implies $x^T A x+b^T x+c \leq 0$ and $y^T A y+b^T y+c \leq 0$.

The converse is not true. To see this, consider $n=1$, $A=-1$, $b

Textbooks


• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.

此图像的alt属性为空;文件名为%E7%B2%89%E7%AC%94%E5%AD%97%E6%B5%B7%E6%8A%A5-1024x575-10.png
AM221 Convex optimization

Statistics-lab™可以为您提供harvard.edu AM221 Convex optimization现代代数课程的代写代考辅导服务! 请认准Statistics-lab™. Statistics-lab™为您的留学生涯保驾护航。

数学代写|AM221 Convex optimization

Statistics-lab™可以为您提供harvard.edu AM221 Convex optimization凸优化课程的代写代考辅导服务!

数学代写|AM221 Convex optimization

AM221 Convex optimization课程简介

This is a graduate-level course on optimization. The course covers mathematical programming and combinatorial optimization from the perspective of convex optimization, which is a central tool for solving large-scale problems. In recent years convex optimization has had a profound impact on statistical machine learning, data analysis, mathematical finance, signal processing, control, theoretical computer science, and many other areas. The first part will be dedicated to the theory of convex optimization and its direct applications. The second part will focus on advanced techniques in combinatorial optimization using machinery developed in the first part.

PREREQUISITES 

Instructor: Yaron Singer (OH: Wednesdays 4-5pm, MD239)
Teaching fellows:

  • Thibaut Horel (OH: Tuesdays 4:30-5:30pm, MD’s 2nd floor lounge)
  • Rajko Radovanovic (OH: Mondays 5:30-6:30pm, MD’s 2nd floor lounge)
    Time: Monday \& Wednesday, 2:30pm-4:00pm
    Room: MD119
    Sections: Wednesdays 5:30-7:00pm in MD221

AM221 Convex optimization HELP(EXAM HELP, ONLINE TUTOR)

问题 1.

2.5 What is the distance between two parallel hyperplanes $\left\{x \in \mathbf{R}^n \mid a^T x=b_1\right\}$ and $\{x \in$ $\left.\mathbf{R}^n \mid a^T x=b_2\right\}$ ? Solution. The distance between the two hyperplanes is $\left|b_1-b_2\right| /\|a\|_2$. To see this, consider the construction in the figure below. The distance between the two hyperplanes is also the distance between the two points $x_1$ and $x_2$ where the hyperplane intersects the line through the origin and parallel to the normal vector $a$. These points are given by $$ x_1=\left(b_1 /\|a\|_2^2\right) a, \quad x_2=\left(b_2 /\|a\|_2^2\right) a, $$ and the distance is $$ \left\|x_1-x_2\right\|_2=\left|b_1-b_2\right| /\|a\|_2 . $$

To find the distance between two parallel hyperplanes, we first note that any vector in one hyperplane is orthogonal to the normal vector $a$ of the other hyperplane. Let $x_1$ and $x_2$ be points in the two hyperplanes respectively, such that $x_1 – x_2$ is orthogonal to $a$. Then, we have:

$$a^T (x_1 – x_2) = 0$$

Solving for $x_2$, we get:

$$x_2 = x_1 – \frac{a^T x_1 – b_2}{|a|_2^2}a$$

where $b_2$ is a constant such that $a^Tx = b_2$ defines the second hyperplane. Similarly, we can find a point $x_1$ in the first hyperplane such that $x_1 – x_2$ is orthogonal to $a$. Then, we have:

$$x_1 = x_2 + \frac{a^T x_2 – b_1}{|a|_2^2}a$$

where $b_1$ is a constant such that $a^Tx = b_1$ defines the first hyperplane.

Now, the distance between the two hyperplanes is the length of the vector $x_1 – x_2$, which is given by:

\begin{align*} \left|x_1 – x_2\right|_2 &= \left|\frac{a^T x_2 – b_1}{|a|_2^2}a – \frac{a^T x_1 – b_2}{|a|_2^2}a\right|_2 \ &= \frac{\left|b_1 – b_2\right|}{|a|_2} \end{align*}

Therefore, the distance between the two parallel hyperplanes $\left{x \in \mathbf{R}^n \mid a^T x=b_1\right}$ and ${x \in$ $\left.\mathbf{R}^n \mid a^T x=b_2\right}$ is $\left|b_1-b_2\right| /|a|_2$.

问题 2.

When does one halfspace contain another? Give conditions under which
$$
\left{x \mid a^T x \leq b\right} \subseteq\left{x \mid \tilde{a}^T x \leq \tilde{b}\right}
$$
(where $a \neq 0, \tilde{a} \neq 0$ ). Also find the conditions under which the two halfspaces are equal.

Solution. Let $\mathcal{H}=\left{x \mid a^T x \leq b\right}$ and $\tilde{\mathcal{H}}=\left{x \mid \tilde{a}^T x \leq \tilde{b}\right}$. The conditions are:

  • $\mathcal{H} \subseteq \tilde{\mathcal{H}}$ if and only if there exists a $\lambda>0$ such that $\tilde{a}=\lambda a$ and $\tilde{b} \geq \lambda b$.
  • $\mathcal{H}=\tilde{\mathcal{H}}$ if and only if there exists a $\lambda>0$ such that $\tilde{a}=\lambda a$ and $\tilde{b}=\lambda b$.
    Let us prove the first condition. The condition is clearly sufficient: if $\tilde{a}=\lambda a$ and $\tilde{b} \geq \lambda b$ for some $\lambda>0$, then
    $$
    a^T x \leq b \Longrightarrow \lambda a^T x \leq \lambda b \Longrightarrow \tilde{a}^T x \leq \tilde{b}
    $$
    i.e., $\mathcal{H} \subseteq \tilde{\mathcal{H}}$.
    To prove necessity, we distinguish three cases. First suppose $a$ and $\tilde{a}$ are not parallel. This means we can find a $v$ with $\tilde{a}^T v=0$ and $a^T v \neq 0$. Let $\hat{x}$ be any point in the intersection of $\mathcal{H}$ and $\tilde{\mathcal{H}}$, i.e., $a^T \hat{x} \leq b$ and $\tilde{a}^T x \leq \tilde{b}$. We have $a^T(\hat{x}+t v)=a^T \hat{x} \leq b$ for all $t \in \mathbf{R}$. However $\tilde{a}^T(\hat{x}+t v)=\tilde{a}^T \hat{x}+t \tilde{a}^T v$, and since $\tilde{a}^T v \neq 0$, we will have $\tilde{a}^T(\hat{x}+t v)>\tilde{b}$ for sufficiently large $t>0$ or sufficiently small $t<0$. In other words, if $a$ and $\tilde{a}$ are not parallel, we can find a point $\dot{x}+t v \in \mathcal{H}$ that is not in $\tilde{\mathcal{H}}$, i.e., $\mathcal{H} \nsubseteq \tilde{\mathcal{H}}$. Next suppose $a$ and $\bar{a}$ are parallel, but point in opposite directions, i.e., $\bar{a}=\lambda a$ for some $\lambda<0$. Let $\hat{x}$ be any point in $\mathcal{H}$. Then $\hat{x}-t a \in \mathcal{H}$ for all $t \geq 0$. However for $t$ large enough we will have $\tilde{a}^T(\hat{x}-t a)=\tilde{a}^T \hat{x}+t \lambda|a|_2^2>\tilde{b}$, so $\hat{x}-t a \notin \tilde{\mathcal{H}}$. Again, this shows $\mathcal{H} \notin \tilde{\mathcal{H}}$.
  • Finally, we assume $\tilde{a}=\lambda a$ for some $\lambda>0$ but $\tilde{b}<\lambda b$. Consider any point $\hat{x}$ that satisfies $a^T \hat{x}=b$. Then $\tilde{a}^T \hat{x}=\lambda a^T \hat{x}=\lambda b>\tilde{b}$, so $\hat{x} \notin \tilde{\mathcal{H}}$.
  • The proof for the second part of the problem is similar.

Textbooks


• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.

此图像的alt属性为空;文件名为%E7%B2%89%E7%AC%94%E5%AD%97%E6%B5%B7%E6%8A%A5-1024x575-10.png
AM221 Convex optimization

Statistics-lab™可以为您提供harvard.edu AM221 Convex optimization现代代数课程的代写代考辅导服务! 请认准Statistics-lab™. Statistics-lab™为您的留学生涯保驾护航。

数学代写|AM221 Convex optimization

Statistics-lab™可以为您提供harvard.edu AM221 Convex optimization凸优化课程的代写代考辅导服务!

数学代写|AM221 Convex optimization

AM221 Convex optimization课程简介

This is a graduate-level course on optimization. The course covers mathematical programming and combinatorial optimization from the perspective of convex optimization, which is a central tool for solving large-scale problems. In recent years convex optimization has had a profound impact on statistical machine learning, data analysis, mathematical finance, signal processing, control, theoretical computer science, and many other areas. The first part will be dedicated to the theory of convex optimization and its direct applications. The second part will focus on advanced techniques in combinatorial optimization using machinery developed in the first part.

PREREQUISITES 

Instructor: Yaron Singer (OH: Wednesdays 4-5pm, MD239)
Teaching fellows:

  • Thibaut Horel (OH: Tuesdays 4:30-5:30pm, MD’s 2nd floor lounge)
  • Rajko Radovanovic (OH: Mondays 5:30-6:30pm, MD’s 2nd floor lounge)
    Time: Monday \& Wednesday, 2:30pm-4:00pm
    Room: MD119
    Sections: Wednesdays 5:30-7:00pm in MD221

AM221 Convex optimization HELP(EXAM HELP, ONLINE TUTOR)

问题 1.

Midpoint convexity. A set $C$ is midpoint convex if whenever two points $a, b$ are in $C$, the average or midpoint $(a+b) / 2$ is in $C$. Obviously a convex set is midpoint convex. It can be proved that under mild conditions midpoint convexity implies convexity. As a simple case, prove that if $C$ is closed and midpoint convex, then $C$ is convex. Solution. We have to show that $\theta x+(1-\theta) y \in C$ for all $\theta \in[0,1]$ and $x, y \in C$. Let $\theta^{(k)}$ be the binary number of length $k$, i.e., a number of the form $$ \theta^{(k)}=c_1 2^{-1}+c_2 2^{-2}+\cdots+c_k 2^{-k} $$ with $c_i \in\{0,1\}$, closest to $\theta$. By midpoint convexity (applied $k$ times, recursively), $\theta^{(k)} x+\left(1-\theta^{(k)}\right) y \in C$. Because $C$ is closed, $$ \lim _{k \rightarrow \infty}\left(\theta^{(k)} x+\left(1-\theta^{(k)}\right) y\right)=\theta x+(1-\theta) y \in C $$

To show that $C$ is convex, we need to show that for any $x, y \in C$ and $\theta \in [0,1]$, the point $\theta x + (1-\theta)y$ is also in $C$. Since $C$ is closed and $\theta$ is a fixed number, we can consider the sequence of points $(\theta^{(k)}x + (1-\theta^{(k)})y)$, where $\theta^{(k)}$ is the binary number of length $k$ closest to $\theta$.

By midpoint convexity, we know that $(\theta^{(k)}x + (1-\theta^{(k)})y) \in C$ for all $k$. Since $C$ is closed, the limit of this sequence exists and is in $C$, that is,

$$\lim_{k \rightarrow \infty} (\theta^{(k)}x + (1-\theta^{(k)})y) = \theta x + (1-\theta)y \in C.$$

Therefore, $C$ is convex.

问题 2.

Midpoint convexity. A set $C$ is midpoint convex if whenever two points $a, b$ are in $C$, the average or midpoint $(a+b) / 2$ is in $C$. Obviously a convex set is midpoint convex. It can be proved that under mild conditions midpoint convexity implies convexity. As a simple case, prove that if $C$ is closed and midpoint convex, then $C$ is convex.
Solution. We have to show that $\theta x+(1-\theta) y \in C$ for all $\theta \in[0,1]$ and $x, y \in C$. Let $\theta^{(k)}$ be the binary number of length $k$, i.e., a number of the form
$$
\theta^{(k)}=c_1 2^{-1}+c_2 2^{-2}+\cdots+c_k 2^{-k}
$$
with $c_i \in{0,1}$, closest to $\theta$. By midpoint convexity (applied $k$ times, recursively), $\theta^{(k)} x+\left(1-\theta^{(k)}\right) y \in C$. Because $C$ is closed,
$$
\lim _{k \rightarrow \infty}\left(\theta^{(k)} x+\left(1-\theta^{(k)}\right) y\right)=\theta x+(1-\theta) y \in C
$$

To show that $C$ is convex, we need to show that for any $x, y \in C$ and $\theta \in [0,1]$, the point $\theta x + (1-\theta)y$ is also in $C$. Since $C$ is closed and $\theta$ is a fixed number, we can consider the sequence of points $(\theta^{(k)}x + (1-\theta^{(k)})y)$, where $\theta^{(k)}$ is the binary number of length $k$ closest to $\theta$.

By midpoint convexity, we know that $(\theta^{(k)}x + (1-\theta^{(k)})y) \in C$ for all $k$. Since $C$ is closed, the limit of this sequence exists and is in $C$, that is,

$$\lim_{k \rightarrow \infty} (\theta^{(k)}x + (1-\theta^{(k)})y) = \theta x + (1-\theta)y \in C.$$

Therefore, $C$ is convex.

Textbooks


• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.

此图像的alt属性为空;文件名为%E7%B2%89%E7%AC%94%E5%AD%97%E6%B5%B7%E6%8A%A5-1024x575-10.png
AM221 Convex optimization

Statistics-lab™可以为您提供harvard.edu AM221 Convex optimization现代代数课程的代写代考辅导服务! 请认准Statistics-lab™. Statistics-lab™为您的留学生涯保驾护航。

数学代写|AM221 Convex optimization

Statistics-lab™可以为您提供harvard.edu AM221 Convex optimization凸优化课程的代写代考辅导服务!

数学代写|AM221 Convex optimization

AM221 Convex optimization课程简介

This is a graduate-level course on optimization. The course covers mathematical programming and combinatorial optimization from the perspective of convex optimization, which is a central tool for solving large-scale problems. In recent years convex optimization has had a profound impact on statistical machine learning, data analysis, mathematical finance, signal processing, control, theoretical computer science, and many other areas. The first part will be dedicated to the theory of convex optimization and its direct applications. The second part will focus on advanced techniques in combinatorial optimization using machinery developed in the first part.

PREREQUISITES 

Instructor: Yaron Singer (OH: Wednesdays 4-5pm, MD239)
Teaching fellows:

  • Thibaut Horel (OH: Tuesdays 4:30-5:30pm, MD’s 2nd floor lounge)
  • Rajko Radovanovic (OH: Mondays 5:30-6:30pm, MD’s 2nd floor lounge)
    Time: Monday \& Wednesday, 2:30pm-4:00pm
    Room: MD119
    Sections: Wednesdays 5:30-7:00pm in MD221

AM221 Convex optimization HELP(EXAM HELP, ONLINE TUTOR)

问题 1.

2.1 Let $C \subseteq \mathbf{R}^n$ be a convex set, with $x_1, \ldots, x_k \in C$, and let $\theta_1, \ldots, \theta_k \in \mathbf{R}$ satisfy $\theta_i \geq 0$, $\theta_1+\cdots+\theta_k=1$. Show that $\theta_1 x_1+\cdots+\theta_k x_k \in C$. (The definition of convexity is that this holds for $k=2$; you must show it for arbitrary $k$.) Hint. Use induction on $k$.
Solution. This is readily shown by induction from the definition of convex set. We illustrate the idea for $k=3$, leaving the general case to the reader. Suppose that $x_1, x_2, x_3 \in C$, and $\theta_1+\theta_2+\theta_3=1$ with $\theta_1, \theta_2, \theta_3 \geq 0$. We will show that $y=\theta_1 x_1+\theta_2 x_2+\theta_3 x_3 \in C$. At least one of the $\theta_i$ is not equal to one; without loss of generality we can assume that $\theta_1 \neq 1$. Then we can write
$$
y=\theta_1 x_1+\left(1-\theta_1\right)\left(\mu_2 x_2+\mu_3 x_3\right)
$$
where $\mu_2=\theta_2 /\left(1-\theta_1\right)$ and $\mu_2=\theta_3 /\left(1-\theta_1\right)$. Note that $\mu_2, \mu_3 \geq 0$ and
$$
\mu_1+\mu_2=\frac{\theta_2+\theta_3}{1-\theta_1}=\frac{1-\theta_1}{1-\theta_1}=1
$$
Since $C$ is convex and $x_2, x_3 \in C$, we conclude that $\mu_2 x_2+\mu_3 x_3 \in C$. Since this point and $x_1$ are in $C, y \in C$.

The proof by induction for arbitrary $k$ is similar. Suppose $x_1, \ldots, x_k \in C$ and $\theta_1+\cdots+\theta_k=1$ with $\theta_i \geq 0$ for all $i$. Let $y = \theta_1 x_1 + \cdots + \theta_k x_k$. If $k = 2$, then $y$ is the same as the definition of convexity for $C$. Now assume the result holds for $k-1$. We want to show that $y \in C$. Without loss of generality, we can assume that $\theta_k \neq 1$ (otherwise, we can simply remove $x_k$ from the list). Then, we have

y = \frac{\theta_1}{1-\theta_k} \cdot x_1 + \cdots + \frac{\theta_{k-1}}{1-\theta_k} \cdot x_{k-1} + \frac{\theta_k}{1-\theta_k} \cdot x_k.y=1−θk​θ1​​⋅x1​+⋯+1−θk​θk−1​​⋅xk−1​+1−θk​θk​​⋅xk​.

Since $\frac{\theta_1}{1-\theta_k} + \cdots + \frac{\theta_{k-1}}{1-\theta_k} = \frac{1-\theta_k}{1-\theta_k} = 1$, by the induction hypothesis, $\frac{\theta_1}{1-\theta_k} \cdot x_1 + \cdots + \frac{\theta_{k-1}}{1-\theta_k} \cdot x_{k-1} \in C$. Moreover, since $C$ is convex and $x_k \in C$, we have $\frac{\theta_k}{1-\theta_k} \cdot x_k \in C$. Therefore, $y \in C$, and the result holds for arbitrary $k$.

问题 2.

2.2 Show that a set is convex if and only if its intersection with any line is convex. Show that a set is affine if and only if its intersection with any line is affine. Solution. We prove the first part. The intersection of two convex sets is convex. Therefore if $S$ is a convex set, the intersection of $S$ with a line is convex. Conversely, suppose the intersection of $S$ with any line is convex. Take any two distinct points $x_1$ and $x_2 \in S$. The intersection of $S$ with the line through $x_1$ and $x_2$ is convex. Therefore convex combinations of $x_1$ and $x_2$ belong to the intersection, hence also to $S$.

To prove the second part, we first recall the definition of an affine set. A set $S$ is affine if for any two points $x_1,x_2 \in S$ and any scalar $\theta \in \mathbb{R}$, the point $\theta x_1 + (1-\theta) x_2$ also belongs to $S$.

Now, suppose that $S$ is an affine set. Consider any line $L$ with direction vector $d$ and passing through some point $p \in \mathbb{R}^n$. We can write $L$ parametrically as ${p + td \mid t \in \mathbb{R}}$. Let $S_L = S \cap L$ denote the intersection of $S$ with $L$.

We will show that $S_L$ is an affine set. Take any two points $x_1,x_2 \in S_L$ and any scalar $\theta \in \mathbb{R}$. We can write $x_1 = p + t_1 d$ and $x_2 = p + t_2 d$ for some $t_1,t_2 \in \mathbb{R}$. Then, we have $$\theta x_1 + (1-\theta) x_2 = \theta (p + t_1 d) + (1-\theta)(p + t_2 d) = (t_1 \theta + t_2 (1-\theta)) d + p.$$ Since $S$ is affine, we know that $t_1 \theta + t_2 (1-\theta) \in \mathbb{R}$ and $(t_1 \theta + t_2 (1-\theta)) d + p \in S$. Therefore, $\theta x_1 + (1-\theta) x_2 \in S_L$, and we conclude that $S_L$ is affine.

Conversely, suppose that the intersection of $S$ with any line is affine. Take any two points $x_1,x_2 \in S$ and any scalar $\theta \in \mathbb{R}$. Consider the line $L$ passing through $x_1$ and $x_2$, i.e., $L = {x_1 + t(x_2-x_1) \mid t \in \mathbb{R}}$. Then, we have $\theta x_1 + (1-\theta) x_2 \in L$, and by the assumption that $S \cap L$ is affine, we know that $\theta x_1 + (1-\theta) x_2 \in S$. Therefore, $S$ is affine.

Textbooks


• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.

此图像的alt属性为空;文件名为%E7%B2%89%E7%AC%94%E5%AD%97%E6%B5%B7%E6%8A%A5-1024x575-10.png
AM221 Convex optimization

Statistics-lab™可以为您提供harvard.edu AM221 Convex optimization现代代数课程的代写代考辅导服务! 请认准Statistics-lab™. Statistics-lab™为您的留学生涯保驾护航。