## 统计代写|主成分分析代写Principal Component Analysis代考|Truncation Types

Given rotated PCA loadings, $Z$, we introduce the truncation operation of SPCArt, $T_{\lambda}\left(Z_{i}\right)$, where $T_{\lambda}$ is one of the following four types: T- $\ell_{1}$, soft thresholding $S_{\lambda} ; \mathrm{T}-\ell_{0}$, hard thresholding $H_{\lambda}$; T-sp, truncation by sparsity $P_{\lambda}$; and T-en, truncation by energy $E_{\lambda}$. T- $\ell_{1}$ was introduced in the previous section, we now introduce the remaining three types.

T- $\ell_{0}$ : hard thresholding. Set the entries below threshold $\lambda$ to zero: $X_{i}^{}=$ $H_{\lambda}\left(Z_{i}\right) /\left|H_{\lambda}\left(Z_{i}\right)\right|_{2} \cdot H_{\lambda}(\cdot)$ is defined in Table 2 . It is resulted from $\ell_{0}$ penalty: $$\min {X, R}|V-X R|{F}^{2}+\lambda^{2} \sum_{i}\left|X_{i}\right|_{0}, \text { s.t. } R^{T} R=I .$$
The optimization is similar to the $\ell_{1}$ case. Fixing $X, R^{}=\operatorname{Polar}\left(X^{T} V\right)$. Fixing $R$, the problem becomes $\min {X}\left|V R^{T}-X\right|{F}^{2}+\lambda^{2}|X|_{0}$. Let $Z=V R^{T}$, it can be decomposed to $p \times r$ entry-wise subproblems, and the solution is apparent: if $\left|Z_{j i}\right| \leq \lambda$, then $X_{j i}^{}=0$, otherwise $X_{j i}^{}=Z_{j i}$. Hence the solution can be expressed as $X_{i}^{*}=H_{\lambda}\left(Z_{i}\right)$.

There is no normalization for $X^{}$ compared with the $\ell_{1}$ case. This is because, if the unit length constraint, $\left|X_{i}\right|_{2}=1$, is added, there will be no closed-form solution. However, in practice, SPCArt still uses $X_{i}^{}=H_{\lambda}\left(Z_{i}\right) /\left|H_{\lambda}\left(Z_{i}\right)\right|_{2}$ for consistency, since empirically no significant difference is observed.

## 统计代写|主成分分析代写Principal Component Analysis代考|Performance Analysis

This section discusses the performance bounds for each truncation type. For $X_{i}=$ $T_{\lambda}\left(Z_{i}\right) /\left|T_{\lambda}\left(Z_{i}\right)\right|_{2}$, the following problems are studied:

1. How much sparsity of $X_{i}$ is guaranteed?
2. How much does $X_{i}$ deviate from $Z_{i}$ ?
3. What is the orthogonality degree of $X$ ?
4. How much variance is explained by $X$ ?
The derived performance bounds are functions of $\lambda$. The sparsity, orthogonality, and explained variance can be directly or indirectly controlled via $\lambda .^{1}$ We first introduce some definitions.

Definition $2 \forall x \in \mathbb{R}^{p}$, the sparsity of $x$ is the proportion of zero entries: $s(x)=$ $1-|x|_{0} / p$

Definition $3 \forall z \in \mathbb{R}^{p}, z \neq 0, x=T_{\lambda}(z) /\left|T_{\lambda}(z)\right|_{2}$, the deviation of $x$ from $z$ is $\sin (\theta(x, z))$, where $\theta(x, z)$ is the included angle between $x$ and $z, 0 \leq \theta(x, z) \leq \pi / 2$. If $x=0, \theta(x, y)$ is defined to be $\pi / 2$

Definition $4 \forall x, y \in \mathbb{R}^{p}, x \neq 0, y \neq 0$, the nonorthogonality between $x$ and $y$ is $|\cos (\theta(x, y))|=\left|x^{T} y\right| /\left(|x|_{2} \cdot|y|_{2}\right)$, where $\theta(x, y)$ is the included angle between $x$ and $y$.

Definition 5 Given data matrix $A \in \mathbb{R}^{n \times p}$ containing $n$ samples of dimension $p, \forall$ basis $X \in \mathbb{R}^{p \times r}, r \leq p$, the explained variance by $X$ is $E V(X)=\operatorname{tr}\left(X^{T} A^{T} A X\right)$. Let $U$ be any orthonormal basis in the subspace spanned by $X$, then the cumulative percentage of explained variance is $C P E V(X)=\operatorname{tr}\left(U^{T} A^{T} A U\right) / \operatorname{tr}\left(A^{T} A\right)[21]$.
Intuitively, larger $\lambda$ leads to higher sparsity and larger deviation. When two truncated vectors deviate from their originally orthogonal vectors, in the worst case, the nonorthogonality degenerates as the ‘sum’ of their deviations. On the other side, if the deviations of a sparse basis from the rotated loadings are small, we may expect the sparse basis still represents the data well, and the explained variance maintains a similar level to that of PCA. In a word, both the nonorthogonality and explained variance depend on the deviation, and the deviation and sparsity in turn are controlled by $\lambda$. We now go into details. For the proofs of the results, please refer to [9].

## 统计代写|主成分分析代写Principal Component Analysis代考|Sparsity and Deviation

The following results only concern a single vector of the basis. We will denote $Z_{i}$ by $z$, and $X_{i}$ by $x$ for simplicity, and derive bounds of sparsity, $s(x)$, and deviation,

$\sin (\theta(x, z))$, for each $T$. They depend on a key value, $1 / \sqrt{p}$, which is the entry value of a uniform vector.
Proposition 7 For $T-\ell_{0}$, the sparsity bounds are
$$\begin{cases}0 \leq s(x) \leq 1-\frac{1}{p} & , \lambda<\frac{1}{\sqrt{p}} \ 1-\frac{1}{p \lambda^{2}}<s(x) \leq 1, \lambda \geq \frac{1}{\sqrt{p}}\end{cases}$$
The deviation is $\sin (\theta(x, z))=|\bar{z}|_{2}$, where $\bar{z}$ is the truncated part: $\bar{z}{i}=z{i}$ if $x_{i}=0$, and $\bar{z}_{i}=0$ otherwise. The absolute bounds of deviation are:
$$0 \leq \sin (\theta(x, z)) \leq \begin{cases}\sqrt{p-1} \lambda & , \lambda<\frac{1}{\sqrt{p}} \ 1 & , \lambda \geq \frac{1}{\sqrt{p}}\end{cases}$$
All the above bounds are achievable.
Since when $\lambda<1 / \sqrt{p}$, there is no sparsity guarantee, $\lambda$ is usually set to be $1 / \sqrt{p}$ in practice. It generally works well.

Proposition 8 For $T$ – $\ell_{1}$, the bounds of $s(x)$ and lower bound of $\sin (\theta(x, z))$ are the same as $T$ – $\ell_{0}$ ‘s. In addition, there are relative deviation bounds
$$|\bar{z}|_{2} \leq \sin (\theta(x, z))<\sqrt{|\bar{z}|_{2}^{2}+\lambda^{2}|x|_{0}}$$
It is still an open question whether $\mathrm{T}-\ell_{1}$ has the same upper bound of deviation as $\mathrm{T}-\ell_{0}$. By the relative lower bounds, we have

Corollary 9 The deviation of soft thresholding is always larger than that of hard thresholding, if the same $\lambda$ is applied.

## 统计代写|主成分分析代写Principal Component Analysis代考|Truncation Types

T-ℓ0: 硬阈值。将条目设置为低于阈值λ归零：X一世= Hλ(从一世)/|Hλ(从一世)|2⋅Hλ(⋅)在表 2 中定义。它是由ℓ0惩罚：

## 统计代写|主成分分析代写Principal Component Analysis代考|Performance Analysis

1. 多少稀疏X一世有保证吗？
2. 多少钱X一世背离从一世 ?
3. 什么是正交度X?
4. 多少方差可以解释为X?
派生的性能界限是λ. 稀疏性、正交性和解释方差可以通过以下方式直接或间接控制λ.1我们首先介绍一些定义。

## 统计代写|主成分分析代写Principal Component Analysis代考|Sparsity and Deviation

{0≤s(X)≤1−1p,λ<1p 1−1pλ2<s(X)≤1,λ≥1p

|和¯|2≤罪⁡(θ(X,和))<|和¯|22+λ2|X|0

