OLET5610 - 统计代写答疑辅导

标签： OLET5610

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|MAST90085

Posted on 2022年6月24日2022年6月24日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

statistics-lab™ 为您的留学生涯保驾护航在代写多元统计分析Multivariate Statistical Analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写多元统计分析Multivariate Statistical Analysis代写方面经验极为丰富，各种代写多元统计分析Multivariate Statistical Analysis相关的作业也就用不着说。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|MAST90085

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Transformations

Suppose that $X$ has pdf $f_{X}(x)$. What is the pdf of $Y=3 X$ ? Or if $X=$ $\left(X_{1}, X_{2}, X_{3}\right)^{\top}$, what is the pdf of
$$
Y=\left(\begin{array}{c}
3 X_{1} \
X_{1}-4 X_{2} \
X_{3}
\end{array}\right) ?
$$
This is a special case of asking for the pdf of $Y$ when
$$
X=u(Y)
$$
for a one-to-one transformation $u: \mathbb{R}^{p} \rightarrow \mathbb{R}^{p}$. Define the Jacobian of $u$ as
$$
\mathcal{J}=\left(\frac{\partial x_{i}}{\partial y_{j}}\right)=\left(\frac{\partial u_{i}(y)}{\partial y_{j}}\right)
$$
and let $\operatorname{abs}(|\mathcal{J}|)$ be the absolute value of the determinant of this Jacobian. The pdf of $Y$ is given by
$$
f_{Y}(y)=\operatorname{abs}(|\mathcal{J}|) \cdot f_{X}{u(y)}
$$
Using this we can answer the introductory questions, namely
$$
\left(x_{1}, \ldots, x_{p}\right)^{\top}=u\left(y_{1}, \ldots, y_{p}\right)=\frac{1}{3}\left(y_{1}, \ldots, y_{p}\right)^{\top}
$$
with
$$
\mathcal{J}=\left(\begin{array}{ccc}
\frac{1}{3} & & 0 \
& \ddots & \
0 & & \frac{1}{3}
\end{array}\right)
$$
and hence $\operatorname{abs}(|\mathcal{J}|)=\left(\frac{1}{3}\right)^{p}$. So the pdf of $Y$ is $\frac{1}{3^{p}} f_{X}\left(\frac{y}{3}\right)$.
This introductory example is a special case of
$$
Y=\mathcal{A} X+b \text {, where } \mathcal{A} \text { is nonsingular. }
$$
The inverse transformation is
$$
X=\mathcal{A}^{-1}(Y-b)
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|The Multinormal Distribution

The multinormal distribution with mean $\mu$ and covariance $\Sigma>0$ has the density
$$
f(x)=|2 \pi \Sigma|^{-1 / 2} \exp \left{-\frac{1}{2}(x-\mu)^{\top} \Sigma^{-1}(x-\mu)\right}
$$
We write $X \sim N_{p}(\mu, \Sigma)$.
How is this multinormal distribution with mean $\mu$ and covariance $\Sigma$ related to the multivariate standard normal $N_{p}\left(0, \mathcal{I}_{p}\right)$ ? Through a linear transformation using the results of Sect. $4.3$, as shown in the next theorem.

Theorem 4.5 Let $X \sim N_{p}(\mu, \Sigma)$ and $Y=\Sigma^{-1 / 2}(X-\mu)$ (Mahalanobis transfor mation). Then
$$
Y \sim N_{p}\left(0, \mathcal{I}{p}\right), $$ i.e. the elements $Y{j} \in \mathbb{R}$ are independent, one-dimensional $N(0,1)$ variables.
Proof Note that $(X-\mu)^{\top} \Sigma^{-1}(X-\mu)=Y^{\top} Y$. Application of (4.45) gives $\mathcal{J}=$ $\Sigma^{1 / 2}$, hence
$$
f_{Y}(y)=(2 \pi)^{-p / 2} \exp \left(-\frac{1}{2} y^{\top} y\right)
$$
which is by $(4.47)$ the pdf of a $N_{p}\left(0, \mathcal{I}_{p}\right)$.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Sampling Distributions and Limit Theorems

In multivariate statistics, we observe the values of a multivariate random variable $X$ and obtain a sample $\left{x_{i}\right}_{i=1}^{n}$, as described in Chap. 3. Under random sampling, these observations are considered to be realisations of a sequence of i.i.d. random variables $X_{1}, \ldots, X_{n}$, where each $X_{i}$ is a $p$-variate random variable which replicates the parent or population random variable $X$. Some notational confusion is hard to avoid: $X_{i}$ is not the $i$ th component of $X$, but rather the $i$ th replicate of the $p$-variate random variable $X$ which provides the $i$ th observation $x_{i}$ of our sample.

For a given random sample $X_{1}, \ldots, X_{n}$, the idea of statistical inference is to analyse the properties of the population variable $X$. This is typically done by analysing some characteristic $\theta$ of its distribution, like the mean, covariance matrix, etc. Statistical inference in a multivariate setup is considered in more detail in Chaps. 6 and $7 .$

Inference can often be performed using some observable function of the sample $X_{1}, \ldots, X_{n}$, i.e. a statistics. Examples of such statistics were given in Chap. 3: the sample mean $\bar{x}$, the sample covariance matrix $\mathcal{S}$. To get an idea of the relationship between a statistics and the corresponding population characteristic, one has to derive the sampling distribution of the statistic. The next example gives some insight into the relation of $(\bar{x}, S)$ to $(\mu, \Sigma)$.

Example $4.15$ Consider an iid sample of $n$ random vectors $X_{i} \in \mathbb{R}^{p}$ where $\mathrm{E}\left(X_{i}\right)=\mu$ and $\operatorname{Var}\left(X_{i}\right)=\Sigma$. The sample mean $\bar{x}$ and the covariance matrix $\mathcal{S}$ have already been defined in Sect. 3.3. It is easy to prove the following results:
$$
\begin{aligned}
&\mathrm{E}(\bar{x})=n^{-1} \sum_{i=1}^{n} \mathrm{E}\left(X_{i}\right)=\mu \
&\operatorname{Var}(\bar{x})=n^{-2} \sum_{i=1}^{n} \operatorname{Var}\left(X_{i}\right)=n^{-1} \Sigma=\mathrm{E}\left(\bar{x} \bar{x}^{\top}\right)-\mu \mu^{\top}
\end{aligned}
$$

$$
\begin{aligned}
\mathrm{E}(\mathcal{S}) &=n^{-1} \mathrm{E}\left{\sum_{i=1}^{n}\left(X_{i}-\bar{x}\right)\left(X_{i}-\bar{x}\right)^{\top}\right} \
&=n^{-1} \mathrm{E}\left{\sum_{i=1}^{n} X_{i} X_{i}^{\top}-n \bar{x} \bar{x}^{\top}\right} \
&=n^{-1}\left{n\left(\Sigma+\mu \mu^{\top}\right)-n\left(n^{-1} \Sigma+\mu \mu^{\top}\right)\right} \
&=\frac{n-1}{n} \Sigma .
\end{aligned}
$$
This shows in particular that $\mathcal{S}$ is a biased estimator of $\Sigma$. By contrast, $\mathcal{S}_{u t}=\frac{n}{n-1} \mathcal{S}$ is an unbiased estimator of $\Sigma$.

Statistical inference often requires more than just the mean and/or the variance of a statistic. We need the sampling distribution of the statistics to derive confidence intervals or to define rejection regions in hypothesis testing for a given significance level. Theorem $4.9$ gives the distribution of the sample mean for a multinormal population.

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Transformations

假设X有pdfFX(X). 什么是pdf是=3X? 或者如果X= (X1,X2,X3)⊤，什么是pdf

是=(3X1 X1−4X2 X3)?
这是要求pdf的特殊情况是什么时候

X=在(是)
一对一的转换在:Rp→Rp. 定义雅可比行列式在作为

Ĵ=(∂X一世∂是j)=(∂在一世(是)∂是j)
然后让腹肌⁡(|Ĵ|)是这个雅可比行列式的绝对值。的pdf是是（谁）给的

F是(是)=腹肌⁡(|Ĵ|)⋅FX在(是)
使用它，我们可以回答介绍性问题，即

(X1,…,Xp)⊤=在(是1,…,是p)=13(是1,…,是p)⊤
和

Ĵ=(130 ⋱ 013)
因此腹肌⁡(|Ĵ|)=(13)p. 所以pdf是是13pFX(是3).
这个介绍性的例子是一个特例

是=一个X+b，在哪里一个是非奇异的。
逆变换是

X=一个−1(是−b)

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|The Multinormal Distribution

具有均值的多元正态分布μ和协方差Σ>0有密度

f(x)=|2 \pi \Sigma|^{-1 / 2} \exp \left{-\frac{1}{2}(x-\mu)^{\top} \Sigma^{-1 }(x-\mu)\right}f(x)=|2 \pi \Sigma|^{-1 / 2} \exp \left{-\frac{1}{2}(x-\mu)^{\top} \Sigma^{-1 }(x-\mu)\right}
我们写X∼ñp(μ,Σ).
这种具有均值的多正态分布如何μ和协方差Σ与多元标准正态有关ñp(0,我p)? 通过使用 Sect 的结果进行线性变换。4.3，如下一个定理所示。

定理 4.5 让X∼ñp(μ,Σ)和是=Σ−1/2(X−μ)（马氏变换）。然后

是∼ñp(0,我p),即元素是j∈R是独立的，一维的ñ(0,1)变量。
证明请注意(X−μ)⊤Σ−1(X−μ)=是⊤是. (4.45) 的应用给出Ĵ= Σ1/2，因此

F是(是)=(2圆周率)−p/2经验⁡(−12是⊤是)
这是由(4.47)一个的pdfñp(0,我p).

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Sampling Distributions and Limit Theorems

在多元统计中，我们观察多元随机变量的值X并获取样本\left{x_{i}\right}_{i=1}^{n}\left{x_{i}\right}_{i=1}^{n}，如第 1 章所述。3. 在随机抽样下，这些观察被认为是一系列独立同分布随机变量的实现X1,…,Xn, 其中每个X一世是一个p- 复制父或群体随机变量的变量随机变量X. 一些符号混淆是难以避免的：X一世不是一世的第一个组成部分X，而是一世的第一次复制p- 变量随机变量X它提供了一世观察X一世我们的样本。

对于给定的随机样本X1,…,Xn, 统计推断的思想是分析总体变量的性质X. 这通常是通过分析一些特征来完成的θ其分布，如均值、协方差矩阵等。多变量设置中的统计推断在章节中进行了更详细的考虑。6 和7.

推理通常可以使用样本的一些可观察函数来执行X1,…,Xn，即统计。此类统计的示例在第 1 章中给出。3：样本均值X¯, 样本协方差矩阵小号. 要了解统计量与相应的总体特征之间的关系，必须推导出统计量的抽样分布。下一个例子给出了一些关于关系的见解(X¯,小号)至(μ,Σ).

例子4.15考虑一个独立同分布的样本n随机向量X一世∈Rp在哪里和(X一世)=μ和曾是⁡(X一世)=Σ. 样本均值X¯和协方差矩阵小号已在 Sect 中定义。3.3. 很容易证明以下结果：

和(X¯)=n−1∑一世=1n和(X一世)=μ 曾是⁡(X¯)=n−2∑一世=1n曾是⁡(X一世)=n−1Σ=和(X¯X¯⊤)−μμ⊤

\begin{对齐} \mathrm{E}(\mathcal{S}) &=n^{-1} \mathrm{E}\left{\sum_{i=1}^{n}\left(X_{i }-\bar{x}\right)\left(X_{i}-\bar{x}\right)^{\top}\right} \ &=n^{-1} \mathrm{E}\left {\sum_{i=1}^{n} X_{i} X_{i}^{\top}-n \bar{x} \bar{x}^{\top}\right} \ &=n^ {-1}\left{n\left(\Sigma+\mu \mu^{\top}\right)-n\left(n^{-1} \Sigma+\mu \mu^{\top}\right) \right} \ &=\frac{n-1}{n} \Sigma 。\end{对齐}\begin{对齐} \mathrm{E}(\mathcal{S}) &=n^{-1} \mathrm{E}\left{\sum_{i=1}^{n}\left(X_{i }-\bar{x}\right)\left(X_{i}-\bar{x}\right)^{\top}\right} \ &=n^{-1} \mathrm{E}\left {\sum_{i=1}^{n} X_{i} X_{i}^{\top}-n \bar{x} \bar{x}^{\top}\right} \ &=n^ {-1}\left{n\left(\Sigma+\mu \mu^{\top}\right)-n\left(n^{-1} \Sigma+\mu \mu^{\top}\right) \right} \ &=\frac{n-1}{n} \Sigma 。\end{对齐}
这尤其表明小号是一个有偏估计量Σ. 相比之下，小号在吨=nn−1小号是一个无偏估计量Σ.

统计推断通常需要的不仅仅是统计数据的均值和/或方差。我们需要统计数据的抽样分布来推导置信区间或在给定显着性水平的假设检验中定义拒绝区域。定理4.9给出多正态总体的样本均值分布.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考请认准statistics-lab™

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

金融工程是使用数学技术来解决金融问题。金融工程使用计算机科学、统计学、经济学和应用数学领域的工具和知识来解决当前的金融问题，以及设计新的和创新的金融产品。

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

术语广义线性模型（GLM）通常是指给定连续和/或分类预测因素的连续响应变量的常规线性回归模型。它包括多元线性回归，以及方差分析和方差分析（仅含固定效应）。

有限元方法代写

有限元方法（FEM）是一种流行的方法，用于数值解决工程和数学建模中出现的微分方程。典型的问题领域包括结构分析、传热、流体流动、质量运输和电磁势等传统领域。

有限元是一种通用的数值方法，用于解决两个或三个空间变量的偏微分方程（即一些边界值问题）。为了解决一个问题，有限元将一个大系统细分为更小、更简单的部分，称为有限元。这是通过在空间维度上的特定空间离散化来实现的，它是通过构建对象的网格来实现的：用于求解的数值域，它有有限数量的点。边界值问题的有限元方法表述最终导致一个代数方程组。该方法在域上对未知函数进行逼近。[1] 然后将模拟这些有限元的简单方程组合成一个更大的方程系统，以模拟整个问题。然后，有限元通过变化微积分使相关的误差函数最小化来逼近一个解决方案。

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

随机分析代写

随机微积分是数学的一个分支，对随机过程进行操作。它允许为随机过程的积分定义一个关于随机过程的一致的积分理论。这个领域是由日本数学家伊藤清在第二次世界大战期间创建并开始的。

时间序列分析代写

随机过程，是依赖于参数的一组随机变量的全体，参数通常是时间。随机变量是随机现象的数量表现，其时间序列是一组按照时间发生先后顺序进行排列的数据点序列。通常一组时间序列的时间间隔为一恒定值（如1秒，5分钟，12小时，7天，1年），因此时间序列可以作为离散时间数据进行分析处理。研究时间序列数据的意义在于现实中，往往需要研究某个事物其随时间发展变化的规律。这就需要通过研究该事物过去发展的历史记录，以得到其自身发展的规律。

回归分析代写

多元回归分析渐进（Multiple Regression Analysis Asymptotics）属于计量经济学领域，主要是一种数学上的统计分析方法，可以分析复杂情况下各影响因素的数学关系，在自然科学、社会和经济学等多个领域内应用广泛。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习和应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|OLET5610

Posted on 2022年6月24日2022年6月24日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|OLET5610

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Distribution and Density Function

Let $X=\left(X_{1}, X_{2}, \ldots, X_{p}\right)^{\top}$ be a random vector. The cumulative distribution function (cdf) of $X$ is defined by
$$
F(x)=\mathrm{P}(X \leq x)=\mathrm{P}\left(X_{1} \leq x_{1}, X_{2} \leq x_{2}, \ldots, X_{p} \leq x_{p}\right)
$$
For continuous $X$, a nonnegative probability density function (pdf) $f$ exists that
$$
F(x)=\int_{-\infty}^{x} f(u) d u
$$
Note that
$$
\int_{-\infty}^{\infty} f(u) d u=1
$$
Most of the integrals appearing below are multidimensional. For instance, $\int_{-\infty}^{x} f(u) d u$ means $\int_{-\infty}^{x_{p}} \ldots \int_{-\infty}^{x_{1}} f\left(u_{1}, \ldots, u_{p}\right) d u_{1} \ldots d u_{p}$. Note also that the cdf $F$ is differentiable with
$$
f(x)=\frac{\partial^{p} F(x)}{\partial x_{1} \cdots \partial x_{p}}
$$
For discrete $X$, the values of this random variable are concentrated on a countable or finite set of points $\left{c_{j}\right}_{j \in J}$, the probability of events of the form ${X \in D}$ can then be computed as
$$
\mathrm{P}(X \in D)=\sum_{\left{j: c_{j} \in D\right}} \mathrm{P}\left(X=c_{j}\right)
$$
If we partition $X$ as $X=\left(X_{1}, X_{2}\right)^{\top}$ with $X_{1} \in \mathbb{R}^{k}$ and $X_{2} \in \mathbb{R}^{p-k}$, then the function
$$
F_{X_{1}}\left(x_{1}\right)=\mathrm{P}\left(X_{1} \leq x_{1}\right)=F\left(x_{11}, \ldots, x_{1 k}, \infty, \ldots, \infty\right)
$$
is called the marginal cdf. $F=F(x)$ is called the joint cdf. For continuous $X$ the marginal pdf can be computed from the joint density by “integrating out” the variable not of interest.
$$
f_{X_{1}}\left(x_{1}\right)=\int_{-\infty}^{\infty} f\left(x_{1}, x_{2}\right) d x_{2}
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Moments and Characteristic Functions

Moments: Expectation and Covariance Matrix
If $X$ is a random vector with density $f(x)$ then the expectation of $X$ is
$$
\mathrm{E} X=\left(\begin{array}{c}
\mathrm{E} X_{1} \
\vdots \
\mathrm{E} X_{p}
\end{array}\right)=\int x f(x) d x=\left(\begin{array}{c}
\int x_{1} f(x) d x \
\vdots \
\int x_{p} f(x) d x
\end{array}\right)=\mu
$$

Accordingly, the expectation of a matrix of random elements has to be understood component by component. The operation of forming expectations is linear:
$$
\mathrm{E}(\alpha X+\beta Y)=\alpha \mathrm{E} X+\beta \mathrm{E} Y
$$
If $\mathcal{A}(q \times p)$ is a matrix of real numbers, we have:
$$
\mathrm{E}(\mathcal{A} X)=\mathcal{A} E X
$$
When $X$ and $Y$ are independent,
$$
E\left(X Y^{\top}\right)=E X E Y^{\top}
$$
The matrix
$$
\operatorname{Var}(X)=\Sigma=\mathrm{E}(X-\mu)(X-\mu)^{\top}
$$
is the (theoretical) covariance matrix. We write for a vector $X$ with mean vector $\mu$ and covariance matrix $\Sigma$,
$$
X \sim(\mu, \Sigma)
$$
The $(p \times q)$ matrix
$$
\Sigma_{X Y}=\operatorname{Cov}(X, Y)=\mathrm{E}(X-\mu)(Y-v)^{\top}
$$
is the covariance matrix of $X \sim\left(\mu, \Sigma_{X X}\right)$ and $Y \sim\left(v, \Sigma_{Y Y}\right)$. Note that $\Sigma_{X Y}=\Sigma_{Y X}^{\top}$ and that $Z=\left(\begin{array}{l}X \ Y\end{array}\right)$ has covariance $\Sigma_{Z Z}=\left(\begin{array}{ll}\Sigma_{X X} & \Sigma_{X Y} \ \Sigma_{Y X} & \Sigma_{Y Y}\end{array}\right)$. From
$$
\operatorname{Cov}(X, Y)=\mathrm{E}\left(X Y^{\top}\right)-\mu v^{\top}=\mathrm{E}\left(X Y^{\top}\right)-\mathrm{E} X E Y^{\top}
$$
it follows that $\operatorname{Cov}(X, Y)=0$ in the case where $X$ and $Y$ are independent. We often say that $\mu=\mathrm{E}(X)$ is the first order moment of $X$ and that $\mathrm{E}\left(X X^{\top}\right)$ provides the second order moments of $X$ :
$$
E\left(X X^{\top}\right)=\left{E\left(X_{i} X_{j}\right)\right}, \text { for } i=1, \ldots, p \text { and } j=1, \ldots, p
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Properties of Conditional Expectations

Since $\mathrm{E}\left(X_{2} \mid X_{1}=x_{1}\right)$ is a function of $x_{1}$, say $h\left(x_{1}\right)$, we can define the random variable $h\left(X_{1}\right)=\mathrm{E}\left(X_{2} \mid X_{1}\right)$. The same can be done when defining the random variable $\operatorname{Var}\left(X_{2} \mid X_{1}\right)$. These two random variables share some interesting properties:
$$
\begin{aligned}
\mathrm{E}\left(X_{2}\right) &=\mathrm{E}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right} \
\operatorname{Var}\left(X_{2}\right) &=\mathrm{E}\left{\operatorname{Var}\left(X_{2} \mid X_{1}\right)\right}+\operatorname{Var}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right}
\end{aligned}
$$
Example $4.8$ Consider the following pdf
$$
f\left(x_{1}, x_{2}\right)=2 e^{-\frac{x_{2}}{x_{1}}} ; 00 .
$$
It is easy to show that
$$
\begin{gathered}
f\left(x_{1}\right)=2 x_{1} \text { for } 00 ; \quad \mathrm{E}\left(X_{2} \mid X_{1}\right)=X_{1} \text { and } \operatorname{Var}\left(X_{2} \mid X_{1}\right)=X_{1}^{2} .
\end{gathered}
$$
Without explicitly computing $f\left(x_{2}\right)$, we can obtain:
$$
\begin{aligned}
\mathrm{E}\left(X_{2}\right) &=\mathrm{E}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right}=\mathrm{E}\left(X_{1}\right)=\frac{2}{3} \
\operatorname{Var}\left(X_{2}\right) &=\mathrm{E}\left{\operatorname{Var}\left(X_{2} \mid X_{1}\right)\right}+\operatorname{Var}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right} \
&=\mathrm{E}\left(X_{1}^{2}\right)+\operatorname{Var}\left(X_{1}\right)=\frac{2}{4}+\frac{1}{18}=\frac{10}{18}
\end{aligned}
$$
The conditional expectation $\mathrm{E}\left(X_{2} \mid X_{1}\right)$ viewed as a function $h\left(X_{1}\right)$ of $X_{1}$ (known as the regression function of $X_{2}$ on $X_{1}$ ), can be interpreted as a conditional approximation of $X_{2}$ by a function of $X_{1}$. The error term of the approximation is then given by:
$$
U=X_{2}-\mathrm{E}\left(X_{2} \mid X_{1}\right)
$$
Theorem 4.3 Let $X_{1} \in \mathbb{R}^{k}$ and $X_{2} \in \mathbb{R}^{p-k}$ and $U=X_{2}-E\left(X_{2} \mid X_{1}\right)$. Then we have:

$E(U)=0$
$E\left(X_{2} \mid X_{1}\right)$ is the best approximation of $X_{2}$ by a function $h\left(X_{1}\right)$ of $X_{1}$ where $h$ : $\mathbb{R}^{k} \longrightarrow \mathbb{R}^{p-k}$. “Best” is the minimum mean squared error (MSE) sense, where
$$
\operatorname{MSE}(h)=E\left[\left{X_{2}-h\left(X_{1}\right)\right}^{\top}\left{X_{2}-h\left(X_{1}\right)\right}\right] .
$$

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Distribution and Density Function

让X=(X1,X2,…,Xp)⊤是一个随机向量。的累积分布函数 (cdf)X定义为

F(X)=磷(X≤X)=磷(X1≤X1,X2≤X2,…,Xp≤Xp)
对于连续X, 非负概率密度函数 (pdf)F存在

F(X)=∫−∞XF(在)d在
注意

∫−∞∞F(在)d在=1
下面出现的大多数积分都是多维的。例如，∫−∞XF(在)d在方法∫−∞Xp…∫−∞X1F(在1,…,在p)d在1…d在p. 另请注意，cdfF可与

F(X)=∂pF(X)∂X1⋯∂Xp
对于离散X，这个随机变量的值集中在可数或有限的点集上\left{c_{j}\right}_{j \in J}\left{c_{j}\right}_{j \in J}, 形式的事件的概率X∈D然后可以计算为

\mathrm{P}(X \in D)=\sum_{\left{j: c_{j} \in D\right}} \mathrm{P}\left(X=c_{j}\right)\mathrm{P}(X \in D)=\sum_{\left{j: c_{j} \in D\right}} \mathrm{P}\left(X=c_{j}\right)
如果我们分区X作为X=(X1,X2)⊤和X1∈Rķ和X2∈Rp−ķ, 那么函数

FX1(X1)=磷(X1≤X1)=F(X11,…,X1ķ,∞,…,∞)
称为边际 cdf。F=F(X)称为联合 cdf。对于连续X可以通过“积分”不感兴趣的变量从联合密度计算边际 pdf。

FX1(X1)=∫−∞∞F(X1,X2)dX2

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Moments and Characteristic Functions

矩：期望和协方差矩阵
如果X是具有密度的随机向量F(X)那么期望X是

和X=(和X1 ⋮ 和Xp)=∫XF(X)dX=(∫X1F(X)dX ⋮ ∫XpF(X)dX)=μ

因此，必须逐个组件地理解随机元素矩阵的期望。形成期望的操作是线性的：

和(一个X+b是)=一个和X+b和是
如果一个(q×p)是实数矩阵，我们有：

和(一个X)=一个和X
什么时候X和是是独立的，

和(X是⊤)=和X和是⊤
矩阵

曾是⁡(X)=Σ=和(X−μ)(X−μ)⊤
是（理论）协方差矩阵。我们写一个向量X具有平均向量μ和协方差矩阵Σ,

X∼(μ,Σ)
这(p×q)矩阵

ΣX是=这⁡(X,是)=和(X−μ)(是−在)⊤
是协方差矩阵X∼(μ,ΣXX)和是∼(在,Σ是是). 注意ΣX是=Σ是X⊤然后从=(X 是)有协方差Σ从从=(ΣXXΣX是 Σ是XΣ是是). 从

这⁡(X,是)=和(X是⊤)−μ在⊤=和(X是⊤)−和X和是⊤
它遵循这⁡(X,是)=0在这种情况下X和是是独立的。我们经常说μ=和(X)是的一阶矩X然后和(XX⊤)提供二阶矩X :

E\left(X X^{\top}\right)=\left{E\left(X_{i} X_{j}\right)\right}, \text { for } i=1, \ldots, p \文本 { 和 } j=1, \ldots, pE\left(X X^{\top}\right)=\left{E\left(X_{i} X_{j}\right)\right}, \text { for } i=1, \ldots, p \文本 { 和 } j=1, \ldots, p

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Properties of Conditional Expectations

自从和(X2∣X1=X1)是一个函数X1，说H(X1)，我们可以定义随机变量H(X1)=和(X2∣X1). 定义随机变量时也可以这样做曾是⁡(X2∣X1). 这两个随机变量有一些有趣的属性：

\begin{对齐} \mathrm{E}\left(X_{2}\right) &=\mathrm{E}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right )\right} \ \operatorname{Var}\left(X_{2}\right) &=\mathrm{E}\left{\operatorname{Var}\left(X_{2} \mid X_{1}\right )\right}+\operatorname{Var}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right} \end{aligned}\begin{对齐} \mathrm{E}\left(X_{2}\right) &=\mathrm{E}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right )\right} \ \operatorname{Var}\left(X_{2}\right) &=\mathrm{E}\left{\operatorname{Var}\left(X_{2} \mid X_{1}\right )\right}+\operatorname{Var}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right} \end{aligned}
例子4.8考虑以下pdf

F(X1,X2)=2和−X2X1;00.
很容易证明

F(X1)=2X1 为了 00;和(X2∣X1)=X1 和曾是⁡(X2∣X1)=X12.
无需显式计算F(X2)，我们可以得到：

\begin{对齐} \mathrm{E}\left(X_{2}\right) &=\mathrm{E}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right )\right}=\mathrm{E}\left(X_{1}\right)=\frac{2}{3} \ \operatorname{Var}\left(X_{2}\right) &=\mathrm{ E}\left{\operatorname{Var}\left(X_{2} \mid X_{1}\right)\right}+\operatorname{Var}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right} \ &=\mathrm{E}\left(X_{1}^{2}\right)+\operatorname{Var}\left(X_{1}\right) =\frac{2}{4}+\frac{1}{18}=\frac{10}{18} \end{对齐}\begin{对齐} \mathrm{E}\left(X_{2}\right) &=\mathrm{E}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right )\right}=\mathrm{E}\left(X_{1}\right)=\frac{2}{3} \ \operatorname{Var}\left(X_{2}\right) &=\mathrm{ E}\left{\operatorname{Var}\left(X_{2} \mid X_{1}\right)\right}+\operatorname{Var}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right} \ &=\mathrm{E}\left(X_{1}^{2}\right)+\operatorname{Var}\left(X_{1}\right) =\frac{2}{4}+\frac{1}{18}=\frac{10}{18} \end{对齐}
有条件的期望和(X2∣X1)被视为一种功能H(X1)的X1（称为回归函数X2上X1)，可以解释为条件近似X2通过一个函数X1. 近似的误差项由下式给出：

在=X2−和(X2∣X1)
定理 4.3 让X1∈Rķ和X2∈Rp−ķ和在=X2−和(X2∣X1). 然后我们有：

和(在)=0
和(X2∣X1)是的最佳近似值X2通过函数H(X1)的X1在哪里H : Rķ⟶Rp−ķ. “最佳”是最小均方误差 (MSE) 意义，其中
\operatorname{MSE}(h)=E\left[\left{X_{2}-h\left(X_{1}\right)\right}^{\top}\left{X_{2}-h\左(X_{1}\right)\right}\right] 。\operatorname{MSE}(h)=E\left[\left{X_{2}-h\left(X_{1}\right)\right}^{\top}\left{X_{2}-h\左(X_{1}\right)\right}\right] 。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Multiple Linear Model

Posted on 2022年6月24日2022年6月24日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Multiple Linear Model

The simple linear model and the analysis of variance model can be viewed as a particular case of a more general linear model where the variations of one variable $y$ are explained by $p$ explanatory variables $x$ respectively. Let $y(n \times 1)$ and $\mathcal{X}(n \times p)$ be a vector of observations on the response variable and a data matrix on the $p$ explanatory variables. An important application of the developed theory is the least squares fitting. The idea is to approximate $y$ by a linear combination $\hat{y}$ of columns of $\mathcal{X}$, i.e. $\hat{y} \in C(\mathcal{X})$. The problem is to find $\hat{\beta} \in \mathbb{R}^{p}$ such that $\hat{y}=\mathcal{X} \hat{\beta}$ is the best fit of $y$ in the least-squares sense. The linear model can be written as
$$
y=X \beta+\varepsilon
$$
where $\varepsilon$ are the errors. The least squares solution is given by $\hat{\beta}$ :
$$
\hat{\beta}=\arg \min {\beta}(y-\mathcal{X} \beta)^{\top}(y-\mathcal{X} \beta)=\arg \min {\beta} \varepsilon^{\top} \varepsilon
$$

Suppose that $\left(\mathcal{X}^{\top} \mathcal{X}\right)$ is of full rank and thus invertible. Minimising the expression (3.51) with respect to $\beta$ yields:
$$
\hat{\beta}=\left(\mathcal{X}^{\top} \mathcal{X}\right)^{-1} \mathcal{X}^{\top} y
$$
The fitted value $\hat{y}=\mathcal{X} \hat{\beta}=\mathcal{X}\left(\mathcal{X}^{\top} \mathcal{X}^{-1} \mathcal{X}^{\top} y=\mathcal{P} y\right.$ is the projection of $y$ onto $C(\mathcal{X})$ as computed in (2.47).
The least squares residuals are
$$
e=y-\hat{y}=y-\mathcal{X} \hat{\beta}=\mathcal{Q} y=\left(\mathcal{I}{n}-\mathcal{P}\right) y $$ The vector $e$ is the projection of $y$ onto the orthogonal complement of $C(\mathcal{X})$. Remark $3.5$ A linear model with an intercept $\alpha$ can also be written in this framework. The approximating equation is: $$ y{i}=\alpha+\beta_{1} x_{i 1}+\cdots+\beta_{p} x_{i p}+\varepsilon_{i} ; i=1, \ldots, n .
$$
This can be written as:
$$
y=\mathcal{X}^{} \beta^{}+\varepsilon
$$
where $\mathcal{X}^{}=\left(1_{n} \mathcal{X}\right)$ (we add a column of ones to the data). We have by (3.52): $$ \hat{\beta}^{}=\left(\begin{array}{l}
\hat{\alpha} \
\hat{\beta}
\end{array}\right)=\left(\mathcal{X}^{* \top} \mathcal{X}^{}\right)^{-1} \mathcal{X}^{ \top} y
$$
Example $3.15$ Let us come back to the “classic blue” pullovers example. In Example 3.11, we considered the regression fit of the sales $X_{1}$ on the price $X_{2}$ and concluded that there was only a small influence of sales by changing the prices. A linear model incorporating all three variables allows us to approximate sales as a linear function of price $\left(X_{2}\right)$, advertisement $\left(X_{3}\right)$ and presence of sales assistants ( $X_{4}$ ) simultaneously. Adding a column of ones to the data (in order to estimate the intercept $\alpha$ ) leads to
$$
\hat{\alpha}=65.670 \text { and } \widehat{\beta}{1}=-0.216, \widehat{\beta}{2}=0.485, \widehat{\beta}{3}=0.844 $$ The coefficient of determination is computed as before in ( $3.40)$ and is: $$ r^{2}=1-\frac{e^{\top} e}{\sum\left(y{i}-\bar{y}\right)^{2}}=0.907 .
$$
We conclude that the variation of $X_{1}$ is well approximated by the linear relation.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|The ANOVA Model in Matrix Notation

The simple ANOVA problem (Sect. 3.5) may also be rewritten in matrix terms. Recall the definition of a vector of ones from $(2.1)$ and define a vector of zeros as $0_{n}$. Then construct the following $(n \times p)$ matrix (here $p=3$ ),
$$
\mathcal{X}=\left(\begin{array}{lll}
1_{m} & 0_{m} & 0_{m} \
0_{m} & 1_{m} & 0_{m} \
0_{m} & 0_{m} & 1_{m}
\end{array}\right)
$$
where $m=10$. Equation (3.41) then reads as follows.
The parameter vector is $\beta=\left(\mu_{1}, \mu_{2}, \mu_{3}\right)^{\top}$. The data set from Example $3.14$ can therefore be written as a linear model $y=\mathcal{X} \beta+\varepsilon$ where $y \in \mathbb{R}^{n}$ with $n=m \cdot p$ is the stacked vector of the columns of Table 3.1. The projection into the column space $C(\mathcal{X})$ of (3.54) yields the least-squares estimator $\hat{\beta}=\left(\mathcal{X}^{\top} \mathcal{X}^{-1} \mathcal{X}^{\top} y\right.$. Note that $\left(\mathcal{X}^{\top} \mathcal{X}\right)^{-1}=(1 / 10) \mathcal{I}{3}$ and that $\mathcal{X}^{\top} y=(106,124,151)^{\top}$ is the sum $\sum{k=1}^{m} y_{k j}$ for each factor, i.e. the three column sums of Table 3.1. The least squares estimator is therefore the vector $\hat{\beta}{H{1}}=\left(\hat{\mu}{1}, \hat{\mu}{2}, \hat{\mu}{3}\right)=(10.6,12.4,15.1)^{\top}$ of sample means for each factor level $j=1,2,3$. Under the null hypothesis of equal mean values $\mu{1}=\mu_{2}=\mu_{3}=\mu$, we estimate the parameters under the same constraints. This can be put into the form of a linear constraint:
$$
\begin{aligned}
&-\mu_{1}+\mu_{2}=0 \
&-\mu_{1}+\mu_{3}=0
\end{aligned}
$$
This can be written as $\mathcal{A} \beta=a$, where
$$
a=\left(\begin{array}{l}
0 \
0
\end{array}\right)
$$ and
$$
\mathcal{A}=\left(\begin{array}{lll}
-1 & 1 & 0 \
-1 & 0 & 1
\end{array}\right)
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Multivariate Distributions

The preceding chapter showed that by using the two first moments of a multivariate distribution (the mean and the covariance matrix), a lot of information on the relationship between the variables can be made available. Only basic statistical theory was used to derive tests of independence or of linear relationships. In this chapter we give an introduction to the basic probability tools useful in statistical multivariate analysis.

Means and covariances share many interesting and useful properties, but they represent only part of the information on a multivariate distribution. Section $4.1$ presents the basic probability tools used to describe a multivariate random variable, including marginal and conditional distributions and the concept of independence. In Sect. 4.2, basic properties on means and covariances (marginal and conditional ones) are derived.

Since many statistical procedures rely on transformations of a multivariate random variable, Sect. $4.3$ proposes the basic techniques needed to derive the distribution of transformations with a special emphasis on linear transforms. As an important example of a multivariate random variable, Sect. $4.4$ defines the multinormal distribution. It will be analysed in more detail in Chap. 5 along with most of its “companion” distributions that are useful in making multivariate statistical inferences.

The normal distribution plays a central role in statistics because it can be viewed as an approximation and limit of many other distributions. The basic justification relies on the central limit theorem presented in Sect. 4.5. We present this central theorem in the framework of sampling theory. A useful extension of this theorem is also given: it is an approximate distribution to transformations of asymptotically normal variables. The increasing power of computers today makes it possible to consider alternative approximate sampling distributions. These are based on resampling techniques and are suitable for many general situations. Section $4.8$ gives an introduction to the ideas behind bootstrap approximations.

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Multiple Linear Model

简单线性模型和方差分析模型可以看作是一个更一般的线性模型的特例，其中一个变量的变化是被解释为p解释变量X分别。让是(n×1)和X(n×p)是响应变量的观测向量和数据矩阵p解释变量。发展理论的一个重要应用是最小二乘拟合。这个想法是近似是通过线性组合是^的列X， IE是^∈C(X). 问题是找到b^∈Rp这样是^=Xb^是最合适的是在最小二乘意义上。线性模型可以写为

是=Xb+e
在哪里e是错误。最小二乘解由下式给出b^ :

b^=参数⁡分钟b(是−Xb)⊤(是−Xb)=参数⁡分钟be⊤e

假设(X⊤X)是满秩的，因此是可逆的。最小化表达式 (3.51) 关于b产量：

b^=(X⊤X)−1X⊤是
拟合值是^=Xb^=X(X⊤X−1X⊤是=磷是是的投影是到C(X)如 (2.47) 中计算的那样。
最小二乘残差是

和=是−是^=是−Xb^=问是=(我n−磷)是向量和是的投影是上的正交补C(X). 评论3.5具有截距的线性模型一个也可以写在这个框架里。近似方程为：

是一世=一个+b1X一世1+⋯+bpX一世p+e一世;一世=1,…,n.
这可以写成：

是=Xb+e
在哪里X=(1nX)（我们在数据中添加一列）。我们有（3.52）：

b^=(一个^ b^)=(X∗⊤X)−1X⊤是
例子3.15让我们回到“经典蓝色”套头衫的例子。在示例 3.11 中，我们考虑了销售额的回归拟合X1关于价格X2并得出结论，改变价格对销售的影响很小。包含所有三个变量的线性模型允许我们将销售额近似为价格的线性函数(X2)，广告(X3)和销售助理的存在（X4）同时。向数据中添加一列（以估计截距一个）导致

一个^=65.670 和 b^1=−0.216,b^2=0.485,b^3=0.844决定系数的计算方式如前所述 (3.40)并且是：

r2=1−和⊤和∑(是一世−是¯)2=0.907.
我们得出结论，X1由线性关系很好地近似。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|The ANOVA Model in Matrix Notation

简单的 ANOVA 问题（第 3.5 节）也可以用矩阵项重写。回想一下向量的定义(2.1)并将零向量定义为0n. 然后构造如下(n×p)矩阵（这里p=3 ),

X=(1米0米0米 0米1米0米 0米0米1米)
在哪里米=10. 等式 (3.41) 如下所示。
参数向量为b=(μ1,μ2,μ3)⊤. Example 中的数据集3.14因此可以写成线性模型是=Xb+e在哪里是∈Rn和n=米⋅p是表 3.1 的列的堆叠向量。投影到列空间C(X)(3.54) 产生最小二乘估计量b^=(X⊤X−1X⊤是. 注意(X⊤X)−1=(1/10)我3然后X⊤是=(106,124,151)⊤是总和∑ķ=1米是ķj对于每个因素，即表 3.1 的三列总和。因此，最小二乘估计量是向量b^H1=(μ^1,μ^2,μ^3)=(10.6,12.4,15.1)⊤每个因子水平的样本均值j=1,2,3. 在均值相等的原假设下μ1=μ2=μ3=μ，我们在相同的约束条件下估计参数。这可以转化为线性约束的形式：

−μ1+μ2=0 −μ1+μ3=0
这可以写成一个b=一个，在哪里

一个=(0 0)和

一个=(−110 −101)

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Multivariate Distributions

前一章表明，通过使用多元分布的两个一阶矩（均值和协方差矩阵），可以获得有关变量之间关系的大量信息。只有基本的统计理论被用来推导独立性或线性关系的检验。在本章中，我们将介绍在统计多元分析中有用的基本概率工具。

均值和协方差具有许多有趣和有用的属性，但它们仅代表多元分布信息的一部分。部分4.1介绍了用于描述多元随机变量的基本概率工具，包括边际分布和条件分布以及独立性的概念。昆虫。4.2，推导出均值和协方差（边际和条件）的基本性质。

由于许多统计程序依赖于多元随机变量的转换，Sect。4.3提出了推导变换分布所需的基本技术，特别强调线性变换。作为多元随机变量的一个重要例子，Sect。4.4定义多正态分布。将在第 1 章中对其进行更详细的分析。5 以及它的大多数“伴侣”分布，这些分布在进行多元统计推断时很有用。

正态分布在统计中起着核心作用，因为它可以被视为许多其他分布的近似值和极限。基本理由依赖于 Sect 中提出的中心极限定理。4.5. 我们在抽样理论的框架中提出了这个中心定理。还给出了该定理的一个有用扩展：它是渐近正态变量变换的近似分布。今天计算机的强大功能使得考虑替代的近似抽样分布成为可能。这些基于重采样技术，适用于许多一般情况。部分4.8介绍了引导近似背后的思想。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Moving to Higher Dimensions

Posted on 2022年6月24日2022年6月24日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Moving to Higher Dimensions

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Summary Statistics

This section focuses on the representation of basic summary statistics (means, covariances and correlations) in matrix notation, since we often apply linear transformations to data. The matrix notation allows us to derive instantaneously the corresponding characteristics of the transformed variables. The Mahalanobis transformation is a prominent example of such linear transformations.

Assume that we have observed $n$ realisations of a $p$-dimensional random variable; we have a data matrix $\mathcal{X}(n \times p)$ :
$$
\mathcal{X}=\left(\begin{array}{ccc}
x_{11} & \cdots & x_{1 p} \
\vdots & & \vdots \
\vdots & & \vdots \
x_{n 1} & \cdots & x_{n p}
\end{array}\right)
$$
The rows $x_{i}=\left(x_{i 1}, \ldots, x_{i p}\right) \in \mathbb{R}^{p}$ denote the $i$ th observation of a $p$-dimensional random variable $X \in \mathbb{R}^{p}$.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Linear Model for Two Variables

We have looked several times now at downward and upward-sloping scatterplots. What does the eye define here as a slope? Suppose that we can construct a line corresponding to the general direction of the cloud. The sign of the slope of this line would correspond to the upward and downward directions. Call the variable on the vertical axis $Y$ and the one on the horizontal axis $X$. A slope line is a linear relationship between $X$ and $Y$ :
$$
y_{i}=\alpha+\beta x_{i}+\varepsilon_{i}, i=1, \ldots, n
$$

Here, $\alpha$ is the intercept and $\beta$ is the slope of the line. The errors (or deviations from the line) are denoted as $\varepsilon_{i}$ and are assumed to have zero mean and finite variance $\sigma^{2}$. The task of finding $(\alpha, \beta)$ in $(3.27)$ is referred to as a linear adjustment.

In Sect. $3.6$ we shall derive estimators for $\alpha$ and $\beta$ more formally, as well as accurately describe what a “good” estimator is. For now, one may try to find a “good” estimator $(\hat{\alpha}, \hat{\beta})$ via graphical techniques. A very common numerical and statistical technique is to use those $\hat{\alpha}$ and $\hat{\beta}$ that minimise:
$$
(\hat{\alpha}, \hat{\beta})=\arg \min {(\alpha, \beta)} \sum{i=1}^{n}\left(y_{i}-\alpha-\beta x_{i}\right)^{2} .
$$
The solution to this task are the estimators:
$$
\begin{aligned}
&\hat{\beta}=\frac{s_{X Y}}{s_{X X}} \
&\hat{\alpha}=\bar{y}-\hat{\beta} \bar{x}
\end{aligned}
$$
The variance of $\hat{\beta}$ is:
$$
\operatorname{Var}(\hat{\beta})=\frac{\sigma^{2}}{n \cdot s_{X X}}
$$
The standard error (SE) of the estimator is the square root of (3.31),
$$
\operatorname{SE}(\hat{\beta})={\operatorname{Var}(\hat{\beta})}^{1 / 2}=\frac{\sigma}{\left(n \cdot s_{X X}\right)^{1 / 2}}
$$
We can use this formula to test the hypothesis that $\beta=0$. In an application the variance $\sigma^{2}$ has to be estimated by an estimator $\hat{\sigma}^{2}$ that will be given below. Under a normality assumption of the errors, the $t$-test for the hypothesis $\beta=0$ works as follows.
One computes the statistic
$$
t=\frac{\hat{\beta}}{\operatorname{SE}(\hat{\beta})}
$$
and rejects the hypothesis at a $5 \%$ significance level if $|t| \geq t_{0.975 ; n-2}$, where the $97.5 \%$ quantile of the Student’s $t_{n-2}$ distribution is clearly the $95 \%$ critical value for the two-sided test. For $n \geq 30$, this can be replaced by $1.96$, the $97.5 \%$ quantile of the normal distribution. An estimator $\hat{\sigma}^{2}$ of $\sigma^{2}$ will be given in the following.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Simple Analysis of Variance

In a simple (i.e. one-factorial) analysis of variance (ANOVA), it is assumed that the average values of the response variable $y$ are induced by one simple factor. Suppose that this factor takes on $p$ values and that for each factor level, we have $m=n / p$ observations. The sample is of the form given in Table $3.1$, where all of the observations are independent.
The goal of a simple ANOVA is to analyse the observation structure
$$
y_{k l}=\mu_{l}+\varepsilon_{k l} \text { for } k=1, \ldots, m, \text { and } l=1, \ldots, p
$$
Each factor has a mean value $\mu \mathrm{l}$. Each observation $y_{k l}$ is assumed to be a sum of the corresponding factor mean value $\mu /$ and a zero mean random error $\varepsilon_{k l}$. The linear

regression model falls into this scheme with $m=1, p=n$ and $\mu_{i}=\alpha+\beta x_{i}$, where $x_{i}$ is the $i$ th level value of the factor.

Example $3.14$ The “classic blue” pullover company analyses the effect of three marketing strategies

advertisement in local newspaper,
presence of sales assistant,
luxury presentation in shop windows.
All of these strategies are tried in ten different shops. The resulting sale observations are given in Table 3.2.

There are $p=3$ factors and $n=m p=30$ observations in the data. The “classic blue” pullover company wants to know whether all three marketing strategies have the same mean effect or whether there are differences. Having the same effect means that all $\mu_{l}$ in $(3.41$ ) equal one value, $\mu$. The hypothesis to be tested is therefore
$$
H_{0}: \mu_{l}=\mu \text { for } I=1, \ldots, p
$$

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Summary Statistics

本节重点介绍矩阵表示法中基本汇总统计量（均值、协方差和相关性）的表示，因为我们经常对数据应用线性变换。矩阵表示法允许我们立即导出转换变量的相应特征。马氏变换是这种线性变换的一个突出例子。

假设我们已经观察到n的实现p-维随机变量；我们有一个数据矩阵X(n×p) :

X=(X11⋯X1p ⋮⋮ ⋮⋮ Xn1⋯Xnp)
行X一世=(X一世1,…,X一世p)∈Rp表示一世第一次观察p维随机变量X∈Rp.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Linear Model for Two Variables

我们已经多次查看向下和向上倾斜的散点图。眼睛在这里将什么定义为斜率？假设我们可以构造一条与云的大致方向相对应的线。这条线的斜率符号将对应于向上和向下的方向。调用纵轴上的变量是和水平轴上的那个X. 斜线是两者之间的线性关系X和是 :

是一世=一个+bX一世+e一世,一世=1,…,n

这里，一个是截距和b是线的斜率。误差（或与线的偏差）表示为e一世并假设其均值为零且方差有限σ2. 寻找任务(一个,b)在(3.27)称为线性调整。

昆虫。3.6我们将推导出估计量一个和b更正式地，以及准确地描述什么是“好的”估计器。目前，人们可能会尝试找到一个“好的”估算器(一个^,b^)通过图形技术。一种非常常见的数值和统计技术是使用那些一个^和b^最小化：

(一个^,b^)=参数⁡分钟(一个,b)∑一世=1n(是一世−一个−bX一世)2.
此任务的解决方案是估算器：

b^=sX是sXX 一个^=是¯−b^X¯
的方差b^是：

曾是⁡(b^)=σ2n⋅sXX
估计量的标准误差 (SE) 是 (3.31) 的平方根，

东南⁡(b^)=曾是⁡(b^)1/2=σ(n⋅sXX)1/2
我们可以使用这个公式来检验假设b=0. 在应用程序中，方差σ2必须由估算器估算σ^2这将在下面给出。在误差的正态假设下，吨- 检验假设b=0工作如下。
计算统计量

吨=b^东南⁡(b^)
并在 a 处拒绝假设5%显着性水平如果|吨|≥吨0.975;n−2, 其中97.5%学生的分位数吨n−2分布显然是95%两侧试验的临界值。为了n≥30, 这可以替换为1.96，这97.5%正态分布的分位数。估算器σ^2的σ2下面会给出。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Simple Analysis of Variance

在简单（即单因素）方差分析 (ANOVA) 中，假设响应变量的平均值是是由一个简单的因素引起的。假设这个因素在p值，对于每个因子水平，我们有米=n/p观察。样品的形式如表所示3.1，其中所有的观察都是独立的。
简单方差分析的目标是分析观察结构

是ķl=μl+eķl 为了 ķ=1,…,米, 和 l=1,…,p
每个因素都有一个平均值μl. 每次观察是ķl假设为相应因子平均值的总和μ/和一个零均值随机误差eķl. 线性的

回归模型属于这个方案米=1,p=n和μ一世=一个+bX一世，在哪里X一世是个一世因子的第 th 水平值。

例子3.14“经典蓝”套头衫公司分析三种营销策略的效果

在当地报纸上刊登广告，
销售助理在场，
橱窗里的奢侈品展示。
所有这些策略都在十个不同的商店中进行了尝试。表 3.2 给出了所得的销售观察结果。

有p=3因素和n=米p=30数据中的观察。“经典蓝”套头衫公司想知道这三种营销策略是否具有相同的平均效果，或者是否存在差异。具有相同的效果意味着所有μl在(3.41) 等于一个值，μ. 因此，要检验的假设是

H0:μl=μ 为了我=1,…,p

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|A Short Excursion into Matrix Algebra

Posted on 2022年6月24日2022年6月24日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|A Short Excursion into Matrix Algebra

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Derivatives

For later sections of this book, it will be useful to introduce matrix notation for derivatives of a scalar function of a vector $x$, i.e. $f(x)$, with respect to $x$. Consider $f: \mathbb{R}^{p} \rightarrow \mathbb{R}$ and a $(p \times 1)$ vector $x$, then $\frac{\partial f(x)}{\partial x}$ is the column vector of partial derivatives $\left{\frac{\partial f(x)}{\partial x_{j}}\right}, j=1, \ldots, p$ and $\frac{\partial f(x)}{\partial x T}$ is the row vector of the same derivative $\left(\frac{\partial f(x)}{\partial x}\right.$ is called the gradient of $\left.f\right)$.

We can also introduce second order derivatives: $\frac{\partial^{2} f(x)}{\partial x \partial x^{\top}}$ is the $(p \times p)$ matrix of elements $\frac{\partial^{2} f(x)}{\partial x_{i} \partial x_{j}}, i=1, \ldots, p$ and $j=1, \ldots, p\left(\frac{\partial^{2} f(x)}{\partial x \partial x}\right.$ is called the Hessian of $\left.f\right)$. Suppose that $a$ is a $(p \times 1)$ vector and that $\mathcal{A}=\mathcal{A}^{\top}$ is a $(p \times p)$ matrix. Then
$$
\frac{\partial a^{\top} x}{\partial x}=\frac{\partial x^{\top} a}{\partial x}=a
$$
The Hessian of the quadratic form $Q(x)=x^{\top} \mathcal{A} x$ is:
$$
\frac{\partial^{2} x^{\top} \mathcal{A} x}{\partial x \partial x^{\top}}=2 . \mathcal{A}
$$
Example $2.8$ Consider the matrix
$$
\mathcal{A}=\left(\begin{array}{ll}
1 & 2 \
2 & 3
\end{array}\right)
$$

From formulas $(2.24)$ and $(2.25)$ it immediately follows that the gradient of $Q(x)=$ $x^{\top} \mathcal{A} x$ is
$$
\frac{\partial x^{\top} \mathcal{A} x}{\partial x}=2, A x=2\left(\begin{array}{ll}
1 & 2 \
2 & 3
\end{array}\right) x=\left(\begin{array}{ll}
2 x & 4 x \
4 x & 6 x
\end{array}\right)
$$
and the Hessian is
$$
\frac{\partial^{2} x^{\top} \mathcal{A x}}{\partial x \partial x^{\top}}=2, A=2\left(\begin{array}{ll}
1 & 2 \
2 & 3
\end{array}\right)=\left(\begin{array}{ll}
2 & 4 \
4 & 6
\end{array}\right)
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Partitioned Matrices

Very often we will have to consider certain groups of rows and columns of a matrix $\mathcal{A}(n \times p)$. In the case of two groups, we have
$$
\mathcal{A}=\left(\begin{array}{ll}
\mathcal{A}{11} & \mathcal{A}{12} \
\mathcal{A}{21} & \mathcal{A}{22}
\end{array}\right)
$$
where $\mathcal{A}{i j}\left(n{i} \times p_{j}\right), i, j=1,2, n_{1}+n_{2}=n$ and $p_{1}+p_{2}=p$.
If $\mathcal{B}(n \times p)$ is partitioned accordingly, we have:
$$
\begin{aligned}
\mathcal{A}+\mathcal{B} &=\left(\begin{array}{ll}
\mathcal{A}{11}+\mathcal{B}{11} & \mathcal{A}{12}+\mathcal{B}{12} \
\mathcal{A}{21}+\mathcal{B}{21} & \mathcal{A}{22}+\mathcal{B}{22}
\end{array}\right) \
\mathcal{B}^{\top} &=\left(\begin{array}{l}
\mathcal{B}{11}^{\top} \mathcal{B}{21}^{\top} \
\mathcal{B}{12}^{\top} \mathcal{B}{22}^{\top}
\end{array}\right) \
\mathcal{A B} &=\left(\begin{array}{l}
\mathcal{A}{11} \mathcal{B}{11}^{\top}+\mathcal{A}{12} \mathcal{B}{12}^{\top} \mathcal{A}{11} \mathcal{B}{21}^{\top}+\mathcal{A}{12} \mathcal{B}{22}^{\top} \
\mathcal{A}{21} \mathcal{B}{11}^{\top}+\mathcal{A}{22} \mathcal{B}{12}^{\top} \mathcal{A}{21} \mathcal{B}{21}^{\top}+\mathcal{A}{22} \mathcal{B}{22}^{\top}
\end{array}\right)
\end{aligned}
$$
An important particular case is the square matrix $\mathcal{A}(p \times p)$, partitioned in such a way that $\mathcal{A}{11}$ and $\mathcal{A}{22}$ are both square matrices (i.e. $n_{j}=p_{j}, j=1,2$ ). It can be verified that when $\mathcal{A}$ is non-singular $\left(\mathcal{A}, \mathcal{A}^{-1}=\mathcal{I}{p}\right)$ : $$ \mathcal{A}^{-1}=\left(\begin{array}{ll} \mathcal{A}^{11} & \mathcal{A}^{12} \ \mathcal{A}^{21} & \mathcal{A}^{22} \end{array}\right) $$ where $$ \left{\begin{array}{l} \mathcal{A}^{11}=\left(\mathcal{A}{11}-\mathcal{A}{12} \mathcal{A}{22}^{-1} \mathcal{A}{21}\right)^{-1} \stackrel{\text { def }}{=}\left(\mathcal{A}{11 \cdot 2}\right)^{-1} \
\mathcal{A}^{12}=-\left(\mathcal{A}{11 \cdot 2}\right)^{-1} \mathcal{A}{12} \mathcal{A}{22}^{-1} \ \mathcal{A}^{21}=-\mathcal{A}{22}^{-1} \mathcal{A}{21}\left(\mathcal{A}{11 \cdot 2}\right)^{-1} \
\mathcal{A}^{22}=\mathcal{A}{22}^{-1}+\mathcal{A}{22}^{-1} \mathcal{A}{21}\left(\mathcal{A}{11 \cdot 2}\right)^{-1} \mathcal{A}{12} \mathcal{A}{22}^{-1}
\end{array}\right.
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Column Space and Null Space of a Matrix

Define for $\mathcal{X}(n \times p)$
$$
\operatorname{Im}(\mathcal{X}) \stackrel{\text { def }}{=} C(\mathcal{X})=\left{x \in \mathbb{R}^{n} \mid \exists a \in \mathbb{R}^{p} \text { so that } \mathcal{X} a=x\right},
$$
the space generated by the columns of $\mathcal{X}$ or the column space of $\mathcal{X}$. Note that $C(\mathcal{X}) \subseteq \mathbb{R}^{n}$ and $\operatorname{dim}{C(\mathcal{X})}=\operatorname{rank}(\mathcal{X})=r \leq \min (n, p)$
$$
\operatorname{Ker}(\mathcal{X}) \stackrel{\text { def }}{=} N(\mathcal{X})=\left{y \in \mathbb{R}^{p} \mid \mathcal{X} y=0\right}
$$
is the null space of $\mathcal{X}$. Note that $N(\mathcal{X}) \subseteq \mathbb{R}^{p}$ and that $\operatorname{dim}{N(\mathcal{X})}=p-r$.
Remark $2.2 N\left(\mathcal{X}^{\top}\right)$ is the orthogonal complement of $C(\mathcal{X})$ in $\mathbb{R}^{n}$, i.e. given a vector $b \in \mathbb{R}^{n}$ it will hold that $x^{\top} b=0$ for all $x \in C(\mathcal{X})$, if and only if $b \in N\left(\mathcal{X}^{\top}\right)$.
Example $2.12$ Let $\mathcal{X}=\left(\begin{array}{lll}2 & 3 & 5 \ 4 & 6 & 7 \ 6 & 8 & 6 \ 8 & 2 & 4\end{array}\right)$. It is easy to show (e.g. by calculating the determinant of $\mathcal{X})$ that $\operatorname{rank}(\mathcal{X})=3$. Hence, the column space of $\mathcal{X}$ is $C(\mathcal{X})=\mathbb{R}^{3}$. The null space of $\mathcal{X}$ contains only the zero vector $(0,0,0)^{\top}$ and its dimension is equal to $\operatorname{rank}(\mathcal{X})-3=0$.

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Derivatives

对于本书的后面部分，介绍向量标量函数的导数的矩阵表示法将很有用X， IEF(X)，关于X. 考虑F:Rp→R和一个(p×1)向量X，然后∂F(X)∂X是偏导数的列向量\left{\frac{\partial f(x)}{\partial x_{j}}\right}, j=1, \ldots, p\left{\frac{\partial f(x)}{\partial x_{j}}\right}, j=1, \ldots, p和∂F(X)∂X吨是同导数的行向量(∂F(X)∂X被称为梯度F).

我们还可以引入二阶导数：∂2F(X)∂X∂X⊤是个(p×p)元素矩阵∂2F(X)∂X一世∂Xj,一世=1,…,p和j=1,…,p(∂2F(X)∂X∂X被称为 Hessian 的F). 假设一个是一个(p×1)矢量和那个一个=一个⊤是一个(p×p)矩阵。然后

∂一个⊤X∂X=∂X⊤一个∂X=一个
二次形式的 Hessian问(X)=X⊤一个X是：

∂2X⊤一个X∂X∂X⊤=2.一个
例子2.8考虑矩阵

一个=(12 23)

从公式(2.24)和(2.25)紧接着就是梯度问(X)= X⊤一个X是

∂X⊤一个X∂X=2,一个X=2(12 23)X=(2X4X 4X6X)
黑森州是

∂2X⊤一个X∂X∂X⊤=2,一个=2(12 23)=(24 46)

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Partitioned Matrices

很多时候，我们将不得不考虑矩阵的某些行和列组一个(n×p). 在两组的情况下，我们有

一个=(一个11一个12 一个21一个22)
在哪里一个一世j(n一世×pj),一世,j=1,2,n1+n2=n和p1+p2=p.
如果乙(n×p)被相应地划分，我们有：

一个+乙=(一个11+乙11一个12+乙12 一个21+乙21一个22+乙22) 乙⊤=(乙11⊤乙21⊤ 乙12⊤乙22⊤) 一个乙=(一个11乙11⊤+一个12乙12⊤一个11乙21⊤+一个12乙22⊤ 一个21乙11⊤+一个22乙12⊤一个21乙21⊤+一个22乙22⊤)
一个重要的特例是方阵一个(p×p), 以这样的方式划分一个11和一个22都是方阵（即nj=pj,j=1,2）。可以验证当一个是非奇异的(一个,一个−1=我p) :

一个−1=(一个11一个12 一个21一个22)$$ \左{

一个11=(一个11−一个12一个22−1一个21)−1= 定义 (一个11⋅2)−1 一个12=−(一个11⋅2)−1一个12一个22−1 一个21=−一个22−1一个21(一个11⋅2)−1 一个22=一个22−1+一个22−1一个21(一个11⋅2)−1一个12一个22−1\正确的。
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Column Space and Null Space of a Matrix

定义为X(n×p)

\operatorname{Im}(\mathcal{X}) \stackrel{\text { def }}{=} C(\mathcal{X})=\left{x \in \mathbb{R}^{n} \mid \存在一个 \in \mathbb{R}^{p} \text { 使得 } \mathcal{X} a=x\right}，\operatorname{Im}(\mathcal{X}) \stackrel{\text { def }}{=} C(\mathcal{X})=\left{x \in \mathbb{R}^{n} \mid \存在一个 \in \mathbb{R}^{p} \text { 使得 } \mathcal{X} a=x\right}，
的列产生的空间X或的列空间X. 注意C(X)⊆Rn和暗淡⁡C(X)=秩⁡(X)=r≤分钟(n,p)

\operatorname{Ker}(\mathcal{X}) \stackrel{\text { def }}{=} N(\mathcal{X})=\left{y \in \mathbb{R}^{p} \mid \mathcal{X} y=0\right}\operatorname{Ker}(\mathcal{X}) \stackrel{\text { def }}{=} N(\mathcal{X})=\left{y \in \mathbb{R}^{p} \mid \mathcal{X} y=0\right}
是零空间X. 注意ñ(X)⊆Rp然后暗淡⁡ñ(X)=p−r.
评论2.2ñ(X⊤)是的正交补C(X)在Rn，即给定一个向量b∈Rn它会认为X⊤b=0对所有人X∈C(X), 当且仅当b∈ñ(X⊤).
例子2.12让X=(235 467 686 824). 它很容易显示（例如通过计算行列式X)那秩⁡(X)=3. 因此，列空间为X是C(X)=R3. 的零空间X仅包含零向量(0,0,0)⊤它的维度等于秩⁡(X)−3=0.

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Parallel Coordinates Plots

Posted on 2022年6月24日2022年6月24日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Parallel Coordinates Plots

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|KernelParallel Coordinates Plots

PCP is a method for representing high-dimensional data, see Inselberg (1985). Instead of plotting observations in an orthogonal coordinate system, PCP draws coordinates in parallel axes and connects them with straight lines. This method helps in representing data with more than four dimensions.

One first scales all variables to $\max =1$ and $\min =0$. The coordinate index $j$ is drawn onto the horizontal axis, and the scaled value of variable $x_{i j}$ is mapped onto the vertical axis. This way of representation is very useful for high-dimensional data. It is however also sensitive to the order of the variables, since certain trends in the data can be shown more clearly in one ordering than in another.

Example 1.5 Take, once again, the observations $96-105$ of the Swiss bank notes. These observations are six dimensional, so we can’t show them in a six-dimensional Cartesian coordinate system. Using the PCP technique, however, they can be plotted on parallel axes. This is shown in Fig. 1.22.

PCP can also be used for detecting linear dependencies between variables: if all the lines are of almost parallel dimensions $(p=2)$, there is a positive linear dependence between them. In Fig. $1.23$ we display the two variables weight and displacement for the car data set in Sect. 22.3. The correlation coefficient $\rho$ introduced in Sect. $3.2$ is $0.9$. If all lines intersect visibly in the middle, there is evidence of a negative linear dependence between these two variables, see Fig. $1.24$. In fact the correlation is $\rho=-0.82$ between two variables mileage and weight: The more the weight, the less the mileage.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Hexagon Plots

This section closely follows the presentation of Lewin-Koh (2006). In geometry, a hexagon is a polygon with six edges and six vertices. Hexagon binning is a type of bivariate histogram with hexagon borders. It is useful for visualising the structure of data sets entailing a large number of observations $n$. The concept of hexagon binning is as follows:

The $x y$ plane over the set (range $(x)$, $\operatorname{range}(y)$ ) is tessellated by a regular grid of hexagons.
The number of points falling in each hexagon is counted.
The hexagons with count $>0$ are plotted by using a colour ramp or varying the radius of the hexagon in proportion to the counts.

This algorithm is extremely fast and effective for displaying the structure of data sets even for $n \geq 10^{6}$. If the size of the grid and the cuts in the colour ramp are chosen in a clever fashion, then the structure inherent in the data should emerge in the binned plot. The same caveats apply to hexagon binning as histograms. Variance and bias vary in opposite directions with bin width, so we have to settle for finding the value of the bin width that yields the optimal compromise between variance and bias reduction. Clearly, if we increase the size of the grid, the hexagon plot appears to be smoother, but without some reasonable criterion on hand it remains difficult to say which bin width provides the “optimal” degree of smoothness. The default number of bins suggested by standard software is 30 .

Applications to some data sets are shown as follows. The data is taken from ALLBUS (2006) [ZA No.3762]. The number of respondents is 2,946 . The following nine variables have been selected to analyse the relation between each pair of variables.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Quadratic Forms

A quadratic form $Q(x)$ is built from a symmetric matrix $\mathcal{A}(p \times p)$ and a vector $x \in \mathbb{R}^{p}:$
$$
Q(x)=x^{\top}, \mathcal{A} x=\sum_{i=1}^{p} \sum_{j=1}^{p} a_{i j} x_{i} x_{j}
$$
Definiteness of Quadratic Forms and Matrices
$$
\begin{array}{ll}
Q(x)>0 \text { for all } x \neq 0 & \text { positive definite } \
Q(x) \geq 0 \text { for all } x \neq 0 & \text { positive semidefinite }
\end{array}
$$
A matrix $\mathcal{A}$ is called positive definite (semidefinite) if the corresponding quadratic form $Q(.)$ is positive definite (semidefinite). We write $\mathcal{A}>0(\geq 0)$.
Quadratic forms can always be diagonalised, as the following result shows.
Theorem 2.3 If $\mathcal{A}$ is symmetric and $Q(x)=x^{\top} \mathcal{A} x$ is the corresponding quadratic form, then there exists a transformation $x \mapsto \Gamma^{\top} x=y$ such that
$$
x^{\top} \mathcal{A} x=\sum_{i=1}^{p} \lambda_{i} y_{i}^{2}
$$
where $\lambda_{i}$ are the eigenvalues of $\mathcal{A}$.
Proof $\mathcal{A}=\Gamma \Lambda \Gamma^{\top}$. By Theorem $2.1$ and $y=\Gamma^{\top} \alpha$ we have that $x^{\top} \mathcal{A} x=$ $x^{\top} \Gamma \Lambda \Gamma^{\top} x=y^{\top} \Lambda y=\sum_{i=1}^{p} \lambda_{i} y_{i}^{2}$.

Positive definiteness of quadratic forms can be deduced from positive eigenvalues.
Theorem $2.4 \mathcal{A}>0$ if and only if all $\lambda_{i}>0, i=1, \ldots, p$.
Proof $0<\lambda_{1} y_{1}^{2}+\cdots+\lambda_{p} y_{p}^{2}=x^{\top} \mathcal{A} x$ for all $x \neq 0$ by Theorem 2.3.

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|KernelParallel Coordinates Plots

PCP 是一种表示高维数据的方法，参见 Inselberg (1985)。PCP 不是在正交坐标系中绘制观测值，而是在平行轴上绘制坐标并用直线连接它们。此方法有助于表示具有四个以上维度的数据。

首先将所有变量缩放为最大限度=1和分钟=0. 坐标索引j绘制在水平轴上，变量的缩放值X一世j映射到垂直轴上。这种表示方式对于高维数据非常有用。然而，它对变量的顺序也很敏感，因为数据中的某些趋势可以以一种顺序比另一种顺序更清楚地显示。

例 1.5 再次观察观察结果96−105瑞士银行纸币。这些观察是六维的，所以我们不能在六维笛卡尔坐标系中显示它们。然而，使用 PCP 技术，它们可以绘制在平行轴上。如图 1.22 所示。

PCP 也可用于检测变量之间的线性依赖关系：如果所有线的维度几乎平行(p=2)，它们之间存在正线性相关。在图。1.23我们显示了 Sect 中汽车数据集的两个变量权重和位移。22.3. 相关系数ρ节中介绍。3.2是0.9. 如果所有线在中间明显相交，则有证据表明这两个变量之间存在负线性相关性，见图。1.24. 实际上相关性是ρ=−0.82里程和重量两个变量之间：重量越大，里程越少。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Hexagon Plots

本节紧跟 Lewin-Koh (2006) 的介绍。在几何学中，六边形是一个有六个边和六个顶点的多边形。六边形分箱是一种具有六边形边界的二元直方图。它对于可视化需要大量观察的数据集的结构很有用n. 六边形分箱的概念如下：

这X是平面上的集合（范围(X), 范围⁡(是)) 由六边形的规则网格镶嵌。
计算落在每个六边形中的点数。
有计数的六边形>0通过使用色带或与计数成比例地改变六边形的半径来绘制。

该算法非常快速有效地显示数据集的结构，即使对于n≥106. 如果以巧妙的方式选择网格的大小和色带中的切口，则数据中固有的结构应该出现在分箱图中。相同的注意事项适用于直方图的六边形分箱。方差和偏差随 bin 宽度的变化方向相反，因此我们必须找到在方差和偏差减少之间产生最佳折衷的 bin 宽度值。显然，如果我们增加网格的大小，六边形图看起来会更平滑，但如果没有一些合理的标准，仍然很难说哪个 bin 宽度提供了“最佳”平滑度。标准软件建议的默认 bin 数量为 30 。

一些数据集的应用如下所示。数据取自 ALLBUS (2006) [ZA No.3762]。受访者人数为 2,946 人。选择了以下九个变量来分析每对变量之间的关系。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Quadratic Forms

二次形式问(X)由对称矩阵构建一个(p×p)和一个向量X∈Rp:

问(X)=X⊤,一个X=∑一世=1p∑j=1p一个一世jX一世Xj
二次形式和矩阵的确定性

问(X)>0 对所有人 X≠0 正定问(X)≥0 对所有人 X≠0 正半定
矩阵一个如果对应的二次型称为正定（半定）问(.)是正定（半定）。我们写一个>0(≥0).
二次形式总是可以对角化，如下面的结果所示。
定理 2.3 如果一个是对称的并且问(X)=X⊤一个X是对应的二次型，则存在变换X↦Γ⊤X=是这样

X⊤一个X=∑一世=1pλ一世是一世2
在哪里λ一世是的特征值一个.
证明一个=ΓΛΓ⊤. 按定理2.1和是=Γ⊤一个我们有X⊤一个X= X⊤ΓΛΓ⊤X=是⊤Λ是=∑一世=1pλ一世是一世2.

二次形式的正定性可以从正特征值推导出来。
定理2.4一个>0当且仅当所有λ一世>0,一世=1,…,p.
证明0<λ1是12+⋯+λp是p2=X⊤一个X对所有人X≠0由定理 2.3。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Comparison of Batches

Posted on 2022年6月24日2022年6月24日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Comparison of Batches

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Kernel Densities

The major difficulties of histogram estimation may be summarised in four critiques:

determination of the binwidth $h$, which controls the shape of the histogram,
choice of the bin origin $x_{0}$, which also influences to some extent the shape,
loss of information since observations are replaced by the central point of the interval in which they fall,
the underlying density function is often assumed to be smooth, but the histogram is not smooth.

Rosenblatt (1956), Whittle (1958) and Parzen (1962) developed an approach which avoids the last three difficulties. First, a smooth kernel function rather than a box is used as the basic building block. Second, the smooth function is centred directly over each observation. Let us study this refinement by supposing that $x$ is the centre value of a bin. The histogram can in fact be rewritten as
$$
\hat{f}{h}(x)=n^{-1} h^{-1} \sum{i=1}^{n} I\left(\left|x-x_{i}\right| \leq \frac{h}{2}\right)
$$
If we define $K(u)=\boldsymbol{I}\left(|u| \leq \frac{1}{2}\right)$, then (1.8) changes to
$$
\hat{f}{h}(x)=n^{-1} h^{-1} \sum{i=1}^{n} K\left(\frac{x-x_{i}}{h}\right) .
$$
This is the general form of the kernel estimator. Allowing smoother kernel functions like the quartic kernel,
$$
K(u)=\frac{15}{16}\left(1-u^{2}\right)^{2} \boldsymbol{I}(|u| \leq 1)
$$
and computing $x$ not only at bin centers gives us the kernel density estimator. Kernel estimators can also be derived via weighted averaging of rounded points (WARPing) or by averaging histograms with different origins, see Scott (1985). Table $1.5$ introduces some commonly used kernels.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Chernoff-Flury Faces

If we are given data in numerical form, we tend to also display it numerically. This was done in the preceding sections: an observation $x_{1}=(1,2)$ was plotted as the point $(1,2)$ in a two-dimensional coordinate system. In multivariate analysis we want to understand data in low dimensions (e.g. on a $2 \mathrm{D}$ computer screen) although the structures are hidden in high dimensions. The numerical display of data structures using coordinates therefore ends at dimensions greater than three.

If we are interested in condensing a structure into $2 \mathrm{D}$ elements, we have to consider alternative graphical techniques. The Chernoff-Flury faces, for example, provide such a condensation of high-dimensional information into a simple “face”. In fact faces are a simple way of graphically displaying high-dimensional data. The size of the face elements like pupils, eyes, upper and lower hair line, etc. are assigned to certain variables. The idea of using faces goes back to Chernoff (1973) and has been further developed by Bernhard Flury. We follow the design described in Flury and Riedwyl (1988) which uses the following characteristics.

right eye size
right pupil size
position of right pupil
right eye slant
horizontal position of right eye
vertical position of right eye
curvature of right eyebrow
density of right eyebrow
horizontal position of right eyebrow
vertical position of right eyebrow
right upper hair line
right lower hair line
right face line
darkness of right hair
right hair slant
right nose line
right size of mouth
right curvature of mouth
19-36. like 1-18, only for the left side.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Andrews’ Curves

The basic problem of graphical displays of multivariate data is the dimensionality. Scatterplots work well up to three dimensions (if we use interactive displays). More than three dimensions have to be coded into displayable 2D or 3D structures (e.g. faces). The idea of coding and representing multivariate data by curves was suggested by Andrews (1972). Each multivariate observation $X_{i}=\left(X_{i, 1}, \ldots, X_{i, p}\right)$ is transformed into a curve as follows:
the observation represents the coefficients of a so-called Fourier series $(t \in[-\pi, \pi])$.
Suppose that we have three-dimensional observations: $X_{1}=(0,0,1), X_{2}=$ $(1,0,0)$ and $X_{3}=(0,1,0)$. Here $p=3$ and the following representations correspond to the Andrews’ curves:
$$
\begin{aligned}
&f_{1}(t)=\cos (t) \
&f_{2}(t)=\frac{1}{\sqrt{2}} \quad \text { and } \
&f_{3}(t)=\sin (t)
\end{aligned}
$$
These curves are indeed quite distinct, since the observations $X_{1}, X_{2}$, and $X_{3}$ are the $3 \mathrm{D}$ unit vectors: each observation has mass only in one of the three dimensions. The order of the variables plays an important role.

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Kernel Densities

直方图估计的主要困难可以概括为四个批评：

binwidth的确定H，它控制直方图的形状，
bin 原点的选择X0，这也在一定程度上影响了形状，
信息丢失，因为观测值被它们所在区间的中心点所取代，
通常假设底层密度函数是平滑的，但直方图并不平滑。

Rosenblatt (1956)、Whittle (1958) 和 Parzen (1962) 开发了一种方法来避免最后三个困难。首先，使用平滑核函数而不是盒子作为基本构建块。其次，平滑函数直接以每个观察为中心。让我们通过假设X是 bin 的中心值。直方图实际上可以重写为

F^H(X)=n−1H−1∑一世=1n我(|X−X一世|≤H2)
如果我们定义ķ(在)=我(|在|≤12), 然后 (1.8) 变为

F^H(X)=n−1H−1∑一世=1nķ(X−X一世H).
这是核估计器的一般形式。允许更平滑的核函数，如四次核，

ķ(在)=1516(1−在2)2我(|在|≤1)
和计算X不仅在 bin 中心为我们提供了核密度估计器。内核估计量也可以通过圆角点的加权平均（WARPing）或通过对不同来源的直方图进行平均得出，参见 Scott (1985)。桌子1.5介绍一些常用的内核。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Chernoff-Flury Faces

如果我们以数字形式给出数据，我们也倾向于以数字形式显示它。这是在前面的部分中完成的：一个观察X1=(1,2)被绘制为点(1,2)在二维坐标系中。在多变量分析中，我们希望了解低维数据（例如2D电脑屏幕）虽然这些结构隐藏在高维度中。因此，使用坐标的数据结构的数字显示以大于三的维度结束。

如果我们有兴趣将一个结构压缩成2D元素，我们必须考虑替代图形技术。例如，Chernoff-Flury 面将高维信息浓缩成一个简单的“面”。事实上，人脸是一种以图形方式显示高维数据的简单方式。诸如瞳孔、眼睛、上下发线等面部元素的大小被分配给某些变量。使用面的想法可以追溯到 Chernoff (1973)，并由 Bernhard Flury 进一步发展。我们遵循 Flury 和 Riedwyl (1988) 中描述的设计，它使用以下特性。

右眼大小
正确的瞳孔大小
右瞳孔位置
右眼倾斜
右眼水平位置
右眼垂直位置
右眉弧度
右眉密度
右眉水平位置
右眉垂直位置
右上发际线
右下发际线
右脸线
右头发的黑暗
右发斜
右鼻线
嘴巴大小合适
嘴的右曲度
19-36。像 1-18，仅适用于左侧。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Andrews’ Curves

多元数据图形显示的基本问题是维数。散点图在三个维度上都能很好地工作（如果我们使用交互式显示）。超过三个维度必须编码为可显示的 2D 或 3D 结构（例如面）。Andrews (1972) 提出了用曲线编码和表示多元数据的想法。每个多变量观察X一世=(X一世,1,…,X一世,p)被转换成如下曲线：
观察表示所谓的傅立叶级数的系数(吨∈[−圆周率,圆周率]).
假设我们有三维观察：X1=(0,0,1),X2= (1,0,0)和X3=(0,1,0). 这里p=3以下表示对应于安德鲁斯曲线：

F1(吨)=因⁡(吨) F2(吨)=12 和 F3(吨)=罪⁡(吨)
这些曲线确实很明显，因为观察X1,X2，和X3是3D单位向量：每个观测值仅在三个维度之一中具有质量。变量的顺序起着重要作用。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|主成分分析代写Principal Component Analysis代考|OLET5610

Posted on 2022年6月14日2022年6月14日 by statistics-lab

如果你也在怎样代写主成分分析Principal Component Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

主成分分析（PCA）是计算主成分并使用它们对数据进行基础改变的过程，有时只使用前几个主成分，而忽略其余部分。

statistics-lab™ 为您的留学生涯保驾护航在代写主成分分析Principal Component Analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写主成分分析Principal Component Analysis代写方面经验极为丰富，各种代写主成分分析Principal Component Analysis相关的作业也就用不着说。

我们提供的主成分分析Principal Component Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|主成分分析代写Principal Component Analysis代考|OLET5610

统计代写|主成分分析代写Principal Component Analysis代考|Generalized Mean

For a $p \neq 0$, the generalized mean or power mean $\mathscr{M}{p}$ of $\left{a{i}>0, i=1, \ldots, N\right}$ $[20]$ is defined as
$$
\mathcal{M}{p}\left{a{1}, \ldots, a_{N}\right}=\left(\frac{1}{N} \sum_{i=1}^{N} a_{i}^{p}\right)^{1 / p}
$$
Figure $1[25]$ shows that $\mathscr{H}_{p}{1,2, \ldots, 10}$ varies continuously as $p$ changes from $-10$ to 10 . The arithmetic mean, the geometric mean, and the harmonic mean are special cases of the generalized mean when $p=1, p \rightarrow 0$, and $p=-1$, respectively. Furthermore, the maximum and the minimum values of the numbers can also be approximated from the generalized mean by making $p \rightarrow \infty$ and $p \rightarrow-\infty$, respectively. Note that as $p$ decreases (increases), the generalized mean is more affected by the smaller (larger) numbers than the larger (smaller) ones, i.e., controlling $p$ makes it possible to adjust the contribution of each number to the generalized mean. This characteristic is useful in the situation where data samples should be

differently handled according to their importance, for example, when outliers are contained in the training set.

In [25], it was shown that the generalized mean of a set of positive numbers can be expressed by a nonnegative linear combination of the elements in the set as the following:
$$
\left(\frac{1}{K} \sum_{i=1}^{K} a_{i}^{p}\right)^{1 / p}=c_{1} a_{1}+\cdots+c_{K} a_{K}
$$
Each $c_{i}$ in this equation can be obtained by differentiating this equation with respect to $c_{i}$
$$
c_{i}=\left(\frac{1}{K} \sum_{i=1}^{K} a_{i}^{p}\right)^{\frac{1}{p-1}} \frac{a_{i}^{p-1}}{K}
$$
where $i=1, \ldots, K$. In this chapter, it is further simplified as the following:
$$
\begin{aligned}
\sum_{i=1}^{K} a_{i}^{p} &=b_{1} a_{1}+\cdots+b_{K} a_{K} \
b_{i} &=a_{i}^{p-1}, \quad i=1, \ldots, K
\end{aligned}
$$
Note that each weight $b_{i}$ has the same value of 1 if $p=1$, where the generalized mean becomes the arithmetic mean. It is also noted that, if $p$ is less than one, the weight $b_{i}$ increases as $a_{i}$ decreases. This means that, when $p<1$, the generalized mean is more influenced by the small numbers in $\left{a_{i}\right}_{i=1}^{K}$, and the extent of the influence increases as $p$ decreases. This equation plays an important role in solving the optimization problems using the generalized mean.

统计代写|主成分分析代写Principal Component Analysis代考|Generalized Sample Mean

Most conventional PCAs commonly assume that training samples have zero-mean. To satisfy this assumption, all of the samples are subtracted by the sample mean, i.e., $\mathbf{x}{i}-\mathbf{m}{S}$ for $i=1, \ldots, N$, where $\mathbf{m}{S}=\frac{1}{N} \sum{i=1}^{N} \mathbf{x}{i}$. The conventional sample mean can be considered as the center of the samples in the sense of the least square, i.e., $$ \mathbf{m}{S}=\underset{\mathbf{m}}{\arg \min } \frac{1}{N} \sum_{i=1}^{N}\left|\mathbf{x}{i}-\mathbf{m}\right|{2}^{2} .
$$
In (7), a small number of outliers in the training samples dominate the objective function because the objective function in ( 7$)$ is constructed based on the squared distances. To obtain a robust sample mean in the presence of outliers, a new opti-mization problem is formulated by replacing the arithmetic mean in ( 7 ) with the generalized mean as
$$
\mathbf{m}{G}=\underset{\mathbf{m}}{\arg \min }\left(\frac{1}{N} \sum{i=1}^{N}\left(\left|\mathbf{x}{i}-\mathbf{m}\right| |{2}^{2}\right)^{p}\right)^{1 / p}
$$
This problem is equivalent to $(7)$ if $p=1$. As mentioned in the previous subsection, the contribution of a large number to the objective function decreases as $p$ decreases. Thus, the negative effect of outliers can be alleviated if $p<1$. From now on, we will call $\mathbf{m}{G}$ as the generalized sample mean. Using the fact that $x^{p}$ with $p>0$ is a monotonic increasing function of $x$ for $x>0$, this problem can be converted to $$ \mathbf{m}{G}=\underset{\mathbf{m}}{\arg \min } \sum_{i=1}^{N}\left(\left|\mathbf{x}{i}-\mathbf{m}\right|{2}^{2}\right)^{p}
$$
Although the minimization in (8) should be changed into the maximization when $p<0$, we only consider positive values of $p$ in this paper.

The necessary condition for $\mathbf{m}{G}$ to be a local minimum is that the gradient of the objective function in (8) with respect to $\mathrm{m}$ is equal to zero, i.e., $$ \frac{\partial}{\partial \mathbf{m}} \sum{i=1}^{N}\left(\left|\mathbf{x}{i}-\mathbf{m}\right|{2}^{2}\right)^{p}=0
$$

统计代写|主成分分析代写Principal Component Analysis代考|Principal Component Analysis Using Generalized Mean

For a projected sample $\mathbf{W}^{T} \mathbf{x}$, the squared reconstruction error $e(\mathbf{W})$ can be computed as
$$
e(\mathbf{W})=\tilde{\mathbf{x}}^{T} \widetilde{\mathbf{x}}-\tilde{\mathbf{x}}^{T} \mathbf{W} \mathbf{W}^{T} \widetilde{\mathbf{x}}
$$
where $\tilde{\mathbf{x}}=\mathbf{x}-\mathbf{m}$. We use the generalized sample mean $\mathbf{m}{G}$ for $\mathbf{m}$. To prevent outliers corresponding to large $e(\mathbf{W})$ from dominating the objective function, we propose to minimize the following objective function: $$ J{G}(\mathbf{W})=\left(\frac{1}{N} \sum_{i=1}^{N}\left[e_{i}(\mathbf{W})\right]^{p}\right)^{1 / p}
$$
where $e_{i}(\mathbf{W})=\tilde{\mathbf{x}}{i}^{T} \tilde{\mathbf{x}}{i}-\tilde{\mathbf{x}}{i}^{T} \mathbf{W} \mathbf{W}^{T} \tilde{\mathbf{x}}{i}$ is the squared reconstruction error of $\mathbf{x}{i}$ with respect to $\mathbf{W}$. Note that $J{G}(\mathbf{W})$ is formulated by replacing the arithmetic mean in $J_{L_{2}}(\mathbf{W})$ with the generalized mean keeping the use of the Euclidean distance and it is equivalent to $J_{L_{2}}(\mathbf{W})$ if $p=1$. The negative effect raised by outliers is suppressed in the same way as in (8). Also, the solution that minimizes $J_{G}(\mathbf{W})$ is rotational invariant because each $e_{i}(\mathbf{W})$ is measured based on the Euclidean distance. To obtain $\mathbf{W}_{G}$, we develop an iterative optimization method similar to Algorithm $1 .$

主成分分析代考

统计代写|主成分分析代写Principal Component Analysis代考|Generalized Mean

为一个p≠0, 广义均值或幂均值米p的\left{a{i}>0, i=1, \ldots, N\right}\left{a{i}>0, i=1, \ldots, N\right} [20]定义为

\mathcal{M}{p}\left{a{1}, \ldots, a_{N}\right}=\left(\frac{1}{N} \sum_{i=1}^{N} a_ {i}^{p}\right)^{1 / p}\mathcal{M}{p}\left{a{1}, \ldots, a_{N}\right}=\left(\frac{1}{N} \sum_{i=1}^{N} a_ {i}^{p}\right)^{1 / p}
数字1[25]表明Hp1,2,…,10随着不断变化p从变化−10到 10 。算术平均数、几何平均数和调和平均数是广义平均数的特殊情况，当p=1,p→0，和p=−1，分别。此外，数字的最大值和最小值也可以通过使p→∞和p→−∞，分别。请注意，作为p减少（增加），广义均值受较小（较大）数字的影响大于较大（较小）数字，即控制p可以调整每个数字对广义均值的贡献。这个特性在数据样本应该是

根据它们的重要性进行不同的处理，例如，当异常值包含在训练集中时。

在 [25] 中，表明一组正数的广义均值可以通过集合中元素的非负线性组合表示如下：

(1ķ∑一世=1ķ一个一世p)1/p=C1一个1+⋯+Cķ一个ķ
每个C一世在这个方程中，可以通过对这个方程进行微分得到C一世

C一世=(1ķ∑一世=1ķ一个一世p)1p−1一个一世p−1ķ
在哪里一世=1,…,ķ. 在本章中，进一步简化为：

∑一世=1ķ一个一世p=b1一个1+⋯+bķ一个ķ b一世=一个一世p−1,一世=1,…,ķ
注意每个重量b一世具有相同的值 1 如果p=1，其中广义平均值变为算术平均值。还要注意的是，如果p小于一，权重b一世增加为一个一世减少。这意味着，当p<1, 广义均值更受小数的影响\left{a_{i}\right}_{i=1}^{K}\left{a_{i}\right}_{i=1}^{K}, 影响程度随着p减少。该方程在使用广义均值求解优化问题中起着重要作用。

统计代写|主成分分析代写Principal Component Analysis代考|Generalized Sample Mean

大多数传统的 PCA 通常假设训练样本的均值为零。为了满足这个假设，所有样本都减去样本均值，即X一世−米小号为了一世=1,…,ñ，在哪里米小号=1ñ∑一世=1ñX一世. 传统的样本均值可以认为是最小二乘意义上的样本中心，即

米小号=参数⁡分钟米1ñ∑一世=1ñ|X一世−米|22.
在（7）中，训练样本中的少量异常值支配了目标函数，因为（7）中的目标函数)是根据平方距离构造的。为了在存在异常值的情况下获得稳健的样本均值，通过将 (7) 中的算术均值替换为广义均值来制定一个新的优化问题：

米G=参数⁡分钟米(1ñ∑一世=1ñ(|X一世−米||22)p)1/p
这个问题相当于(7)如果p=1. 如上一小节所述，大数对目标函数的贡献减少为p减少。因此，如果异常值的负面影响可以减轻p<1. 从现在开始，我们将调用米G作为广义样本均值。利用这个事实Xp和p>0是一个单调递增函数X为了X>0，这个问题可以转化为

米G=参数⁡分钟米∑一世=1ñ(|X一世−米|22)p
虽然（8）中的最小化应该变成最大化时p<0，我们只考虑正值p在本文中。

的必要条件米G成为局部最小值是（8）中目标函数的梯度相对于米等于零，即

∂∂米∑一世=1ñ(|X一世−米|22)p=0

统计代写|主成分分析代写Principal Component Analysis代考|Principal Component Analysis Using Generalized Mean

对于投影样本在吨X, 平方重建误差和(在)可以计算为

和(在)=X~吨X~−X~吨在在吨X~
在哪里X~=X−米. 我们使用广义样本均值米G为了米. 防止异常值对应大和(在)为了控制目标函数，我们建议最小化以下目标函数：

ĴG(在)=(1ñ∑一世=1ñ[和一世(在)]p)1/p
在哪里和一世(在)=X~一世吨X~一世−X~一世吨在在吨X~一世是平方重建误差X一世关于在. 注意ĴG(在)是通过替换算术平均值来制定的Ĵ大号2(在)广义均值保持使用欧几里得距离，它等价于Ĵ大号2(在)如果p=1. 异常值引起的负面影响以与（8）相同的方式被抑制。此外，最小化的解决方案ĴG(在)是旋转不变的，因为每个和一世(在)是基于欧几里得距离测量的。获得在G, 我们开发了一种类似于 Algorithm 的迭代优化方法1.

统计代写|主成分分析代写Principal Component Analysis代考请认准statistics-lab™

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|主成分分析代写Principal Component Analysis代考|MATH5855

Posted on 2022年6月14日2022年6月14日 by statistics-lab

如果你也在怎样代写主成分分析Principal Component Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

主成分分析（PCA）是计算主成分并使用它们对数据进行基础改变的过程，有时只使用前几个主成分，而忽略其余部分。

我们提供的主成分分析Principal Component Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|主成分分析代写Principal Component Analysis代考|MATH5855

统计代写|主成分分析代写Principal Component Analysis代考|Missing Not at Random

A number of statistical approaches have been developed to cope with nonignorable missing mechanisms and such approaches have been applied in different analytic frameworks. However, the approaches used in PCA have typically focused on assumptions of ignorability $[19,21,42]$. Recently, Geraci and Farcomeni [11] extended Tipping and Bishop’s [42] EM approach to the case in which the vector $\mathbf{y}$ is partially observed and the missing data mechanism is nonignorable. Specifically, they proposed an adaptation of Ibrahim et al.’s [17] methods for missing responses in random-effects models with non-monotone patterns of missing data.

Suppose that $\mathbf{y}{i}$ contains $s{i}, s_{i}<p$, missing values. The $i$ th contribution to the complete data density of $\left(\mathbf{y}{i}, \mathbf{u}{i}, \mathbf{m}{i}\right)$ is given by $$ f\left(\mathbf{y}{i}, \mathbf{u}{i}, \mathbf{m}{i} \mid \boldsymbol{\theta}, \eta\right)=f\left(\mathbf{y}{i} \mid \mathbf{u}{i}, \boldsymbol{\theta}\right) f\left(\mathbf{u}{i}\right) f\left(\mathbf{m}{i} \mid \mathbf{y}{i}, \eta\right), \quad i=1, \ldots, n, $$ where the additional factor $f\left(\mathbf{m}{i} \mid \mathbf{y}{i}, \eta\right)$, indexed by the parameter $\eta$, is the MDM, which we assume to be independent from $\mathbf{u}{i}$. This assumption simplifies the subsequent steps of the estimation algorithm, although it can be relaxed at the cost of increased computational time (see $[17,18]$ for a discussion).

Estimation of $\boldsymbol{\theta}$ would in general require marginalizing the log-likelihood based on (10) over the unobserved data, which however leads to a rather intractable integral of dimension $s_{i}+q$. Instead, the EM algorithm can be applied. The E-step at the $(t+1)$ th iteration is defined as follows:
$$
Q\left(\lambda \mid \lambda^{(t)}\right)=\mathrm{E}{\mathbf{z}, \mathbf{u} \mid \mathbf{x}, \mathbf{m}, \lambda^{(m}}{l(\lambda ; \mathbf{Y}, \mathbf{U}, \mathbf{M})} $$ with $\lambda=(\boldsymbol{\theta}, \eta) \quad$ and $\quad l(\lambda ; \mathbf{Y}, \mathbf{U}, \mathbf{M})=\sum{i}^{n} \log f\left(\mathbf{y}{i} \mid \mathbf{u}{i}, \boldsymbol{\theta}\right)+\log f\left(\mathbf{u}{i}\right)+$ $\log f\left(\mathbf{m}{i} \mid \mathbf{y}{i}, \eta\right)$, and where the expectation is taken with respect to the conditional distribution of $\mathbf{z}{i}$ and $\mathbf{u}_{i}$, given the observed data, evaluated at $\lambda^{(t)}$.

The E-step (11), however, does not yet offer a computational advantage since it is not easy to solve analytically. Therefore, Geraci and Farcomeni [11] applied a Monte Carlo E-step [17]. They considered an adaptive rejection Metropolis sampling (ARMS) algorithm [12] and specified the following MDM

$$
f\left(\mathbf{m}{i} \mid \mathbf{y}{i}, \eta\right)=\prod_{j=1}^{p} \pi_{i j}^{m_{i j}}\left(1-\pi_{i j}\right)^{1-m_{i j}}
$$
where $\pi_{i j}$ is the probability that $y_{i j}$ is missing, conditional on the response $\mathbf{y}_{i}$. The ARMS algorithm is convenient as, basically, no tuning is needed. In addition, it is run in parallel for each row of the data matrix, which therefore greatly speeds up the computation. Moreover, the PPCA Gaussian model provides scope for further reductions in computational time during the calculation of the E-step. All the technical details are given in appendix.

统计代写|主成分分析代写Principal Component Analysis代考|Appendix – EM Algorithm for PPCA with MNAR Values

In this appendix, we provide additional details on the Monte Carlo EM algorithm introduced in Sect. $3.2$ and we derive a simplified E-step where the random effects are integrated out from the complete data log-likelihood.

The Monte Carlo E-step requires sampling from $f\left(\mathbf{z}{i}, \mathbf{u}{i} \mid \mathbf{x}{i}, \mathbf{m}{i}, \lambda^{(t)}\right)$. This task can be carried out efficiently via ARMS [12] using the full conditionals
$$
\begin{aligned}
&f\left(\mathbf{z}{i} \mid \mathbf{x}{i}, \mathbf{u}{i}, \mathbf{m}{i}, \lambda^{(t)}\right) \propto f\left(\mathbf{y}{i} \mid \mathbf{u}{i}, \lambda^{(t)}\right) f\left(\mathbf{m}{i} \mid \mathbf{y}{i}, \lambda^{(t)}\right) \
&f\left(\mathbf{u}{i} \mid \mathbf{x}{i}, \mathbf{z}{i}, \mathbf{m}{i}, \lambda^{(t)}\right) \propto f\left(\mathbf{y}{i} \mid \mathbf{u}{i}, \lambda^{(t)}\right) f\left(\mathbf{u}{i}\right) \end{aligned} $$ An implementation of ARMS is available in the R package HI [35]. A sample $\xi{i 1}, \ldots, \xi_{i K}$ for $i=1, \ldots, n$ is obtained at each EM iteration $t$, where the $\left(s_{i}+q\right) \times 1$ vector $\xi_{i k}=\left(\overline{\mathbf{z}}{i k}, \overline{\mathbf{u}}{i k}\right), k=1, \ldots, K$, contains ‘imputed’ values for $\mathbf{z}{i}$ and $\mathbf{u}{i}$ (with the understanding that $\boldsymbol{\xi}{i k}=\overline{\mathbf{u}}{i k}$ if $s_{i}=0$ ). Here the Monte Carlo sample size $K$ is kept constant throughout. Alternative strategies with varying $K^{(t)}$ that may increase the speed or the accuracy of the EM algorithm can be considered $[2,17]$. The E-step (11) is approximated by
$$
Q\left(\lambda \mid \lambda^{(t)}\right)=\frac{1}{K} \sum_{i=1}^{n} \sum_{k=1}^{K} l\left(\lambda ; \xi_{i k}, \mathbf{x}{i}, \mathbf{m}{i}\right)
$$
The maximization of (15) with respect to $\lambda$ is straightforward. Define $\tilde{\mathbf{y}}{i k}=\left(\tilde{\mathbf{z}}{i k}, \mathbf{x}{i}\right)$ if $s{i}>0$ or $\tilde{\mathbf{y}}{i k}=\mathbf{y}{i}$ if $s_{i}=0, i=1, \ldots, n, k=1, \ldots, K$. The maximum likelihood solution of the M-step at the $(t+1)$ th iteration is given by
$$
\begin{aligned}
\hat{\boldsymbol{\mu}}^{(t+1)} &=\frac{1}{n K} \sum_{i=1}^{n} \sum_{k=1}^{K}\left(\tilde{\mathbf{y}}{i k}-\hat{\mathbf{W}}^{(t)} \tilde{\mathbf{u}}{i k}\right) \
\hat{\mathbf{W}}^{(t+1)} &=\left{\sum_{i=1}^{n} \sum_{k=1}^{K}\left(\tilde{\mathbf{y}}{i k}-\hat{\boldsymbol{\mu}}^{(t+1)}\right) \tilde{\mathbf{u}}{i k}^{\top}\right}\left(\sum_{i=1}^{n} \sum_{k=1}^{K} \tilde{\mathbf{u}}{i k} \tilde{\mathbf{u}}{i k}^{T}\right)^{-1} \
\hat{\psi}^{(t+1)} &=\frac{1}{n K p} \sum_{i=1}^{n} \sum_{k=1}^{K}\left|\tilde{\mathbf{y}}{i k}-\hat{\boldsymbol{\mu}}^{(t+1)}-\hat{\mathbf{W}}^{(t+1)} \tilde{\mathbf{u}}{i k}\right|_{2}^{2}
\end{aligned}
$$
Analogously, the MLE of $\eta$ can be easily obtained using standard results for generalized linear models.

统计代写|主成分分析代写Principal Component Analysis代考|PCA and Robust PCAs

Let us consider a training set of $N n$-dimensional samples $\left{\mathbf{x}{i}\right}{i=1^{*}}^{N}$ Assuming that the samples have zero-mean, PCA is to find an orthonormal projection matrix $\mathbf{W} \in$ $\mathbb{R}^{n \times m}\left(m \ll n\right.$ ) by which the projected samples $\left{\mathbf{y}{i}=\mathbf{W}^{T} \mathbf{x}{i}\right}_{i=1}^{N}$ have the maximum variance in the reduce space. It is formulated as the following:
$$
\mathbf{W}{P C A}=\underset{\mathbf{W}}{\arg \max } \operatorname{tr}\left(\mathbf{W}^{T} \mathbf{S W}\right) $$ where $\mathbf{S}=\frac{1}{N} \sum{i=1}^{N} \mathbf{x}{i} \mathbf{x}{i}^{T}$ is a sample covariance matrix and $\operatorname{tr}(\mathbf{A})$ is the trace of a square matrix $\mathbf{A}$. The projection matrix $\mathbf{W}{P C A}$ can be also found from the viewpoint of projection errors, i.e., it minimizes the average of the squared projection errors or reconstruction errors. Mathematically, it is represented as the optimization problem minimizing the following cost function: $$ J{L_{2}}(\mathbf{W})=\frac{1}{N} \sum_{i=1}^{N}\left|\mathbf{x}{i}-\mathbf{W} \mathbf{W}^{T} \mathbf{x}{i}\right|_{2}^{2}
$$
where $|\mathbf{x}|_{2}$ is the $L_{2}$-norm of a vector $\mathbf{x}$. The two optimization problems are equivalent and easily solved by obtaining the $m$ eigenvectors associated with the $m$ largest eigenvalues of $\mathbf{S}$. Although PCA is simple and powerful, it is prone to outliers [8, 13] because $J_{L_{2}}(\mathbf{W})$ is based on the mean squared reconstruction error. To learn a subspace robust to outliers, Ke and Kanade [13] proposed to minimize an $L_{1}$-norm based objective function as follows:
$$
J_{L_{1}}(\mathbf{W})=\frac{1}{N} \sum_{i=1}^{N}\left|\mathbf{x}{i}-\mathbf{W} \mathbf{W}^{T} \mathbf{x}{i}\right|_{1}
$$
where $|\mathbf{x}|_{1}$ is the $L_{1}$-norm of a vector $\mathbf{x}$. They also present an iterative method to obtain the solution for minimizing $J_{L_{1}}(\mathbf{W})$.

主成分分析代考

统计代写|主成分分析代写Principal Component Analysis代考|Missing Not at Random

已经开发了许多统计方法来应对不可忽视的缺失机制，并且这些方法已应用于不同的分析框架。然而，PCA 中使用的方法通常侧重于可忽略性假设[19,21,42]. 最近，Geraci 和 Farcomeni [11] 将 Tipping 和 Bishop 的 [42] EM 方法扩展到向量是是部分观察到的，缺失数据机制是不可忽略的。具体来说，他们提出了对 Ibrahim 等人的 [17] 方法的改编，用于在具有非单调缺失数据模式的随机效应模型中缺失响应。

假设是一世包含s一世,s一世<p，缺失值。这一世对完整数据密度的贡献(是一世,在一世,米一世)是（谁）给的

F(是一世,在一世,米一世∣θ,这)=F(是一世∣在一世,θ)F(在一世)F(米一世∣是一世,这),一世=1,…,n,其中附加因子F(米一世∣是一世,这), 由参数索引这, 是 MDM，我们假设它独立于在一世. 这个假设简化了估计算法的后续步骤，尽管它可以以增加计算时间为代价来放宽（参见[17,18]讨论）。

估计θ通常需要在未观察到的数据上边缘化基于 (10) 的对数似然，但这会导致维度的相当棘手的积分s一世+q. 相反，可以应用 EM 算法。E 步在(吨+1)第一次迭代定义如下：

问(λ∣λ(吨))=和和,在∣X,米,λ(米l(λ;是,在,米)和λ=(θ,这)和l(λ;是,在,米)=∑一世n日志⁡F(是一世∣在一世,θ)+日志⁡F(在一世)+ 日志⁡F(米一世∣是一世,这)，并且期望是关于条件分布的和一世和在一世，给定观察到的数据，在λ(吨).

然而，E-step (11) 还没有提供计算优势，因为它不容易解析求解。因此，Geraci 和 Farcomeni [11] 应用了 Monte Carlo E-step [17]。他们考虑了一种自适应拒绝 Metropolis 采样 (ARMS) 算法 [12] 并指定了以下 MDM

F(米一世∣是一世,这)=∏j=1p圆周率一世j米一世j(1−圆周率一世j)1−米一世j
在哪里圆周率一世j是概率是一世j缺失，以响应为条件是一世. ARMS 算法很方便，因为基本上不需要调整。此外，它对数据矩阵的每一行都并行运行，因此大大加快了计算速度。此外，PPCA 高斯模型为进一步减少 E 步计算期间的计算时间提供了空间。所有的技术细节都在附录中给出。

统计代写|主成分分析代写Principal Component Analysis代考|Appendix – EM Algorithm for PPCA with MNAR Values

在本附录中，我们提供了有关第 3 节中介绍的 Monte Carlo EM 算法的更多详细信息。3.2我们推导出一个简化的 E 步，其中随机效应从完整的数据对数似然中整合出来。

Monte Carlo E-step 需要从F(和一世,在一世∣X一世,米一世,λ(吨)). 这个任务可以通过 ARMS [12] 使用完整的条件有效地执行

F(和一世∣X一世,在一世,米一世,λ(吨))∝F(是一世∣在一世,λ(吨))F(米一世∣是一世,λ(吨)) F(在一世∣X一世,和一世,米一世,λ(吨))∝F(是一世∣在一世,λ(吨))F(在一世)R 包 HI [35] 中提供了 ARMS 的实现。一个样品X一世1,…,X一世ķ为了一世=1,…,n在每次 EM 迭代中获得吨, 其中(s一世+q)×1向量X一世ķ=(和¯一世ķ,在¯一世ķ),ķ=1,…,ķ, 包含“估算”值和一世和在一世（理解为X一世ķ=在¯一世ķ如果s一世=0）。这里是蒙特卡洛样本量ķ始终保持不变。具有不同的替代策略ķ(吨)可以考虑提高 EM 算法的速度或准确性[2,17]. E-step (11) 近似为

问(λ∣λ(吨))=1ķ∑一世=1n∑ķ=1ķl(λ;X一世ķ,X一世,米一世)
(15) 的最大化关于λ很简单。定义是~一世ķ=(和~一世ķ,X一世)如果s一世>0或者是~一世ķ=是一世如果s一世=0,一世=1,…,n,ķ=1,…,ķ. M 步的最大似然解(吨+1)第一次迭代由下式给出

\begin{aligned} \hat{\boldsymbol{\mu}}^{(t+1)} &=\frac{1}{n K} \sum_{i=1}^{n} \sum_{k= 1}^{K}\left(\tilde{\mathbf{y}}{i k}-\hat{\mathbf{W}}^{(t)} \tilde{\mathbf{u}}{i k}\右) \ \hat{\mathbf{W}}^{(t+1)} &=\left{\sum_{i=1}^{n} \sum_{k=1}^{K}\left( \tilde{\mathbf{y}}{i k}-\hat{\boldsymbol{\mu}}^{(t+1)}\right) \tilde{\mathbf{u}}{i k}^{\top }\right}\left(\sum_{i=1}^{n} \sum_{k=1}^{K} \tilde{\mathbf{u}}{i k} \tilde{\mathbf{u}} {i k}^{T}\right)^{-1} \ \hat{\psi}^{(t+1)} &=\frac{1}{n K p} \sum_{i=1}^ {n} \sum_{k=1}^{K}\left|\tilde{\mathbf{y}}{i k}-\hat{\boldsymbol{\mu}}^{(t+1)}-\ hat{\mathbf{W}}^{(t+1)} \tilde{\mathbf{u}}{i k}\right|_{2}^{2} \end{aligned}\begin{aligned} \hat{\boldsymbol{\mu}}^{(t+1)} &=\frac{1}{n K} \sum_{i=1}^{n} \sum_{k= 1}^{K}\left(\tilde{\mathbf{y}}{i k}-\hat{\mathbf{W}}^{(t)} \tilde{\mathbf{u}}{i k}\右) \ \hat{\mathbf{W}}^{(t+1)} &=\left{\sum_{i=1}^{n} \sum_{k=1}^{K}\left( \tilde{\mathbf{y}}{i k}-\hat{\boldsymbol{\mu}}^{(t+1)}\right) \tilde{\mathbf{u}}{i k}^{\top }\right}\left(\sum_{i=1}^{n} \sum_{k=1}^{K} \tilde{\mathbf{u}}{i k} \tilde{\mathbf{u}} {i k}^{T}\right)^{-1} \ \hat{\psi}^{(t+1)} &=\frac{1}{n K p} \sum_{i=1}^ {n} \sum_{k=1}^{K}\left|\tilde{\mathbf{y}}{i k}-\hat{\boldsymbol{\mu}}^{(t+1)}-\ hat{\mathbf{W}}^{(t+1)} \tilde{\mathbf{u}}{i k}\right|_{2}^{2} \end{aligned}
类似地，MLE这可以使用广义线性模型的标准结果轻松获得。

统计代写|主成分分析代写Principal Component Analysis代考|PCA and Robust PCAs

让我们考虑一个训练集ñn维样本\left{\mathbf{x}{i}\right}{i=1^{*}}^{N}\left{\mathbf{x}{i}\right}{i=1^{*}}^{N}假设样本均值为零，PCA就是找一个正交投影矩阵在∈ Rn×米(米≪n) 投影样本\left{\mathbf{y}{i}=\mathbf{W}^{T} \mathbf{x}{i}\right}_{i=1}^{N}\left{\mathbf{y}{i}=\mathbf{W}^{T} \mathbf{x}{i}\right}_{i=1}^{N}在减少空间中具有最大方差。其公式如下：

在磷C一个=参数⁡最大限度在tr⁡(在吨小号在)在哪里小号=1ñ∑一世=1ñX一世X一世吨是一个样本协方差矩阵，并且tr⁡(一个)是方阵的迹一个. 投影矩阵在磷C一个也可以从投影误差的角度求得，即最小化平方投影误差或重构误差的平均值。在数学上，它表示为最小化以下成本函数的优化问题：

Ĵ大号2(在)=1ñ∑一世=1ñ|X一世−在在吨X一世|22
在哪里|X|2是个大号2-向量的范数X. 这两个优化问题是等价的，通过得到米与相关的特征向量米的最大特征值小号. 虽然 PCA 简单而强大，但它很容易出现异常值 [8, 13]，因为Ĵ大号2(在)基于均方重构误差。为了学习对异常值具有鲁棒性的子空间，Ke 和 Kanade [13] 建议最小化大号1-基于范数的目标函数如下：

Ĵ大号1(在)=1ñ∑一世=1ñ|X一世−在在吨X一世|1
在哪里|X|1是个大号1-向量的范数X. 他们还提出了一种迭代方法来获得最小化的解决方案Ĵ大号1(在).

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|主成分分析代写Principal Component Analysis代考|ENVX2001

Posted on 2022年6月14日2022年6月14日 by statistics-lab

如果你也在怎样代写主成分分析Principal Component Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

主成分分析（PCA）是计算主成分并使用它们对数据进行基础改变的过程，有时只使用前几个主成分，而忽略其余部分。

我们提供的主成分分析Principal Component Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|主成分分析代写Principal Component Analysis代考|ENVX2001

统计代写|主成分分析代写Principal Component Analysis代考|Missing Not at Random

When the distribution of the missing data indicator depends on the unobserved data, after conditioning on the observed data, i.e. when
$$
\operatorname{Pr}\left(\mathbf{m}{i} \mid \mathbf{z}{i}, \mathbf{x}{i}\right) \neq \operatorname{Pr}\left(\mathbf{m}{i} \mid \mathbf{x}_{i}\right)
$$

then the missing data are said to be missing not at random. This setting is the most general of all.

In the MNAR scenario the missingness indicator is assumed to be related with unmeasured predictors and/or the unobserved response, even conditionally on the observed data. MNAR data are also referred to as informative since the missing values contain information about the MNAR mechanism itself.

It should be stressed that, if the true mechanism is MNAR, simple CC analyses or naïve imputation methods inevitably produce biased results. The extent and direction of this bias is unpredictable, and even relatively small fractions of missing values might lead to a large bias.

There are different approaches to the treatment of this kind of missing data. Modelbased procedures are most commonly adopted. These aim at modeling the joint distribution of the measurement process and the dropout process, by specifying a missing data model (MDM). The MDM must take into account the residual dependence between the missingness indicator and the unobserved response. Below, we summarize the three main approaches.

Pattern-mixture models. The joint distribution of $\mathbf{y}{i}$ and $\mathbf{m}{i}$ is factorized as
$$
\operatorname{Pr}\left(\mathbf{y}{i}, \mathbf{m}{i}\right)=\operatorname{Pr}\left(\mathbf{y}{i} \mid \mathbf{m}{i}\right) \operatorname{Pr}\left(\mathbf{m}{i}\right) $$ This approach involves formulating separate submodels $\operatorname{Pr}\left(\mathbf{y}{i} \mid \mathbf{m}{i}\right)$ for each possible configuration of $\mathbf{m}{i}$, or, at least, for each observed configuration. This is appealing for studies where the main objective is to compare the response distribution in subgroups with possibly different missing value patterns. On the other hand, its specification can be cumbersome, while its interpretation at the population level may become difficult.
Selection models. The joint distribution of $\mathbf{y}{i}$ and $\mathbf{m}{i}$ is factorized as
$$
\operatorname{Pr}\left(\mathbf{y}{i}, \mathbf{m}{i}\right)=\operatorname{Pr}\left(\mathbf{m}{i} \mid \mathbf{y}{i}\right) \operatorname{Pr}\left(\mathbf{y}{i}\right) $$ This approach involves an explicit model to handle the distribution of the missing data process given the measurement mechanism. If correctly specified, the model for $\mathbf{y}{i}$ is estimated without bias and its interpretation is not compromised.

统计代写|主成分分析代写Principal Component Analysis代考|Methods to Handle Missing Data in Principal

In this section, we discuss selected missing data methods in PCA, some of which are relatively recent at the time of writing. PCA, originally introduced by Karl Pearson [34], is arguably one the most popular multivariate analysis techniques. It is often described as a tool for dimensionality reduction. Some authors consider PCA as a descriptive method, which needs not be based on distributional assumptions, whereas others provide probabilistic justifications in relation to sampling errors (see for example [20] or [21] for alternative interpretations of the PCA model). It is not our purpose to get embroiled in this discussion; here we take a probabilistic view as it is an essential framework for a statistical treatment of the missing data problem.
There are basically two main approaches where sampling comes into play: fixed and random effects PCA. (The random-effects approach can be formulated in either a frequentist or a Bayesian framework. We focus on the former, while more details on the latter can be found in [21]). In the fixed-effects approach, individuals are of direct interest. Therefore, individual-specific scores are parameters to be estimated. In symbols, the fixed-effects PCA model is given by
$$
\mathbf{y}{i}=\mu+\mathbf{W b}{i}+\varepsilon_{i}, \quad i=1, \ldots, n
$$
where $\mu=\left(\mu_{1}, \mu_{2}, \ldots, \mu_{p}\right)^{\mathrm{T}}$ is the vector with the mean of each variable, $\mathbf{b}{i}=$ $\left(b{i 1}, b_{i 2}, \ldots, b_{i q}\right)^{\mathrm{T}}$ is the $i$ th row vector of fixed scores and $\mathbf{W}$ is a $p \times q$ matrix of unknown loadings with elements $w_{j h}, j=1, \ldots, p, h=1, \ldots, q$, with $q \leq p$. The error is assumed to be zero-centered Gaussian with homoscedastic variance, $\varepsilon_{i} \sim N\left(\mathbf{0}, \psi \mathbf{I}{p}\right)$, where $\psi$ is a positive scalar. Model (6)’s parameter, $\boldsymbol{\theta}=\left(\boldsymbol{\mu}, \mathbf{W}, \psi, \mathbf{b}{1}, \mathbf{b}{2}, \ldots, \mathbf{b}{n}\right)$, can be estimated via maximum likelihood (or, equivalently, least squares) estimation (MLE). A downside of this approach is that the dimension of $\theta$ increases with the sample size.

The random-effects specification of the probabilistic representation of PCA $[38$, $42]$ is given by the model
$$
\mathbf{y}{i}=\mu+\mathbf{W u}{i}+\boldsymbol{\varepsilon}{i}, \quad i=1, \ldots, n $$ where $\mathbf{u}{i}=\left(u_{i 1}, u_{i 2}, \ldots, u_{i q}\right)^{\top}$ is the $i$ th row vector of latent scores and $\mathbf{W}$ is, as above, a matrix of unknown loadings. Furthermore, it is assumed that $\mathbf{u}$ is stochastically independent from $\varepsilon$. Conventionally, $\mathbf{u}{i} \sim N\left(0, \mathbf{I}{q}\right)$. If in addition the error is assumed to be zero-centered Gaussian with covariance matrix $\Psi, \varepsilon_{i} \sim N(0, \Psi)$, we obtain the multivariate normal distribution $\mathbf{y}{i} \sim N(\boldsymbol{\mu}, \mathbf{C}), \mathbf{C}=\mathbf{W} \mathbf{W}^{\top}+\boldsymbol{\Psi}$. We also assume that $\Psi=\psi \mathbf{I}{p}$, so that the elements of $\mathbf{y}{i}$ are conditionally independent, given $\mathbf{u}{i}$. The parameter $\mu=\left(\mu_{1}, \mu_{2}, \ldots, \mu_{p}\right)^{\top}$ allows for a location-shift fixed effect.

统计代写|主成分分析代写Principal Component Analysis代考|Multiple Imputation

As mentioned before, single imputation methods treat imputed missing values as fixed (known). This means that the uncertainty related to the missing values is ignored, which generally leads to deflated standard errors. Josse et al. [23] proposed to deal with this issue by first performing a residual bootstrap procedure to obtain $B$ estimates of the PPCA parameters and then generate $B$ data matrices, each completed with samples from the predictive distribution of the missing values conditional on the observed values and the corresponding bootstrapped parameter set. More formally, one can proceed as follows:

obtain an initial estimate $\hat{\mu}, \hat{\mathbf{W}}, \hat{\mathbf{b}}_{i}, i=1, \ldots, n$, of the parameters in model (6) (e.g., via EM-PCA estimation). Reconstruct the data $\hat{\mathbf{Y}}$ with the first $q$ dimensions and calculate $\mathbf{R}=\mathbf{Y}-\hat{\mathbf{Y}}$, where the $n \times p$ matrix $\mathbf{R}$ of residuals has missing entries corresponding to those of $\mathbf{Y}$;
draw $B$ random samples from the non-missing rows of $\mathbf{R}$. Denote each replicate by $\mathbf{R}_{b}^{*}, b=1, \ldots, B$;
calculate $\mathbf{Y}{b}^{}=\hat{\mathbf{Y}}+\mathbf{R}{b}^{}, b=1, \ldots, B$ and estimate the PPCA parameters $\hat{\boldsymbol{\mu}}{b}^{}$, $\hat{\mathbf{W}}{b}^{}, \hat{\mathbf{b}}{b, i}^{}, i=1, \ldots, n$, from $\mathbf{Y}{b}^{}, b=1, \ldots, B ;$
for $\left{i: s_{i}>0\right}$, calculate $\mathbf{y}{b, i}^{}=\hat{\boldsymbol{\mu}}{b}^{}+\hat{\mathbf{W}}{b}^{} \hat{\mathbf{b}}{b, i}^{}+\mathbf{r}^{}$, where $\mathbf{r}^{}$ is a newly sampled residual from $\mathbf{R}$. Complete the vector $\mathbf{y}{i}$ with $\mathbf{y}{b, i}^{*}$ to obtain the $b$ th complete data matrix $\mathbf{Y}_{b}, b=1, \ldots, B$.

There are several ways to obtain bootstrapped residuals. One approach is to draw a sample with replacement from the entries of the matrix $\mathbf{R}$. Another approach, recommended by [21], is to sample the residuals from a zero-centered Gaussian with variance estimated from the non-missing entries of $\mathbf{R}$. Improved results might be obtained with corrected residuals (e.g., leave-one-out residuals).

Once $B$ complete data matrices have been generated, the simplest analytic approach is to carry out a PPCA on each $\mathbf{Y}_{b}$, and then calculate the average of the $B$ sets of parameters. A multiple imputation PPCA of the Orange data set is given in R code 3.4. Our example is based on $B=100$ replicates, with $q=2$. The individuals and variables maps obtained from average scores and loadings are plotted in Fig. 2. As compared to the complete case PCA and the single imputation EM-PCA (Table 1), the multiple imputation PCA produced noticeably different estimates of $\hat{\mathbf{W}}$, especially for the second principal axis (R code $3.4$ ). The uncertainty due to the missing values is also shown in Fig. 2. For example, individual scores 1, 9, and 10 showed more total variability, as given by the area of the ellipses, and more variability along the second axis, as reflected in the eccentricity of the ellipses. The uncertainty was greater for sweetness, color intensity, and bitterness, and, as in the case of the individual scores, it was more prominent in relation to the second axis.

主成分分析代考

统计代写|主成分分析代写Principal Component Analysis代考|Missing Not at Random

当缺失数据指标的分布取决于未观察到的数据时，在对观察到的数据进行调节后，即当

公关⁡(米一世∣和一世,X一世)≠公关⁡(米一世∣X一世)

那么缺失的数据被称为不是随机缺失的。这个设置是最通用的。

在 MNAR 场景中，假设缺失指标与未测量的预测变量和/或未观察到的响应相关，甚至在观察数据的条件下也是如此。MNAR 数据也被称为信息性数据，因为缺失值包含有关 MNAR 机制本身的信息。

需要强调的是，如果真正的机制是 MNAR，简单的 CC 分析或朴素的插补方法不可避免地会产生有偏差的结果。这种偏差的程度和方向是不可预测的，即使是相对较小的缺失值也可能导致较大的偏差。

处理这种缺失数据有不同的方法。最常采用基于模型的程序。这些旨在通过指定缺失数据模型 (MDM) 对测量过程和退出过程的联合分布进行建模。MDM 必须考虑缺失指标和未观察到的响应之间的剩余依赖性。下面，我们总结了三种主要方法。

模式混合模型。联合分布是一世和米一世被分解为
公关⁡(是一世,米一世)=公关⁡(是一世∣米一世)公关⁡(米一世)这种方法涉及制定单独的子模型公关⁡(是一世∣米一世)对于每个可能的配置米一世，或者至少对于每个观察到的配置。这对于主要目标是比较可能具有不同缺失值模式的子组中的响应分布的研究很有吸引力。另一方面，它的规范可能很麻烦，而在人口水平上的解释可能会变得困难。
选择模型。联合分布是一世和米一世被分解为
公关⁡(是一世,米一世)=公关⁡(米一世∣是一世)公关⁡(是一世)这种方法涉及一个显式模型来处理给定测量机制的缺失数据过程的分布。如果正确指定，模型为是一世估计没有偏见，它的解释没有妥协。

统计代写|主成分分析代写Principal Component Analysis代考|Methods to Handle Missing Data in Principal

在本节中，我们将讨论 PCA 中选定的缺失数据方法，其中一些在撰写本文时相对较新。PCA 最初由 Karl Pearson [34] 引入，可以说是最流行的多变量分析技术之一。它通常被描述为降维的工具。一些作者认为 PCA 是一种描述性方法，它不需要基于分布假设，而另一些作者则提供与抽样误差相关的概率论证明（参见例如 [20] 或 [21] 对 PCA 模型的替代解释）。卷入这个讨论不是我们的目的。在这里，我们采用概率观点，因为它是对缺失数据问题进行统计处理的基本框架。
抽样发挥作用的主要方法有两种：固定效应和随机效应 PCA。（随机效应方法可以在频率论或贝叶斯框架中制定。我们关注前者，而关于后者的更多细节可以在 [21] 中找到）。在固定效应方法中，个人是直接感兴趣的。因此，个人特定分数是要估计的参数。在符号中，固定效应 PCA 模型由下式给出

是一世=μ+在b一世+e一世,一世=1,…,n
在哪里μ=(μ1,μ2,…,μp)吨是每个变量均值的向量，b一世= (b一世1,b一世2,…,b一世q)吨是个一世固定分数的第行向量和在是一个p×q元素未知载荷矩阵在jH,j=1,…,p,H=1,…,q，和q≤p. 假设误差是具有同方差方差的零中心高斯，e一世∼ñ(0,ψ我p)，在哪里ψ是一个正标量。模型（6）的参数，θ=(μ,在,ψ,b1,b2,…,bn)，可以通过最大似然（或等效地，最小二乘）估计（MLE）来估计。这种方法的一个缺点是θ随着样本量的增加而增加。

PCA概率表示的随机效应规范[38, 42]由模型给出

是一世=μ+在在一世+e一世,一世=1,…,n在哪里在一世=(在一世1,在一世2,…,在一世q)⊤是个一世潜分数的第行向量和在如上所述，是一个未知载荷矩阵。此外，假设在随机独立于e. 按照惯例，在一世∼ñ(0,我q). 如果另外假设误差是具有协方差矩阵的零中心高斯Ψ,e一世∼ñ(0,Ψ)，我们得到多元正态分布是一世∼ñ(μ,C),C=在在⊤+Ψ. 我们还假设Ψ=ψ我p, 使得元素是一世是有条件独立的，给定在一世. 参数μ=(μ1,μ2,…,μp)⊤允许位置偏移固定效果。

统计代写|主成分分析代写Principal Component Analysis代考|Multiple Imputation

如前所述，单一插补方法将插补缺失值视为固定（已知）。这意味着与缺失值相关的不确定性被忽略，这通常会导致标准误差缩小。乔斯等人。[23] 提出通过首先执行残差引导程序来解决这个问题，以获得乙估计 PPCA 参数，然后生成乙数据矩阵，每个都包含来自缺失值的预测分布的样本，这些样本以观测值和相应的自举参数集为条件。更正式地说，可以如下进行：

获得初步估计μ^,在^,b^一世,一世=1,…,n，模型（6）中的参数（例如，通过EM-PCA估计）。重构数据是^与第一个q尺寸和计算R=是−是^, 其中n×p矩阵R残差的缺失条目对应于是;
画乙非缺失行中的随机样本R. 用Rb∗,b=1,…,乙;
计算是b=是^+Rb,b=1,…,乙并估计 PPCA 参数μ^b, 在^b,b^b,一世,一世=1,…,n，从是b,b=1,…,乙;
为了\left{i: s_{i}>0\right}\left{i: s_{i}>0\right}，计算是b,一世=μ^b+在^bb^b,一世+r，在哪里r是新采样的残差R. 完成向量是一世和是b,一世∗获得b完整的数据矩阵是b,b=1,…,乙.

有几种方法可以获得自举残差。一种方法是从矩阵的条目中抽取带有替换的样本R. [21] 推荐的另一种方法是从零中心高斯样本中对残差进行采样，方差从R. 使用校正残差（例如，留一法残差）可以获得改进的结果。

一次乙已经生成了完整的数据矩阵，最简单的分析方法是对每个矩阵进行 PPCA是b，然后计算平均值乙参数集。Orange 数据集的多重插补 PPCA 在 R 代码 3.4 中给出。我们的例子是基于乙=100复制，与q=2. 从平均分数和负荷获得的个体和变量图绘制在图 2 中。与完整案例 PCA 和单一插补 EM-PCA（表 1）相比，多重插补 PCA 产生了明显不同的估计在^, 特别是对于第二主轴 (R 代码3.4）。由于缺失值导致的不确定性也显示在图 2 中。例如，单个分数 1、9 和 10 显示出更大的总变异性，如椭圆面积所示，而沿第二轴的变异性更大，如反映在椭圆的偏心率中。甜度、颜色强度和苦味的不确定性更大，并且与单个分数的情况一样，它与第二个轴的关系更为突出。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写