STATS 7062 - 统计代写答疑辅导

标签： STATS 7062

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|MAST90085

Posted on 2022年6月24日2022年6月24日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

statistics-lab™ 为您的留学生涯保驾护航在代写多元统计分析Multivariate Statistical Analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写多元统计分析Multivariate Statistical Analysis代写方面经验极为丰富，各种代写多元统计分析Multivariate Statistical Analysis相关的作业也就用不着说。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|MAST90085

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Transformations

Suppose that $X$ has pdf $f_{X}(x)$. What is the pdf of $Y=3 X$ ? Or if $X=$ $\left(X_{1}, X_{2}, X_{3}\right)^{\top}$, what is the pdf of
$$
Y=\left(\begin{array}{c}
3 X_{1} \
X_{1}-4 X_{2} \
X_{3}
\end{array}\right) ?
$$
This is a special case of asking for the pdf of $Y$ when
$$
X=u(Y)
$$
for a one-to-one transformation $u: \mathbb{R}^{p} \rightarrow \mathbb{R}^{p}$. Define the Jacobian of $u$ as
$$
\mathcal{J}=\left(\frac{\partial x_{i}}{\partial y_{j}}\right)=\left(\frac{\partial u_{i}(y)}{\partial y_{j}}\right)
$$
and let $\operatorname{abs}(|\mathcal{J}|)$ be the absolute value of the determinant of this Jacobian. The pdf of $Y$ is given by
$$
f_{Y}(y)=\operatorname{abs}(|\mathcal{J}|) \cdot f_{X}{u(y)}
$$
Using this we can answer the introductory questions, namely
$$
\left(x_{1}, \ldots, x_{p}\right)^{\top}=u\left(y_{1}, \ldots, y_{p}\right)=\frac{1}{3}\left(y_{1}, \ldots, y_{p}\right)^{\top}
$$
with
$$
\mathcal{J}=\left(\begin{array}{ccc}
\frac{1}{3} & & 0 \
& \ddots & \
0 & & \frac{1}{3}
\end{array}\right)
$$
and hence $\operatorname{abs}(|\mathcal{J}|)=\left(\frac{1}{3}\right)^{p}$. So the pdf of $Y$ is $\frac{1}{3^{p}} f_{X}\left(\frac{y}{3}\right)$.
This introductory example is a special case of
$$
Y=\mathcal{A} X+b \text {, where } \mathcal{A} \text { is nonsingular. }
$$
The inverse transformation is
$$
X=\mathcal{A}^{-1}(Y-b)
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|The Multinormal Distribution

The multinormal distribution with mean $\mu$ and covariance $\Sigma>0$ has the density
$$
f(x)=|2 \pi \Sigma|^{-1 / 2} \exp \left{-\frac{1}{2}(x-\mu)^{\top} \Sigma^{-1}(x-\mu)\right}
$$
We write $X \sim N_{p}(\mu, \Sigma)$.
How is this multinormal distribution with mean $\mu$ and covariance $\Sigma$ related to the multivariate standard normal $N_{p}\left(0, \mathcal{I}_{p}\right)$ ? Through a linear transformation using the results of Sect. $4.3$, as shown in the next theorem.

Theorem 4.5 Let $X \sim N_{p}(\mu, \Sigma)$ and $Y=\Sigma^{-1 / 2}(X-\mu)$ (Mahalanobis transfor mation). Then
$$
Y \sim N_{p}\left(0, \mathcal{I}{p}\right), $$ i.e. the elements $Y{j} \in \mathbb{R}$ are independent, one-dimensional $N(0,1)$ variables.
Proof Note that $(X-\mu)^{\top} \Sigma^{-1}(X-\mu)=Y^{\top} Y$. Application of (4.45) gives $\mathcal{J}=$ $\Sigma^{1 / 2}$, hence
$$
f_{Y}(y)=(2 \pi)^{-p / 2} \exp \left(-\frac{1}{2} y^{\top} y\right)
$$
which is by $(4.47)$ the pdf of a $N_{p}\left(0, \mathcal{I}_{p}\right)$.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Sampling Distributions and Limit Theorems

In multivariate statistics, we observe the values of a multivariate random variable $X$ and obtain a sample $\left{x_{i}\right}_{i=1}^{n}$, as described in Chap. 3. Under random sampling, these observations are considered to be realisations of a sequence of i.i.d. random variables $X_{1}, \ldots, X_{n}$, where each $X_{i}$ is a $p$-variate random variable which replicates the parent or population random variable $X$. Some notational confusion is hard to avoid: $X_{i}$ is not the $i$ th component of $X$, but rather the $i$ th replicate of the $p$-variate random variable $X$ which provides the $i$ th observation $x_{i}$ of our sample.

For a given random sample $X_{1}, \ldots, X_{n}$, the idea of statistical inference is to analyse the properties of the population variable $X$. This is typically done by analysing some characteristic $\theta$ of its distribution, like the mean, covariance matrix, etc. Statistical inference in a multivariate setup is considered in more detail in Chaps. 6 and $7 .$

Inference can often be performed using some observable function of the sample $X_{1}, \ldots, X_{n}$, i.e. a statistics. Examples of such statistics were given in Chap. 3: the sample mean $\bar{x}$, the sample covariance matrix $\mathcal{S}$. To get an idea of the relationship between a statistics and the corresponding population characteristic, one has to derive the sampling distribution of the statistic. The next example gives some insight into the relation of $(\bar{x}, S)$ to $(\mu, \Sigma)$.

Example $4.15$ Consider an iid sample of $n$ random vectors $X_{i} \in \mathbb{R}^{p}$ where $\mathrm{E}\left(X_{i}\right)=\mu$ and $\operatorname{Var}\left(X_{i}\right)=\Sigma$. The sample mean $\bar{x}$ and the covariance matrix $\mathcal{S}$ have already been defined in Sect. 3.3. It is easy to prove the following results:
$$
\begin{aligned}
&\mathrm{E}(\bar{x})=n^{-1} \sum_{i=1}^{n} \mathrm{E}\left(X_{i}\right)=\mu \
&\operatorname{Var}(\bar{x})=n^{-2} \sum_{i=1}^{n} \operatorname{Var}\left(X_{i}\right)=n^{-1} \Sigma=\mathrm{E}\left(\bar{x} \bar{x}^{\top}\right)-\mu \mu^{\top}
\end{aligned}
$$

$$
\begin{aligned}
\mathrm{E}(\mathcal{S}) &=n^{-1} \mathrm{E}\left{\sum_{i=1}^{n}\left(X_{i}-\bar{x}\right)\left(X_{i}-\bar{x}\right)^{\top}\right} \
&=n^{-1} \mathrm{E}\left{\sum_{i=1}^{n} X_{i} X_{i}^{\top}-n \bar{x} \bar{x}^{\top}\right} \
&=n^{-1}\left{n\left(\Sigma+\mu \mu^{\top}\right)-n\left(n^{-1} \Sigma+\mu \mu^{\top}\right)\right} \
&=\frac{n-1}{n} \Sigma .
\end{aligned}
$$
This shows in particular that $\mathcal{S}$ is a biased estimator of $\Sigma$. By contrast, $\mathcal{S}_{u t}=\frac{n}{n-1} \mathcal{S}$ is an unbiased estimator of $\Sigma$.

Statistical inference often requires more than just the mean and/or the variance of a statistic. We need the sampling distribution of the statistics to derive confidence intervals or to define rejection regions in hypothesis testing for a given significance level. Theorem $4.9$ gives the distribution of the sample mean for a multinormal population.

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Transformations

假设X有pdfFX(X). 什么是pdf是=3X? 或者如果X= (X1,X2,X3)⊤，什么是pdf

是=(3X1 X1−4X2 X3)?
这是要求pdf的特殊情况是什么时候

X=在(是)
一对一的转换在:Rp→Rp. 定义雅可比行列式在作为

Ĵ=(∂X一世∂是j)=(∂在一世(是)∂是j)
然后让腹肌⁡(|Ĵ|)是这个雅可比行列式的绝对值。的pdf是是（谁）给的

F是(是)=腹肌⁡(|Ĵ|)⋅FX在(是)
使用它，我们可以回答介绍性问题，即

(X1,…,Xp)⊤=在(是1,…,是p)=13(是1,…,是p)⊤
和

Ĵ=(130 ⋱ 013)
因此腹肌⁡(|Ĵ|)=(13)p. 所以pdf是是13pFX(是3).
这个介绍性的例子是一个特例

是=一个X+b，在哪里一个是非奇异的。
逆变换是

X=一个−1(是−b)

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|The Multinormal Distribution

具有均值的多元正态分布μ和协方差Σ>0有密度

f(x)=|2 \pi \Sigma|^{-1 / 2} \exp \left{-\frac{1}{2}(x-\mu)^{\top} \Sigma^{-1 }(x-\mu)\right}f(x)=|2 \pi \Sigma|^{-1 / 2} \exp \left{-\frac{1}{2}(x-\mu)^{\top} \Sigma^{-1 }(x-\mu)\right}
我们写X∼ñp(μ,Σ).
这种具有均值的多正态分布如何μ和协方差Σ与多元标准正态有关ñp(0,我p)? 通过使用 Sect 的结果进行线性变换。4.3，如下一个定理所示。

定理 4.5 让X∼ñp(μ,Σ)和是=Σ−1/2(X−μ)（马氏变换）。然后

是∼ñp(0,我p),即元素是j∈R是独立的，一维的ñ(0,1)变量。
证明请注意(X−μ)⊤Σ−1(X−μ)=是⊤是. (4.45) 的应用给出Ĵ= Σ1/2，因此

F是(是)=(2圆周率)−p/2经验⁡(−12是⊤是)
这是由(4.47)一个的pdfñp(0,我p).

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Sampling Distributions and Limit Theorems

在多元统计中，我们观察多元随机变量的值X并获取样本\left{x_{i}\right}_{i=1}^{n}\left{x_{i}\right}_{i=1}^{n}，如第 1 章所述。3. 在随机抽样下，这些观察被认为是一系列独立同分布随机变量的实现X1,…,Xn, 其中每个X一世是一个p- 复制父或群体随机变量的变量随机变量X. 一些符号混淆是难以避免的：X一世不是一世的第一个组成部分X，而是一世的第一次复制p- 变量随机变量X它提供了一世观察X一世我们的样本。

对于给定的随机样本X1,…,Xn, 统计推断的思想是分析总体变量的性质X. 这通常是通过分析一些特征来完成的θ其分布，如均值、协方差矩阵等。多变量设置中的统计推断在章节中进行了更详细的考虑。6 和7.

推理通常可以使用样本的一些可观察函数来执行X1,…,Xn，即统计。此类统计的示例在第 1 章中给出。3：样本均值X¯, 样本协方差矩阵小号. 要了解统计量与相应的总体特征之间的关系，必须推导出统计量的抽样分布。下一个例子给出了一些关于关系的见解(X¯,小号)至(μ,Σ).

例子4.15考虑一个独立同分布的样本n随机向量X一世∈Rp在哪里和(X一世)=μ和曾是⁡(X一世)=Σ. 样本均值X¯和协方差矩阵小号已在 Sect 中定义。3.3. 很容易证明以下结果：

和(X¯)=n−1∑一世=1n和(X一世)=μ 曾是⁡(X¯)=n−2∑一世=1n曾是⁡(X一世)=n−1Σ=和(X¯X¯⊤)−μμ⊤

\begin{对齐} \mathrm{E}(\mathcal{S}) &=n^{-1} \mathrm{E}\left{\sum_{i=1}^{n}\left(X_{i }-\bar{x}\right)\left(X_{i}-\bar{x}\right)^{\top}\right} \ &=n^{-1} \mathrm{E}\left {\sum_{i=1}^{n} X_{i} X_{i}^{\top}-n \bar{x} \bar{x}^{\top}\right} \ &=n^ {-1}\left{n\left(\Sigma+\mu \mu^{\top}\right)-n\left(n^{-1} \Sigma+\mu \mu^{\top}\right) \right} \ &=\frac{n-1}{n} \Sigma 。\end{对齐}\begin{对齐} \mathrm{E}(\mathcal{S}) &=n^{-1} \mathrm{E}\left{\sum_{i=1}^{n}\left(X_{i }-\bar{x}\right)\left(X_{i}-\bar{x}\right)^{\top}\right} \ &=n^{-1} \mathrm{E}\left {\sum_{i=1}^{n} X_{i} X_{i}^{\top}-n \bar{x} \bar{x}^{\top}\right} \ &=n^ {-1}\left{n\left(\Sigma+\mu \mu^{\top}\right)-n\left(n^{-1} \Sigma+\mu \mu^{\top}\right) \right} \ &=\frac{n-1}{n} \Sigma 。\end{对齐}
这尤其表明小号是一个有偏估计量Σ. 相比之下，小号在吨=nn−1小号是一个无偏估计量Σ.

统计推断通常需要的不仅仅是统计数据的均值和/或方差。我们需要统计数据的抽样分布来推导置信区间或在给定显着性水平的假设检验中定义拒绝区域。定理4.9给出多正态总体的样本均值分布.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考请认准statistics-lab™

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

金融工程是使用数学技术来解决金融问题。金融工程使用计算机科学、统计学、经济学和应用数学领域的工具和知识来解决当前的金融问题，以及设计新的和创新的金融产品。

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

术语广义线性模型（GLM）通常是指给定连续和/或分类预测因素的连续响应变量的常规线性回归模型。它包括多元线性回归，以及方差分析和方差分析（仅含固定效应）。

有限元方法代写

有限元方法（FEM）是一种流行的方法，用于数值解决工程和数学建模中出现的微分方程。典型的问题领域包括结构分析、传热、流体流动、质量运输和电磁势等传统领域。

有限元是一种通用的数值方法，用于解决两个或三个空间变量的偏微分方程（即一些边界值问题）。为了解决一个问题，有限元将一个大系统细分为更小、更简单的部分，称为有限元。这是通过在空间维度上的特定空间离散化来实现的，它是通过构建对象的网格来实现的：用于求解的数值域，它有有限数量的点。边界值问题的有限元方法表述最终导致一个代数方程组。该方法在域上对未知函数进行逼近。[1] 然后将模拟这些有限元的简单方程组合成一个更大的方程系统，以模拟整个问题。然后，有限元通过变化微积分使相关的误差函数最小化来逼近一个解决方案。

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

随机分析代写

随机微积分是数学的一个分支，对随机过程进行操作。它允许为随机过程的积分定义一个关于随机过程的一致的积分理论。这个领域是由日本数学家伊藤清在第二次世界大战期间创建并开始的。

时间序列分析代写

随机过程，是依赖于参数的一组随机变量的全体，参数通常是时间。随机变量是随机现象的数量表现，其时间序列是一组按照时间发生先后顺序进行排列的数据点序列。通常一组时间序列的时间间隔为一恒定值（如1秒，5分钟，12小时，7天，1年），因此时间序列可以作为离散时间数据进行分析处理。研究时间序列数据的意义在于现实中，往往需要研究某个事物其随时间发展变化的规律。这就需要通过研究该事物过去发展的历史记录，以得到其自身发展的规律。

回归分析代写

多元回归分析渐进（Multiple Regression Analysis Asymptotics）属于计量经济学领域，主要是一种数学上的统计分析方法，可以分析复杂情况下各影响因素的数学关系，在自然科学、社会和经济学等多个领域内应用广泛。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习和应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|OLET5610

Posted on 2022年6月24日2022年6月24日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|OLET5610

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Distribution and Density Function

Let $X=\left(X_{1}, X_{2}, \ldots, X_{p}\right)^{\top}$ be a random vector. The cumulative distribution function (cdf) of $X$ is defined by
$$
F(x)=\mathrm{P}(X \leq x)=\mathrm{P}\left(X_{1} \leq x_{1}, X_{2} \leq x_{2}, \ldots, X_{p} \leq x_{p}\right)
$$
For continuous $X$, a nonnegative probability density function (pdf) $f$ exists that
$$
F(x)=\int_{-\infty}^{x} f(u) d u
$$
Note that
$$
\int_{-\infty}^{\infty} f(u) d u=1
$$
Most of the integrals appearing below are multidimensional. For instance, $\int_{-\infty}^{x} f(u) d u$ means $\int_{-\infty}^{x_{p}} \ldots \int_{-\infty}^{x_{1}} f\left(u_{1}, \ldots, u_{p}\right) d u_{1} \ldots d u_{p}$. Note also that the cdf $F$ is differentiable with
$$
f(x)=\frac{\partial^{p} F(x)}{\partial x_{1} \cdots \partial x_{p}}
$$
For discrete $X$, the values of this random variable are concentrated on a countable or finite set of points $\left{c_{j}\right}_{j \in J}$, the probability of events of the form ${X \in D}$ can then be computed as
$$
\mathrm{P}(X \in D)=\sum_{\left{j: c_{j} \in D\right}} \mathrm{P}\left(X=c_{j}\right)
$$
If we partition $X$ as $X=\left(X_{1}, X_{2}\right)^{\top}$ with $X_{1} \in \mathbb{R}^{k}$ and $X_{2} \in \mathbb{R}^{p-k}$, then the function
$$
F_{X_{1}}\left(x_{1}\right)=\mathrm{P}\left(X_{1} \leq x_{1}\right)=F\left(x_{11}, \ldots, x_{1 k}, \infty, \ldots, \infty\right)
$$
is called the marginal cdf. $F=F(x)$ is called the joint cdf. For continuous $X$ the marginal pdf can be computed from the joint density by “integrating out” the variable not of interest.
$$
f_{X_{1}}\left(x_{1}\right)=\int_{-\infty}^{\infty} f\left(x_{1}, x_{2}\right) d x_{2}
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Moments and Characteristic Functions

Moments: Expectation and Covariance Matrix
If $X$ is a random vector with density $f(x)$ then the expectation of $X$ is
$$
\mathrm{E} X=\left(\begin{array}{c}
\mathrm{E} X_{1} \
\vdots \
\mathrm{E} X_{p}
\end{array}\right)=\int x f(x) d x=\left(\begin{array}{c}
\int x_{1} f(x) d x \
\vdots \
\int x_{p} f(x) d x
\end{array}\right)=\mu
$$

Accordingly, the expectation of a matrix of random elements has to be understood component by component. The operation of forming expectations is linear:
$$
\mathrm{E}(\alpha X+\beta Y)=\alpha \mathrm{E} X+\beta \mathrm{E} Y
$$
If $\mathcal{A}(q \times p)$ is a matrix of real numbers, we have:
$$
\mathrm{E}(\mathcal{A} X)=\mathcal{A} E X
$$
When $X$ and $Y$ are independent,
$$
E\left(X Y^{\top}\right)=E X E Y^{\top}
$$
The matrix
$$
\operatorname{Var}(X)=\Sigma=\mathrm{E}(X-\mu)(X-\mu)^{\top}
$$
is the (theoretical) covariance matrix. We write for a vector $X$ with mean vector $\mu$ and covariance matrix $\Sigma$,
$$
X \sim(\mu, \Sigma)
$$
The $(p \times q)$ matrix
$$
\Sigma_{X Y}=\operatorname{Cov}(X, Y)=\mathrm{E}(X-\mu)(Y-v)^{\top}
$$
is the covariance matrix of $X \sim\left(\mu, \Sigma_{X X}\right)$ and $Y \sim\left(v, \Sigma_{Y Y}\right)$. Note that $\Sigma_{X Y}=\Sigma_{Y X}^{\top}$ and that $Z=\left(\begin{array}{l}X \ Y\end{array}\right)$ has covariance $\Sigma_{Z Z}=\left(\begin{array}{ll}\Sigma_{X X} & \Sigma_{X Y} \ \Sigma_{Y X} & \Sigma_{Y Y}\end{array}\right)$. From
$$
\operatorname{Cov}(X, Y)=\mathrm{E}\left(X Y^{\top}\right)-\mu v^{\top}=\mathrm{E}\left(X Y^{\top}\right)-\mathrm{E} X E Y^{\top}
$$
it follows that $\operatorname{Cov}(X, Y)=0$ in the case where $X$ and $Y$ are independent. We often say that $\mu=\mathrm{E}(X)$ is the first order moment of $X$ and that $\mathrm{E}\left(X X^{\top}\right)$ provides the second order moments of $X$ :
$$
E\left(X X^{\top}\right)=\left{E\left(X_{i} X_{j}\right)\right}, \text { for } i=1, \ldots, p \text { and } j=1, \ldots, p
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Properties of Conditional Expectations

Since $\mathrm{E}\left(X_{2} \mid X_{1}=x_{1}\right)$ is a function of $x_{1}$, say $h\left(x_{1}\right)$, we can define the random variable $h\left(X_{1}\right)=\mathrm{E}\left(X_{2} \mid X_{1}\right)$. The same can be done when defining the random variable $\operatorname{Var}\left(X_{2} \mid X_{1}\right)$. These two random variables share some interesting properties:
$$
\begin{aligned}
\mathrm{E}\left(X_{2}\right) &=\mathrm{E}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right} \
\operatorname{Var}\left(X_{2}\right) &=\mathrm{E}\left{\operatorname{Var}\left(X_{2} \mid X_{1}\right)\right}+\operatorname{Var}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right}
\end{aligned}
$$
Example $4.8$ Consider the following pdf
$$
f\left(x_{1}, x_{2}\right)=2 e^{-\frac{x_{2}}{x_{1}}} ; 00 .
$$
It is easy to show that
$$
\begin{gathered}
f\left(x_{1}\right)=2 x_{1} \text { for } 00 ; \quad \mathrm{E}\left(X_{2} \mid X_{1}\right)=X_{1} \text { and } \operatorname{Var}\left(X_{2} \mid X_{1}\right)=X_{1}^{2} .
\end{gathered}
$$
Without explicitly computing $f\left(x_{2}\right)$, we can obtain:
$$
\begin{aligned}
\mathrm{E}\left(X_{2}\right) &=\mathrm{E}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right}=\mathrm{E}\left(X_{1}\right)=\frac{2}{3} \
\operatorname{Var}\left(X_{2}\right) &=\mathrm{E}\left{\operatorname{Var}\left(X_{2} \mid X_{1}\right)\right}+\operatorname{Var}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right} \
&=\mathrm{E}\left(X_{1}^{2}\right)+\operatorname{Var}\left(X_{1}\right)=\frac{2}{4}+\frac{1}{18}=\frac{10}{18}
\end{aligned}
$$
The conditional expectation $\mathrm{E}\left(X_{2} \mid X_{1}\right)$ viewed as a function $h\left(X_{1}\right)$ of $X_{1}$ (known as the regression function of $X_{2}$ on $X_{1}$ ), can be interpreted as a conditional approximation of $X_{2}$ by a function of $X_{1}$. The error term of the approximation is then given by:
$$
U=X_{2}-\mathrm{E}\left(X_{2} \mid X_{1}\right)
$$
Theorem 4.3 Let $X_{1} \in \mathbb{R}^{k}$ and $X_{2} \in \mathbb{R}^{p-k}$ and $U=X_{2}-E\left(X_{2} \mid X_{1}\right)$. Then we have:

$E(U)=0$
$E\left(X_{2} \mid X_{1}\right)$ is the best approximation of $X_{2}$ by a function $h\left(X_{1}\right)$ of $X_{1}$ where $h$ : $\mathbb{R}^{k} \longrightarrow \mathbb{R}^{p-k}$. “Best” is the minimum mean squared error (MSE) sense, where
$$
\operatorname{MSE}(h)=E\left[\left{X_{2}-h\left(X_{1}\right)\right}^{\top}\left{X_{2}-h\left(X_{1}\right)\right}\right] .
$$

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Distribution and Density Function

让X=(X1,X2,…,Xp)⊤是一个随机向量。的累积分布函数 (cdf)X定义为

F(X)=磷(X≤X)=磷(X1≤X1,X2≤X2,…,Xp≤Xp)
对于连续X, 非负概率密度函数 (pdf)F存在

F(X)=∫−∞XF(在)d在
注意

∫−∞∞F(在)d在=1
下面出现的大多数积分都是多维的。例如，∫−∞XF(在)d在方法∫−∞Xp…∫−∞X1F(在1,…,在p)d在1…d在p. 另请注意，cdfF可与

F(X)=∂pF(X)∂X1⋯∂Xp
对于离散X，这个随机变量的值集中在可数或有限的点集上\left{c_{j}\right}_{j \in J}\left{c_{j}\right}_{j \in J}, 形式的事件的概率X∈D然后可以计算为

\mathrm{P}(X \in D)=\sum_{\left{j: c_{j} \in D\right}} \mathrm{P}\left(X=c_{j}\right)\mathrm{P}(X \in D)=\sum_{\left{j: c_{j} \in D\right}} \mathrm{P}\left(X=c_{j}\right)
如果我们分区X作为X=(X1,X2)⊤和X1∈Rķ和X2∈Rp−ķ, 那么函数

FX1(X1)=磷(X1≤X1)=F(X11,…,X1ķ,∞,…,∞)
称为边际 cdf。F=F(X)称为联合 cdf。对于连续X可以通过“积分”不感兴趣的变量从联合密度计算边际 pdf。

FX1(X1)=∫−∞∞F(X1,X2)dX2

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Moments and Characteristic Functions

矩：期望和协方差矩阵
如果X是具有密度的随机向量F(X)那么期望X是

和X=(和X1 ⋮ 和Xp)=∫XF(X)dX=(∫X1F(X)dX ⋮ ∫XpF(X)dX)=μ

因此，必须逐个组件地理解随机元素矩阵的期望。形成期望的操作是线性的：

和(一个X+b是)=一个和X+b和是
如果一个(q×p)是实数矩阵，我们有：

和(一个X)=一个和X
什么时候X和是是独立的，

和(X是⊤)=和X和是⊤
矩阵

曾是⁡(X)=Σ=和(X−μ)(X−μ)⊤
是（理论）协方差矩阵。我们写一个向量X具有平均向量μ和协方差矩阵Σ,

X∼(μ,Σ)
这(p×q)矩阵

ΣX是=这⁡(X,是)=和(X−μ)(是−在)⊤
是协方差矩阵X∼(μ,ΣXX)和是∼(在,Σ是是). 注意ΣX是=Σ是X⊤然后从=(X 是)有协方差Σ从从=(ΣXXΣX是 Σ是XΣ是是). 从

这⁡(X,是)=和(X是⊤)−μ在⊤=和(X是⊤)−和X和是⊤
它遵循这⁡(X,是)=0在这种情况下X和是是独立的。我们经常说μ=和(X)是的一阶矩X然后和(XX⊤)提供二阶矩X :

E\left(X X^{\top}\right)=\left{E\left(X_{i} X_{j}\right)\right}, \text { for } i=1, \ldots, p \文本 { 和 } j=1, \ldots, pE\left(X X^{\top}\right)=\left{E\left(X_{i} X_{j}\right)\right}, \text { for } i=1, \ldots, p \文本 { 和 } j=1, \ldots, p

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Properties of Conditional Expectations

自从和(X2∣X1=X1)是一个函数X1，说H(X1)，我们可以定义随机变量H(X1)=和(X2∣X1). 定义随机变量时也可以这样做曾是⁡(X2∣X1). 这两个随机变量有一些有趣的属性：

\begin{对齐} \mathrm{E}\left(X_{2}\right) &=\mathrm{E}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right )\right} \ \operatorname{Var}\left(X_{2}\right) &=\mathrm{E}\left{\operatorname{Var}\left(X_{2} \mid X_{1}\right )\right}+\operatorname{Var}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right} \end{aligned}\begin{对齐} \mathrm{E}\left(X_{2}\right) &=\mathrm{E}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right )\right} \ \operatorname{Var}\left(X_{2}\right) &=\mathrm{E}\left{\operatorname{Var}\left(X_{2} \mid X_{1}\right )\right}+\operatorname{Var}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right} \end{aligned}
例子4.8考虑以下pdf

F(X1,X2)=2和−X2X1;00.
很容易证明

F(X1)=2X1 为了 00;和(X2∣X1)=X1 和曾是⁡(X2∣X1)=X12.
无需显式计算F(X2)，我们可以得到：

\begin{对齐} \mathrm{E}\left(X_{2}\right) &=\mathrm{E}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right )\right}=\mathrm{E}\left(X_{1}\right)=\frac{2}{3} \ \operatorname{Var}\left(X_{2}\right) &=\mathrm{ E}\left{\operatorname{Var}\left(X_{2} \mid X_{1}\right)\right}+\operatorname{Var}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right} \ &=\mathrm{E}\left(X_{1}^{2}\right)+\operatorname{Var}\left(X_{1}\right) =\frac{2}{4}+\frac{1}{18}=\frac{10}{18} \end{对齐}\begin{对齐} \mathrm{E}\left(X_{2}\right) &=\mathrm{E}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right )\right}=\mathrm{E}\left(X_{1}\right)=\frac{2}{3} \ \operatorname{Var}\left(X_{2}\right) &=\mathrm{ E}\left{\operatorname{Var}\left(X_{2} \mid X_{1}\right)\right}+\operatorname{Var}\left{\mathrm{E}\left(X_{2} \mid X_{1}\right)\right} \ &=\mathrm{E}\left(X_{1}^{2}\right)+\operatorname{Var}\left(X_{1}\right) =\frac{2}{4}+\frac{1}{18}=\frac{10}{18} \end{对齐}
有条件的期望和(X2∣X1)被视为一种功能H(X1)的X1（称为回归函数X2上X1)，可以解释为条件近似X2通过一个函数X1. 近似的误差项由下式给出：

在=X2−和(X2∣X1)
定理 4.3 让X1∈Rķ和X2∈Rp−ķ和在=X2−和(X2∣X1). 然后我们有：

和(在)=0
和(X2∣X1)是的最佳近似值X2通过函数H(X1)的X1在哪里H : Rķ⟶Rp−ķ. “最佳”是最小均方误差 (MSE) 意义，其中
\operatorname{MSE}(h)=E\left[\left{X_{2}-h\left(X_{1}\right)\right}^{\top}\left{X_{2}-h\左(X_{1}\right)\right}\right] 。\operatorname{MSE}(h)=E\left[\left{X_{2}-h\left(X_{1}\right)\right}^{\top}\left{X_{2}-h\左(X_{1}\right)\right}\right] 。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Multiple Linear Model

Posted on 2022年6月24日2022年6月24日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Multiple Linear Model

The simple linear model and the analysis of variance model can be viewed as a particular case of a more general linear model where the variations of one variable $y$ are explained by $p$ explanatory variables $x$ respectively. Let $y(n \times 1)$ and $\mathcal{X}(n \times p)$ be a vector of observations on the response variable and a data matrix on the $p$ explanatory variables. An important application of the developed theory is the least squares fitting. The idea is to approximate $y$ by a linear combination $\hat{y}$ of columns of $\mathcal{X}$, i.e. $\hat{y} \in C(\mathcal{X})$. The problem is to find $\hat{\beta} \in \mathbb{R}^{p}$ such that $\hat{y}=\mathcal{X} \hat{\beta}$ is the best fit of $y$ in the least-squares sense. The linear model can be written as
$$
y=X \beta+\varepsilon
$$
where $\varepsilon$ are the errors. The least squares solution is given by $\hat{\beta}$ :
$$
\hat{\beta}=\arg \min {\beta}(y-\mathcal{X} \beta)^{\top}(y-\mathcal{X} \beta)=\arg \min {\beta} \varepsilon^{\top} \varepsilon
$$

Suppose that $\left(\mathcal{X}^{\top} \mathcal{X}\right)$ is of full rank and thus invertible. Minimising the expression (3.51) with respect to $\beta$ yields:
$$
\hat{\beta}=\left(\mathcal{X}^{\top} \mathcal{X}\right)^{-1} \mathcal{X}^{\top} y
$$
The fitted value $\hat{y}=\mathcal{X} \hat{\beta}=\mathcal{X}\left(\mathcal{X}^{\top} \mathcal{X}^{-1} \mathcal{X}^{\top} y=\mathcal{P} y\right.$ is the projection of $y$ onto $C(\mathcal{X})$ as computed in (2.47).
The least squares residuals are
$$
e=y-\hat{y}=y-\mathcal{X} \hat{\beta}=\mathcal{Q} y=\left(\mathcal{I}{n}-\mathcal{P}\right) y $$ The vector $e$ is the projection of $y$ onto the orthogonal complement of $C(\mathcal{X})$. Remark $3.5$ A linear model with an intercept $\alpha$ can also be written in this framework. The approximating equation is: $$ y{i}=\alpha+\beta_{1} x_{i 1}+\cdots+\beta_{p} x_{i p}+\varepsilon_{i} ; i=1, \ldots, n .
$$
This can be written as:
$$
y=\mathcal{X}^{} \beta^{}+\varepsilon
$$
where $\mathcal{X}^{}=\left(1_{n} \mathcal{X}\right)$ (we add a column of ones to the data). We have by (3.52): $$ \hat{\beta}^{}=\left(\begin{array}{l}
\hat{\alpha} \
\hat{\beta}
\end{array}\right)=\left(\mathcal{X}^{* \top} \mathcal{X}^{}\right)^{-1} \mathcal{X}^{ \top} y
$$
Example $3.15$ Let us come back to the “classic blue” pullovers example. In Example 3.11, we considered the regression fit of the sales $X_{1}$ on the price $X_{2}$ and concluded that there was only a small influence of sales by changing the prices. A linear model incorporating all three variables allows us to approximate sales as a linear function of price $\left(X_{2}\right)$, advertisement $\left(X_{3}\right)$ and presence of sales assistants ( $X_{4}$ ) simultaneously. Adding a column of ones to the data (in order to estimate the intercept $\alpha$ ) leads to
$$
\hat{\alpha}=65.670 \text { and } \widehat{\beta}{1}=-0.216, \widehat{\beta}{2}=0.485, \widehat{\beta}{3}=0.844 $$ The coefficient of determination is computed as before in ( $3.40)$ and is: $$ r^{2}=1-\frac{e^{\top} e}{\sum\left(y{i}-\bar{y}\right)^{2}}=0.907 .
$$
We conclude that the variation of $X_{1}$ is well approximated by the linear relation.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|The ANOVA Model in Matrix Notation

The simple ANOVA problem (Sect. 3.5) may also be rewritten in matrix terms. Recall the definition of a vector of ones from $(2.1)$ and define a vector of zeros as $0_{n}$. Then construct the following $(n \times p)$ matrix (here $p=3$ ),
$$
\mathcal{X}=\left(\begin{array}{lll}
1_{m} & 0_{m} & 0_{m} \
0_{m} & 1_{m} & 0_{m} \
0_{m} & 0_{m} & 1_{m}
\end{array}\right)
$$
where $m=10$. Equation (3.41) then reads as follows.
The parameter vector is $\beta=\left(\mu_{1}, \mu_{2}, \mu_{3}\right)^{\top}$. The data set from Example $3.14$ can therefore be written as a linear model $y=\mathcal{X} \beta+\varepsilon$ where $y \in \mathbb{R}^{n}$ with $n=m \cdot p$ is the stacked vector of the columns of Table 3.1. The projection into the column space $C(\mathcal{X})$ of (3.54) yields the least-squares estimator $\hat{\beta}=\left(\mathcal{X}^{\top} \mathcal{X}^{-1} \mathcal{X}^{\top} y\right.$. Note that $\left(\mathcal{X}^{\top} \mathcal{X}\right)^{-1}=(1 / 10) \mathcal{I}{3}$ and that $\mathcal{X}^{\top} y=(106,124,151)^{\top}$ is the sum $\sum{k=1}^{m} y_{k j}$ for each factor, i.e. the three column sums of Table 3.1. The least squares estimator is therefore the vector $\hat{\beta}{H{1}}=\left(\hat{\mu}{1}, \hat{\mu}{2}, \hat{\mu}{3}\right)=(10.6,12.4,15.1)^{\top}$ of sample means for each factor level $j=1,2,3$. Under the null hypothesis of equal mean values $\mu{1}=\mu_{2}=\mu_{3}=\mu$, we estimate the parameters under the same constraints. This can be put into the form of a linear constraint:
$$
\begin{aligned}
&-\mu_{1}+\mu_{2}=0 \
&-\mu_{1}+\mu_{3}=0
\end{aligned}
$$
This can be written as $\mathcal{A} \beta=a$, where
$$
a=\left(\begin{array}{l}
0 \
0
\end{array}\right)
$$ and
$$
\mathcal{A}=\left(\begin{array}{lll}
-1 & 1 & 0 \
-1 & 0 & 1
\end{array}\right)
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Multivariate Distributions

The preceding chapter showed that by using the two first moments of a multivariate distribution (the mean and the covariance matrix), a lot of information on the relationship between the variables can be made available. Only basic statistical theory was used to derive tests of independence or of linear relationships. In this chapter we give an introduction to the basic probability tools useful in statistical multivariate analysis.

Means and covariances share many interesting and useful properties, but they represent only part of the information on a multivariate distribution. Section $4.1$ presents the basic probability tools used to describe a multivariate random variable, including marginal and conditional distributions and the concept of independence. In Sect. 4.2, basic properties on means and covariances (marginal and conditional ones) are derived.

Since many statistical procedures rely on transformations of a multivariate random variable, Sect. $4.3$ proposes the basic techniques needed to derive the distribution of transformations with a special emphasis on linear transforms. As an important example of a multivariate random variable, Sect. $4.4$ defines the multinormal distribution. It will be analysed in more detail in Chap. 5 along with most of its “companion” distributions that are useful in making multivariate statistical inferences.

The normal distribution plays a central role in statistics because it can be viewed as an approximation and limit of many other distributions. The basic justification relies on the central limit theorem presented in Sect. 4.5. We present this central theorem in the framework of sampling theory. A useful extension of this theorem is also given: it is an approximate distribution to transformations of asymptotically normal variables. The increasing power of computers today makes it possible to consider alternative approximate sampling distributions. These are based on resampling techniques and are suitable for many general situations. Section $4.8$ gives an introduction to the ideas behind bootstrap approximations.

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Multiple Linear Model

简单线性模型和方差分析模型可以看作是一个更一般的线性模型的特例，其中一个变量的变化是被解释为p解释变量X分别。让是(n×1)和X(n×p)是响应变量的观测向量和数据矩阵p解释变量。发展理论的一个重要应用是最小二乘拟合。这个想法是近似是通过线性组合是^的列X， IE是^∈C(X). 问题是找到b^∈Rp这样是^=Xb^是最合适的是在最小二乘意义上。线性模型可以写为

是=Xb+e
在哪里e是错误。最小二乘解由下式给出b^ :

b^=参数⁡分钟b(是−Xb)⊤(是−Xb)=参数⁡分钟be⊤e

假设(X⊤X)是满秩的，因此是可逆的。最小化表达式 (3.51) 关于b产量：

b^=(X⊤X)−1X⊤是
拟合值是^=Xb^=X(X⊤X−1X⊤是=磷是是的投影是到C(X)如 (2.47) 中计算的那样。
最小二乘残差是

和=是−是^=是−Xb^=问是=(我n−磷)是向量和是的投影是上的正交补C(X). 评论3.5具有截距的线性模型一个也可以写在这个框架里。近似方程为：

是一世=一个+b1X一世1+⋯+bpX一世p+e一世;一世=1,…,n.
这可以写成：

是=Xb+e
在哪里X=(1nX)（我们在数据中添加一列）。我们有（3.52）：

b^=(一个^ b^)=(X∗⊤X)−1X⊤是
例子3.15让我们回到“经典蓝色”套头衫的例子。在示例 3.11 中，我们考虑了销售额的回归拟合X1关于价格X2并得出结论，改变价格对销售的影响很小。包含所有三个变量的线性模型允许我们将销售额近似为价格的线性函数(X2)，广告(X3)和销售助理的存在（X4）同时。向数据中添加一列（以估计截距一个）导致

一个^=65.670 和 b^1=−0.216,b^2=0.485,b^3=0.844决定系数的计算方式如前所述 (3.40)并且是：

r2=1−和⊤和∑(是一世−是¯)2=0.907.
我们得出结论，X1由线性关系很好地近似。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|The ANOVA Model in Matrix Notation

简单的 ANOVA 问题（第 3.5 节）也可以用矩阵项重写。回想一下向量的定义(2.1)并将零向量定义为0n. 然后构造如下(n×p)矩阵（这里p=3 ),

X=(1米0米0米 0米1米0米 0米0米1米)
在哪里米=10. 等式 (3.41) 如下所示。
参数向量为b=(μ1,μ2,μ3)⊤. Example 中的数据集3.14因此可以写成线性模型是=Xb+e在哪里是∈Rn和n=米⋅p是表 3.1 的列的堆叠向量。投影到列空间C(X)(3.54) 产生最小二乘估计量b^=(X⊤X−1X⊤是. 注意(X⊤X)−1=(1/10)我3然后X⊤是=(106,124,151)⊤是总和∑ķ=1米是ķj对于每个因素，即表 3.1 的三列总和。因此，最小二乘估计量是向量b^H1=(μ^1,μ^2,μ^3)=(10.6,12.4,15.1)⊤每个因子水平的样本均值j=1,2,3. 在均值相等的原假设下μ1=μ2=μ3=μ，我们在相同的约束条件下估计参数。这可以转化为线性约束的形式：

−μ1+μ2=0 −μ1+μ3=0
这可以写成一个b=一个，在哪里

一个=(0 0)和

一个=(−110 −101)

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Multivariate Distributions

前一章表明，通过使用多元分布的两个一阶矩（均值和协方差矩阵），可以获得有关变量之间关系的大量信息。只有基本的统计理论被用来推导独立性或线性关系的检验。在本章中，我们将介绍在统计多元分析中有用的基本概率工具。

均值和协方差具有许多有趣和有用的属性，但它们仅代表多元分布信息的一部分。部分4.1介绍了用于描述多元随机变量的基本概率工具，包括边际分布和条件分布以及独立性的概念。昆虫。4.2，推导出均值和协方差（边际和条件）的基本性质。

由于许多统计程序依赖于多元随机变量的转换，Sect。4.3提出了推导变换分布所需的基本技术，特别强调线性变换。作为多元随机变量的一个重要例子，Sect。4.4定义多正态分布。将在第 1 章中对其进行更详细的分析。5 以及它的大多数“伴侣”分布，这些分布在进行多元统计推断时很有用。

正态分布在统计中起着核心作用，因为它可以被视为许多其他分布的近似值和极限。基本理由依赖于 Sect 中提出的中心极限定理。4.5. 我们在抽样理论的框架中提出了这个中心定理。还给出了该定理的一个有用扩展：它是渐近正态变量变换的近似分布。今天计算机的强大功能使得考虑替代的近似抽样分布成为可能。这些基于重采样技术，适用于许多一般情况。部分4.8介绍了引导近似背后的思想。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Moving to Higher Dimensions

Posted on 2022年6月24日2022年6月24日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Moving to Higher Dimensions

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Summary Statistics

This section focuses on the representation of basic summary statistics (means, covariances and correlations) in matrix notation, since we often apply linear transformations to data. The matrix notation allows us to derive instantaneously the corresponding characteristics of the transformed variables. The Mahalanobis transformation is a prominent example of such linear transformations.

Assume that we have observed $n$ realisations of a $p$-dimensional random variable; we have a data matrix $\mathcal{X}(n \times p)$ :
$$
\mathcal{X}=\left(\begin{array}{ccc}
x_{11} & \cdots & x_{1 p} \
\vdots & & \vdots \
\vdots & & \vdots \
x_{n 1} & \cdots & x_{n p}
\end{array}\right)
$$
The rows $x_{i}=\left(x_{i 1}, \ldots, x_{i p}\right) \in \mathbb{R}^{p}$ denote the $i$ th observation of a $p$-dimensional random variable $X \in \mathbb{R}^{p}$.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Linear Model for Two Variables

We have looked several times now at downward and upward-sloping scatterplots. What does the eye define here as a slope? Suppose that we can construct a line corresponding to the general direction of the cloud. The sign of the slope of this line would correspond to the upward and downward directions. Call the variable on the vertical axis $Y$ and the one on the horizontal axis $X$. A slope line is a linear relationship between $X$ and $Y$ :
$$
y_{i}=\alpha+\beta x_{i}+\varepsilon_{i}, i=1, \ldots, n
$$

Here, $\alpha$ is the intercept and $\beta$ is the slope of the line. The errors (or deviations from the line) are denoted as $\varepsilon_{i}$ and are assumed to have zero mean and finite variance $\sigma^{2}$. The task of finding $(\alpha, \beta)$ in $(3.27)$ is referred to as a linear adjustment.

In Sect. $3.6$ we shall derive estimators for $\alpha$ and $\beta$ more formally, as well as accurately describe what a “good” estimator is. For now, one may try to find a “good” estimator $(\hat{\alpha}, \hat{\beta})$ via graphical techniques. A very common numerical and statistical technique is to use those $\hat{\alpha}$ and $\hat{\beta}$ that minimise:
$$
(\hat{\alpha}, \hat{\beta})=\arg \min {(\alpha, \beta)} \sum{i=1}^{n}\left(y_{i}-\alpha-\beta x_{i}\right)^{2} .
$$
The solution to this task are the estimators:
$$
\begin{aligned}
&\hat{\beta}=\frac{s_{X Y}}{s_{X X}} \
&\hat{\alpha}=\bar{y}-\hat{\beta} \bar{x}
\end{aligned}
$$
The variance of $\hat{\beta}$ is:
$$
\operatorname{Var}(\hat{\beta})=\frac{\sigma^{2}}{n \cdot s_{X X}}
$$
The standard error (SE) of the estimator is the square root of (3.31),
$$
\operatorname{SE}(\hat{\beta})={\operatorname{Var}(\hat{\beta})}^{1 / 2}=\frac{\sigma}{\left(n \cdot s_{X X}\right)^{1 / 2}}
$$
We can use this formula to test the hypothesis that $\beta=0$. In an application the variance $\sigma^{2}$ has to be estimated by an estimator $\hat{\sigma}^{2}$ that will be given below. Under a normality assumption of the errors, the $t$-test for the hypothesis $\beta=0$ works as follows.
One computes the statistic
$$
t=\frac{\hat{\beta}}{\operatorname{SE}(\hat{\beta})}
$$
and rejects the hypothesis at a $5 \%$ significance level if $|t| \geq t_{0.975 ; n-2}$, where the $97.5 \%$ quantile of the Student’s $t_{n-2}$ distribution is clearly the $95 \%$ critical value for the two-sided test. For $n \geq 30$, this can be replaced by $1.96$, the $97.5 \%$ quantile of the normal distribution. An estimator $\hat{\sigma}^{2}$ of $\sigma^{2}$ will be given in the following.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Simple Analysis of Variance

In a simple (i.e. one-factorial) analysis of variance (ANOVA), it is assumed that the average values of the response variable $y$ are induced by one simple factor. Suppose that this factor takes on $p$ values and that for each factor level, we have $m=n / p$ observations. The sample is of the form given in Table $3.1$, where all of the observations are independent.
The goal of a simple ANOVA is to analyse the observation structure
$$
y_{k l}=\mu_{l}+\varepsilon_{k l} \text { for } k=1, \ldots, m, \text { and } l=1, \ldots, p
$$
Each factor has a mean value $\mu \mathrm{l}$. Each observation $y_{k l}$ is assumed to be a sum of the corresponding factor mean value $\mu /$ and a zero mean random error $\varepsilon_{k l}$. The linear

regression model falls into this scheme with $m=1, p=n$ and $\mu_{i}=\alpha+\beta x_{i}$, where $x_{i}$ is the $i$ th level value of the factor.

Example $3.14$ The “classic blue” pullover company analyses the effect of three marketing strategies

advertisement in local newspaper,
presence of sales assistant,
luxury presentation in shop windows.
All of these strategies are tried in ten different shops. The resulting sale observations are given in Table 3.2.

There are $p=3$ factors and $n=m p=30$ observations in the data. The “classic blue” pullover company wants to know whether all three marketing strategies have the same mean effect or whether there are differences. Having the same effect means that all $\mu_{l}$ in $(3.41$ ) equal one value, $\mu$. The hypothesis to be tested is therefore
$$
H_{0}: \mu_{l}=\mu \text { for } I=1, \ldots, p
$$

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Summary Statistics

本节重点介绍矩阵表示法中基本汇总统计量（均值、协方差和相关性）的表示，因为我们经常对数据应用线性变换。矩阵表示法允许我们立即导出转换变量的相应特征。马氏变换是这种线性变换的一个突出例子。

假设我们已经观察到n的实现p-维随机变量；我们有一个数据矩阵X(n×p) :

X=(X11⋯X1p ⋮⋮ ⋮⋮ Xn1⋯Xnp)
行X一世=(X一世1,…,X一世p)∈Rp表示一世第一次观察p维随机变量X∈Rp.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Linear Model for Two Variables

我们已经多次查看向下和向上倾斜的散点图。眼睛在这里将什么定义为斜率？假设我们可以构造一条与云的大致方向相对应的线。这条线的斜率符号将对应于向上和向下的方向。调用纵轴上的变量是和水平轴上的那个X. 斜线是两者之间的线性关系X和是 :

是一世=一个+bX一世+e一世,一世=1,…,n

这里，一个是截距和b是线的斜率。误差（或与线的偏差）表示为e一世并假设其均值为零且方差有限σ2. 寻找任务(一个,b)在(3.27)称为线性调整。

昆虫。3.6我们将推导出估计量一个和b更正式地，以及准确地描述什么是“好的”估计器。目前，人们可能会尝试找到一个“好的”估算器(一个^,b^)通过图形技术。一种非常常见的数值和统计技术是使用那些一个^和b^最小化：

(一个^,b^)=参数⁡分钟(一个,b)∑一世=1n(是一世−一个−bX一世)2.
此任务的解决方案是估算器：

b^=sX是sXX 一个^=是¯−b^X¯
的方差b^是：

曾是⁡(b^)=σ2n⋅sXX
估计量的标准误差 (SE) 是 (3.31) 的平方根，

东南⁡(b^)=曾是⁡(b^)1/2=σ(n⋅sXX)1/2
我们可以使用这个公式来检验假设b=0. 在应用程序中，方差σ2必须由估算器估算σ^2这将在下面给出。在误差的正态假设下，吨- 检验假设b=0工作如下。
计算统计量

吨=b^东南⁡(b^)
并在 a 处拒绝假设5%显着性水平如果|吨|≥吨0.975;n−2, 其中97.5%学生的分位数吨n−2分布显然是95%两侧试验的临界值。为了n≥30, 这可以替换为1.96，这97.5%正态分布的分位数。估算器σ^2的σ2下面会给出。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Simple Analysis of Variance

在简单（即单因素）方差分析 (ANOVA) 中，假设响应变量的平均值是是由一个简单的因素引起的。假设这个因素在p值，对于每个因子水平，我们有米=n/p观察。样品的形式如表所示3.1，其中所有的观察都是独立的。
简单方差分析的目标是分析观察结构

是ķl=μl+eķl 为了 ķ=1,…,米, 和 l=1,…,p
每个因素都有一个平均值μl. 每次观察是ķl假设为相应因子平均值的总和μ/和一个零均值随机误差eķl. 线性的

回归模型属于这个方案米=1,p=n和μ一世=一个+bX一世，在哪里X一世是个一世因子的第 th 水平值。

例子3.14“经典蓝”套头衫公司分析三种营销策略的效果

在当地报纸上刊登广告，
销售助理在场，
橱窗里的奢侈品展示。
所有这些策略都在十个不同的商店中进行了尝试。表 3.2 给出了所得的销售观察结果。

有p=3因素和n=米p=30数据中的观察。“经典蓝”套头衫公司想知道这三种营销策略是否具有相同的平均效果，或者是否存在差异。具有相同的效果意味着所有μl在(3.41) 等于一个值，μ. 因此，要检验的假设是

H0:μl=μ 为了我=1,…,p

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|A Short Excursion into Matrix Algebra

Posted on 2022年6月24日2022年6月24日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|A Short Excursion into Matrix Algebra

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Derivatives

For later sections of this book, it will be useful to introduce matrix notation for derivatives of a scalar function of a vector $x$, i.e. $f(x)$, with respect to $x$. Consider $f: \mathbb{R}^{p} \rightarrow \mathbb{R}$ and a $(p \times 1)$ vector $x$, then $\frac{\partial f(x)}{\partial x}$ is the column vector of partial derivatives $\left{\frac{\partial f(x)}{\partial x_{j}}\right}, j=1, \ldots, p$ and $\frac{\partial f(x)}{\partial x T}$ is the row vector of the same derivative $\left(\frac{\partial f(x)}{\partial x}\right.$ is called the gradient of $\left.f\right)$.

We can also introduce second order derivatives: $\frac{\partial^{2} f(x)}{\partial x \partial x^{\top}}$ is the $(p \times p)$ matrix of elements $\frac{\partial^{2} f(x)}{\partial x_{i} \partial x_{j}}, i=1, \ldots, p$ and $j=1, \ldots, p\left(\frac{\partial^{2} f(x)}{\partial x \partial x}\right.$ is called the Hessian of $\left.f\right)$. Suppose that $a$ is a $(p \times 1)$ vector and that $\mathcal{A}=\mathcal{A}^{\top}$ is a $(p \times p)$ matrix. Then
$$
\frac{\partial a^{\top} x}{\partial x}=\frac{\partial x^{\top} a}{\partial x}=a
$$
The Hessian of the quadratic form $Q(x)=x^{\top} \mathcal{A} x$ is:
$$
\frac{\partial^{2} x^{\top} \mathcal{A} x}{\partial x \partial x^{\top}}=2 . \mathcal{A}
$$
Example $2.8$ Consider the matrix
$$
\mathcal{A}=\left(\begin{array}{ll}
1 & 2 \
2 & 3
\end{array}\right)
$$

From formulas $(2.24)$ and $(2.25)$ it immediately follows that the gradient of $Q(x)=$ $x^{\top} \mathcal{A} x$ is
$$
\frac{\partial x^{\top} \mathcal{A} x}{\partial x}=2, A x=2\left(\begin{array}{ll}
1 & 2 \
2 & 3
\end{array}\right) x=\left(\begin{array}{ll}
2 x & 4 x \
4 x & 6 x
\end{array}\right)
$$
and the Hessian is
$$
\frac{\partial^{2} x^{\top} \mathcal{A x}}{\partial x \partial x^{\top}}=2, A=2\left(\begin{array}{ll}
1 & 2 \
2 & 3
\end{array}\right)=\left(\begin{array}{ll}
2 & 4 \
4 & 6
\end{array}\right)
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Partitioned Matrices

Very often we will have to consider certain groups of rows and columns of a matrix $\mathcal{A}(n \times p)$. In the case of two groups, we have
$$
\mathcal{A}=\left(\begin{array}{ll}
\mathcal{A}{11} & \mathcal{A}{12} \
\mathcal{A}{21} & \mathcal{A}{22}
\end{array}\right)
$$
where $\mathcal{A}{i j}\left(n{i} \times p_{j}\right), i, j=1,2, n_{1}+n_{2}=n$ and $p_{1}+p_{2}=p$.
If $\mathcal{B}(n \times p)$ is partitioned accordingly, we have:
$$
\begin{aligned}
\mathcal{A}+\mathcal{B} &=\left(\begin{array}{ll}
\mathcal{A}{11}+\mathcal{B}{11} & \mathcal{A}{12}+\mathcal{B}{12} \
\mathcal{A}{21}+\mathcal{B}{21} & \mathcal{A}{22}+\mathcal{B}{22}
\end{array}\right) \
\mathcal{B}^{\top} &=\left(\begin{array}{l}
\mathcal{B}{11}^{\top} \mathcal{B}{21}^{\top} \
\mathcal{B}{12}^{\top} \mathcal{B}{22}^{\top}
\end{array}\right) \
\mathcal{A B} &=\left(\begin{array}{l}
\mathcal{A}{11} \mathcal{B}{11}^{\top}+\mathcal{A}{12} \mathcal{B}{12}^{\top} \mathcal{A}{11} \mathcal{B}{21}^{\top}+\mathcal{A}{12} \mathcal{B}{22}^{\top} \
\mathcal{A}{21} \mathcal{B}{11}^{\top}+\mathcal{A}{22} \mathcal{B}{12}^{\top} \mathcal{A}{21} \mathcal{B}{21}^{\top}+\mathcal{A}{22} \mathcal{B}{22}^{\top}
\end{array}\right)
\end{aligned}
$$
An important particular case is the square matrix $\mathcal{A}(p \times p)$, partitioned in such a way that $\mathcal{A}{11}$ and $\mathcal{A}{22}$ are both square matrices (i.e. $n_{j}=p_{j}, j=1,2$ ). It can be verified that when $\mathcal{A}$ is non-singular $\left(\mathcal{A}, \mathcal{A}^{-1}=\mathcal{I}{p}\right)$ : $$ \mathcal{A}^{-1}=\left(\begin{array}{ll} \mathcal{A}^{11} & \mathcal{A}^{12} \ \mathcal{A}^{21} & \mathcal{A}^{22} \end{array}\right) $$ where $$ \left{\begin{array}{l} \mathcal{A}^{11}=\left(\mathcal{A}{11}-\mathcal{A}{12} \mathcal{A}{22}^{-1} \mathcal{A}{21}\right)^{-1} \stackrel{\text { def }}{=}\left(\mathcal{A}{11 \cdot 2}\right)^{-1} \
\mathcal{A}^{12}=-\left(\mathcal{A}{11 \cdot 2}\right)^{-1} \mathcal{A}{12} \mathcal{A}{22}^{-1} \ \mathcal{A}^{21}=-\mathcal{A}{22}^{-1} \mathcal{A}{21}\left(\mathcal{A}{11 \cdot 2}\right)^{-1} \
\mathcal{A}^{22}=\mathcal{A}{22}^{-1}+\mathcal{A}{22}^{-1} \mathcal{A}{21}\left(\mathcal{A}{11 \cdot 2}\right)^{-1} \mathcal{A}{12} \mathcal{A}{22}^{-1}
\end{array}\right.
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Column Space and Null Space of a Matrix

Define for $\mathcal{X}(n \times p)$
$$
\operatorname{Im}(\mathcal{X}) \stackrel{\text { def }}{=} C(\mathcal{X})=\left{x \in \mathbb{R}^{n} \mid \exists a \in \mathbb{R}^{p} \text { so that } \mathcal{X} a=x\right},
$$
the space generated by the columns of $\mathcal{X}$ or the column space of $\mathcal{X}$. Note that $C(\mathcal{X}) \subseteq \mathbb{R}^{n}$ and $\operatorname{dim}{C(\mathcal{X})}=\operatorname{rank}(\mathcal{X})=r \leq \min (n, p)$
$$
\operatorname{Ker}(\mathcal{X}) \stackrel{\text { def }}{=} N(\mathcal{X})=\left{y \in \mathbb{R}^{p} \mid \mathcal{X} y=0\right}
$$
is the null space of $\mathcal{X}$. Note that $N(\mathcal{X}) \subseteq \mathbb{R}^{p}$ and that $\operatorname{dim}{N(\mathcal{X})}=p-r$.
Remark $2.2 N\left(\mathcal{X}^{\top}\right)$ is the orthogonal complement of $C(\mathcal{X})$ in $\mathbb{R}^{n}$, i.e. given a vector $b \in \mathbb{R}^{n}$ it will hold that $x^{\top} b=0$ for all $x \in C(\mathcal{X})$, if and only if $b \in N\left(\mathcal{X}^{\top}\right)$.
Example $2.12$ Let $\mathcal{X}=\left(\begin{array}{lll}2 & 3 & 5 \ 4 & 6 & 7 \ 6 & 8 & 6 \ 8 & 2 & 4\end{array}\right)$. It is easy to show (e.g. by calculating the determinant of $\mathcal{X})$ that $\operatorname{rank}(\mathcal{X})=3$. Hence, the column space of $\mathcal{X}$ is $C(\mathcal{X})=\mathbb{R}^{3}$. The null space of $\mathcal{X}$ contains only the zero vector $(0,0,0)^{\top}$ and its dimension is equal to $\operatorname{rank}(\mathcal{X})-3=0$.

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Derivatives

对于本书的后面部分，介绍向量标量函数的导数的矩阵表示法将很有用X， IEF(X)，关于X. 考虑F:Rp→R和一个(p×1)向量X，然后∂F(X)∂X是偏导数的列向量\left{\frac{\partial f(x)}{\partial x_{j}}\right}, j=1, \ldots, p\left{\frac{\partial f(x)}{\partial x_{j}}\right}, j=1, \ldots, p和∂F(X)∂X吨是同导数的行向量(∂F(X)∂X被称为梯度F).

我们还可以引入二阶导数：∂2F(X)∂X∂X⊤是个(p×p)元素矩阵∂2F(X)∂X一世∂Xj,一世=1,…,p和j=1,…,p(∂2F(X)∂X∂X被称为 Hessian 的F). 假设一个是一个(p×1)矢量和那个一个=一个⊤是一个(p×p)矩阵。然后

∂一个⊤X∂X=∂X⊤一个∂X=一个
二次形式的 Hessian问(X)=X⊤一个X是：

∂2X⊤一个X∂X∂X⊤=2.一个
例子2.8考虑矩阵

一个=(12 23)

从公式(2.24)和(2.25)紧接着就是梯度问(X)= X⊤一个X是

∂X⊤一个X∂X=2,一个X=2(12 23)X=(2X4X 4X6X)
黑森州是

∂2X⊤一个X∂X∂X⊤=2,一个=2(12 23)=(24 46)

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Partitioned Matrices

很多时候，我们将不得不考虑矩阵的某些行和列组一个(n×p). 在两组的情况下，我们有

一个=(一个11一个12 一个21一个22)
在哪里一个一世j(n一世×pj),一世,j=1,2,n1+n2=n和p1+p2=p.
如果乙(n×p)被相应地划分，我们有：

一个+乙=(一个11+乙11一个12+乙12 一个21+乙21一个22+乙22) 乙⊤=(乙11⊤乙21⊤ 乙12⊤乙22⊤) 一个乙=(一个11乙11⊤+一个12乙12⊤一个11乙21⊤+一个12乙22⊤ 一个21乙11⊤+一个22乙12⊤一个21乙21⊤+一个22乙22⊤)
一个重要的特例是方阵一个(p×p), 以这样的方式划分一个11和一个22都是方阵（即nj=pj,j=1,2）。可以验证当一个是非奇异的(一个,一个−1=我p) :

一个−1=(一个11一个12 一个21一个22)$$ \左{

一个11=(一个11−一个12一个22−1一个21)−1= 定义 (一个11⋅2)−1 一个12=−(一个11⋅2)−1一个12一个22−1 一个21=−一个22−1一个21(一个11⋅2)−1 一个22=一个22−1+一个22−1一个21(一个11⋅2)−1一个12一个22−1\正确的。
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Column Space and Null Space of a Matrix

定义为X(n×p)

\operatorname{Im}(\mathcal{X}) \stackrel{\text { def }}{=} C(\mathcal{X})=\left{x \in \mathbb{R}^{n} \mid \存在一个 \in \mathbb{R}^{p} \text { 使得 } \mathcal{X} a=x\right}，\operatorname{Im}(\mathcal{X}) \stackrel{\text { def }}{=} C(\mathcal{X})=\left{x \in \mathbb{R}^{n} \mid \存在一个 \in \mathbb{R}^{p} \text { 使得 } \mathcal{X} a=x\right}，
的列产生的空间X或的列空间X. 注意C(X)⊆Rn和暗淡⁡C(X)=秩⁡(X)=r≤分钟(n,p)

\operatorname{Ker}(\mathcal{X}) \stackrel{\text { def }}{=} N(\mathcal{X})=\left{y \in \mathbb{R}^{p} \mid \mathcal{X} y=0\right}\operatorname{Ker}(\mathcal{X}) \stackrel{\text { def }}{=} N(\mathcal{X})=\left{y \in \mathbb{R}^{p} \mid \mathcal{X} y=0\right}
是零空间X. 注意ñ(X)⊆Rp然后暗淡⁡ñ(X)=p−r.
评论2.2ñ(X⊤)是的正交补C(X)在Rn，即给定一个向量b∈Rn它会认为X⊤b=0对所有人X∈C(X), 当且仅当b∈ñ(X⊤).
例子2.12让X=(235 467 686 824). 它很容易显示（例如通过计算行列式X)那秩⁡(X)=3. 因此，列空间为X是C(X)=R3. 的零空间X仅包含零向量(0,0,0)⊤它的维度等于秩⁡(X)−3=0.

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Parallel Coordinates Plots

Posted on 2022年6月24日2022年6月24日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Parallel Coordinates Plots

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|KernelParallel Coordinates Plots

PCP is a method for representing high-dimensional data, see Inselberg (1985). Instead of plotting observations in an orthogonal coordinate system, PCP draws coordinates in parallel axes and connects them with straight lines. This method helps in representing data with more than four dimensions.

One first scales all variables to $\max =1$ and $\min =0$. The coordinate index $j$ is drawn onto the horizontal axis, and the scaled value of variable $x_{i j}$ is mapped onto the vertical axis. This way of representation is very useful for high-dimensional data. It is however also sensitive to the order of the variables, since certain trends in the data can be shown more clearly in one ordering than in another.

Example 1.5 Take, once again, the observations $96-105$ of the Swiss bank notes. These observations are six dimensional, so we can’t show them in a six-dimensional Cartesian coordinate system. Using the PCP technique, however, they can be plotted on parallel axes. This is shown in Fig. 1.22.

PCP can also be used for detecting linear dependencies between variables: if all the lines are of almost parallel dimensions $(p=2)$, there is a positive linear dependence between them. In Fig. $1.23$ we display the two variables weight and displacement for the car data set in Sect. 22.3. The correlation coefficient $\rho$ introduced in Sect. $3.2$ is $0.9$. If all lines intersect visibly in the middle, there is evidence of a negative linear dependence between these two variables, see Fig. $1.24$. In fact the correlation is $\rho=-0.82$ between two variables mileage and weight: The more the weight, the less the mileage.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Hexagon Plots

This section closely follows the presentation of Lewin-Koh (2006). In geometry, a hexagon is a polygon with six edges and six vertices. Hexagon binning is a type of bivariate histogram with hexagon borders. It is useful for visualising the structure of data sets entailing a large number of observations $n$. The concept of hexagon binning is as follows:

The $x y$ plane over the set (range $(x)$, $\operatorname{range}(y)$ ) is tessellated by a regular grid of hexagons.
The number of points falling in each hexagon is counted.
The hexagons with count $>0$ are plotted by using a colour ramp or varying the radius of the hexagon in proportion to the counts.

This algorithm is extremely fast and effective for displaying the structure of data sets even for $n \geq 10^{6}$. If the size of the grid and the cuts in the colour ramp are chosen in a clever fashion, then the structure inherent in the data should emerge in the binned plot. The same caveats apply to hexagon binning as histograms. Variance and bias vary in opposite directions with bin width, so we have to settle for finding the value of the bin width that yields the optimal compromise between variance and bias reduction. Clearly, if we increase the size of the grid, the hexagon plot appears to be smoother, but without some reasonable criterion on hand it remains difficult to say which bin width provides the “optimal” degree of smoothness. The default number of bins suggested by standard software is 30 .

Applications to some data sets are shown as follows. The data is taken from ALLBUS (2006) [ZA No.3762]. The number of respondents is 2,946 . The following nine variables have been selected to analyse the relation between each pair of variables.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Quadratic Forms

A quadratic form $Q(x)$ is built from a symmetric matrix $\mathcal{A}(p \times p)$ and a vector $x \in \mathbb{R}^{p}:$
$$
Q(x)=x^{\top}, \mathcal{A} x=\sum_{i=1}^{p} \sum_{j=1}^{p} a_{i j} x_{i} x_{j}
$$
Definiteness of Quadratic Forms and Matrices
$$
\begin{array}{ll}
Q(x)>0 \text { for all } x \neq 0 & \text { positive definite } \
Q(x) \geq 0 \text { for all } x \neq 0 & \text { positive semidefinite }
\end{array}
$$
A matrix $\mathcal{A}$ is called positive definite (semidefinite) if the corresponding quadratic form $Q(.)$ is positive definite (semidefinite). We write $\mathcal{A}>0(\geq 0)$.
Quadratic forms can always be diagonalised, as the following result shows.
Theorem 2.3 If $\mathcal{A}$ is symmetric and $Q(x)=x^{\top} \mathcal{A} x$ is the corresponding quadratic form, then there exists a transformation $x \mapsto \Gamma^{\top} x=y$ such that
$$
x^{\top} \mathcal{A} x=\sum_{i=1}^{p} \lambda_{i} y_{i}^{2}
$$
where $\lambda_{i}$ are the eigenvalues of $\mathcal{A}$.
Proof $\mathcal{A}=\Gamma \Lambda \Gamma^{\top}$. By Theorem $2.1$ and $y=\Gamma^{\top} \alpha$ we have that $x^{\top} \mathcal{A} x=$ $x^{\top} \Gamma \Lambda \Gamma^{\top} x=y^{\top} \Lambda y=\sum_{i=1}^{p} \lambda_{i} y_{i}^{2}$.

Positive definiteness of quadratic forms can be deduced from positive eigenvalues.
Theorem $2.4 \mathcal{A}>0$ if and only if all $\lambda_{i}>0, i=1, \ldots, p$.
Proof $0<\lambda_{1} y_{1}^{2}+\cdots+\lambda_{p} y_{p}^{2}=x^{\top} \mathcal{A} x$ for all $x \neq 0$ by Theorem 2.3.

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|KernelParallel Coordinates Plots

PCP 是一种表示高维数据的方法，参见 Inselberg (1985)。PCP 不是在正交坐标系中绘制观测值，而是在平行轴上绘制坐标并用直线连接它们。此方法有助于表示具有四个以上维度的数据。

首先将所有变量缩放为最大限度=1和分钟=0. 坐标索引j绘制在水平轴上，变量的缩放值X一世j映射到垂直轴上。这种表示方式对于高维数据非常有用。然而，它对变量的顺序也很敏感，因为数据中的某些趋势可以以一种顺序比另一种顺序更清楚地显示。

例 1.5 再次观察观察结果96−105瑞士银行纸币。这些观察是六维的，所以我们不能在六维笛卡尔坐标系中显示它们。然而，使用 PCP 技术，它们可以绘制在平行轴上。如图 1.22 所示。

PCP 也可用于检测变量之间的线性依赖关系：如果所有线的维度几乎平行(p=2)，它们之间存在正线性相关。在图。1.23我们显示了 Sect 中汽车数据集的两个变量权重和位移。22.3. 相关系数ρ节中介绍。3.2是0.9. 如果所有线在中间明显相交，则有证据表明这两个变量之间存在负线性相关性，见图。1.24. 实际上相关性是ρ=−0.82里程和重量两个变量之间：重量越大，里程越少。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Hexagon Plots

本节紧跟 Lewin-Koh (2006) 的介绍。在几何学中，六边形是一个有六个边和六个顶点的多边形。六边形分箱是一种具有六边形边界的二元直方图。它对于可视化需要大量观察的数据集的结构很有用n. 六边形分箱的概念如下：

这X是平面上的集合（范围(X), 范围⁡(是)) 由六边形的规则网格镶嵌。
计算落在每个六边形中的点数。
有计数的六边形>0通过使用色带或与计数成比例地改变六边形的半径来绘制。

该算法非常快速有效地显示数据集的结构，即使对于n≥106. 如果以巧妙的方式选择网格的大小和色带中的切口，则数据中固有的结构应该出现在分箱图中。相同的注意事项适用于直方图的六边形分箱。方差和偏差随 bin 宽度的变化方向相反，因此我们必须找到在方差和偏差减少之间产生最佳折衷的 bin 宽度值。显然，如果我们增加网格的大小，六边形图看起来会更平滑，但如果没有一些合理的标准，仍然很难说哪个 bin 宽度提供了“最佳”平滑度。标准软件建议的默认 bin 数量为 30 。

一些数据集的应用如下所示。数据取自 ALLBUS (2006) [ZA No.3762]。受访者人数为 2,946 人。选择了以下九个变量来分析每对变量之间的关系。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Quadratic Forms

二次形式问(X)由对称矩阵构建一个(p×p)和一个向量X∈Rp:

问(X)=X⊤,一个X=∑一世=1p∑j=1p一个一世jX一世Xj
二次形式和矩阵的确定性

问(X)>0 对所有人 X≠0 正定问(X)≥0 对所有人 X≠0 正半定
矩阵一个如果对应的二次型称为正定（半定）问(.)是正定（半定）。我们写一个>0(≥0).
二次形式总是可以对角化，如下面的结果所示。
定理 2.3 如果一个是对称的并且问(X)=X⊤一个X是对应的二次型，则存在变换X↦Γ⊤X=是这样

X⊤一个X=∑一世=1pλ一世是一世2
在哪里λ一世是的特征值一个.
证明一个=ΓΛΓ⊤. 按定理2.1和是=Γ⊤一个我们有X⊤一个X= X⊤ΓΛΓ⊤X=是⊤Λ是=∑一世=1pλ一世是一世2.

二次形式的正定性可以从正特征值推导出来。
定理2.4一个>0当且仅当所有λ一世>0,一世=1,…,p.
证明0<λ1是12+⋯+λp是p2=X⊤一个X对所有人X≠0由定理 2.3。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Comparison of Batches

Posted on 2022年6月24日2022年6月24日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Comparison of Batches

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Kernel Densities

The major difficulties of histogram estimation may be summarised in four critiques:

determination of the binwidth $h$, which controls the shape of the histogram,
choice of the bin origin $x_{0}$, which also influences to some extent the shape,
loss of information since observations are replaced by the central point of the interval in which they fall,
the underlying density function is often assumed to be smooth, but the histogram is not smooth.

Rosenblatt (1956), Whittle (1958) and Parzen (1962) developed an approach which avoids the last three difficulties. First, a smooth kernel function rather than a box is used as the basic building block. Second, the smooth function is centred directly over each observation. Let us study this refinement by supposing that $x$ is the centre value of a bin. The histogram can in fact be rewritten as
$$
\hat{f}{h}(x)=n^{-1} h^{-1} \sum{i=1}^{n} I\left(\left|x-x_{i}\right| \leq \frac{h}{2}\right)
$$
If we define $K(u)=\boldsymbol{I}\left(|u| \leq \frac{1}{2}\right)$, then (1.8) changes to
$$
\hat{f}{h}(x)=n^{-1} h^{-1} \sum{i=1}^{n} K\left(\frac{x-x_{i}}{h}\right) .
$$
This is the general form of the kernel estimator. Allowing smoother kernel functions like the quartic kernel,
$$
K(u)=\frac{15}{16}\left(1-u^{2}\right)^{2} \boldsymbol{I}(|u| \leq 1)
$$
and computing $x$ not only at bin centers gives us the kernel density estimator. Kernel estimators can also be derived via weighted averaging of rounded points (WARPing) or by averaging histograms with different origins, see Scott (1985). Table $1.5$ introduces some commonly used kernels.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Chernoff-Flury Faces

If we are given data in numerical form, we tend to also display it numerically. This was done in the preceding sections: an observation $x_{1}=(1,2)$ was plotted as the point $(1,2)$ in a two-dimensional coordinate system. In multivariate analysis we want to understand data in low dimensions (e.g. on a $2 \mathrm{D}$ computer screen) although the structures are hidden in high dimensions. The numerical display of data structures using coordinates therefore ends at dimensions greater than three.

If we are interested in condensing a structure into $2 \mathrm{D}$ elements, we have to consider alternative graphical techniques. The Chernoff-Flury faces, for example, provide such a condensation of high-dimensional information into a simple “face”. In fact faces are a simple way of graphically displaying high-dimensional data. The size of the face elements like pupils, eyes, upper and lower hair line, etc. are assigned to certain variables. The idea of using faces goes back to Chernoff (1973) and has been further developed by Bernhard Flury. We follow the design described in Flury and Riedwyl (1988) which uses the following characteristics.

right eye size
right pupil size
position of right pupil
right eye slant
horizontal position of right eye
vertical position of right eye
curvature of right eyebrow
density of right eyebrow
horizontal position of right eyebrow
vertical position of right eyebrow
right upper hair line
right lower hair line
right face line
darkness of right hair
right hair slant
right nose line
right size of mouth
right curvature of mouth
19-36. like 1-18, only for the left side.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Andrews’ Curves

The basic problem of graphical displays of multivariate data is the dimensionality. Scatterplots work well up to three dimensions (if we use interactive displays). More than three dimensions have to be coded into displayable 2D or 3D structures (e.g. faces). The idea of coding and representing multivariate data by curves was suggested by Andrews (1972). Each multivariate observation $X_{i}=\left(X_{i, 1}, \ldots, X_{i, p}\right)$ is transformed into a curve as follows:
the observation represents the coefficients of a so-called Fourier series $(t \in[-\pi, \pi])$.
Suppose that we have three-dimensional observations: $X_{1}=(0,0,1), X_{2}=$ $(1,0,0)$ and $X_{3}=(0,1,0)$. Here $p=3$ and the following representations correspond to the Andrews’ curves:
$$
\begin{aligned}
&f_{1}(t)=\cos (t) \
&f_{2}(t)=\frac{1}{\sqrt{2}} \quad \text { and } \
&f_{3}(t)=\sin (t)
\end{aligned}
$$
These curves are indeed quite distinct, since the observations $X_{1}, X_{2}$, and $X_{3}$ are the $3 \mathrm{D}$ unit vectors: each observation has mass only in one of the three dimensions. The order of the variables plays an important role.

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Kernel Densities

直方图估计的主要困难可以概括为四个批评：

binwidth的确定H，它控制直方图的形状，
bin 原点的选择X0，这也在一定程度上影响了形状，
信息丢失，因为观测值被它们所在区间的中心点所取代，
通常假设底层密度函数是平滑的，但直方图并不平滑。

Rosenblatt (1956)、Whittle (1958) 和 Parzen (1962) 开发了一种方法来避免最后三个困难。首先，使用平滑核函数而不是盒子作为基本构建块。其次，平滑函数直接以每个观察为中心。让我们通过假设X是 bin 的中心值。直方图实际上可以重写为

F^H(X)=n−1H−1∑一世=1n我(|X−X一世|≤H2)
如果我们定义ķ(在)=我(|在|≤12), 然后 (1.8) 变为

F^H(X)=n−1H−1∑一世=1nķ(X−X一世H).
这是核估计器的一般形式。允许更平滑的核函数，如四次核，

ķ(在)=1516(1−在2)2我(|在|≤1)
和计算X不仅在 bin 中心为我们提供了核密度估计器。内核估计量也可以通过圆角点的加权平均（WARPing）或通过对不同来源的直方图进行平均得出，参见 Scott (1985)。桌子1.5介绍一些常用的内核。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Chernoff-Flury Faces

如果我们以数字形式给出数据，我们也倾向于以数字形式显示它。这是在前面的部分中完成的：一个观察X1=(1,2)被绘制为点(1,2)在二维坐标系中。在多变量分析中，我们希望了解低维数据（例如2D电脑屏幕）虽然这些结构隐藏在高维度中。因此，使用坐标的数据结构的数字显示以大于三的维度结束。

如果我们有兴趣将一个结构压缩成2D元素，我们必须考虑替代图形技术。例如，Chernoff-Flury 面将高维信息浓缩成一个简单的“面”。事实上，人脸是一种以图形方式显示高维数据的简单方式。诸如瞳孔、眼睛、上下发线等面部元素的大小被分配给某些变量。使用面的想法可以追溯到 Chernoff (1973)，并由 Bernhard Flury 进一步发展。我们遵循 Flury 和 Riedwyl (1988) 中描述的设计，它使用以下特性。

右眼大小
正确的瞳孔大小
右瞳孔位置
右眼倾斜
右眼水平位置
右眼垂直位置
右眉弧度
右眉密度
右眉水平位置
右眉垂直位置
右上发际线
右下发际线
右脸线
右头发的黑暗
右发斜
右鼻线
嘴巴大小合适
嘴的右曲度
19-36。像 1-18，仅适用于左侧。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Andrews’ Curves

多元数据图形显示的基本问题是维数。散点图在三个维度上都能很好地工作（如果我们使用交互式显示）。超过三个维度必须编码为可显示的 2D 或 3D 结构（例如面）。Andrews (1972) 提出了用曲线编码和表示多元数据的想法。每个多变量观察X一世=(X一世,1,…,X一世,p)被转换成如下曲线：
观察表示所谓的傅立叶级数的系数(吨∈[−圆周率,圆周率]).
假设我们有三维观察：X1=(0,0,1),X2= (1,0,0)和X3=(0,1,0). 这里p=3以下表示对应于安德鲁斯曲线：

F1(吨)=因⁡(吨) F2(吨)=12 和 F3(吨)=罪⁡(吨)
这些曲线确实很明显，因为观察X1,X2，和X3是3D单位向量：每个观测值仅在三个维度之一中具有质量。变量的顺序起着重要作用。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Inference for a Multivariate Normal

Posted on 2022年6月6日2022年6月6日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Inference for a Multivariate Normal

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Population

In previous chapters, we have mostly focused on multivariate exploratory analyses, without statistical inference. That is, our focus is mostly on finding important descriptive features of data, such as data dimensions and data classifications, without extending the results or conclusions to the whole population. Distributional assumptions for the data are often not required for such exploratory analysis. Exploratory data analysis is an important component of statistical analysis. However, exploratory analysis is not sufficient. Often we wish to extend the conclusions from exploratory analysis to the whole population. That is, we also wish to make statistical inference, such as hypothesis testing or confidence intervals.

In this chapter, we consider statistical inference for multivariate continuous data, i.e., we try to extend the results from the data to the whole population. In order to make inference, the following assumptions are often required: (i) the sample is a random and representative subset of the population, e.g., an i.i.d. (independent and identically distributed) sample from the population, and (ii) the population is assumed to follow a parametric distribution, such as a multivariate normal distribution. In this chapter, we focus on the most important distribution for multivariate continuous data, i.e., the multivariate normal distribution. A multivariate normal distribution, denoted by $N(\mu, \Sigma)$, is completely determined by its mean vector $\mu=\left(\mu_{1}, \mu_{2}, \cdots, \mu_{p}\right)^{\mathrm{T}}$ and its covariance matrix $\Sigma=\left(\sigma_{i j}\right)_{p \times p}$.

Inference for a multivariate normal distribution is often the main focus of many classic multivariate analysis books. Many nice features of the multivariate normal distribution allow us to do theoretical arguments and derive many elegant results. In practice, a multivariate normal distributional assumption may be reasonable for continuous data in many cases (perhaps after some transformations of the original data), especially when the sample size is large so the central limit theorems can be used to justify the normality assumption for the sample means. In some other cases in practice, such as small sample sizes or skewed data or discrete data, the multivariate

normal assumption may be too strong or unreasonable. In real data analysis, when we use a method or model which require normality assumption, we should check to see if this assumption is indeed reasonable.

In the following sections, we consider inference for both the mean vector and the covariance matrix of a multivariate normal distribution. Since theoretical derivations of these results are available in many classic books and our focus is to use these methods in practice, we skip the theoretical derivations and focus on explaining the methods and their applicability in data analysis. Readers interested in mathematical derivations are referred to many classic multivariate analysis books (e.g., Johnson and Wichern 2007).

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Inference for Multivariate Means

We first review the well-known $t$-test for inference of a univariate mean. In univariate analysis, the $t$-test is widely used to test the mean parameter assumed for a continuous variable. Suppose that $\left{x_{1}, x_{2}, \cdots, x_{n}\right}$ is a random sample from a univariate normal population $N\left(\mu, \sigma^{2}\right)$. Consider a two-sided test for the mean parameter $\mu$ :
$$
H_{0}: \mu=\mu_{0} \quad \text { versus } \quad H_{1}: \mu \neq \mu_{0}
$$
where $\mu_{0}$ is known. Let the sample mean and the sample standard deviation be $\bar{x}$ and $s$ respectively. The $t$ test statistic is given by
$$
t=\frac{\sqrt{n}\left(\bar{x}-\mu_{0}\right)}{s}
$$
which is in fact a standardize version of the sample mean $\bar{x}$ under $H_{0}$. Under the null hypothesis $H_{0}$, the test statistic $t$ follows a $t$-distribution with $n-1$ degrees of freedom. An alternative and equivalent test statistic is given by
$$
t^{2}=\frac{n\left(\bar{x}-\mu_{0}\right)^{2}}{s^{2}},
$$
which follows a $F(1, n-1)$-distribution. These tests are relatively robust against small to moderate departure from the assumed normality of the population, especially when the sample size is large. In fact, when the sample size is large, the sample mean $\bar{x}$ will approximately follow a normal distribution for any population distribution, based on the central limit theorem. In other words, $t$-tests can be used as long as the sample size is large, even if the population is not normally distributed.

The above univariate $t$-test can be extended to the multivariate case. For multivariate data, a key consideration is to incorporate the correlation or covariance between the variables. The most well-known extension is the so-called Hotelling’s $T$ test, which is described as follows. Let $\mathbf{x}{1}, \cdots, \mathbf{x}{n}$ be a random sample from the

multivariate normal population $N_{p}(\mu, \Sigma)$, where $\mathbf{x}{i}=\left(x{i 1}, \cdots, x_{i p}\right)^{\mathrm{T}}$ and both $\boldsymbol{\mu}$ and $\Sigma$ are unknown. Let
$$
\overline{\mathrm{x}}=\frac{\sum_{i=1}^{n} \mathbf{x}{i}}{n}, \quad \boldsymbol{S}=\frac{1}{n-1} \sum{i=1}^{n}\left(\mathbf{x}{i}-\overline{\mathrm{x}}\right)\left(\mathbf{x}{i}-\overline{\mathrm{x}}\right)^{\mathrm{T}}
$$
be the sample mean vector and sample covariance matrix respectively. Suppose that we wish to test the following two-sided multivariate hypotheses
$$
H_{0}: \boldsymbol{\mu}=\boldsymbol{\mu}{0} \quad \text { versus } \quad H{1}: \boldsymbol{\mu} \neq \boldsymbol{\mu}{0}, $$ where $\boldsymbol{\mu}{0}$ is a known vector. The Hotelling’s $T^{2}$ test statistic is given by
$$
T^{2}=n\left(\overline{\mathrm{x}}-\mu_{0}\right)^{\mathrm{T}} S^{-1}\left(\overline{\mathrm{x}}-\mu_{0}\right) .
$$
Under $H_{0}$, we have
$$
T^{* 2}=\frac{n-p}{p(n-1)} T^{2} \sim F(p, n-p) .
$$

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Inference for Covariance Matrices

For a multivariate normal population $N_{p}(\boldsymbol{\mu}, \Sigma)$, inference for the covariance matrix $\Sigma$ can also be performed. However, the computation associated with the test can be tedious since closed-form null distributions of test statistics are often unavailable, so computer software is needed for computation.

In practice, it is common to test the equality of two covariance matrices, For example, when testing two multivariate mean vectors, it is assumed that the two unknown covariance matrices are equal (as noted in the previous section). This assumption can be tested. Specifically, we can test the equality of two population covariance matrices
$$
H_{0}: \Sigma_{1}=\Sigma_{2} \quad \text { versus } \quad \Sigma_{1} \neq \Sigma_{2} \text {. }
$$
Let $\left{\mathbf{x}{1}, \cdots, \mathbf{x}{n_{1}}\right}$ be a random sample from population $N_{p}\left(\boldsymbol{\mu}{1}, \Sigma{1}\right)$, and let $\left{\mathbf{y}{1}, \cdots\right.$, $\left.\mathbf{y}{n_{2}}\right}$ be an independent random sample from population $N_{p}\left(\mu_{2}, \Sigma_{2}\right)$. The test statistic is given by
$$
\Lambda=\frac{\left|\hat{\Sigma}{1}\right|^{\left(n{1}-1\right) / 2}\left|\hat{\Sigma}{2}\right|^{\left(n{2}-1\right) / 2}}{|\hat{\Sigma}|(n-2) / 2}
$$

where $\hat{\Sigma}{1}$ and $\hat{\Sigma}{2}$ are sample covariance matrices of $\Sigma_{1}$ and $\Sigma_{2}$ respectively,
$$
\hat{\Sigma}=\left(\left(n_{1}-1\right) \hat{\Sigma}{1}+\left(n{2}-1\right) \hat{\Sigma}{2}\right) /(n-2) $$ is the pooled estimate of the covariance matrix, and $n=n{1}+n_{2}$ is the total sample size. The null distribution of $\Lambda$ is complicated, but computer software can be used to obtain p-values of the test.

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Population

在前面的章节中，我们主要关注多变量探索性分析，没有进行统计推断。也就是说，我们的重点主要是寻找数据的重要描述性特征，例如数据维度和数据分类，而不是将结果或结论扩展到整个人群。这种探索性分析通常不需要数据的分布假设。探索性数据分析是统计分析的重要组成部分。然而，探索性分析是不够的。我们经常希望将探索性分析的结论扩展到整个人群。也就是说，我们还希望进行统计推断，例如假设检验或置信区间。

在本章中，我们考虑多元连续数据的统计推断，即我们尝试将结果从数据扩展到整个人群。为了进行推断，通常需要以下假设：（i）样本是总体的随机且具有代表性的子集，例如，来自总体的 iid（独立同分布）样本，以及（ii）总体是假设遵循参数分布，例如多元正态分布。在本章中，我们关注多元连续数据最重要的分布，即多元正态分布。多元正态分布，表示为ñ(μ,Σ), 完全由其均值向量确定μ=(μ1,μ2,⋯,μp)吨及其协方差矩阵Σ=(σ一世j)p×p.

多元正态分布的推断通常是许多经典多元分析书籍的主要焦点。多元正态分布的许多不错的特性使我们能够进行理论论证并得出许多优雅的结果。在实践中，多元正态分布假设在许多情况下对于连续数据可能是合理的（可能是在对原始数据进行一些变换之后），特别是当样本量很大时，因此可以使用中心极限定理来证明正态性假设的合理性样本手段。在实践中的其他一些情况下，例如小样本量或倾斜数据或离散数据，多变量

正常的假设可能太强或不合理。在实际数据分析中，当我们使用需要正态假设的方法或模型时，我们应该检查这个假设是否确实合理。

在以下部分中，我们考虑对多元正态分布的均值向量和协方差矩阵进行推断。由于这些结果的理论推导在许多经典书籍中都有，而我们的重点是在实践中使用这些方法，我们跳过理论推导，重点解释这些方法及其在数据分析中的适用性。对数学推导感兴趣的读者可以参考许多经典的多元分析书籍（例如，Johnson 和 Wichern 2007）。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Inference for Multivariate Means

我们先回顾一下众所周知的吨- 测试单变量均值的推断。在单变量分析中，吨-test 广泛用于测试为连续变量假设的平均参数。假设\left{x_{1}, x_{2}, \cdots, x_{n}\right}\left{x_{1}, x_{2}, \cdots, x_{n}\right}是来自单变量正态总体的随机样本ñ(μ,σ2). 考虑平均参数的双边检验μ :

H0:μ=μ0 相对 H1:μ≠μ0
在哪里μ0是已知的。设样本均值和样本标准差为X¯和s分别。这吨检验统计量由下式给出

吨=n(X¯−μ0)s
这实际上是样本均值的标准化版本X¯在下面H0. 在原假设下H0, 检验统计量吨遵循一个吨-分布与n−1自由程度。另一种等效的检验统计量由下式给出

吨2=n(X¯−μ0)2s2,
紧随其后的是F(1,n−1)-分配。这些检验相对稳健，相对于假设的总体正态性的小到中度偏离，特别是当样本量很大时。实际上，当样本量很大时，样本均值X¯根据中心极限定理，任何人口分布都将近似服从正态分布。换句话说，吨只要样本量很大，即使总体不是正态分布的，也可以使用 -tests。

上述单变量吨-test 可以扩展到多变量情况。对于多变量数据，一个关键考虑因素是纳入变量之间的相关性或协方差。最著名的扩展是所谓的 Hotelling’s吨测试，描述如下。令 $\mathbf{x} {1}, \cdots, \mathbf{x} {n}$ 是来自

多元正态人群ñp(μ,Σ)，在哪里X一世=(X一世1,⋯,X一世p)吨和两者μ和Σ是未知的。让

X¯=∑一世=1nX一世n,小号=1n−1∑一世=1n(X一世−X¯)(X一世−X¯)吨
分别为样本均值向量和样本协方差矩阵。假设我们希望检验以下双边多元假设

H0:μ=μ0 相对 H1:μ≠μ0,在哪里μ0是一个已知向量。霍特林的吨2检验统计量由下式给出

吨2=n(X¯−μ0)吨小号−1(X¯−μ0).
在下面H0，我们有

吨∗2=n−pp(n−1)吨2∼F(p,n−p).

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Inference for Covariance Matrices

对于多元正态人群ñp(μ,Σ), 协方差矩阵的推断Σ也可以进行。然而，与测试相关的计算可能很繁琐，因为测试统计的封闭形式的空分布通常不可用，因此需要计算机软件进行计算。

在实践中，测试两个协方差矩阵是否相等是很常见的，例如，在测试两个多元均值向量时，假设两个未知协方差矩阵相等（如上一节所述）。这个假设是可以检验的。具体来说，我们可以检验两个总体协方差矩阵的相等性

H0:Σ1=Σ2 相对 Σ1≠Σ2.
让\left{\mathbf{x}{1}, \cdots, \mathbf{x}{n_{1}}\right}\left{\mathbf{x}{1}, \cdots, \mathbf{x}{n_{1}}\right}是来自总体的随机样本ñp(μ1,Σ1)，然后让\left{\mathbf{y}{1}, \cdots\right.$, $\left.\mathbf{y}{n_{2}}\right}\left{\mathbf{y}{1}, \cdots\right.$, $\left.\mathbf{y}{n_{2}}\right}是来自总体的独立随机样本ñp(μ2,Σ2). 检验统计量由下式给出

Λ=|Σ^1|(n1−1)/2|Σ^2|(n2−1)/2|Σ^|(n−2)/2

在哪里Σ^1和Σ^2是样本协方差矩阵Σ1和Σ2分别，

Σ^=((n1−1)Σ^1+(n2−1)Σ^2)/(n−2)是协方差矩阵的合并估计，并且n=n1+n2是总样本量。的零分布Λ很复杂，但是可以使用计算机软件来获得检验的 p 值。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Discriminant Analysis and Cluster Analysis

Posted on 2022年6月6日2022年6月6日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Discriminant Analysis

In discriminant analysis, we wish to decide which population an observation in a sample comes from, if the possible populations are known in advance. For example, suppose that we know in advance that some customers are “good” and some customers are “bad” (so the two possible populations, good customers and bad customers, are known in advance). Given a randomly selected customer, we wish to determine if he/she is a good customer or not based on his/her records (data) such as credit history $\left(x_{1}\right)$, education $\left(x_{2}\right)$, and income $\left(x_{3}\right)$. Such a discriminant analysis may be very useful for (say) credit card applicants so that good applicants will receive credit cards while bad applicants do not receive credit cards. For the convenience of statistical analysis, sometimes we may assume that the sample $\left(x_{1}, x_{2}, x_{3}\right)$ (or its transformations, e.g., $\left.\left(\log \left(x_{1}\right), \log \left(x_{2}\right), \log \left(x_{3}\right)\right)\right)$, follows a multivariate normal distribution, but some discriminant analysis methods do not require this assumption.
As another example, suppose that a company reviews job applicants based on their academic records $\left(x_{1}\right)$, education $\left(x_{2}\right)$, working experience $\left(x_{3}\right)$, self confidence $\left(x_{4}\right)$, and motivation $\left(x_{5}\right)$. All job applicants can be classified as either “suitable” or “not suitable” based on the given information. So the company may perform a discriminant analysis to separate suitable applicants from unsuitable applicants based on the data all applicants provide.

More generally, suppose that there are two multivariate normally distributed populations, denoted by
population $\pi_{1}: N_{p}\left(\boldsymbol{\mu}{1}, \Sigma{1}\right), \quad$ population $\pi_{2}: N_{p}\left(\boldsymbol{\mu}{2}, \Sigma{2}\right)$.
Given an observation $\mathrm{x}=\left(x_{1}, x_{2}, \cdots, x_{p}\right)^{\mathrm{T}}$ from a sample, we want to find out whether $\mathbf{x}$ is from population $\pi_{1}$ or population $\pi_{2}$. This is a discriminant analysis. Note that, if $p=1$ (i.e., if there is only one variable of interest), then it is very easy to separate the observations. All we need to do is to decide a threshold value, say $K$, so that we can do the separation based on whether $x \leqslant K$ or $x>K$. For example, if we just wish to separate students based on their grades, it is easy to see which students are good and which are not, such as the ones with grades over $80 \%$ and the ones with grades less than $80 \%$ (so $K=80$ ). However, when $p \geqslant 2$, it is less straightforward to separate the observations. For example, if we wish to separate students based on their grades and their music skills, then it may be hard to do the separation since a student may have very good grades but poor music skills. In this case, we need more advance statistical methods to do the separation.

There are many methods available for discriminant analysis. For example, the following two methods are simple and useful ones:

Likelihood method: we may choose population $\pi_{1}$ if the likelihood for $\pi_{1}$ is larger

than the likelihood for $\pi_{2}$, or vice versa. This method requires distributional assumption, such as multivariate normal distributions.

Mahalanobis distance method we may consider the Mahalanobis distances between an observation $\mathbf{x}$ and the population mean $\boldsymbol{\mu}{i}$ : $$ d{i}=\sqrt{\left(\mathrm{x}-\mu_{i}\right)^{\prime} \Sigma^{-1}\left(\mathrm{x}-\mu_{i}\right)}, \quad i=1,2,
$$
assuming $\Sigma_{1}=\Sigma_{2}=\Sigma$. We can then choose population $\pi_{1}$ if $d_{1}<d_{2}$, or vice versa. This method does not require distributional assumption.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Discriminant analysis for categorical data

The discriminant analysis methods discussed so far assume that all the variables or data are continuous. When some variables or data are categorical or discrete, the above methods cannot be used since the means and covariance matrices are no longer meaningful for categorical variables or data. When some variables are categorical, a simple approach is to use logistic regression models for discriminant analysis, as illustrated below. Note that a logistic regression model is a generalized linear model, which will be described in details in Chapter 9 .

As an example, consider the case of two populations. Let $y=1$ if an observation $\mathbf{x}=\left(x_{1}, x_{2}, \cdots, x_{p}\right)^{\mathbf{T}}$ is from population 1 and $y=0$ if the observation is from population 2. Then, we can consider the following logistic regression model
$$
\log \frac{P(y=1)}{1-P(y=1)}=\beta_{0}+\beta_{1} x_{1}+\cdots+\beta_{p} x_{p},
$$
where some $x_{j}$ ‘s may be categorical and some may be continuous. Given data, the above logistic regression model can be used to fit the data. Then, we can estimate the probability $P(y=1)$ based on the fitted logistic regression model. For observation $\mathbf{x}{i}$, if the estimated probability $\widehat{P}\left(y{i}=1\right)>0.5$, observation $\mathbf{x}_{i}$ is more likely from population 1 ; otherwise it is more likely from population 2. This method can be extended to more than two populations.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Cluster Analysis

The goal of a cluster analysis is to identify homogeneous groups in the data by grouping observations based on their similarities or dis-similarities, i.e., similar observations are assigned to the same group. In other words, the idea is to partition all observations into subgroups or clusters (or populations), so that observations in the same cluster have similar characteristics.

For example, a marketing professional may want to partition all consumers into subgroups or clusters so that consumers in the same subgroup or cluster have similar buying habits. The partition can be based on age $\left(x_{1}\right)$, education $\left(x_{2}\right)$, income $\left(x_{3}\right)$, and monthly payments $\left(x_{4}\right)$. This is an example of cluster analysis based on four variables. The marketing professional may then design special advertisement strategies for different groups/clusters of consumers. Note that here the number of subgroups or clusters is not known before the cluster analysis.

Cluster analysis is similar to discriminant analysis in the sense that both methods try to separate observations into different groups. However, in discriminant analysis, the number of clusters (or subgroups or populations) are known in advance, and the objective is to determine which cluster (or population) an observation is likely to come from. In cluster analysis, on the other hand, the number of clusters (or subgroups or populations) is not known in advance, and the objective is find out distinct clusters and determine which cluster an observation is likely to come from. In other words, in cluster analysis one needs to find out how many clusters there may be and determine the cluster membership of an observation. Therefore, statistical methods for discriminant analysis may not be directly used in cluster analysis.

In cluster analysis, our objective is to devise a classification scheme, i.e., we need to find a rule to measure the similarity or dissimilarity between any two observations so that similar observations are grouped together to form clusters. Specifically, let $\mathbf{x}=\left(x_{1}, x_{2}, \cdots, x_{p}\right)^{\mathrm{T}}$ and $\mathbf{x}^{}=\left(x_{1}^{}, x_{2}^{}, \cdots, x_{p}^{}\right)^{\mathrm{T}}$ be two observations. The similarity between them can be measured by the “distance” between them, so that the two observations are similar if the distance between them is small. For example, the Euclidean distance between $\mathrm{x}$ and $\mathrm{x}^{}$ is defined as $$ d_{0}\left(\mathrm{x}, \mathrm{x}^{}\right)=\sqrt{\left(\mathrm{x}-\mathrm{x}^{}\right)^{\mathrm{T}}\left(\mathrm{x}-\mathrm{x}^{}\right)}=\sqrt{\sum_{j=1}^{p}\left(x_{j}-x_{j}^{}\right)^{2}} . $$ However, the Euclidean distance does not take into account the variations and the correlations of the component variables. For multivariate data, each individual component variable has its own variance and the component variables may be correlated. Therefore, a better measure of the “distance” between two multivariate observations $\mathbf{x}$ and $\mathbf{x}^{}$ is the Mahalanobis distance:
$$
d\left(\mathbf{x}, \mathbf{x}^{}\right)=\sqrt{\left(\mathbf{x}-\mathbf{x}^{}\right)^{\mathrm{T}} \Sigma^{-1}\left(\mathbf{x}-\mathbf{x}^{*}\right)}
$$

where $\Sigma=\operatorname{Cov}(\mathbf{x})=\operatorname{Cov}\left(\mathbf{x}^{}\right)$ is the covariance matrix, assuming the two observations have the same covariance matrices. Thus, if the distance $d\left(\mathbf{x}, \mathbf{x}^{}\right)$ is small, we can consider that $\mathrm{x}$ and $\mathrm{x}^{*}$ are “close” and put them in the same cluster/group. Otherwise, we can put them in different clusters/groups. For a sample of $n$ observations, $\left{\mathbf{x}{1}, \cdots, \mathbf{x}{n}\right}$, each observation can be assigned into one of the clusters, based on a clustering method.

There are many cluster analysis methods available. In the following, we briefly discuss a few commonly used methods: nearest neighbour method, $k$-means algorithm, and two hierarchical cluster methods. Each method has its advantages and disadvantages, and different methods may lead to different results. In practice, it is always desirable to try at least two methods to analyze a dataset to see if the results agree or how much the results differ (which may provide some insights about the data).

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Discriminant Analysis

在判别分析中，如果事先知道可能的总体，我们希望确定样本中的观察来自哪个总体。例如，假设我们预先知道有些客户是“好”，有些客户是“坏”（所以预先知道好客户和坏客户这两个可能的人群）。给定一个随机选择的客户，我们希望根据他/她的信用记录等记录（数据）来确定他/她是否是一个好客户(X1)，教育(X2), 和收入(X3). 这样的判别分析对于（比如说）信用卡申请者可能非常有用，因此好的申请者会收到信用卡，而坏的申请者不会收到信用卡。为方便统计分析，有时我们可以假设样本(X1,X2,X3)（或其转换，例如，(日志⁡(X1),日志⁡(X2),日志⁡(X3))), 遵循多元正态分布，但一些判别分析方法不需要这个假设。
再举一个例子，假设一家公司根据求职者的学术记录来审查他们(X1)，教育(X2)，工作经验(X3)，自信心(X4), 和动机(X5). 根据给定的信息，所有求职者都可以分为“合适”或“不合适”。因此，公司可能会根据所有申请人提供的数据进行判别分析，将合适的申请人与不合适的申请人区分开来。

更一般地，假设有两个多元正态分布的总体，用
population表示圆周率1:ñp(μ1,Σ1),人口圆周率2:ñp(μ2,Σ2).
给定一个观察X=(X1,X2,⋯,Xp)吨从样本中，我们想知道是否X来自人口圆周率1或人口圆周率2. 这是判别分析。请注意，如果p=1（即，如果只有一个感兴趣的变量），那么很容易将观察结果分开。我们需要做的就是确定一个阈值，比如说ķ, 这样我们就可以根据是否X⩽ķ或者X>ķ. 例如，如果我们只是想根据成绩来区分学生，很容易看出哪些学生好哪些不好，比如成绩超过80%以及成绩低于80%（所以ķ=80）。然而，当p⩾2, 分开观察就不太容易了。例如，如果我们希望根据学生的成绩和音乐技能将学生分开，那么可能很难做到分开，因为学生可能成绩很好但音乐技能很差。在这种情况下，我们需要更先进的统计方法来进行分离。

有许多方法可用于判别分析。例如，以下两种方法是简单而有用的方法：

似然法：我们可以选择总体圆周率1如果可能性为圆周率1更大

比可能性圆周率2，或相反亦然。此方法需要分布假设，例如多元正态分布。

马氏距离方法我们可以考虑观察之间的马氏距离X和人口平均μ一世 :d一世=(X−μ一世)′Σ−1(X−μ一世),一世=1,2,
假设Σ1=Σ2=Σ. 然后我们可以选择人口圆周率1如果d1<d2，或相反亦然。这种方法不需要分布假设。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Discriminant analysis for categorical data

到目前为止讨论的判别分析方法假设所有变量或数据都是连续的。当某些变量或数据是分类变量或离散数据时，不能使用上述方法，因为均值和协方差矩阵对分类变量或数据不再有意义。当某些变量是分类变量时，一种简单的方法是使用逻辑回归模型进行判别分析，如下图所示。请注意，逻辑回归模型是广义线性模型，将在第 9 章中详细介绍。

例如，考虑两个总体的情况。让是=1如果一个观察X=(X1,X2,⋯,Xp)吨来自人口 1 和是=0如果观察来自人口 2。那么，我们可以考虑以下逻辑回归模型

日志⁡磷(是=1)1−磷(是=1)=b0+b1X1+⋯+bpXp,
其中一些Xj可能是分类的，有些可能是连续的。给定数据，上述逻辑回归模型可用于拟合数据。然后，我们可以估计概率磷(是=1)基于拟合的逻辑回归模型。用于观察X一世, 如果估计的概率磷^(是一世=1)>0.5, 观察X一世更有可能来自人口 1 ；否则更有可能来自种群 2。这种方法可以扩展到两个以上的种群。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Cluster Analysis

聚类分析的目标是通过根据观察结果的相似性或不相似性对观察结果进行分组来识别数据中的同质组，即将相似的观察结果分配给同一组。换句话说，这个想法是将所有观测值划分为子组或集群（或总体），以便同一集群中的观测值具有相似的特征。

例如，营销专业人员可能希望将所有消费者划分为子组或集群，以便同一子组或集群中的消费者具有相似的购买习惯。分区可以基于年龄(X1)，教育(X2)，收入(X3), 和每月付款(X4). 这是一个基于四个变量的聚类分析示例。然后，营销专业人员可以为不同的消费者群体/集群设计特殊的广告策略。请注意，此处子组或聚类的数量在聚类分析之前是未知的。

聚类分析类似于判别分析，因为这两种方法都试图将观察结果分成不同的组。但是，在判别分析中，聚类（或子组或总体）的数量是预先知道的，目标是确定观察结果可能来自哪个聚类（或总体）。另一方面，在聚类分析中，事先不知道聚类（或子组或总体）的数量，目标是找出不同的聚类并确定观察结果可能来自哪个聚类。换句话说，在聚类分析中，需要找出可能有多少聚类并确定观察的聚类成员。因此，判别分析的统计方法可能不能直接用于聚类分析。

在聚类分析中，我们的目标是设计一个分类方案，即我们需要找到一个规则来衡量任何两个观测值之间的相似性或不相似性，以便将相似的观测值组合在一起形成聚类。具体来说，让X=(X1,X2,⋯,Xp)吨和X=(X1,X2,⋯,Xp)吨是两个观察。它们之间的相似性可以通过它们之间的“距离”来衡量，因此如果它们之间的距离很小，则两个观测值相似。例如，之间的欧几里得距离X和X定义为

d0(X,X)=(X−X)吨(X−X)=∑j=1p(Xj−Xj)2.然而，欧几里得距离没有考虑成分变量的变化和相关性。对于多变量数据，每个单独的组件变量都有自己的方差，并且组件变量可能是相关的。因此，更好地衡量两个多变量观测值之间的“距离”X和X是马氏距离：

d(X,X)=(X−X)吨Σ−1(X−X∗)

在哪里Σ=这⁡(X)=这⁡(X)是协方差矩阵，假设两个观测值具有相同的协方差矩阵。因此，如果距离d(X,X)很小，我们可以认为X和X∗是“关闭的”并将它们放在同一个集群/组中。否则，我们可以将它们放在不同的集群/组中。对于一个样本n观察，\left{\mathbf{x}{1}, \cdots, \mathbf{x}{n}\right}\left{\mathbf{x}{1}, \cdots, \mathbf{x}{n}\right}，基于聚类方法，可以将每个观测值分配到其中一个聚类中。

有许多可用的聚类分析方法。下面，我们简要讨论几种常用的方法：最近邻法，ķ-means 算法和两种层次聚类方法。每种方法都有其优点和缺点，不同的方法可能导致不同的结果。在实践中，总是希望尝试至少两种方法来分析数据集，以查看结果是否一致或结果有多大差异（这可能会提供有关数据的一些见解）。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Factor Analysis

Posted on 2022年6月6日2022年6月6日 by statistics-lab

如果你也在怎样代写多元统计分析Multivariate Statistical Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

多变量统计分析被认为是评估地球化学异常与任何单独变量和变量之间相互影响的意义的有用工具。

我们提供的多元统计分析Multivariate Statistical Analysis及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等概率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Factor Analysis

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Methods for Estimation

In this section, we describe several methods to estimate the parameters in a FA model. In FA model (3.2), the loading matrix $\Lambda$ and the factors $f$ and error $\eta$ are all unobservable. Since $\boldsymbol{f}$ and $\eta$ are assumed to be independent, based on FA model (3.2) and its assumptions, we can obtain the following factor analysis (FA) equation
$$
\Sigma=\Lambda \Lambda^{\mathrm{T}}+\Psi .
$$
The above FA equation leads to the following partition of variance for variable $x_{j}$ :
$$
\sigma_{j j}=\sum_{k=1}^{m} \lambda_{j k}^{2}+\psi_{j}, \quad j=1, \cdots, p
$$

i.e., the total variance (variation) of the original variable $x_{j}$ can be partitioned into the contributions from the common factors and the random variation. The communality of variable $x_{j}$, defined as
$$
c_{j}=\sum_{k=1}^{m} \lambda_{j k}^{2} / \sigma_{j j}, \quad j=1, \cdots, p
$$
is the proportion of the variance of $x_{j}$ that is explained by the common factors $\left(f_{1}, \cdots, f_{m}\right)$. Therefore, the communality $c_{j}$ indicates the importance of the common factors to variable $x_{j}, j=1,2, \cdots, p$. In other words, the communality $c_{j}$ indicates how much variation in the original variable $x_{j}$ can be explained by the common factors $\left(f_{1}, \cdots, f_{m}\right)$. The variance of random error $\psi_{i}$ is called uniqueness or specificity.

The FA equations can be solved using different methods, such as the the maximum likelihood method and the principal factor method. For the maximum likelihood method, we assume that the error $\eta$ follows the multivariate normal distribution $N(0, \Phi)$, and then we maximize the likelihood to obtain the maximum likelihood estimates of $\Gamma$ and $f$. The principal factor method uses a different approach and does not require the normality assumption for $\eta$. We omit the technical details here. Interested readers can find the technical details in Johnson and Wichern (2007). Although different methods are available for solving the FA equations, each method has its own advantages and limitations. Thus, in data analysis, a good strategy is to use different methods to analyze the same dataset and then compare the results. If all the results are similar, the conclusions may be reliable. If the results based on different methods lead to different conclusions, we should do a further investigation and try to gain some insights as to why the results differ.

Note that the factor loading matrix $\Lambda$ is not unique, i.e., different loading matrices may satisfy the same FA equations. In fact, for any orthogonal matrix $Q$, the new matrix $\Lambda^{}$ obtained by the following orthogonal transformation $$ A^{}=A Q
$$
is also a loading matrix, since $\Lambda^{} \Lambda^{ \mathrm{~T}}=\Lambda Q Q^{\mathrm{T}} \Lambda=\Lambda \Lambda^{\mathrm{T}}$. The matrix $\Lambda^{}$ is called a rotation of the loading matrix $\Lambda$, since $\Lambda^{}$ is an orthogonal transformation of $\Lambda$.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|

In practice, we often wish to separate individuals into different groups based on observed multivariate data. For example, a bank may wish to separate good customers from bad customers based on their credit history, education, and income, or a teacher may wish to separate good students from bad students based on their grades, motivation, attitude, and other performances. In other words, given observed multivariate data, we wish to decide which group (or population) an individual in the sample belongs to. This type of analysis is called discriminant analysis or cluster analysis. Such analysis would be easy if the separation is only based on one variable, such as students’ grades or customers’ income. However, when we have multivariate data, the analysis may not be easy. For example, a student may have an average grade but excellent attitude, or a customer may have high income but poor credit history and low education, then it is not clear which group the student or customer belong to.

Discriminant analysis and cluster analysis are important in many areas. The difference between discriminant analysis and cluster analysis depends on whether the groups (or populations) are known in advance or not. If the groups are known in advance, e.g., we already know that there are good and bad students in a class or there are good customers and bad customers in a bank and we just want to separate them, then the analysis is called discriminant analysis. If the groups are not known in advance, e.g., we don’t know whether there are good students or bad students in a class and we wish to determine if all students in a class can be separated into two groups (sometimes maybe all students in a class are good students), then the analysis is called cluster analysis.

In the following section, we first introduce the basic idea of discriminant analysis using simple examples, and then we describe cluster analysis and discuss the differences and similarities between these two analyses. We first consider continuous data, and then we discuss how to treat discrete or categorical data.

In this section, we present several examples to show how to use $R$ to do factor analysis. Example 1. The weekly rates of the return of the following 5 stocks were determined (Johnson and Wichern, 2007): Applied Chemical $\left(x_{1}\right)$, duPont $\left(x_{2}\right)$, Union Carbide $\left(x_{3}\right)$, Exxon $\left(x_{4}\right)$, Texaco $\left(x_{5}\right)$. Observations in 100 successive weeks were obtained. Based on a PCA, the first two PCs explain about $73 \%$ variation. The first two $\mathrm{PCs}$ are given by

$$
\begin{aligned}
&y_{1}=0.46 x_{1}+0.46 x_{2}+0.47 x_{3}+0.42 x_{4}+0.42 x_{5} \
&y_{2}=0.24 x_{1}+0.51 x_{2}+0.26 x_{3}-0.53 x_{4}-0.58 x_{5}
\end{aligned}
$$
We see that the first $\mathrm{PC} y_{1}$ is an equally weighted sum of individual stocks, so it may be interpreted as representing market component. The second $\mathrm{PC} y_{2}$ is a contrast between the chemical stocks (the first three stocks $x_{1}, x_{2}, x_{3}$ ), and the oil stocks (the last two stocks $x_{4}, x_{5}$ ), so it may be interpreted as representing the industry component. Thus, most of the variation in these five stock returns is due to market activity and uncorrelated industry activity.

We can also do a factor analysis, which may allows us to obtain better interpretations. Based on the above PCA, we consider 2 factors. The following FA results are obtained: one without rotation and one with rotation of the loadings

We see that the rotated factors $f_{1}^{}$ and $f_{2}^{}$ are easier to interpret: the chemical companies Applied Chemical, du Pont, and Union Carbide contribute most to the first factor $f_{1}^{}$ (the corresponding loadings are high), while the oil companies Exxon and Texaco contribute most to the second factor $f_{2}^{}$ (the corresponding loadings are high). Thus, we can interpret the two factors as follows: factor $f_{1}^{}$ represents chemical stocks and factor $f_{2}^{}$ represents oil stocks. Such a meaningful and practical interpretation is unavailable in PCA, and it is an advantage of factor analysis.

Example 2. We return to the job applicant data described in Chapter 2. Here we do a factor analysis on this dataset for comparison. In PCA, it was shown that the first two PCs can explain most variation in the data. Thus, in the following we consider a factor analysis using two factors $(m=2)$. We first repeat the $\mathrm{PCA}$, and then do a factor analysis.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Introduction

多元统计分析代考

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Methods for Estimation

在本节中，我们描述了几种估计 FA 模型中参数的方法。在FA模型（3.2）中，加载矩阵Λ和因素F和错误这都是不可观察的。自从F和这假设是独立的，基于FA模型（3.2）及其假设，我们可以得到以下因子分析（FA）方程

Σ=ΛΛ吨+Ψ.
上面的 FA 方程导致以下变量的方差划分Xj :

σjj=∑ķ=1米λjķ2+ψj,j=1,⋯,p

即原始变量的总方差（variation）Xj可以分为共同因素的贡献和随机变化的贡献。变量的公共性Xj，定义为

Cj=∑ķ=1米λjķ2/σjj,j=1,⋯,p
是方差的比例Xj这是由共同因素解释的(F1,⋯,F米). 因此，社区Cj表示公因子对变量的重要性Xj,j=1,2,⋯,p. 换句话说，社区Cj表示原始变量有多少变化Xj可以用公因数来解释(F1,⋯,F米). 随机误差的方差ψ一世称为唯一性或特异性。

可以使用不同的方法来求解 FA 方程，例如最大似然法和主因子法。对于最大似然法，我们假设误差这遵循多元正态分布ñ(0,披)，然后我们最大化似然来获得最大似然估计Γ和F. 主因子法使用不同的方法，不需要正态性假设这. 我们在这里省略了技术细节。感兴趣的读者可以在 Johnson and Wichern (2007) 中找到技术细节。尽管有不同的方法可用于求解 FA 方程，但每种方法都有其自身的优点和局限性。因此，在数据分析中，一个好的策略是使用不同的方法来分析相同的数据集，然后比较结果。如果所有结果都相似，则结论可能是可靠的。如果基于不同方法的结果导致不同的结论，我们应该做进一步的调查，并尝试获得一些关于为什么结果不同的见解。

注意因子加载矩阵Λ不是唯一的，即不同的加载矩阵可能满足相同的 FA 方程。事实上，对于任何正交矩阵问, 新矩阵Λ通过以下正交变换获得

一个=一个问
也是一个加载矩阵，因为ΛΛ 吨=Λ问问吨Λ=ΛΛ吨. 矩阵Λ称为加载矩阵的旋转Λ，自从Λ是一个正交变换Λ.

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|

在实践中，我们经常希望根据观察到的多元数据将个体分成不同的组。例如，银行可能希望根据他们的信用记录、教育和收入来区分好客户和坏客户，或者老师可能希望根据他们的成绩、动机、态度和其他表现来区分好学生和坏学生。换句话说，给定观察到的多变量数据，我们希望确定样本中的个体属于哪个组（或总体）。这种类型的分析称为判别分析或聚类分析。如果分离仅基于一个变量，例如学生的成绩或客户的收入，那么这种分析将很容易。但是，当我们拥有多变量数据时，分析可能并不容易。例如，一个学生可能成绩一般，但态度很好，

判别分析和聚类分析在许多领域都很重要。判别分析和聚类分析之间的区别取决于是否事先知道组（或总体）。如果这些组是预先知道的，例如，我们已经知道一个班有好学生和坏学生，或者银行有好客户和坏客户，我们只是想将它们分开，那么这种分析称为判别分析。如果事先不知道这些组，例如，我们不知道一个班级中是否有好学生或坏学生，我们希望确定一个班级中的所有学生是否可以分成两组（有时可能是班级中的所有学生）一个班级都是好学生），那么这种分析就叫做聚类分析。

在下一节中，我们首先通过简单的例子介绍判别分析的基本思想，然后描述聚类分析并讨论这两种分析之间的异同。我们首先考虑连续数据，然后讨论如何处理离散或分类数据。

在本节中，我们将提供几个示例来展示如何使用R做因子分析。示例 1. 确定了以下 5 只股票的每周收益率（Johnson 和 Wichern，2007）：应用化学(X1), 杜邦(X2), 联合碳化物(X3), 埃克森(X4), 德士古(X5). 获得连续 100 周的观察结果。基于 PCA，前两台 PC 解释了73%变化。前两个磷Cs由

是1=0.46X1+0.46X2+0.47X3+0.42X4+0.42X5 是2=0.24X1+0.51X2+0.26X3−0.53X4−0.58X5
我们看到第一个磷C是1是个股的等权重总和，因此可以解释为代表市场成分。第二磷C是2是化工股之间的对比（前三只股票X1,X2,X3)，以及石油股（最后两只股票X4,X5)，因此可以解释为代表行业成分。因此，这五种股票收益的大部分变化是由于市场活动和不相关的行业活动造成的。

我们还可以做一个因子分析，这可以让我们获得更好的解释。基于上述 PCA，我们考虑 2 个因素。得到以下 FA 结果：一个没有旋转，一个有旋转载荷

我们看到旋转的因子F1和F2更容易解释：化学公司 Applied Chemical、du Pont 和 Union Carbide 对第一个因素的贡献最大F1（相应的负载很高），而石油公司埃克森和德士古对第二个因素的贡献最大F2（相应的负载很高）。因此，我们可以将这两个因素解释如下：F1代表化学股票和因子F2代表石油股。这种有意义和实用的解释在 PCA 中是不可用的，它是因子分析的优势。

示例 2. 我们回到第 2 章中描述的求职者数据。这里我们对这个数据集进行因子分析以进行比较。在 PCA 中，表明前两个 PC 可以解释数据中的大多数变化。因此，在下文中，我们考虑使用两个因子进行因子分析(米=2). 我们首先重复磷C一个，然后进行因子分析。

统计代写|多元统计分析代写Multivariate Statistical Analysis代考|Introduction

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写