### 统计代写|数据科学代写data science代考|Accuracy Bounds

statistics-lab™ 为您的留学生涯保驾护航 在代写数据科学data science方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写数据科学data science方面经验极为丰富，各种代写数据科学data science相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|数据科学代写data science代考|Accuracy Bounds

Finally, (1.19) can now be taken advantage of in constructing the accuracy bounds for the $h$ th disjunct region. The variance of the residuals can be calculated based on the Frobenius norm of the residual matrix $\mathbf{E}{h}$. Beginning with the PCA decomposition of the data matrix $\mathbf{Z}{h}$, storing the observations of the $h$ th disjunct region, into the product of the associated score and loading matrices, $\mathbf{T}{h} \mathbf{P}{h}^{T}$ and the residual matrix $\mathbf{E}{h}=\mathbf{T}{h}^{} \mathbf{P}{h}^{^{T}}$ :
$$\mathbf{Z}{h}=\mathbf{T}{h} \mathbf{P}{h}^{T}+\mathbf{E}{h}=\mathbf{T}{h} \mathbf{P}{h}^{T}+\mathbf{T}{h}^{} \mathbf{P}{h}^{^{T}},$$
the sum of the residual variances for each original variable, $\rho{i_{h}}, \rho_{h}=\sum_{i=1}^{N} \rho_{i_{h}}$ can be determined as follows:
$$\rho_{h}=\frac{1}{\widetilde{K}-1} \sum_{i=1}^{\widetilde{K}} \sum_{j=1}^{N} e_{i j_{h}}^{2}=\frac{1}{\widetilde{K}-1}\left|\mathbf{E}{h}\right|{2}^{2}$$
which can be simplified to:
$$\rho_{h}=\frac{1}{\widetilde{K}-1}\left|\mathbf{T}{h}^{} \mathbf{P}{h}^{^{T}}\right|_{2}^{2}=\frac{1}{\widetilde{K}-1}\left|\mathbf{U}{h}^{} \boldsymbol{\Lambda}{h}^{} \sqrt[1]{/ 2} \sqrt{\widetilde{K}-1} \mathbf{P}{h}^{^{T}}\right|{2}^{2}$$
and is equal to:
$$\rho_{h}=\frac{\widetilde{K}-1}{\widetilde{K}-1}\left|\boldsymbol{\Lambda}{h}^{}{ }^{1}\right|{2}^{2}=\sum_{i=n+1}^{N} \lambda_{i}$$
Equations (1.20) and (1.22) utilize a singular value decomposition of $\mathbf{Z}{h}$ and reconstructs the discarded components, that is $$\mathbf{E}{h}=\mathbf{U}{h}^{}\left[\Lambda=\sqrt{\widetilde{K}{h}-1}\right] \mathbf{P}{h}^{^{T}}=\mathbf{T}{h}^{} \mathbf{P}{h}^{^{T}}$$
Since $\mathbf{R}{Z Z}^{(h)}=\left[\mathbf{P}{h} \mathbf{P}{h}^{}\right]\left[\begin{array}{cc}\boldsymbol{\Lambda}{h} & \mathbf{0} \ \mathbf{0} & \boldsymbol{\Lambda}{h}^{}\end{array}\right]\left[\begin{array}{c}\mathbf{P}{h}^{T} \ \mathbf{P}{h}^{*}\end{array}\right]$, the discarded eigenvalues $\lambda_{1}$, $\lambda_{2}, \ldots, \lambda_{N}$ depend on the elements in the correlation matrix $\mathbf{R}{Z Z}$. According to (1.18) and (1.19), however, these values are calculated within a confidence limits obtained for a significance level $\alpha$. This, in turn, gives rise to the following optimization problem: \begin{aligned} &\rho{h_{\max }}=\arg \max {\Delta \mathbf{R}{Z Z_{\max }}} \rho_{h}\left(\mathbf{R}{Z Z}+\Delta \mathbf{R}{Z Z_{\max }}\right) \ &\rho_{h_{\min }}=\arg \min {\Delta \mathbf{R}{Z Z_{\min }}} \rho_{h}\left(\mathbf{R}{Z Z}+\Delta \mathbf{R}{Z Z_{\min }}\right) \end{aligned}
which is subject to the following constraints:

\begin{aligned} &\mathbf{R}{Z Z{L}} \leq \mathbf{R}{Z Z}+\Delta \mathbf{R}{Z Z_{\max }} \leq \mathbf{R}{Z Z{U}} \ &\mathbf{R}{Z Z{L}} \leq \mathbf{R}{Z Z}+\Delta \mathbf{R}{Z Z_{\min }} \leq \mathbf{R}{Z Z{U}} \end{aligned}
where $\Delta \mathbf{R}{Z Z{\max }}$ and $\Delta \mathbf{R}{Z Z{\min }}$ are perturbations of the nondiagonal elements in $\mathbf{R}{Z Z}$ that result in the determination of a maximum value, $\rho{h_{\max }}$, and a minimum value, $\rho_{h_{\min }}$, of $\rho_{h}$, respectively.

The maximum and minimum value, $\rho_{h_{\max }}$ and $\rho_{h_{\min }}$, are defined as the accuracy bounds for the $h$ th disjunct region. The interpretation of the accuracy bounds is as follows.

Definition 1. Any set of observations taken from the same disjunct operating region cannot produce a larger or a smaller residual variance, determined with a significance of $\alpha$, if the interrelationship between the original variables is linear.

The solution of Equations (1.24) and (1.25) can be computed using a genetic algorithm [63] or the more recently proposed particle swarm optimization [50].

## 统计代写|数据科学代写data science代考|Summary of the Nonlinearity Test

After determining the accuracy bounds for the $h$ th disjunct region, detailed in the previous subsection, a PCA model is obtained for each of the remaining $m-1$ regions. The sum of the $N-n$ discarded eigenvalues is then benchmarked against these limits to examine whether they fall inside or at least one residual variance value is outside. The test is completed if accuracy bounds have been computed for each of the disjunct regions including a benchmarking of the respective remaining $m-1$ residual variance. If for each of these combinations the residual variance is within the accuracy bound the process is said to be linear. In contrast, if at least one of the residual variances is outside one of the accuracy bounds, it must be concluded that the variable interrelationships are nonlinear. In the latter case, the uncertainty in the $\mathrm{PCA}$ model accuracy is smaller than the variation of the residual variances, implying that a nonlinear PCA model must be employed.
The application of the nonlinearity test involves the following steps.

1. Obtain a sufficiently large set of process data;
2. Determine whether this set can be divided into disjunct regions based on a priori knowledge; if yes, goto step 5 else goto step 3 ;
3. Carry out a $\mathrm{PCA}$ analysis of the recorded data, construct scatter diagrams for the first few principal components to determine whether distinctive operating regions can be identified; if so goto step 5 else goto step 4 ;
4. Divide the data into two disjunct regions, carry out steps 6 to 11 by setting $h=1$, and investigate whether nonlinearity within the data can be proven; if not, increase the number of disjunct regions incrementally either until the sum of discarded eigenvalues violate the accuracy bounds or the number of observations in each region is insufficient to continue the analysis;
1. Set $h=1$;
2. Calculate the confidence limits for the nondiagonal elements of the correlation matrix for the hth disjunct region (Equations (1.17) and (1.18));
3. Solve Equations (1.24) and (1.25) to compute accuracy bounds $\sigma_{h_{\max }}$ and $\sigma_{h_{\min }} ;$
4. Obtain correlation/covariance matrices for each disjunct region (scaled with respect to the variance of the observations within the $h$ th disjunct region:
5. Carry out a singular value decomposition to determine the sum of eigenvalues for each matrix;
6. Benchmark the sums of eigenvalues against the $h$ th set of accuracy bounds to test the hypothesis that the interrelationships between the recorded process variables are linear against the alternative hypothesis that the variable interrelationships are nonlinear:
7. if $h=N$ terminate the nonlinearity test else goto step 6 by setting $h=$ $h+1 .$

Examples of how to employ the nonlinearity test is given in the next subsection.

## 统计代写|数据科学代写data science代考|Example Studies

These examples have two variables, $z_{1}$ and $z_{2}$. They describe (a) a linear interrelationship and (b) a nonlinear interrelationship between $z_{1}$ and $z_{2}$. The examples involve the simulation of 1000 observations of a single score variable $t$ that stem from a uniform distribution such that the division of this set into 4 disjunct regions produces 250 observations per region. The mean value of $t$ is equal to zero and the observations of $t$ spread between $+4$ and $-4$.

In the linear example, $z_{1}$ and $z_{2}$ are defined by superimposing two independently and identically distributed sequenoes, $e_{1}$ and $e_{2}$, that follow a normal distribution of zero mean and a variance of $0.005$ onto $t$ :
$$z_{1}=t+e_{1}, e_{1}=\mathcal{N}{0,0.005} \quad z_{2}=t+e_{2}, e_{2}=\mathcal{N}{0,0.005}$$
For the nonlinear example, $z_{1}$ and $z_{2}$, are defined as follows:
$$z_{1}=t+e_{1} \quad z_{2}=t^{3}+e_{2}$$
with $e_{1}$ and $e_{2}$ described above. Figure $1.2$ shows the resultant scatter plots for the linear example (right plot) and the nonlinear example (left plot) including the division into 4 disjunct regions each.

## 统计代写|数据科学代写data science代考|Accuracy Bounds

ρH=1ķ~−1∑一世=1ķ~∑j=1ñ和一世jH2=1ķ~−1|和H|22

$$\rho_{h}=\frac{1}{\widetilde{K}-1}\left|\mathbf{T}{h}^{} \mathbf{P}{h} ^{^{T}}\right|_{2}^{2}=\frac{1}{\widetilde{K}-1}\left|\mathbf{U}{h}^{} \boldsymbol{ \Lambda}{h}^{} \sqrt[1]{/ 2} \sqrt{\widetilde{K}-1} \mathbf{P}{h}^{^{T}}\right|{2} ^{2} 一种nd一世s和q在一种l吨这: \rho_{h}=\frac{\widetilde{K}-1}{\widetilde{K}-1}\left|\boldsymbol{\Lambda}{h}^{}{ }^{1}\right| {2}^{2}=\sum_{i=n+1}^{N} \lambda_{i} 和q在一种吨一世这ns(1.20)一种nd(1.22)在吨一世l一世和和一种s一世nG在l一种r在一种l在和d和C这米p这s一世吨一世这n这F从H一种ndr和C这ns吨r在C吨s吨H和d一世sC一种rd和dC这米p这n和n吨s,吨H一种吨一世s\mathbf{E}{h}=\mathbf{U}{h}^{}\left[\Lambda=\sqrt{\widetilde{K}{h}-1}\right] \mathbf{P}{h }^{^{T}}=\mathbf{T}{h}^{} \mathbf{P}{h}^{^{T}} 小号一世nC和R从从(H)=[磷H磷H][ΛH0 0ΛH][磷H吨 磷H∗],吨H和d一世sC一种rd和d和一世G和n在一种l在和sλ1,λ2,…,λñd和p和nd这n吨H和和l和米和n吨s一世n吨H和C这rr和l一种吨一世这n米一种吨r一世XR从从.一种CC这rd一世nG吨这(1.18)一种nd(1.19),H这在和在和r,吨H和s和在一种l在和s一种r和C一种lC在l一种吨和d在一世吨H一世n一种C这nF一世d和nC和l一世米一世吨s这b吨一种一世n和dF这r一种s一世Gn一世F一世C一种nC和l和在和l一种.吨H一世s,一世n吨在rn,G一世在和sr一世s和吨这吨H和F这ll这在一世nG这p吨一世米一世和一种吨一世这npr这bl和米:ρH最大限度=参数⁡最大限度ΔR从从最大限度ρH(R从从+ΔR从从最大限度) ρH分钟=参数⁡分钟ΔR从从分钟ρH(R从从+ΔR从从分钟)$$

## 统计代写|数据科学代写data science代考|Summary of the Nonlinearity Test

1. 获得足够大的过程数据集；
2. 判断这个集合是否可以根据先验知识划分为不相交的区域；如果是，则转到第 5 步，否则转到第 3 步；
3. 进行一次磷C一种分析记录的数据，构建前几个主成分的散点图，以确定是否可以识别出不同的操作区域；如果是，则转到第 5 步，否则转到第 4 步；
4. 将数据分成两个不相交的区域，通过设置执行步骤 6 到 11H=1，并调查是否可以证明数据中的非线性；如果不是，则逐渐增加分离区域的数量，直到丢弃的特征值的总和超出精度界限或每个区域中的观察数量不足以继续分析；
1. 放H=1;
2. 计算第 h 个分离区域的相关矩阵的非对角元素的置信限（方程（1.17）和（1.18））；
3. 求解方程 (1.24) 和 (1.25) 以计算精度界限σH最大限度和σH分钟;
4. 获得每个分离区域的相关/协方差矩阵（根据观测值的方差缩放H分离区域：
5. 进行奇异值分解，确定每个矩阵的特征值之和；
6. 将特征值之和与H用于检验记录过程变量之间的相互关系是线性的假设与变量相互关系是非线性的备择假设的准确度范围：
7. 如果H=ñ通过设置终止非线性测试，否则转到步骤 6H= H+1.

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。