### 统计代写|实验设计作业代写experimental design代考|WHICH VARIABLES SHOULD BE INCLUDED IN THE MODEL

statistics-lab™ 为您的留学生涯保驾护航 在代写实验设计experimental designatistical Modelling方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写实验设计experimental design代写方面经验极为丰富，各种代写实验设计experimental design相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|实验设计作业代写experimental design代考|INTRODUCTION

When a model can be formed by including some, or all, of the predictor variables, there is a problem in deciding how many variables to include. The decision we arrive at will depend to some extent on the purpose we have in mind. If we merely wish to explain the variation of the dependent variable in the sample, then $1 \mathrm{t}$ would seem obvious that as many predictor variables as possible should be included. This can be seen with the lactation curve of Example $2.11$. If enough powers of $w$ were added to the model the curve would pass through every observed value, but it would be so jagged and complicated it would be difficult to understand what was happening. On the other hand, a small model has the advantage that it is easy to understand the relationships between the variables. Further more, a small model will usually yield estimators which are less influenced by peculiarites of the sample and so are more stable. Another important decision which must be made is whether to use the original predictor variables or to transform them in some way, often by taking a linear combination. For example, the cost of a particular kind of fencing for a rectangular field may largely depend on the length and breadth of the field. If all the fields in the

sample are in the same proportions then only one variable (length or breadth) would be needed. Even if they are not in the same proportions, one variable may be sufficient, namely the sum of the length and the breadth or, indeed, the perimeter. This is our ideal solution, reducing the number of predictor variables from two to one and at the same time obtaining a predictor variable which has physi= cal meaning, With a particular data set, the predicted value of the cost may be $y=1.11+0.9$ b so that the best single variable would be the $r$ ight hand side with $1=l$ ength and $b=b r e a d t h$, but this particular linear combination would have no physical meaning. We need to keep both aspects in mind, balancing statistioal optimum against physical meaning.
In the first section we shall 11 mit our discussion to orthogonal predictor variables, Although this may seem an unnecessarily strong restriction to place on the model, orthogonal variables of ten exist in experimental design situations. Indeed the values of the variables in the sample are often deliberately chosen to be orthogonal. We explain the advantages of this in section $3.2$, while in section $3.4$ we show that 1 t is possible to transform variables, for any data set, so that they are orthogonal.

## 统计代写|实验设计作业代写experimental design代考|ORTHOGONAL PREDICTOR VARIABLES

If the variables in a model are expressed as deviations from their means and if there are $k$ predictor variables, the sum of squares for regression is given by
\begin{aligned} \mathrm{SSR} &=b_{1} s_{y 1}+b_{2} s_{y 2}+\cdots+b_{k} s_{y k} \ &=s_{y 1}^{2} / s_{11}+s_{y 2}^{2} / s_{22}+\cdots+s_{y k}^{2} / s_{k k} \end{aligned}
The total sum of squares is
$$\text { SST }-s_{y y}=\sum y_{i}^{2}$$
By subtraction, we find the sum of squares for error (residual) is

$$\mathrm{SSE}=S S T-S S R$$
In this seotion, we assume that the predicton variables are orthog= onal and explore the implications of the number of variables included in the model.

We consider now the effect of adding another variable, $x_{k+1}$, to the model and assume that this variable $1 s$ also orthogonal to the other predfotor variables. The SST will not be affected by adding $x_{k+1}$ to the model. We introduce the notation that SSR(k) is the sum of squares for negression when the variables $x_{1}, x_{2}, \cdots x_{k}$ are in the model. It is clear that
(i) $\operatorname{SSR}(k+1) \geq \operatorname{SSR}(k)$
This follows from $(3.2 .1)$ as each term in the sum cannot be negative so that adding a further variable cannot decrease the sum of squares for regression.
(ii) $\operatorname{SSE}(k+1) \leq \operatorname{SSE}(k)$
This is the other side of the coin and follows from $(3.2 .2)$.
(111)
$$R(k+1)^{2}=\operatorname{SSR}(k+1) / \operatorname{SST} \geq \mathrm{R}(k)^{2}=\operatorname{SSR}(k) / \mathrm{SST}$$
SSR $(k+1)$ can be thought of as the amount of variation in $y$ explained by the $(k+1)$ predictor variables, and $R(k+1)^{2}$ is the proportion of the variation in y explained by these variables. These monotone properties are illustrated by the diagrams in figure $3.2 .1$.

## 统计代写|实验设计作业代写experimental design代考|ADDING NONORTHOGONAL VARIABLES SEQUENTIALLY

Although orthogonal predictor variables are the ideal, they will rarely occur in practice with observational data. If some of the predictor variables are highly correlated, the matrix $X \mathrm{~T} X$ will be nearly singular. This could raise statistical and numerical problems, particularly if there is interest in estimating the coefficients of the model. We nave more to say on this in the next section and in a later section on Ridge Estimators.

Moderate correlations between predictor variables will cause few problems. While it is not essential to convert predictor variables to others which are orthogonal, it is instruotive to do so as it gives insight into the meaning of the coefficients and the tests of significance based on them.

In Problem 1.5, we considered predicting the outcome of a student in the mathematics paper 303 (which we denoted by y) by marks

recelved in the papers 201 and 203 (denoted by $x_{1}$ and $x_{2}$, respectively). The actual numbers of these papers are not relevant, but, for interest sake, the paper 201 was a calculus paper and 203 an al gebra paper, both at second year university level and 303 was a third year paper in algebra. The sum of squares for regression when $y$ is regressed singly and together on the $x$ variables (and the $F^{2}$ values) are:
$\begin{array}{lll}\text { SSR on } 201 \text { alone : } & 1433.6 & (.405) \ \text { SSR on } 203 \text { alone : } & 2129.2 & (.602) \ \text { SSR on } 201 \text { and } 203: & 2265.6 & (.641)\end{array}$
Clearly, the two $x$ variables are not orthogonal (and, in fact, the correlation coefficient between them is 0.622) as the individual sums of squares for regression do not add to that given by the model with both variables included. Once we have regressed the 303 marks on the 201 marks, the additional sum of squares due to $2031 \mathrm{~s}$ (2265.6 $1433.6)=832$. In this section we show how to adjust one variable for another so that they are orthogonal, and, as a consequence, their sums of squares for regression add to that given by the model with both variables included.
$\begin{array}{ll}\text { SSR for } 201 & =1433.6=\text { SSR for } x_{1} \ \text { SSR for } 203 \text { adjusted for } 201=832.0=\text { SSR for } z_{2} \ \text { SSR for } 201 \text { and } 203 & =2265.6\end{array}$

## 统计代写|实验设计作业代写experimental design代考|ORTHOGONAL PREDICTOR VARIABLES

SST −s是的是的=∑是的一世2

（一）固态硬盘⁡(ķ+1)≥固态硬盘⁡(ķ)

(二)上证所⁡(ķ+1)≤上证所⁡(ķ)

(111)
R(ķ+1)2=固态硬盘⁡(ķ+1)/SST≥R(ķ)2=固态硬盘⁡(ķ)/小号小号吨

## 统计代写|实验设计作业代写experimental design代考|ADDING NONORTHOGONAL VARIABLES SEQUENTIALLY

SSR 开启 201 独自的 ： 1433.6(.405)  SSR 开启 203 独自的 ： 2129.2(.602)  SSR 开启 201 和 203:2265.6(.641)

SSR 为 201=1433.6= SSR 为 X1  SSR 为 203 调整为 201=832.0= SSR 为 和2  SSR 为 201 和 203=2265.6

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。