## 统计代写|线性回归代写linear regression代考|SOC605

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|线性回归代写linear regression代考|Fixed Effects One Way Anova

The one way Anova model is used to compare $p$ treatments. Usually there is replication and Ho: $\mu_1=\mu_2=\cdots=\mu_p$ is a hypothesis of interest. Investigators may also want to rank the population means from smallest to largest.

Definition 5.6. Let $f_Z(z)$ be the pdf of $Z$. Then the family of pdfs $f_Y(y)=f_Z(y-\mu)$ indexed by the location parameter $\mu,-\infty<\mu<\infty$, is the location family for the random variable $Y=\mu+Z$ with standard $p d f f_Z(z)$.

Definition 5.7. A one way fixed effects Anova model has a single qualitative predictor variable $W$ with $p$ categories $a_1, \ldots, a_p$. There are $p$ different distributions for $Y$, one for each category $a_i$. The distribution of
$$Y \mid\left(W=a_i\right) \sim f_Z\left(y-\mu_i\right)$$
where the location family has second moments. Hence all $p$ distributions come from the same location family with different location parameter $\mu_i$ and the same variance $\sigma^2$.

Definition 5.8. The one way fixed effects normal Anova model is the special case where

$$Y \mid\left(W=a_i\right) \sim N\left(\mu_i, \sigma^2\right)$$
Example 5.3. The pooled 2 sample $t$-test is a special case of a one way Anova model with $p=2$. For example, one population could be ACT scores for men and the second population ACT scores for women. Then $W=$ gender and $Y=$ score.Notation. It is convenient to relabel the response variable $Y_1, \ldots, Y_n$ as the vector $\boldsymbol{Y}=\left(Y_{11}, \ldots, Y_{1, n_1}, Y_{21}, \ldots, Y_{2, n_2}, \ldots, Y_{p 1}, \ldots, Y_{p, n_p}\right)^T$ where the $Y_{i j}$ are independent and $Y_{i 1}, \ldots, Y_{i, n_i}$ are iid. Here $j=1, \ldots, n_i$ where $n_i$ is the number of cases from the $i$ th level where $i=1, \ldots, p$. Thus $n_1+\cdots+n_p=$ $n$. Similarly use double subscripts on the errors. Then there will be many equivalent parameterizations of the one way fixed effects Anova model.

## 统计代写|线性回归代写linear regression代考|Random Effects One Way Anova

Definition 5.16. For the random effects one way Anova, the levels of the factor are a random sample of levels from some population of levels $\Lambda_F$. The cell means model for the random effects one way Anova is $Y_{i j}=\mu_i+e_{i j}$ for $i=1, \ldots, p$ and $j=1, \ldots, n_i$. The $\mu_i$ are randomly selected from some population $\Lambda$ with mean $\mu$ and variance $\sigma_\mu^2$, where $i \in \Lambda_F$ is equivalent to $\mu_i \in \Lambda$. The $e_{i j}$ and $\mu_i$ are independent, and the $e_{i j}$ are iid from a location family with pdf $f$, mean 0 , and variance $\sigma^2$. The $Y_{i j} \mid \mu_i \sim f\left(y-\mu_i\right)$, the location family with location parameter $\mu_i$ and variance $\sigma^2$. Unconditionally, $E\left(Y_{i j}\right)=\mu$ and $V\left(Y_{i j}\right)=\sigma_\mu^2+\sigma^2$.

For the random effects model, the $\mu_i$ are independent random variables with $E\left(\mu_i\right)=\mu$ and $V\left(\mu_i\right)=\sigma_\mu^2$. The cell means model for fixed effects one way Anova is very similar to that for the random effects model, but the $\mu_i$ are fixed constants rather than random variables.

Definition 5.17. For the normal random effects one way Anova model, $\Lambda \sim N\left(\mu, \sigma_\mu^2\right)$. Thus the $\mu_i$ are independent $N\left(\mu, \sigma_\mu^2\right)$ random variables. The $e_{i j}$ are iid $N\left(0, \sigma^2\right)$ and the $e_{i j}$ and $\mu_i$ are independent. For this model, $Y_{i j} \mid \mu_i \sim N\left(\mu_i, \sigma^2\right)$ for $i=1, \ldots, p$. Note that the conditional variance $\sigma^2$ is the same for each $\mu_i \in \Lambda$. Unconditionally, $Y_{i j} \sim N\left(\mu, \sigma_\mu^2+\sigma^2\right)$.

The fixed effects one way Anova tested Ho: $\mu_1=\cdots=\mu_p$. For the random effects one way Anova, interest is in whether $\mu_i \equiv \mu$ for every $\mu_i$ in $\Lambda$ where the population $\Lambda$ is not necessarily finite. Note that if $\sigma_\mu^2=0$, then $\mu_i \equiv \mu$ for all $\mu_i \in \Lambda$. In the sample of $p$ levels, the $\mu_i$ will differ if $\sigma_\mu^2>0$.

5.7.

## 统计代写|线性回归代写线性回归代考|随机效应单向方差分析

. $\mu_i$将有所不同

## 统计代写|线性回归代写linear regression代考|STAT6450

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|线性回归代写linear regression代考|GLS, WLS, and FGLS

Definition 4.3. Suppose that the response variable and at least one of the predictor variables is quantitative. Then the generalized least squares (GLS) model is
$$\boldsymbol{Y}=\boldsymbol{X} \boldsymbol{\beta}+e,$$
where $\boldsymbol{Y}$ is an $n \times 1$ vector of dependent variables, $\boldsymbol{X}$ is an $n \times p$ matrix of predictors, $\boldsymbol{\beta}$ is a $p \times 1$ vector of unknown coefficients, and $e$ is an $n \times 1$ vector of unknown errors. Also $E(\boldsymbol{e})=\mathbf{0}$ and $\operatorname{Cov}(\boldsymbol{e})=\sigma^2 \boldsymbol{V}$ where $\boldsymbol{V}$ is a known $n \times n$ positive definite matrix.
Definition 4.4. The GLS estimator
$$\hat{\boldsymbol{\beta}}{G L S}=\left(\boldsymbol{X}^T \boldsymbol{V}^{-1} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^T \boldsymbol{V}^{-1} \boldsymbol{Y} .$$ The fitted values are $\hat{\boldsymbol{Y}}{G L S}=\boldsymbol{X} \hat{\boldsymbol{\beta}}{G L S}$. Definition 4.5. Suppose that the response variable and at least one of the predictor variables is quantitative. Then the weighted least squares (WLS) model with weights $w_1, \ldots, w_n$ is the special case of the GLS model where $\boldsymbol{V}$ is diagonal: $\boldsymbol{V}=\operatorname{diag}\left(\mathrm{v}_1, \ldots, \mathrm{v}{\mathrm{n}}\right)$ and $w_i=1 / v_i$. Hence
$$\boldsymbol{Y}=\boldsymbol{X} \beta+e$$
$E(\boldsymbol{e})=\mathbf{0}$, and $\operatorname{Cov}(\boldsymbol{e})=\sigma^2 \operatorname{diag}\left(\mathrm{v}1, \ldots, \mathrm{v}{\mathrm{n}}\right)=\sigma^2 \operatorname{diag}\left(1 / \mathrm{w}1, \ldots, 1 / \mathrm{w}{\mathrm{n}}\right)$.
Definition 4.6. The WLS estimator
$$\hat{\boldsymbol{\beta}}{W L S}=\left(\boldsymbol{X}^T \boldsymbol{V}^{-1} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^T \boldsymbol{V}^{-1} \boldsymbol{Y} .$$ The fitted values are $\hat{\boldsymbol{Y}}{W L S}=\boldsymbol{X} \hat{\boldsymbol{\beta}}{W L S}$. Definition 4.7. The feasible generalized least squares (FGLS) model is the same as the GLS estimator except that $\boldsymbol{V}=\boldsymbol{V}(\boldsymbol{\theta})$ is a function of an unknown $q \times 1$ vector of parameters $\boldsymbol{\theta}$. Let the estimator of $\boldsymbol{V}$ be $\hat{\boldsymbol{V}}=\boldsymbol{V}(\hat{\boldsymbol{\theta}})$. Then the FGLS estimator $$\hat{\boldsymbol{\beta}}{F G L S}=\left(\boldsymbol{X}^T \hat{\boldsymbol{V}}^{-1} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^T \hat{\boldsymbol{V}}^{-1} \boldsymbol{Y}$$

## 统计代写|线性回归代写linear regression代考|Inference for GLS

Inference for the GLS model $\boldsymbol{Y}=\boldsymbol{X} \boldsymbol{\beta}+\boldsymbol{e}$ can be performed by using the partial $F$ test for the equivalent no intercept OLS model $\boldsymbol{Z}=\boldsymbol{U} \boldsymbol{\beta}+\boldsymbol{\epsilon}$. Following Section 2.10, create $\boldsymbol{Z}$ and $\boldsymbol{U}$, fit the full and reduced model using the “no intercept” or “intercept $=\mathrm{F}$ ” option. Let pval be the estimated pvalue.

The 4 step partial $F$ test of hypotheses: i) State the hypotheses Ho: the reduced model is good Ha: use the full model
ii) Find the test statistic $F_R=$
$$\left[\frac{\operatorname{SSF}(R)-\operatorname{SSF}(F)}{d f_R-d f_F}\right] / \operatorname{MSE}(F)$$
iii) Find the pval $=\mathrm{P}\left(F_{d f_R-d f_F, d f_F}>F_R\right)$. (On exams often an $F$ table is used. Here $d f_R-d f_F=p-q=$ number of parameters set to 0 , and $d f_F=n-p$. ) iv) State whether you reject Ho or fail to reject Ho. Reject Ho if pval $\leq \delta$ and conclude that the full model should be used. Otherwise, fail to reject Ho and conclude that the reduced model is good.

Assume that the GLS model contains a constant $\beta_1$. The GLS ANOVA F test of $\mathrm{Ho}: \beta_2=\cdots=\beta_p$ versus Ha: not Ho uses the reduced model that contains the first column of $\boldsymbol{U}$. The GLS ANOVA $F$ test of $H o: \beta_i=0$ versus $H o: \beta_i \neq 0$ uses the reduced model with the $i$ th column of $U$ deleted. For the special case of WLS, the software will often have a weights option that will also give correct output for inference.

Example 4.3. Suppose that the data from Example $4.2$ has valid weights, so that WLS can be used instead of FWLS. The $R$ commands below perform WLS.

## 统计代写|线性回归代写线性回归代考|GLS, WLS，和FGLS

$$\boldsymbol{Y}=\boldsymbol{X} \boldsymbol{\beta}+e,$$
where $\boldsymbol{Y}$ 是一个 $n \times 1$ 因变量的向量， $\boldsymbol{X}$ 是一个 $n \times p$ 预测矩阵， $\boldsymbol{\beta}$ 是 $p \times 1$ 未知系数向量，和 $e$ 是一个 $n \times 1$ 未知误差的向量。还有 $E(\boldsymbol{e})=\mathbf{0}$ 和 $\operatorname{Cov}(\boldsymbol{e})=\sigma^2 \boldsymbol{V}$ 哪里 $\boldsymbol{V}$ 是已知的 $n \times n$ 正定矩阵。

$$\hat{\boldsymbol{\beta}}{G L S}=\left(\boldsymbol{X}^T \boldsymbol{V}^{-1} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^T \boldsymbol{V}^{-1} \boldsymbol{Y} .$$ 拟合值为 $\hat{\boldsymbol{Y}}{G L S}=\boldsymbol{X} \hat{\boldsymbol{\beta}}{G L S}$。定义4.5。假设响应变量和至少一个预测变量是定量的。然后建立了加权最小二乘(WLS)模型 $w_1, \ldots, w_n$ GLS模型的特例在哪里 $\boldsymbol{V}$ 对角线: $\boldsymbol{V}=\operatorname{diag}\left(\mathrm{v}_1, \ldots, \mathrm{v}{\mathrm{n}}\right)$ 和 $w_i=1 / v_i$。因此
$$\boldsymbol{Y}=\boldsymbol{X} \beta+e$$
$E(\boldsymbol{e})=\mathbf{0}$，以及 $\operatorname{Cov}(\boldsymbol{e})=\sigma^2 \operatorname{diag}\left(\mathrm{v}1, \ldots, \mathrm{v}{\mathrm{n}}\right)=\sigma^2 \operatorname{diag}\left(1 / \mathrm{w}1, \ldots, 1 / \mathrm{w}{\mathrm{n}}\right)$4.6.

$$\hat{\boldsymbol{\beta}}{W L S}=\left(\boldsymbol{X}^T \boldsymbol{V}^{-1} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^T \boldsymbol{V}^{-1} \boldsymbol{Y} .$$ 拟合值为 $\hat{\boldsymbol{Y}}{W L S}=\boldsymbol{X} \hat{\boldsymbol{\beta}}{W L S}$。定义4.7。可行广义最小二乘(FGLS)模型与GLS估计量相同，但有以下几点不同 $\boldsymbol{V}=\boldsymbol{V}(\boldsymbol{\theta})$ 是一个未知数的函数吗 $q \times 1$ 参数向量 $\boldsymbol{\theta}$。的估计量 $\boldsymbol{V}$ 是 $\hat{\boldsymbol{V}}=\boldsymbol{V}(\hat{\boldsymbol{\theta}})$。然后是FGLS估计量 $$\hat{\boldsymbol{\beta}}{F G L S}=\left(\boldsymbol{X}^T \hat{\boldsymbol{V}}^{-1} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^T \hat{\boldsymbol{V}}^{-1} \boldsymbol{Y}$$

## 统计代写|线性回归代写线性回归代考|推论GLS

4步部分$F$假设检验:i)陈述假设Ho:简化模型是好的Ha:使用完整模型
ii)找到测试统计$F_R=$
$$\left[\frac{\operatorname{SSF}(R)-\operatorname{SSF}(F)}{d f_R-d f_F}\right] / \operatorname{MSE}(F)$$
iii)找到pval $=\mathrm{P}\left(F_{d f_R-d f_F, d f_F}>F_R\right)$。(在考试中通常使用$F$表。这里$d f_R-d f_F=p-q=$参数的数量设置为0,$d f_F=n-p$。)iv)说明你是否拒绝何氏或不拒绝何氏。如果pval $\leq \delta$，则拒绝Ho，并得出应该使用完整模型的结论。否则，拒绝Ho，认为简化模型是好的。

## 统计代写|线性回归代写linear regression代考|MATH839

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|线性回归代写linear regression代考|Variable Selection and Multicollinearity

The literature on numerical methods for variable selection in the OLS multiple linear regression model is enormous. Three important papers are Jones (1946), Mallows (1973), and Furnival and Wilson (1974). Chatterjee and Hadi (1988, pp. 43-47) give a nice account on the effects of overfitting on the least squares estimates. Ferrari and Yang (2015) give a method for testing whether a model is underfitting. Section $3.4 .1$ followed Olive (2016a) closely. See Olive (2016b) for more on prediction regions. Also see Claeskins and Hjort (2003), Hjort and Claeskins (2003), and Efron et al. (2004). Texts include Burnham and Anderson (2002), Claeskens and Hjort (2008), and Linhart and Zucchini (1986).

Cook and Weisberg (1999a, pp. 264-265) give a good discussion of the effect of deleting predictors on linearity and the constant variance assumption. Walls and Weeks (1969) note that adding predictors increases the variance of a predicted response. Also $R^2$ gets large. See Freedman (1983).

Discussion of biases introduced by variable selection and data snooping include Hurvich and Tsai (1990), Leeb and Pötscher (2006), Selvin and Stuart (1966), and Hjort and Claeskins (2003). This theory assumes that the full model is known before collecting the data, but in practice the full model is often built after collecting the data. Freedman (2005, pp. 192-195) gives an interesting discussion on model building and variable selection.

The predictor variables can be transformed if the response is not used, and then inference can be done for the linear model. Suppose the $p$ predictor variables are fixed so $\boldsymbol{Y}=t(\boldsymbol{Z})=\boldsymbol{X} \boldsymbol{\beta}+\boldsymbol{e}$, and the computer program outputs $\hat{\boldsymbol{\beta}}$, after doing an automated response transformation and automated variable selection. Then the nonlinear estimator $\hat{\boldsymbol{\beta}}$ can be bootstrapped. See Olive (2016a). If data snooping, such as using graphs, is used to select the response transformation and the submodel from variable selection, then strong, likely unreasonable assumptions are needed for valid inference for the final nonlinear model.

## 统计代写|线性回归代写linear regression代考|Random Vectors

The concepts of a random vector, the expected value of a random vector, and the covariance of a random vector are needed before covering generalized least squares. Recall that for random variables $Y_i$ and $Y_j$, the covariance of $Y_i$ and $Y_j$ is $\operatorname{Cov}\left(Y_i, Y_j\right) \equiv \sigma_{i, j}=E\left[\left(Y_i-E\left(Y_i\right)\right)\left(Y_j-E\left(Y_j\right)\right]=E\left(Y_i Y_j\right)-E\left(Y_i\right) E\left(Y_j\right)\right.$ provided the second moments of $Y_i$ and $Y_j$ exist.

Definition 4.1. $\boldsymbol{Y}=\left(Y_1, \ldots, Y_n\right)^T$ is an $n \times 1$ random vector if $Y_i$ is a random variable for $i=1, \ldots, n$. $\boldsymbol{Y}$ is a discrete random vector if each $Y_i$ is discrete, and $\boldsymbol{Y}$ is a continuous random vector if each $Y_i$ is continuous. A random variable $Y_1$ is the special case of a random vector with $n-1$.
Definition 4.2. The population mean of a random $n \times 1$ vector $\boldsymbol{Y}=$ $\left(Y_1, \ldots, Y_n\right)^T$ is
$$E(\boldsymbol{Y})=\left(E\left(Y_1\right), \ldots, E\left(Y_n\right)\right)^T$$
provided that $E\left(Y_i\right)$ exists for $i=1, \ldots, n$. Otherwise the expected value does not exist. The $n \times n$ population covariance matrix
$$\operatorname{Cov}(\boldsymbol{Y})=E\left[(\boldsymbol{Y}-E(\boldsymbol{Y}))(\boldsymbol{Y}-E(\boldsymbol{Y}))^T\right]=\left(\sigma_{i, j}\right)$$
where the $i j$ entry of $\operatorname{Cov}(\boldsymbol{Y})$ is $\operatorname{Cov}\left(Y_i, Y_j\right)=\sigma_{i, j}$ provided that each $\sigma_{i, j}$ exists. Otherwise $\operatorname{Cov}(\boldsymbol{Y})$ does not exist.

The covariance matrix is also called the variance-covariance matrix and variance matrix. Sometimes the notation $\operatorname{Var}(\boldsymbol{Y})$ is used. Note that $\operatorname{Cov}(\boldsymbol{Y})$ is a symmetric positive semidefinite matrix. If $\boldsymbol{Z}$ and $\boldsymbol{Y}$ are $n \times 1$ random vectors, $\boldsymbol{a}$ a conformable constant vector, and $\boldsymbol{A}$ and $\boldsymbol{B}$ are conformable constant matrices, then
$$E(\boldsymbol{a}+\boldsymbol{Y})=\boldsymbol{a}+E(\boldsymbol{Y}) \text { and } E(\boldsymbol{Y}+\boldsymbol{Z})=E(\boldsymbol{Y})+E(\boldsymbol{Z})$$ and
$$E(\boldsymbol{A} \boldsymbol{Y})=\boldsymbol{A} E(\boldsymbol{Y}) \text { and } E(\boldsymbol{A} \boldsymbol{Y} \boldsymbol{B})=\boldsymbol{A} E(\boldsymbol{Y}) \boldsymbol{B}$$

## 统计代写|线性回归代写线性回归代考|变量选择和多重共线性

Cook和Weisberg (1999a, pp. 264-265)很好地讨论了删除预测因子对线性和恒定方差假设的影响。Walls和Weeks(1969)指出，添加预测因素会增加预测反应的方差。$R^2$也变大了。见弗里德曼(1983)

4.1.

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

