标签: Statistics4315

统计代写|STAT311 Linear regression

Statistics-lab™可以为您提供metrostate.edu STAT311 Linear regression线性回归课程的代写代考辅导服务!

STAT311 Linear regression课程简介

This course covers various statistical models such as simple linear regression, multiple regression, and analysis of variance. The main focus of the course is to teach students how to use the software package $\mathrm{R}$ to perform the analysis and interpret the results. Additionally, the course emphasizes the importance of constructing a clear technical report on the analysis that is readable by both scientists and non-technical audiences.

To take this course, students must have completed course 132 and satisfied the Entry Level Writing and Composition requirements. This course satisfies the General Education Code W requirement.

PREREQUISITES 

Covers simple linear regression, multiple regression, and analysis of variance models. Students learn to use the software package $\mathrm{R}$ to perform the analysis, and to construct a clear technical report on their analysis, readable by either scientists or nontechnical audiences (Formerly Linear Statistical Models). Prerequisite(s): course 132 and satisfaction of the Entry Level Writing and Composition requirements. Gen. Ed. Code(s): W

STAT311 Linear regression HELP(EXAM HELP, ONLINE TUTOR)

问题 1.

Proposition 3.1. Suppose that a numerical variable selection method suggests several submodels with $k$ predictors, including a constant, where $2 \leq k \leq p$
a) The model $I$ that minimizes $C_p(I)$ maximizes $\operatorname{corr}\left(r, r_I\right)$.
b) $C_p(I) \leq 2 k$ implies that $\operatorname{corr}\left(\mathrm{r}, \mathrm{r}{\mathrm{I}}\right) \geq \sqrt{1-\frac{\mathrm{p}}{\mathrm{n}}}$. c) As $\operatorname{corr}\left(r, r_I\right) \rightarrow 1$, $$ \operatorname{corr}\left(\boldsymbol{x}^{\mathrm{T}} \hat{\boldsymbol{\beta}}, \boldsymbol{x}{\mathrm{I}}^{\mathrm{T}} \hat{\boldsymbol{\beta}}{\mathrm{I}}\right)=\operatorname{corr}(\mathrm{ESP}, \mathrm{ESP}(\mathrm{I}))=\operatorname{corr}\left(\hat{\mathrm{Y}}, \hat{\mathrm{Y}}{\mathrm{I}}\right) \rightarrow 1
$$

Remark 3.1. Consider the model $I_i$ that deletes the predictor $x_i$. Then the model has $k=p-1$ predictors including the constant, and the test statistic is $t_i$ where
$$
t_i^2=F_{I_i}
$$
Using Definition 3.8 and $C_p\left(I_{\text {full }}\right)=p$, it can be shown that
$$
C_p\left(I_i\right)=C_p\left(I_{\text {full }}\right)+\left(t_i^2-2\right) .
$$
Using the screen $C_p(I) \leq \min (2 k, p)$ suggests that the predictor $x_i$ should not be deleted if
$$
\left|t_i\right|>\sqrt{2} \approx 1.414
$$
If $\left|t_i\right|<\sqrt{2}$, then the predictor can probably be deleted since $C_p$ decreases. The literature suggests using the $C_p(I) \leq k$ screen, but this screen eliminates too many potentially useful submodels.

问题 2.

Proposition 3.2. Suppose that every submodel contains a constant and that $\boldsymbol{X}$ is a full rank matrix.
Response Plot: i) If $w=\hat{Y}I$ and $z=Y$, then the OLS line is the identity line. ii) If $w=Y$ and $z=\hat{Y}_I$, then the OLS line has slope $b=\left[\operatorname{corr}\left(Y, \hat{Y}_I\right)\right]^2=$ $R^2(I)$ and intercept $a=\bar{Y}\left(1-R^2(I)\right)$ where $\bar{Y}=\sum{i=1}^n Y_i / n$ and $R^2(I)$ is the coefficient of multiple determination from the candidate model.
FF or EE Plot: iii) If $w=\hat{Y}_I$ and $z=\hat{Y}$, then the OLS line is the identity line. Note that $E S P(I)=\hat{Y}_I$ and $E S P=\hat{Y}$.
iv) If $w=\hat{Y}$ and $z=\hat{Y}_I$, then the OLS line has slope $b=\left[\operatorname{corr}\left(\hat{Y}, \hat{Y}_I\right)\right]^2=$ $S S R(I) / S S R$ and intercept $a=\bar{Y}[1-(S S R(I) / S S R)]$ where SSR is the regression sum of squares.
RR Plot: v) If $w=r$ and $z=r_I$, then the OLS line is the identity line.
vi) If $w=r_I$ and $z=r$, then $a=0$ and the OLS slope $b=\left[\operatorname{corr}\left(r, r_I\right)\right]^2$ and
$$
\operatorname{corr}\left(r, r_I\right)=\sqrt{\frac{S S E}{S S E(I)}}=\sqrt{\frac{n-p}{C_p(I)+n-2 k}}=\sqrt{\frac{n-p}{(p-k) F_I+n-p}} .
$$

Proof: Recall that $\boldsymbol{H}$ and $\boldsymbol{H}_I$ are symmetric idempotent matrices and that $\boldsymbol{H}_I=\boldsymbol{H}_I$. The mean of OLS fitted values is equal to $\bar{Y}$ and the mean of OLS residuals is equal to 0 . If the OLS line from regressing $z$ on $w$ is $\hat{z}=a+b w$, then $a=\bar{z}-b \bar{w}$ and $$
b=\frac{\sum\left(w_i-\bar{w}\right)\left(z_i-\bar{z}\right)}{\sum\left(w_i-\bar{w}\right)^2}=\frac{S D(z)}{S D(w)} \operatorname{corr}(z, w) .
$$
Also recall that the OLS line passes through the means of the two variables $(\bar{w}, \bar{z})$
$\left(^\right)$ Notice that the OLS slope from regressing $z$ on $w$ is equal to one if and only if the OLS slope from regressing $w$ on $z$ is equal to $[\operatorname{corr}(z, w)]^2$. i) The slope $b=1$ if $\sum \hat{Y}{I, i} Y_i=\sum \hat{Y}{I, i}^2$. This equality holds since $\hat{\boldsymbol{Y}}I^T \boldsymbol{Y}=$ $\boldsymbol{Y}^T \boldsymbol{H}_I \boldsymbol{Y}=\boldsymbol{Y}^T \boldsymbol{H}_I \boldsymbol{H}_I \boldsymbol{Y}=\hat{\boldsymbol{Y}}_I^T \hat{\boldsymbol{Y}}_I$. Since $b=1, a=\bar{Y}-\bar{Y}=0$. ii) By $\left(^\right)$, the slope
$$
b=\left[\operatorname{corr}\left(Y, \hat{Y}_I\right)\right]^2=R^2(I)=\frac{\sum\left(\hat{Y}{I, i}-\bar{Y}\right)^2}{\sum\left(Y_i-\bar{Y}\right)^2}=\operatorname{SSR}(I) / \operatorname{SSTO} .
$$
The result follows since $a=\bar{Y}-b \bar{Y}$.
iii) The slope $b=1$ if $\sum \hat{Y}{I, i} \hat{Y}_i=\sum \hat{Y}{I, i}^2$. This equality holds since $\hat{\boldsymbol{Y}}^T \hat{\boldsymbol{Y}}_I=\boldsymbol{Y}^T \boldsymbol{H} \boldsymbol{H}_I \boldsymbol{Y}=\boldsymbol{Y}^T \boldsymbol{H}_I \boldsymbol{Y}=\hat{\boldsymbol{Y}}_I^T \hat{\boldsymbol{Y}}_I$. Since $b=1, a=\bar{Y}-\bar{Y}=0$.
iv) From iii),
$$
1=\frac{S D(\hat{Y})}{S D\left(\hat{Y}_I\right)}\left[\operatorname{corr}\left(\hat{Y}, \hat{Y}_I\right)\right] .
$$

Textbooks


• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.

此图像的alt属性为空;文件名为%E7%B2%89%E7%AC%94%E5%AD%97%E6%B5%B7%E6%8A%A5-1024x575-10.png
统计代写|STAT311 Linear regression

Statistics-lab™可以为您提供metrostate.edu STAT311 Linear regression线性回归课程的代写代考辅导服务! 请认准Statistics-lab™. Statistics-lab™为您的留学生涯保驾护航。

统计代写|STAT311 Linear regression

Statistics-lab™可以为您提供metrostate.edu STAT311 Linear regression线性回归课程的代写代考辅导服务!

STAT311 Linear regression课程简介

This course covers various statistical models such as simple linear regression, multiple regression, and analysis of variance. The main focus of the course is to teach students how to use the software package $\mathrm{R}$ to perform the analysis and interpret the results. Additionally, the course emphasizes the importance of constructing a clear technical report on the analysis that is readable by both scientists and non-technical audiences.

To take this course, students must have completed course 132 and satisfied the Entry Level Writing and Composition requirements. This course satisfies the General Education Code W requirement.

PREREQUISITES 

Covers simple linear regression, multiple regression, and analysis of variance models. Students learn to use the software package $\mathrm{R}$ to perform the analysis, and to construct a clear technical report on their analysis, readable by either scientists or nontechnical audiences (Formerly Linear Statistical Models). Prerequisite(s): course 132 and satisfaction of the Entry Level Writing and Composition requirements. Gen. Ed. Code(s): W

STAT311 Linear regression HELP(EXAM HELP, ONLINE TUTOR)

问题 1.

2.10. In the above table, $x_i$ is the length of the femur and $y_i$ is the length of the humerus taken from five dinosaur fossils (Archaeopteryx) that preserved both bones. See Moore (2000, p. 99).
a) Complete the table and find the least squares estimators $\hat{\beta}_1$ and $\hat{\beta}_2$.
b) Predict the humerus length if the femur length is 60 .

问题 2.

2.11. Suppose that the regression model is $Y_i=7+\beta X_i+e_i$ for $i=$ $1, \ldots, n$ where the $e_i$ are iid $N\left(0, \sigma^2\right)$ random variables. The least squares criterion is $Q(\eta)=\sum_{i=1}^n\left(Y_i-7-\eta X_i\right)^2$.
a) What is $E\left(Y_i\right)$ ?
b) Find the least squares estimator $\beta$ of $\beta$ by setting the first derivative $\frac{d}{d \eta} Q(\eta)$ equal to zero.
c) Show that your $\hat{\beta}$ is the global minimizer of the least squares criterion $Q$ by showing that the second derivative $\frac{d^2}{d \eta^2} Q(\eta)>0$ for all values of $\eta$.

问题 3.

2.12. The location model is $Y_i=\mu+e_i$ for $i=1, \ldots, n$ where the $e_i$ are iid with mean $E\left(e_i\right)=0$ and constant variance $\operatorname{VAR}\left(e_i\right)=\sigma^2$. The least squares estimator $\hat{\mu}$ of $\mu$ minimizes the least squares criterion $Q(\eta)=\sum_{i=1}^n\left(Y_i-\eta\right)^2$. To find the least squares estimator, perform the following steps.

a) Find the derivative $\frac{a}{d \eta} Q$, set the derivative equal to zero and solve for
$\eta$. Call the solution $\hat{\mu}$.
b) To show that the solution was indeed the global minimizer of $Q$, show that $\frac{d^2}{d \eta^2} Q>0$ for all real $\eta$. (Then the solution $\hat{\mu}$ is a local min and $Q$ is convex, so $\hat{\mu}$ is the global min.)

问题 4.

2.14. Suppose that the regression model is $Y_i=10+2 X_{i 2}+\beta_3 X_{i 3}+e_i$ for $i=1, \ldots, n$ where the $e_i$ are iid $N\left(0, \sigma^2\right)$ random variables. The least squares criterion is $Q\left(\eta_3\right)=\sum_{i=1}^n\left(Y_i-10-2 X_{\mathrm{i} 2}-\eta_3 X_{i 3}\right)^2$. Find the least squares estimator $\hat{\beta}_3$ of $\beta_3$ by setting the first derivative $\frac{d}{d \eta_3} Q\left(\eta_3\right)$ equal to zero. Show that your $\hat{\beta}_3$ is the global minimizer of the least squares criterion $Q$ by showing that the second derivative $\frac{d^2}{d \eta_3^2} Q\left(\eta_3\right)>0$ for all values of $\eta_3$.

Textbooks


• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.

此图像的alt属性为空;文件名为%E7%B2%89%E7%AC%94%E5%AD%97%E6%B5%B7%E6%8A%A5-1024x575-10.png
统计代写|STAT311 Linear regression

Statistics-lab™可以为您提供metrostate.edu STAT311 Linear regression线性回归课程的代写代考辅导服务! 请认准Statistics-lab™. Statistics-lab™为您的留学生涯保驾护航。

统计代写|STAT311 Linear regression

Statistics-lab™可以为您提供metrostate.edu STAT311 Linear regression线性回归课程的代写代考辅导服务!

STAT311 Linear regression课程简介

This course covers various statistical models such as simple linear regression, multiple regression, and analysis of variance. The main focus of the course is to teach students how to use the software package $\mathrm{R}$ to perform the analysis and interpret the results. Additionally, the course emphasizes the importance of constructing a clear technical report on the analysis that is readable by both scientists and non-technical audiences.

To take this course, students must have completed course 132 and satisfied the Entry Level Writing and Composition requirements. This course satisfies the General Education Code W requirement.

PREREQUISITES 

Covers simple linear regression, multiple regression, and analysis of variance models. Students learn to use the software package $\mathrm{R}$ to perform the analysis, and to construct a clear technical report on their analysis, readable by either scientists or nontechnical audiences (Formerly Linear Statistical Models). Prerequisite(s): course 132 and satisfaction of the Entry Level Writing and Composition requirements. Gen. Ed. Code(s): W

STAT311 Linear regression HELP(EXAM HELP, ONLINE TUTOR)

问题 1.

11.1. Suppose $Y_i=\boldsymbol{x}_i^T \boldsymbol{\beta}+\epsilon_i$ where the errors are independent $N\left(0, \sigma^2\right)$. Then the likelihood function is
$$
L\left(\boldsymbol{\beta}, \sigma^2\right)=\left(2 \pi \sigma^2\right)^{-n / 2} \exp \left(\frac{-1}{2 \sigma^2}|\boldsymbol{y}-\boldsymbol{X} \boldsymbol{\beta}|^2\right) .
$$
a) Since the least squares estimator $\hat{\beta}$ minimizes $|\boldsymbol{y}-\boldsymbol{X} \boldsymbol{\beta}|^2$, show that $\hat{\boldsymbol{\beta}}$ is the MLE of $\boldsymbol{\beta}$.
b) Then find the MLE $\hat{\sigma}^2$ of $\sigma^2$.

问题 2.

11.3. Suppose $Y_i=\boldsymbol{x}i^T \boldsymbol{\beta}+\epsilon_i$ where the errors are independent $N\left(0, \sigma^2 / w_i\right)$ where $w_i>0$ are known constants. Then the likelihood function is $$ L\left(\boldsymbol{\beta}, \sigma^2\right)=\left(\prod{i=1}^n \sqrt{w_i}\right)\left(\frac{1}{\sqrt{2 \pi}}\right)^n \frac{1}{\sigma^n} \exp \left(\frac{-1}{2 \sigma^2} \sum_{i=1}^n w_i\left(y_i-\boldsymbol{x}_i^T \boldsymbol{\beta}\right)^2\right) .
$$ a) Suppose that $\hat{\boldsymbol{\beta}}W$ minimizes $\sum{i=1}^n w_i\left(y_i-\boldsymbol{x}_i^T \boldsymbol{\beta}\right)^2$. Show that $\hat{\boldsymbol{\beta}}_W$ is the MLE of $\boldsymbol{\beta}$.
b) Then find the MLE $\hat{\sigma}^2$ of $\sigma^2$.

问题 3.

11.2. Suppose $Y_i=\boldsymbol{x}i^T \boldsymbol{\beta}+e_i$ where the errors are iid double exponential $(0, \sigma)$ where $\sigma>0$. Then the likelihood function is $$ L(\boldsymbol{\beta}, \sigma)=\frac{1}{2^n} \frac{1}{\sigma^n} \exp \left(\frac{-1}{\sigma} \sum{i=1}^n\left|Y_i-\boldsymbol{x}i^T \boldsymbol{\beta}\right|\right) . $$ Suppose that $\tilde{\boldsymbol{\beta}}$ is a minimizer of $\sum{i=1}^n\left|Y_i-\boldsymbol{x}i^T \boldsymbol{\beta}\right|$. a) By direct maximization, show that $\tilde{\boldsymbol{\beta}}$ is an MLE of $\boldsymbol{\beta}$ regardless of the value of $\sigma$. b) Find an MLE of $\sigma$ by maximizing $$ L(\sigma) \equiv L(\tilde{\boldsymbol{\beta}}, \sigma)=\frac{1}{2^n} \frac{1}{\sigma^n} \exp \left(\frac{-1}{\sigma} \sum{i=1}^n\left|Y_i-\boldsymbol{x}_i^T \tilde{\boldsymbol{\beta}}\right|\right) .
$$

问题 4.

11.8. Let $Y \sim N\left(\mu, \sigma^2\right)$ so that $E(Y)=\mu$ and $\operatorname{Var}(Y)=\sigma^2=E\left(Y^2\right)-$ $[E(Y)]^2$. If $k \geq 2$ is an integer, then
$$
E\left(Y^k\right)=(k-1) \sigma^2 E\left(Y^{k-2}\right)+\mu E\left(Y^{k-1}\right) .
$$
Let $Z=(Y-\mu) / \sigma \sim N(0,1)$. Hence $\mu_k=E(Y-\mu)^k=\sigma^k E\left(Z^k\right)$. Use this fact and the above recursion relationship $E\left(Z^k\right)=(k-1) E\left(Z^{k-2}\right)$ to find a) $\mu_3$ and b) $\mu_4$.

Textbooks


• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.

此图像的alt属性为空;文件名为%E7%B2%89%E7%AC%94%E5%AD%97%E6%B5%B7%E6%8A%A5-1024x575-10.png
统计代写|STAT311 Linear regression

Statistics-lab™可以为您提供metrostate.edu STAT311 Linear regression线性回归课程的代写代考辅导服务! 请认准Statistics-lab™. Statistics-lab™为您的留学生涯保驾护航。