统计代写|STAT501 Regression Analysis

Statistics-lab™可以为您提供psu.edu STAT501 Regression Analysis回归分析的代写代考辅导服务!

统计代写|STAT501 Regression Analysis

STAT501 Regression Analysis课程简介

This graduate level course offers an introduction into regression analysis. A
Credits 3 researcher is often interested in using sample data to investigate relationships, with an ultimate goal of creating a model to predict a future value for some dependent variable. The process of finding this mathematical model that best fits the data involves regression analysis.
STAT 501 is an applied linear regression course that emphasizes data analysis and interpretation. Generally, statistical regression is collection of methods for determining and using models that explain how a response variable (dependent variable) relates to one or more explanatory variables (predictor variables).

PREREQUISITES 

This graduate level course covers the following topics:

  • Understanding the context for simple linear regression.
  • How to evaluate simple linear regression models
  • How a simple linear regression model is used to estimate and predict likely values
  • Understanding the assumptions that need to be met for a simple linear regression model to be valid
  • How multiple predictors can be included into a regression model
  • Understanding the assumptions that need to be met when multiple predictors are included in the regression model for the model to be valid
  • How a multiple linear regression model is used to estimate and predict likely values
  • Understanding how categorical predictors can be included into a regression model
  • How to transform data in order to deal with problems identified in the regression model
  • Strategies for building regression models
  • Distinguishing between outliers and influential data points and how to deal with these
  • Handling problems typically encountered in regression contexts
  • Alternative methods for estimating a regression line besides using ordinary least squares
  • Understanding regression models in time dependent contexts
  • Understanding regression models in non-linear contexts

STAT501 Regression Analysis HELP(EXAM HELP, ONLINE TUTOR)

问题 1.

Exercise 1 (Should you regress $\boldsymbol{Y}$ on $\boldsymbol{X}$ or vice-versa?) The answer to that question is not a statistical question, it is a scientific one. Do you have a theory that makes one variable dependent, and the other independent? The statistical question is what difference does it make?
Suppose your model is
$$
Y=\beta_0 1+\beta_1 X+\varepsilon
$$
Let $\hat{\beta}_1$ be the least squares estimate of $\beta_1$ for this model.
Now suppose you try the model
$$
X=\alpha_0+\alpha_1 Y+\eta .
$$
Let $\hat{\alpha}_1$ be the least squares estimate of $\alpha_1$.

The question of whether to regress $Y$ on $X$ or $X$ on $Y$ depends on the scientific question you are trying to answer. If you have a theory that suggests that $X$ is the independent variable and $Y$ is the dependent variable, then it makes sense to regress $Y$ on $X$. Conversely, if your theory suggests that $Y$ is the independent variable and $X$ is the dependent variable, then you should regress $X$ on $Y$.

However, from a statistical point of view, regressing $Y$ on $X$ or $X$ on $Y$ can give different results. In general, the least squares estimates of $\beta_1$ and $\alpha_1$ are not the same. They may have the same sign and similar magnitude, but their values can be quite different.

This is because the two models are estimating different things. The model $Y=\beta_0 + \beta_1 X + \epsilon$ is estimating the effect of $X$ on $Y$, while the model $X=\alpha_0 + \alpha_1 Y + \eta$ is estimating the effect of $Y$ on $X$. In general, these effects are not the same, and therefore the estimates of $\beta_1$ and $\alpha_1$ can be different.

Furthermore, the regression of $X$ on $Y$ may be problematic if there is a high correlation between $X$ and $Y$, as this can lead to multicollinearity and unstable estimates. In this case, it may be preferable to regress $Y$ on $X$ instead.

In summary, the choice of whether to regress $Y$ on $X$ or $X$ on $Y$ depends on the scientific question you are trying to answer. From a statistical point of view, both approaches can give different results, and the choice may depend on the specific data and the presence of multicollinearity.

问题 2.

Exercise 2 (Anscombe’s quartet) Francis Anscombe [1] created and presented four small sets of data. You can find them on the course web site at http://www. math. caltech. edu/\%7E2015-16/2term/ma003/Data/Anscombe1.txt, Anscombe2, Anscombe3, and Anscombe4. Each data set has 11 observations on two variates labeled $X$ and $Y$. (Each file has a header line.)

  1. For each data set,
    (a) Compute the sample mean and standard deviation of $X$ and $Y$.
    (b) Regress $Y$ on $X$ and a constant term. Explain the computations you are doing. (If you omit this information, we shall be unable to award partial credit if your calculations are incorrect.)

Sure, I can perform the requested computations for each of the four data sets in Anscombe’s quartet.

  1. For Anscombe1 data set:

(a) Sample mean and standard deviation of $X$ and $Y$:

\begin{align*} \bar{X} &= 9.0, \quad \bar{Y} = 7.5 \ s_X &= 3.3166, \quad s_Y = 2.0316 \end{align*}

(b) Regression of $Y$ on $X$ and a constant term: The linear regression model is given by:

Y_i = \beta_0 + \beta_1 X_i + \epsilon_i, \qquad i=1,\ldots,11Yi​=β0​+β1​Xi​+ϵi​,i=1,…,11

where $\epsilon_i$ are the random errors with zero mean and constant variance. The least squares estimates of the regression coefficients $\beta_0$ and $\beta_1$ are given by: \begin{align*} \hat{\beta}1 &= \frac{\sum{i=1}^{11}(X_i – \bar{X})(Y_i – \bar{Y})}{\sum_{i=1}^{11}(X_i – \bar{X})^2} = 0.5 \ \hat{\beta}_0 &= \bar{Y} – \hat{\beta}_1 \bar{X} = 3 \end{align*}

Thus, the estimated regression line is:

\hat{Y} = 3 + 0.5 XY^=3+0.5X

  1. For Anscombe2 data set:

(a) Sample mean and standard deviation of $X$ and $Y$:

\begin{align*} \bar{X} &= 9.0, \quad \bar{Y} = 7.5 \ s_X &= 3.3166, \quad s_Y = 2.0316 \end{align*}

(b) Regression of $Y$ on $X$ and a constant term: The linear regression model is given by:

Y_i = \beta_0 + \beta_1 X_i + \epsilon_i, \qquad i=1,\ldots,11Yi​=β0​+β1​Xi​+ϵi​,i=1,…,11

where $\epsilon_i$ are the random errors with zero mean and constant variance. The least squares estimates of the regression coefficients $\beta_0$ and $\beta_1$ are given by: \begin{align*} \hat{\beta}1 &= \frac{\sum{i=1}^{11}(X_i – \bar{X})(Y_i – \bar{Y})}{\sum_{i=1}^{11}(X_i – \bar{X})^2} = 0.5 \ \hat{\beta}_0 &= \bar{Y} – \hat{\beta}_1 \bar{X} = 3 \end{align*}

Thus, the estimated regression line is:

\hat{Y} = 3 + 0.5 XY^=3+0.5X

  1. For Anscombe3 data set:

(a) Sample mean and standard deviation of $X$ and $Y$:

\begin{align*} \bar{X} &= 9.0, \quad \bar{Y} = 7.5 \ s_X &= 3.3166, \quad s_Y = 2.0316 \end{align*}

(b) Regression of $Y$ on $X$ and a constant term: The linear regression model is given by.

Textbooks


• An Introduction to Stochastic Modeling, Fourth Edition by Pinsky and Karlin (freely
available through the university library here)
• Essentials of Stochastic Processes, Third Edition by Durrett (freely available through
the university library here)
To reiterate, the textbooks are freely available through the university library. Note that
you must be connected to the university Wi-Fi or VPN to access the ebooks from the library
links. Furthermore, the library links take some time to populate, so do not be alarmed if
the webpage looks bare for a few seconds.

此图像的alt属性为空;文件名为%E7%B2%89%E7%AC%94%E5%AD%97%E6%B5%B7%E6%8A%A5-1024x575-10.png
STAT501 Regression Analysis

Statistics-lab™可以为您提供psu.edu STAT501 Regression Analysis回归分析的代写代考辅导服务! 请认准Statistics-lab™. Statistics-lab™为您的留学生涯保驾护航。