## 统计代写|回归分析作业代写Regression Analysis代考|The Quadratic Model in Two or More $X$ Variables

statistics-lab™ 为您的留学生涯保驾护航 在代写回归分析Regression Analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写回归分析Regression Analysis代写方面经验极为丰富，各种代写回归分析Regression Analysis相关的作业也就用不着说。

## 统计代写|回归分析作业代写Regression Analysis代考|The Quadratic Model in Two or More $X$ Variables

The general quadratic response surface as given in the introduction to this chapter is $f\left(x_1, x_2\right)=\beta_0+\beta_1 x_1+\beta_2 x_1^2+\beta_3 x_2+\beta_4 x_2^2+\beta_5 x_1 x_2$. An example for particular choices of the coefficients $\beta$ is shown in Figure 9.3. Notice that the response function is curved, not planar, and is, therefore, more realistic. The term “response surface” is sometimes used instead of “response function” in the case of two or more $X$ variables; the “surface” term is explained by the appearance of the graph of the function in Figure 9.3.
In addition to modeling and testing for curvature in higher dimensional space, quadratic models are also useful for identifying an optimal combination of $X$ values that maximizes or minimizes the response function; see the “rsm” package of $\mathrm{R}$ for more information. While quadratic models are more flexible (and therefore more realistic) than planar models, they can have poor extrapolation properties and are often less realistic than the similarly flexible, curved class of response surfaces known as neural network regression models. In Chapter 17, we compare polynomial regression models with neural network regression models.

## 统计代写|回归分析作业代写Regression Analysis代考|Interaction (or Moderator) Analysis

The commonly-used interaction model is a special case of the general quadratic model, involving the interaction term but no quadratic terms. When performing interaction analysis, you typically will assume the following conditional mean function:
$$\mathrm{E}\left(Y \mid X_1=x_1, X_2=x_2\right)=\beta_0+\beta_1 x_1+\beta_2 x_2+\beta_3 x_1 x_2$$
A slight modification of the Product Complexity example provides a case study in which interaction is needed. Suppose you measure $Y=$ Intent to Purchase a Luxury Product, say expensive jewelry, using a survey of consumers. You also measure the attractiveness $\left(X_1\right)$ of a web design used to display and promote the product, say measured in a scale from 1 to 10 , with $10=$ most attractive design, and the person’s income $\left(X_2\right)$ in a scale from 1 to 5 , with $5=$ most wealthy.

Figure 9.4 shows an example of how this conditional mean function might look. Like the quadratic response surface, it is a curved function in space, not a plane. But note in particular that the effect of $X_1$, Attractiveness of Web Design, on $Y=$ Intent to Purchase, depends on the value of $X_2$, Income: For consumers with the lowest income, $X_2=1$, the slice of the surface corresponding to $X_2=1$ is nearly flat as a function of $X_1=$ Attractiveness of Web Design. That is to say, for people with the lowest income, Attractiveness of Web Design has little effect on Intent to Purchase this luxury product. No surprise! They do not have enough money to purchase luxury items, so the web design is mostly irrelevant to them. On the other hand, for people with the highest income $\left(X_2=5\right)$, the slice of the surface corresponding to $X_2=5$ increases substantially as a function of $X_1=$ Attractiveness of Web Design. Thus, this single model states both (i) that Attractiveness of Web Design $\left(X_1\right)$ has little effect on Intention to Purchase a Luxury Product for people with little money, and (ii) that Attractiveness of Web Design $\left(X_1\right)$ has a substantial effect on Intention to Purchase a Luxury Product for people with lots of money.

# 回归分析代写

## 统计代写|回归分析作业代写Regression Analysis代考|Interaction (or Moderator) Analysis

$$\mathrm{E}\left(Y \mid X_1=x_1, X_2=x_2\right)=\beta_0+\beta_1 x_1+\beta_2 x_2+\beta_3 x_1 x_2$$

## 统计代写|回归分析作业代写Regression Analysis代考|The Adjusted R-Squared Statistic

statistics-lab™ 为您的留学生涯保驾护航 在代写回归分析Regression Analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写回归分析Regression Analysis代写方面经验极为丰富，各种代写回归分析Regression Analysis相关的作业也就用不着说。

## 统计代写|回归分析作业代写Regression Analysis代考|The Adjusted R-Squared Statistic

Recall that, in the classical model, $\Omega^2=1-\sigma^2 / \sigma_Y^2$, and that the standard $R^2$ statistic replaces the two variances with their maximum likelihood estimates. Recall also that maximum likelihood estimates of variance are biased. With a larger number of predictor variables (i.e., larger $k$ ), the estimate $\hat{\sigma}^2=$ SSE $/ n$ becomes increasingly biased downward, implying in turn that the ordinary $R^2$ becomes increasingly biased upward.
Replacing the two variances with their unbiased estimates gives the adjusted $R^2$ statistic:
$$R_a^2=1-{\mathrm{SSE} /(n-k-1)} /{\mathrm{SST} /(n-1)}$$
The adjusted $R^2$ statistic is still biased as an estimator of $\Omega^2=1-\sigma^2 / \sigma_Y^2$ because of Jensen’s inequality, but it is less biased than the ordinary $R^2$ statistic. You can interpret the adjusted $R^2$ statistic in the same way as the ordinary one.

Which estimate is best, adjusted $R^2$ or ordinary $R^2$ ? You guessed it: Use simulation to find out. Despite its reduced bias, the adjusted $R^2$ is not necessarily closer to the true $\Omega^2$, as simulations will show. In addition, the adjusted $R^2$ statistic can be less than 0.0 , which is clearly undesirable. The ordinary $R^2$, like the estimand $\Omega^2$, always lies between 0 and 1 (inclusive).
The following $R$ code locates these $R^2$ statistics in the $1 \mathrm{~m}$ output, and computes them “by hand” as well, using the model where Car Sales is predicted using a quadratic function of Interest Rate.

## 统计代写|回归分析作业代写Regression Analysis代考|The $F$ Test

See the $\mathrm{R}$ output a few lines above: Underneath the $R^2$ statistic is the $F$-statistic. This statistic is related to the $R^2$ statistic in that it is also a function of SST and SSE (review Figure 8.2). It is given by
$$F={(\mathrm{SST}-\mathrm{SSE}) / k} /{\mathrm{SSE} /(n-k-1)}$$
If you add the line ((SST-SSE)/2)/(SSE/(n-3)) to the $\mathrm{R}$ code above, you will get the reported $F$-statistic, although with more decimals: 62.21945.

With a little algebra, you can relate the $F$-statistic directly to the $R^2$ statistic, showing that for fixed $k$ and $n$, larger $R^2$ corresponds to larger $F$ :
$$F={(n-k-1) / k} \times R^2 /\left(1-R^2\right)$$
Self-study question: Why is the equation relating $F$ to $R^2$ true?
The $F$-statistic is used to test the global null hypothesis $\mathrm{H}0: \beta_1=\beta_2=\ldots=\beta_k=0$, which states that none of the regression variables $X_1, X_2, \ldots$, or $X_k$ is related to $Y$. Under the classical model where $\mathrm{H}_0: \beta_1=\beta_2=\ldots=\beta_k=0$ is true, the $F$-statistic has a precise and well-known distribution. Distribution of the $F$-statistic under the classical model where $\beta_1=\beta_2=\ldots=\beta_k=0$ $$F \sim F{k, n-k-1}$$
where $F_{k, n-k-1}$ is the $F$ distribution with $k$ numerator degrees of freedom and $n-k-1$ denominator degrees of freedom.

# 回归分析代写

## 统计代写|回归分析作业代写Regression Analysis代考|The Adjusted R-Squared Statistic

$$R_a^2=1-{\mathrm{SSE} /(n-k-1)} /{\mathrm{SST} /(n-1)}$$

## 统计代写|回归分析作业代写Regression Analysis代考|The $F$ Test

$$F={(\mathrm{SST}-\mathrm{SSE}) / k} /{\mathrm{SSE} /(n-k-1)}$$

$$F={(n-k-1) / k} \times R^2 /\left(1-R^2\right)$$

$F$ -统计量用于检验全局零假设$\mathrm{H}0: \beta_1=\beta_2=\ldots=\beta_k=0$，它表明回归变量$X_1, X_2, \ldots$或$X_k$都与$Y$无关。在$\mathrm{H}0: \beta_1=\beta_2=\ldots=\beta_k=0$为真的经典模型下，$F$ -统计量具有精确且众所周知的分布。经典模型下$F$ -统计量的分布，其中$\beta_1=\beta_2=\ldots=\beta_k=0$$F \sim F{k, n-k-1}$$ 其中$F{k, n-k-1}$为分子自由度为$k$，分母自由度为$n-k-1$的$F$分布。 统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。 ## 统计代写|回归分析作业代写Regression Analysis代考|Multiple Regression from the Matrix Point of View 如果你也在 怎样代写回归分析Regression Analysis 这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。回归分析Regression Analysis回归中的概率观点具体体现在给定X数据的特定固定值的Y数据的可变性模型中。这种可变性是用条件分布建模的;因此，副标题是:“条件分布方法”。回归的整个主题都是用条件分布来表达的;这种观点统一了不同的方法，如经典回归、方差分析、泊松回归、逻辑回归、异方差回归、分位数回归、名义Y数据模型、因果模型、神经网络回归和树回归。所有这些都可以方便地用给定特定X值的Y条件分布模型来看待。 回归分析Regression Analysis条件分布是回归数据的正确模型。它们告诉你，对于变量X的给定值，可能存在可观察到的变量Y的分布。如果你碰巧知道这个分布，那么你就知道了你可能知道的关于响应变量Y的所有信息，因为它与预测变量X的给定值有关。与基于R^2统计量的典型回归方法不同，该模型解释了100%的潜在可观察到的Y数据，后者只解释了Y数据的一小部分，而且在假设几乎总是被违反的情况下也是不正确的。 statistics-lab™ 为您的留学生涯保驾护航 在代写回归分析Regression Analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写回归分析Regression Analysis代写方面经验极为丰富，各种代写回归分析Regression Analysis相关的作业也就用不着说。 ## 统计代写|回归分析作业代写Regression Analysis代考|Multiple Regression from the Matrix Point of View In the case of simple regression, you saw that the OLS estimate of slope has a simple form: It is the estimated covariance of the$(X, Y)$distribution, divided by the estimated variance of the$X$distribution, or$\hat{\beta}1=\hat{\sigma}{x y} / \hat{\sigma}_x^2$. There is no such simple formula in multiple regression. Instead, you must use matrix algebra, involving matrix multiplication and matrix inverses. If you are unfamiliar with basic matrix algebra, including multiplication, addition, subtraction, transpose, identity matrix, and matrix inverse, you should take some time now to get acquainted with those particular concepts before reading on. (Perhaps you can locate a “matrix algebra for beginners” type of web page.) Done? Ok, read on. Our first use of matrix algebra in regression is to give a concise representation of the regression model. Multiple regression models refer to$n$observations and$k$variables, both of which can be in the thousands or even millions. The following matrix form of the model provides a very convenient shorthand to represent all this information. $$Y=\mathrm{X} \beta+\varepsilon$$ This concise form covers all the$n$observations and all the$X$variables ($k$of them) in one simple equation. Note that there are boldface non-italic terms and boldface italic terms in the expression. To make the material easier to read, we use the convention that boldface means a matrix, while boldface italic refers to a vector, which is a matrix with a single column. Thus$\boldsymbol{Y}, \boldsymbol{\beta}$, and$\varepsilon$, are vectors (single-column matrices), while$\mathbf{X}$is a matrix having multiple columns. ## 统计代写|回归分析作业代写Regression Analysis代考|The Least Squares Estimates in Matrix Form One use of matrix algebra is to display the model for all$n$observations and all$X$variables succinctly as shown above. Another use is to identify the OLS estimates of the$\beta$‘s. There is simply no way to display the OLS estimates other than by using matrix algebra, as follows: $$\hat{\boldsymbol{\beta}}=\left(\mathbf{X}^{\mathrm{T}} \mathbf{X}\right)^{-1} \mathbf{X}^{\mathrm{T}} Y$$ (The ”$\mathrm{T}$” symbol denotes transpose of the matrix.) To see why the OLS estimates have this matrix representation, recall that in the simple, classical regression model, the maximum likelihood (ML) estimates must minimize the sum of squared “errors” called SSE. The same is true in multiple regression: The ML estimates must minimize the function $$\operatorname{SSE}\left(\beta_0, \beta_1, \ldots, \beta_k\right)=\sum_{i=1}^n\left{y_i-\left(\beta_0+\beta_1 x_{i 1}+\cdots+\beta_k x_{i k}\right)\right}^2$$ In the case of two$X$variables$(k=2)$, you are to choose$\hat{\beta}0, \hat{\beta}_1$, and$\hat{\beta}_2$that define the plane,$f\left(x_1, x_2\right)=\hat{\beta}_0+\hat{\beta}_1 x_1+\hat{\beta}_2 x_2$, such as the one shown in Figure 6.3 , that minimizes the sum of squared vertical deviations from the 3-dimensional point cloud$\left(x{i 1}, x_{i 2}, y_i\right), i=1,2, \ldots, n$. Figure 7.1 illustrates the concept. # 回归分析代写 ## 统计代写|回归分析作业代写Regression Analysis代考|Multiple Regression from the Matrix Point of View 在简单回归的情况下，您看到斜率的OLS估计有一个简单的形式:它是$(X, Y)$分布的估计协方差除以$X$分布或$\hat{\beta}1=\hat{\sigma}{x y} / \hat{\sigma}_x^2$的估计方差。在多元回归中没有这样简单的公式。相反，你必须使用矩阵代数，包括矩阵乘法和矩阵逆。如果您不熟悉基本的矩阵代数，包括乘法、加法、减法、转置、单位矩阵和矩阵逆，那么在继续阅读之前，您应该花一些时间熟悉这些特定的概念。(也许你可以找到一个“矩阵代数初学者”类型的网页。) 搞定了?好吧，继续读下去。 我们在回归中首先使用矩阵代数是为了给出回归模型的简明表示。多元回归模型涉及$n$观测值和$k$变量，这两个变量都可以是数千甚至数百万。该模型的以下矩阵形式提供了一种非常方便的速记方式来表示所有这些信息。 $$Y=\mathrm{X} \beta+\varepsilon$$ 这个简洁的形式在一个简单的方程中涵盖了所有的$n$观测值和所有的$X$变量(其中的$k$变量)。请注意，表达式中有黑体非斜体项和黑体斜体项。为了使材料更容易阅读，我们使用约定，黑体表示矩阵，而黑体斜体表示向量，这是一个具有单列的矩阵。因此$\boldsymbol{Y}, \boldsymbol{\beta}$和$\varepsilon$是向量(单列矩阵)，而$\mathbf{X}$是具有多列的矩阵。 ## 统计代写|回归分析作业代写Regression Analysis代考|The Least Squares Estimates in Matrix Form 矩阵代数的一种用法是简洁地显示所有$n$观测值和所有$X$变量的模型，如上所示。另一个用途是识别$\beta$的OLS估计。除了使用矩阵代数之外，根本没有办法显示OLS估计，如下所示: $$\hat{\boldsymbol{\beta}}=\left(\mathbf{X}^{\mathrm{T}} \mathbf{X}\right)^{-1} \mathbf{X}^{\mathrm{T}} Y$$ (“$\mathrm{T}$”符号表示矩阵的转置。)要了解为什么OLS估计具有这种矩阵表示，请回忆一下，在简单的经典回归模型中，最大似然(ML)估计必须最小化称为SSE的平方“误差”的总和。在多元回归中也是如此:机器学习估计必须最小化函数 $$\operatorname{SSE}\left(\beta_0, \beta_1, \ldots, \beta_k\right)=\sum_{i=1}^n\left{y_i-\left(\beta_0+\beta_1 x_{i 1}+\cdots+\beta_k x_{i k}\right)\right}^2$$ 在有两个$X$变量$(k=2)$的情况下，您将选择$\hat{\beta}0, \hat{\beta}1$和$\hat{\beta}_2$来定义平面$f\left(x_1, x_2\right)=\hat{\beta}_0+\hat{\beta}_1 x_1+\hat{\beta}_2 x_2$，如图6.3所示，它最小化与三维点云$\left(x{i 1}, x{i 2}, y_i\right), i=1,2, \ldots, n$垂直偏差的平方和。图7.1说明了这个概念。 统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。 ## 统计代写|回归分析作业代写Regression Analysis代考|Evaluating the Linearity Assumption Using Hypothesis Testing Methods 如果你也在 怎样代写回归分析Regression Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。 回归分析是一种强大的统计方法，允许你检查两个或多个感兴趣的变量之间的关系。虽然有许多类型的回归分析，但它们的核心都是考察一个或多个自变量对因变量的影响。 statistics-lab™ 为您的留学生涯保驾护航 在代写回归分析Regression Analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写回归分析Regression Analysis代写方面经验极为丰富，各种代写回归分析Regression Analysis相关的作业也就用不着说。 ## 统计代写|回归分析作业代写Regression Analysis代考|Evaluating the Linearity Assumption Using Hypothesis Testing Methods Here, we will get slightly ahead of the flow of the book, because multiple regression is covered in the next chapter. A simple, powerful way to test for curvature is to use a multiple regression model that includes a quadratic term. The quadratic regression model is given by: $$Y=\beta_0+\beta_1 X+\beta_2 X^2+\varepsilon$$ This model assumes that, if there is curvature, then it takes a quadratic form. Logic for making this assumption is given by “Taylor’s Theorem,” which states that many types of curved functions are well approximated by quadratic functions. Testing methods require restricted (null) and unrestricted (alternative) models. Here, the null model enforces the restriction that$\beta_2=0$; thus the null model states that the mean response is a linear (not curved) function of$x$. So-called “insignificance” (determined historically by$p>0.05$) of the estimate of$\beta_2$means that the evidence of curvature in the observed data, as indicated by a non-zero estimate of$\beta_2$or by a curved LOESS fit, is explainable by chance alone under the linear model. “Significance” (determined historically by$p<0.05$) means that such evidence of curvature is not easily explained by chance alone under the linear model. But you should not take the result of this$p$-value based test as a “recipe” for model construction. If “significant,” you should not automatically assume a curved model. Instead, you should ask, “Is the curvature dramatic enough to warrant the additional modeling complexity?” and “Do the predictions differ much, whether you use a model for curvature or the ordinary linear model?” If the answers to those questions are “No,” then you should use the linear model anyway, even if it was “rejected” by the$p$-value based test. In addition, models employing curvature (particularly quadratics) are notoriously poor at the extremes of the$x$-range(s). So again, you can easily prefer the linear model, even if the curvature is “significant”$(p<0.05)$. ## 统计代写|回归分析作业代写Regression Analysis代考|Testing for Curvature with the Production Cost Data The following R code illustrates the method. ProdC$=$read.table(“https://raw.githubusercontent.com/andrea2719/ URA-DataSets/master/ProdC.txt”) attach(ProdC) plot (Widgets, Cost); abline(lsfit(Widgets, Cost)) Widgets.squared = Widgets^2 Prodc$=$read.table$($“https$: / /$raw.githubusercontent.com/andrea2719/ URA-Datasets/master/ProdC.txt”) attach (ProdC) plot (Widgets, Cost); abline(lsfit(Widgets, Cost)) Widgets.squared$=$Widgets$^{\wedge} 2$fit.quad$=1 \mathrm{~m}$(Cost$~$Widgets + Widgets.squared); summary (fit.quad) lines(spline(Widgets, predict(fit.quad)), col = “gray”, lty=2) Figure 4.3 shows both the linear and quadratic (curved) fit to the data. Since the linear and quadratic fits are so similar, it (again) appears that there is no need to model the curvature explicitly in this example. Relevant lines from the summary of fit are shown as follows: Coefficients : (Intercept) widgets Widgets.squared$\begin{array}{cccc}\text { Estimate } & \text { Std. Error } & t \text { value } & \operatorname{Pr}(>|t|) \ 4.564 e+02 & 7.493 e+02 & 0.609 & 0.546 \ 9.149 e-01 & 1.290 e+00 & 0.709 & 0.483 \ 2.923 e-04 & 5.322 e-04 & 0.549 & 0.586\end{array}$Residual standard error: 241.3 on 37 degrees of freedom Multiple R-squared: 0.7987 , Adjusted R-squared: 0.7878 F-statistic: 73.42 on 2 and$37 \mathrm{DF}$, p-value:$1.318 \mathrm{e}-13$Notice the$p$-value for testing the$\beta_2=0$restriction: Since the$p$-value is 0.586 , the difference between the coefficient 0.0002923 (2.923e-04) and 0.0 is explainable by chance alone. That is, even if the process were truly linear (i.e., even if$\beta_2=0$), you would often see quadratic coefficient estimates$\left(\hat{\beta}_2\right)$as large as 0.0002923 when you fit a quadratic model to similar data. If this is confusing to you, just run a simulation from a similar linear process (where$\beta_2=0$), and fit a quadratic model. You will see a non-zero$\hat{\beta}_2$in every simulated data set, and most will be within 2 standard errors of 0.0 (the$\hat{\beta}_2$above is$T=0.549$standard errors from 0.0 ). # 回归分析代写 ## 统计代写|回归分析作业代写Regression Analysis代考|Evaluating the Linearity Assumption Using Hypothesis Testing Methods 在这里，我们将略超前于本书的流程，因为多元回归将在下一章中讨论。测试曲率的一个简单而有力的方法是使用包含二次项的多元回归模型。二次回归模型为: $$Y=\beta_0+\beta_1 X+\beta_2 X^2+\varepsilon$$ 这个模型假设，如果存在曲率，那么它是二次型的。做出这种假设的逻辑是由“泰勒定理”给出的，该定理指出，许多类型的曲线函数都可以很好地近似于二次函数。 测试方法需要受限制(null)和不受限制(alternative)的模型。在这里，null模型强制限制$\beta_2=0$;因此，零模型表明平均响应是$x$的线性(而不是曲线)函数。所谓的$\beta_2$估计的“不显著性”(历史上由$p>0.05$确定)意味着观测数据中的曲率证据，如$\beta_2$的非零估计或弯曲的黄土拟合所表明的那样，在线性模型下只能由偶然解释。“重要性”(历史上由$p<0.05$决定)意味着这种曲率的证据在线性模型下不容易单独用偶然来解释。 但是，您不应该将这个基于$p$值的测试的结果作为模型构建的“配方”。如果“重要”，您不应该自动假设一个曲线模型。相反，您应该问:“曲率是否足够大，足以保证额外的建模复杂性?”以及“使用曲率模型还是普通线性模型，预测的差异是否很大?”如果这些问题的答案是“否”，那么无论如何都应该使用线性模型，即使它被基于$p$值的测试“拒绝”。 此外，采用曲率(特别是二次曲线)的模型在$x$-范围的极值处是出了名的差。所以，你可以很容易地选择线性模型，即使曲率是“显著的”$(p<0.05)$。 ## 统计代写|回归分析作业代写Regression Analysis代考|Testing for Curvature with the Production Cost Data 下面的R代码演示了该方法。 ProdC$=$read.table(“https://raw.githubusercontent.com/andrea2719/ “ura – dataset /master/ product .txt”) 附件(产品) plot (Widgets, Cost);abline(lsfit(Widgets, Cost)) 小部件。^2 = Widgets^2 产品$=$阅读。表$($“https$: / /$raw.githubusercontent.com/andrea2719/ “ura – dataset /master/ product .txt”) 附件(产品) plot (Widgets, Cost);abline(lsfit(Widgets, Cost)) 小部件。squared$=$小部件$^{\wedge} 2$适合。quad$=1 \mathrm{~m}$(成本$~$Widgets + Widgets.squared);总结(fit.quad) 线条(样条(Widgets, predict(fit.quad))， col = “gray”， lty=2) 图4.3显示了数据的线性拟合和二次(曲线)拟合。由于线性拟合和二次拟合是如此相似，它(再次)似乎没有必要在这个例子中明确地建模曲率。 拟合总结的相关行如下: 系数: (截语) 小部件 widgets。squared$\begin{array}{cccc}\text { Estimate } & \text { Std. Error } & t \text { value } & \operatorname{Pr}(>|t|) \ 4.564 e+02 & 7.493 e+02 & 0.609 & 0.546 \ 9.149 e-01 & 1.290 e+00 & 0.709 & 0.483 \ 2.923 e-04 & 5.322 e-04 & 0.549 & 0.586\end{array}$37个自由度的残差标准误差:241.3 多元r平方:0.7987，调整r平方:0.7878 f统计量:73.42对2和$37 \mathrm{DF}$, p值:$1.318 \mathrm{e}-13$请注意用于测试$\beta_2=0$限制的$p$-值:由于$p$-值为0.586，因此系数0.0002923 (2.923e-04)和0.0之间的差异只能通过偶然来解释。也就是说，即使这个过程是真正线性的(即，即使$\beta_2=0$)，当您将二次模型拟合到类似的数据时，您经常会看到二次系数估计$\left(\hat{\beta}_2\right)$大到0.0002923。如果这让您感到困惑，只需从类似的线性过程($\beta_2=0$)运行模拟，并拟合二次模型。您将在每个模拟数据集中看到一个非零$\hat{\beta}_2$，并且大多数将在0.0的2个标准误差范围内(上面的$\hat{\beta}_2$是0.0的$T=0.549$标准误差)。 统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。 ## 随机过程代考 在概率论概念中，随机过程随机变量的集合。 若一随机系统的样本点是随机函数，则称此函数为样本函数，这一随机系统全部样本函数的集合是一个随机过程。 实际应用中，样本函数的一般定义在时间域或者空间域。 随机过程的实例如股票和汇率的波动、语音信号、视频信号、体温的变化，随机运动如布朗运动、随机徘徊等等。 ## 贝叶斯方法代考 贝叶斯统计概念及数据分析表示使用概率陈述回答有关未知参数的研究问题以及统计范式。后验分布包括关于参数的先验分布，和基于观测数据提供关于参数的信息似然模型。根据选择的先验分布和似然模型，后验分布可以解析或近似，例如，马尔科夫链蒙特卡罗 (MCMC) 方法之一。贝叶斯统计概念及数据分析使用后验分布来形成模型参数的各种摘要，包括点估计，如后验平均值、中位数、百分位数和称为可信区间的区间估计。此外，所有关于模型参数的统计检验都可以表示为基于估计后验分布的概率报表。 ## 广义线性模型代考 广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。 statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。 ## 机器学习代写 随着AI的大潮到来，Machine Learning逐渐成为一个新的学习热点。同时与传统CS相比，Machine Learning在其他领域也有着广泛的应用，因此这门学科成为不仅折磨CS专业同学的“小恶魔”，也是折磨生物、化学、统计等其他学科留学生的“大魔王”。学习Machine learning的一大绊脚石在于使用语言众多，跨学科范围广，所以学习起来尤其困难。但是不管你在学习Machine Learning时遇到任何难题，StudyGate专业导师团队都能为你轻松解决。 ## 多元统计分析代考 基础数据:$N$个样本，$P$个变量数的单样本，组成的横列的数据表 变量定性: 分类和顺序；变量定量：数值 数学公式的角度分为: 因变量与自变量 ## 时间序列分析代写 随机过程，是依赖于参数的一组随机变量的全体，参数通常是时间。 随机变量是随机现象的数量表现，其时间序列是一组按照时间发生先后顺序进行排列的数据点序列。通常一组时间序列的时间间隔为一恒定值（如1秒，5分钟，12小时，7天，1年），因此时间序列可以作为离散时间数据进行分析处理。研究时间序列数据的意义在于现实中，往往需要研究某个事物其随时间发展变化的规律。这就需要通过研究该事物过去发展的历史记录，以得到其自身发展的规律。 ## 回归分析代写 多元回归分析渐进（Multiple Regression Analysis Asymptotics）属于计量经济学领域，主要是一种数学上的统计分析方法，可以分析复杂情况下各影响因素的数学关系，在自然科学、社会和经济学等多个领域内应用广泛。 ## MATLAB代写 MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。 ## 统计代写|回归分析作业代写Regression Analysis代考|Simulation Study to Understand the Null Distribution of the$T$Statistic 如果你也在 怎样代写回归分析Regression Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。 回归分析是一种强大的统计方法，允许你检查两个或多个感兴趣的变量之间的关系。虽然有许多类型的回归分析，但它们的核心都是考察一个或多个自变量对因变量的影响。 statistics-lab™ 为您的留学生涯保驾护航 在代写回归分析Regression Analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写回归分析Regression Analysis代写方面经验极为丰富，各种代写回归分析Regression Analysis相关的作业也就用不着说。 ## 统计代写|回归分析作业代写Regression Analysis代考|Simulation Study to Understand the Null Distribution of the$T$Statistic The following$\mathrm{R}$code generates the data under the null model. It is the same model used in the SSN/Height example above, but re-written in a way to make the constraint$\beta_1=0$explicit. Because we did not use set.seed as in the previous code, the output is random. Your results will differ, but that is good because the point here is to understand randomness.$\mathrm{n}=100$beta$0=70 ;$betal$=0 \quad #$The null model; true betal = 0 ssn = sample$(0: 9,100$, replace=T) height = beta + betal*ssn + rnorm$(100,0,4)$ssn.data = data.frame (ssn, height) fit.$1=$lm (height ssn, data=ssn.data) summary (fit. 1$)\mathrm{n}=100$beta$0=70 ;$betal$=0 \quad #$The null model; true betal$=0$ssn$=$sample$(0: 9,100$, replace$=T)$height$=$beta$0+\operatorname{beta} 1 * \operatorname{ssn}+\operatorname{rnorm}(100,0,4)$ssn. data = data. frame (ssn, height) fit. 1 =$\operatorname{lm}($height ssn, datasssn. data) summary (fit.l) This code gives the following output (yours will vary by randomness): Cal :$\operatorname{lm}$(formula$=$height$\sim$ssn, data$=$ssn. data) Residuals: Min$1 Q$Median$3 Q$Max$-9.4952-2.8261-0.3936 \quad 2.252111 .6764$Coefficients : Estimate std. Error$t$value$\operatorname{Pr}(>|t|)$(Intercept)$69.77372 \quad 0.71865 \quad 97.089<2 \mathrm{e}-16 \star \star \operatorname{ssn} 0.019150 .14336 \quad 0.134 \quad 0.894$Signif. Codes: 0 ‘‘ 0.001 ‘‘ 0.01 ‘*’ 0.05 ‘. 0.1 ‘ 1 Residual standard error: 4.155 on 98 degrees of freedom Multiple R-squared: 0.000182 , Adjusted R-squared: 0.01002 F-statistic: 0.01784 on 1 and$98 \mathrm{DF}$, p-value: 0.894 In our simulation, the estimate$\hat{\beta}_1=0.01915$is$T=0.134$standard errors from zero, and you know that this difference is explained by chance alone because the data are simulated from the null model where$\beta_1=0$. ## 统计代写|回归分析作业代写Regression Analysis代考|The$p$-Value In the example above, the thresholds to determine which real$T$values are explainable by chance alone are the numbers that put$95 \%$of the$T$values that are explained by chance alone between them; these are -1.9845 and +1.9845 in the case of the$T_{98}$distribution. If the observed$T$statistic falls outside that range, then we can say that the difference between$\hat{\beta}_1$and 0 is not easily explained by chance alone. See Figure 3.7 again. Notice that there is$5 \%$total probability outside the \pm 1.9845 range, simply because there is$95 \%$probability inside the range. Now, if the$T$statistic falls inside the$95 \%$range, then there has to be more than$5 \%$total probability outside the$\pm T$range. See Figure 3.7 again, and suppose$T=1.7$, which is inside the range. Then there has to be more than$5 \%$probability outside the \pm 1.7 range, right? See Figure 3.7 again, and locate \pm 1.7 on the graph. Make sure you understand this; it is not hard at all. Do not just read the words, because then you will not understand. Instead, look at Figure 3.7, put your finger on the graph at 1.7 , and think about the area outside the \pm 1.7 range. It is more than 0.05 , do you see? Now, suppose$T=2.5$, and look at Figure 3.7 again. Then there has to be less than$5 \%$probability outside the \pm 2.5 range, right? See Figure 3.7 again, and locate \pm 2.5 on the graph. Make sure you understand this; it is not hard at all. Look at the graph! Do not just read the words! Instead, put your finger on the graph at 2.5 and think about the area outside the \pm 2.5 range. It is less than 0.05 , do you see? # 回归分析代写 ## 统计代写|回归分析作业代写Regression Analysis代考|Simulation Study to Understand the Null Distribution of the$T$Statistic 下面的$\mathrm{R}$代码生成null模型下的数据。它与上面的SSN/Height示例中使用的模型相同，但是以一种使约束$\beta_1=0$显式的方式重新编写。因为我们没有用集合。与前面的代码一样，输出是随机的。你的结果可能会有所不同，但这很好，因为这里的重点是理解随机性。$\mathrm{n}=100$beta$0=70 ;$betal$=0 \quad #$null模型;True betal = 0 ssn = sample$(0: 9,100$, replace=T) 高度= beta + beta *ssn + rnorm$(100,0,4)$ssn. ssn.Data = Data .frame (ssn, height) 适合。$1=$lm(高度ssn，数据=ssn.data) 总结(适合)1$)\mathrm{n}=100$beta$0=70 ;$betal$=0 \quad #$null模型;真betal$=0$SSN$=$样例$(0: 9,100$，替换$=T)$身高$=$beta$0+\operatorname{beta} 1 * \operatorname{ssn}+\operatorname{rnorm}(100,0,4)$ssn. ssn.数据=数据。框架(ssn, height) 适合。1 =$\operatorname{lm}($height ssn, datasssn。数据) 摘要(fit. 1) 这段代码给出了以下输出(你的输出会随随机而变化): 卡尔:$\operatorname{lm}$(公式$=$高度$\sim$ssn，数据$=$ssn。数据) 残差: 最小值$1 Q$中值$3 Q$最大值$-9.4952-2.8261-0.3936 \quad 2.252111 .6764$系数: 估计std误差$t$值$\operatorname{Pr}(>|t|)$(截音)$69.77372 \quad 0.71865 \quad 97.089<2 \mathrm{e}-16 \star \star $$\operatorname{ssn} 0.019150 .14336 \quad 0.134 \quad 0.894代码:0“0.001”“0.01”*“0.05”。0.1 ‘ 1 剩余标准误差:在98自由度上为4.155 多元r平方:0.000182，调整r平方:0.01002 f统计量:0.01784对1和98 \mathrm{DF}, p值:0.894 在我们的模拟中，估计值\hat{\beta}_1=0.01915是T=0.134离零的标准误差，并且您知道这种差异完全是偶然的，因为数据是从null模型模拟的，其中\beta_1=0。 ## 统计代写|回归分析作业代写Regression Analysis代考|The p-Value 在上面的例子中，确定哪些真实的T值只能由偶然解释的阈值是将只能由偶然解释的T值中的95 \%放在它们之间的数字;在T_{98}分布的情况下，它们是-1.9845和+1.9845。如果观察到的T统计值超出了这个范围，那么我们可以说\hat{\beta}_1和0之间的差异不容易单独用偶然来解释。 再次参见图3.7。注意，在\pm 1.9845范围之外有5 \%总概率，因为在范围内有95 \%概率。现在，如果T统计值落在95 \%范围内，那么在\pm T范围外的总概率必须大于5 \%。再次参见图3.7，假设T=1.7在范围内。那么在\pm 1.7范围之外的概率必须大于5 \%，对吧?再次参见图3.7，并在图中找到\pm 1.7。确保你理解了这一点;这一点也不难。不要只看文字，因为那样你不会明白。相反，请查看图3.7，将手指放在1.7处的图形上，并考虑\pm 1.7范围之外的区域。它大于0.05，你看到了吗? 现在，假设T=2.5，再次查看图3.7。那么在\pm 2.5范围之外的概率必须小于5 \%，对吧?再次参见图3.7，在图中找到\pm 2.5。确保你理解了这一点;这一点也不难。看这个图表!不要只看单词!相反，将手指放在2.5的图形上，并考虑\pm 2.5范围之外的区域。它小于0.05，你看到了吗? 统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。 ## 随机过程代考 在概率论概念中，随机过程随机变量的集合。 若一随机系统的样本点是随机函数，则称此函数为样本函数，这一随机系统全部样本函数的集合是一个随机过程。 实际应用中，样本函数的一般定义在时间域或者空间域。 随机过程的实例如股票和汇率的波动、语音信号、视频信号、体温的变化，随机运动如布朗运动、随机徘徊等等。 ## 贝叶斯方法代考 贝叶斯统计概念及数据分析表示使用概率陈述回答有关未知参数的研究问题以及统计范式。后验分布包括关于参数的先验分布，和基于观测数据提供关于参数的信息似然模型。根据选择的先验分布和似然模型，后验分布可以解析或近似，例如，马尔科夫链蒙特卡罗 (MCMC) 方法之一。贝叶斯统计概念及数据分析使用后验分布来形成模型参数的各种摘要，包括点估计，如后验平均值、中位数、百分位数和称为可信区间的区间估计。此外，所有关于模型参数的统计检验都可以表示为基于估计后验分布的概率报表。 ## 广义线性模型代考 广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。 statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。 ## 机器学习代写 随着AI的大潮到来，Machine Learning逐渐成为一个新的学习热点。同时与传统CS相比，Machine Learning在其他领域也有着广泛的应用，因此这门学科成为不仅折磨CS专业同学的“小恶魔”，也是折磨生物、化学、统计等其他学科留学生的“大魔王”。学习Machine learning的一大绊脚石在于使用语言众多，跨学科范围广，所以学习起来尤其困难。但是不管你在学习Machine Learning时遇到任何难题，StudyGate专业导师团队都能为你轻松解决。 ## 多元统计分析代考 基础数据: N 个样本， P 个变量数的单样本，组成的横列的数据表 变量定性: 分类和顺序；变量定量：数值 数学公式的角度分为: 因变量与自变量 ## 时间序列分析代写 随机过程，是依赖于参数的一组随机变量的全体，参数通常是时间。 随机变量是随机现象的数量表现，其时间序列是一组按照时间发生先后顺序进行排列的数据点序列。通常一组时间序列的时间间隔为一恒定值（如1秒，5分钟，12小时，7天，1年），因此时间序列可以作为离散时间数据进行分析处理。研究时间序列数据的意义在于现实中，往往需要研究某个事物其随时间发展变化的规律。这就需要通过研究该事物过去发展的历史记录，以得到其自身发展的规律。 ## 回归分析代写 多元回归分析渐进（Multiple Regression Analysis Asymptotics）属于计量经济学领域，主要是一种数学上的统计分析方法，可以分析复杂情况下各影响因素的数学关系，在自然科学、社会和经济学等多个领域内应用广泛。 ## MATLAB代写 MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。 ## 统计代写|回归分析作业代写Regression Analysis代考|Unbiasedness of OLS Estimates Assuming the Classical Model: A Simulation Study 如果你也在 怎样代写回归分析Regression Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。 回归分析是一种强大的统计方法，允许你检查两个或多个感兴趣的变量之间的关系。虽然有许多类型的回归分析，但它们的核心都是考察一个或多个自变量对因变量的影响。 statistics-lab™ 为您的留学生涯保驾护航 在代写回归分析Regression Analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写回归分析Regression Analysis代写方面经验极为丰富，各种代写回归分析Regression Analysis相关的作业也就用不着说。 ## 统计代写|回归分析作业代写Regression Analysis代考|Unbiasedness of OLS Estimates Assuming the Classical Model: A Simulation Study To start a simulation study, you must specify the model and its parameter values, which in the case of the classical model will be the \mathrm{N}\left(\beta_0+\beta_1 x, \sigma^2\right) probability distribution, along with the three parameters \left(\beta_0, \beta_1, \sigma\right). These parameters are unknown, so just pick any values that make sense. No matter what values you pick for those parameters, the estimates you get are (i) random, and (ii) when unbiased, neither systematically above nor below those parameter values, in an average sense. In reality, Nature picks the actual values of the parameters \left(\beta_0, \beta_1, \sigma\right), and you do not know their values. In simulation studies, you pick the values \left(\beta_0, \beta_1, \sigma\right). The estimates \left(\hat{\beta}_0, \hat{\beta}_1\right., and \left.\hat{\sigma}\right) target those particular values, but with error that you know precisely because you know both the estimates and the true values. In the real world, with your real (not simulated) data, your estimates \hat{\beta}_0, \hat{\beta}_1, and \hat{\sigma} also target the true values \beta_0, \beta_1, and \sigma, but since you do not know the true values for your real data, you also do not know the error. Simulation allows you to understand this error, so you can better understand how your estimates \hat{\beta}_0, \hat{\beta}_1, and \hat{\sigma} relate to Nature’s true values \beta_0, \beta_1, and \sigma. In the Production Cost example, the values \beta_0=55, \beta_1=1.5, \sigma^2=250^2 produce data that look reasonably similar to the actual data, as shown in Chapter 1. So let’s pick those values for the simulation. No matter which values you pick for your simulation parameters \beta_0, \beta_1, and \sigma, the statistical estimates \hat{\beta}_0, \hat{\beta}_1, and \hat{\sigma} “target” those values. To make the abstractions concrete and understandable, run the following simulation code, which produces data exactly as indicated in Table 3.3. beta0 =55 ; betal =1.5 ; sigma =250 # Nature’s parameters Widgets =\mathrm{c}(1500,800,1500,1400,900,800,1400,1400,1300,1400,700, \quad 1000,1200,1200,900,1200,1700,1600,1200,1400,1400,1000, \quad 1200,800,1000,1400,1400,1500,1500,1600,1700,900,800,1300, \quad 1000,1600,900,1300,1600,1000) n = length(Widgets) # Examples of potentially observable data sets: Sim. Cost = beta0 + betal*Widgets + rnorm(n, 0, sigma) head(cbind(Widgets, Sim.Cost)) beta 0=55 ; betal =1.5 ; sigma =250 # Nature’s parameters Widgets =c(1500,800,1500,1400,900,800,1400,1400,1300,1400,700, 1000,1200,1200,900,1200,1700,1600,1200,1400,1400,1000, 1200,800,1000,1400,1400,1500,1500,1600,1700,900,800,1300, 1000,1600,900,1300,1600,1000) \mathrm{n}= length (widgets) # Examples of potentially observable data sets: sim. Cost = beta 0+\operatorname{beta} 1 * widgets +\operatorname{rnorm}(n, 0, sigma ) head (cbind (Widgets, Sim.Cost)) ## 统计代写|回归分析作业代写Regression Analysis代考|Biasedness of OLS Estimates When the Classical Model Is Wrong Unbiasedness of the estimates \hat{\beta}_0 and \hat{\beta}_1 also implies unbiasedness of the OLS-estimated conditional mean value, \hat{\mu}_x=\hat{\beta}_0+\hat{\beta}_1 x, when the classical model is valid. But when the classical model does not correspond to Nature’s model, the OLS-estimated conditional mean value \hat{\mu}_x=\hat{\beta}_0+\hat{\beta}_1 x can be biased. Motivated by the Product Complexity example of Figure 1.16, where Y= Preference and X= Complexity, suppose that Nature’s mean function is not linear, but instead a curved function \mathrm{E}(Y \mid X=x)=f(x). But you do not know Nature’s ways, so you assume the classical model Y \mid X=x \sim \mathrm{N}\left(\beta_0+\beta_1 x, \sigma^2\right). Then your OLS-estimated conditional mean value, \hat{\mu}_x=\hat{\beta}_0+\hat{\beta}_1 x, is biased. Suppose in particular that Nature’s data-generating process is Y \mid X=x \sim \mathrm{N} \left(7+x-0.03 x^2, 10^2\right), where X \sim \mathrm{N}\left(15,5^2\right). A scatterplot of n=1,000 data values from this process is shown in the left panel of Figure 3.2, with OLS line and LOESS fit superimposed. Notice that the LOESS fit looks more like the true, quadratic function than the incorrect linear function. The right panel of Figure 3.2 shows that the OLS estimates \hat{\mu}_{15}=\hat{\beta}_0+\hat{\beta}_1(15), based on samples of size n=1,000, are biased (low) estimates of the true mean. # 回归分析代写 ## 统计代写|回归分析作业代写Regression Analysis代考|Unbiasedness of OLS Estimates Assuming the Classical Model: A Simulation Study 要开始模拟研究，必须指定模型及其参数值，在经典模型的情况下，这些参数值将是\mathrm{N}\left(\beta_0+\beta_1 x, \sigma^2\right)概率分布，以及三个参数\left(\beta_0, \beta_1, \sigma\right)。这些参数是未知的，所以只要选择任何有意义的值。无论你为这些参数选择什么值，你得到的估计都是(i)随机的，(ii)无偏的，在平均意义上既不高于也不低于这些参数值。 实际上，Nature会选择参数的实际值\left(\beta_0, \beta_1, \sigma\right)，而您并不知道它们的值。在模拟研究中，您选择值\left(\beta_0, \beta_1, \sigma\right)。估计值\left(\hat{\beta}_0, \hat{\beta}_1\right.和\left.\hat{\sigma}\right)针对这些特定的值，但是由于您既知道估计值又知道真实值，因此您可以精确地知道其中的误差。在现实世界中，使用真实(非模拟)数据，您的估计值\hat{\beta}_0, \hat{\beta}_1和\hat{\sigma}也针对真实值\beta_0, \beta_1和\sigma，但是由于您不知道真实数据的真实值，因此也不知道误差。模拟可以让您理解这个错误，因此您可以更好地理解您的估计\hat{\beta}_0、\hat{\beta}_1和\hat{\sigma}与Nature的真实值\beta_0, \beta_1和\sigma之间的关系。 在Production Cost示例中，值\beta_0=55, \beta_1=1.5, \sigma^2=250^2生成的数据看起来与实际数据非常相似，如第1章所示。让我们为模拟选择这些值。无论您为模拟参数\beta_0, \beta_1和\sigma选择哪个值，统计都会估计\hat{\beta}_0, \hat{\beta}_1和\hat{\sigma}“以”这些值为目标。 为了使抽象具体化和易于理解，运行下面的仿真代码，生成的数据如表3.3所示。 beta0 =55 ; betal =1.5 ; sigma =250 #自然参数 Widgets =\mathrm{c}(1500,800,1500,1400,900,800,1400,1400,1300,1400,700， \quad 1000,1200,1200,900,1200,1700,1600,1200,1400,1400,1000， \quad 1200,800,1000,1400,1400,1500,1500,1600,1700,900,800,1300， \quad 1000,1600,900,1300,1600,1000) n =长度(Widgets) ＃潜在可观察数据集的例子: Sim。成本= beta0 + betal*Widgets + rnorm(n, 0, sigma) head(cbind(Widgets, Sim.Cost)) beta 0=55 ; betal =1.5 ; sigma =250 ＃自然参数 Widgets =c(1500,800,1500,1400,900,800,1400,1400,1300,1400,700， 1000,1200,1200,900,1200,1700,1600,1200,1400,1400,1000， 1200,800,1000,1400,1400,1500,1500,1600,1700,900,800,1300， 1000,1600,900,1300,1600,1000) \mathrm{n}=长度(部件) ＃潜在可观察数据集的例子: sim。成本= beta 0+\operatorname{beta} 1 * widgets +\operatorname{rnorm}(n, 0, sigma ) head (cbind (Widgets, Sim.Cost)) ## 统计代写|回归分析作业代写Regression Analysis代考|Biasedness of OLS Estimates When the Classical Model Is Wrong 当经典模型有效时，估计\hat{\beta}_0和\hat{\beta}_1的无偏性也意味着ols估计的条件平均值\hat{\mu}_x=\hat{\beta}_0+\hat{\beta}_1 x的无偏性。但是，当经典模型与自然模型不对应时，ols估计的条件平均值\hat{\mu}_x=\hat{\beta}_0+\hat{\beta}_1 x可能存在偏差。受图1.16的Product Complexity示例的启发，其中Y= Preference和X= Complexity假设Nature的均值函数不是线性的，而是一个曲线函数\mathrm{E}(Y \mid X=x)=f(x)。但你不知道自然的方式，所以你假设经典模型Y \mid X=x \sim \mathrm{N}\left(\beta_0+\beta_1 x, \sigma^2\right)。那么你的ols估计的条件平均值\hat{\mu}_x=\hat{\beta}_0+\hat{\beta}_1 x是有偏差的。 特别假设自然的数据生成过程是Y \mid X=x \sim \mathrm{N}$$\left(7+x-0.03 x^2, 10^2\right)$，其中$X \sim \mathrm{N}\left(15,5^2\right)$。图3.2左面板为该过程中$n=1,000$数据值的散点图，OLS线与黄土拟合叠加。注意，黄土拟合看起来更像真实的二次函数，而不是不正确的线性函数。 图3.2的右面板显示，基于样本规模$n=1,000$的OLS估计$\hat{\mu}_{15}=\hat{\beta}_0+\hat{\beta}_1(15)$是对真实平均值的有偏(低)估计。 统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。 ## 随机过程代考 在概率论概念中，随机过程随机变量的集合。 若一随机系统的样本点是随机函数，则称此函数为样本函数，这一随机系统全部样本函数的集合是一个随机过程。 实际应用中，样本函数的一般定义在时间域或者空间域。 随机过程的实例如股票和汇率的波动、语音信号、视频信号、体温的变化，随机运动如布朗运动、随机徘徊等等。 ## 贝叶斯方法代考 贝叶斯统计概念及数据分析表示使用概率陈述回答有关未知参数的研究问题以及统计范式。后验分布包括关于参数的先验分布，和基于观测数据提供关于参数的信息似然模型。根据选择的先验分布和似然模型，后验分布可以解析或近似，例如，马尔科夫链蒙特卡罗 (MCMC) 方法之一。贝叶斯统计概念及数据分析使用后验分布来形成模型参数的各种摘要，包括点估计，如后验平均值、中位数、百分位数和称为可信区间的区间估计。此外，所有关于模型参数的统计检验都可以表示为基于估计后验分布的概率报表。 ## 广义线性模型代考 广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。 statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。 ## 机器学习代写 随着AI的大潮到来，Machine Learning逐渐成为一个新的学习热点。同时与传统CS相比，Machine Learning在其他领域也有着广泛的应用，因此这门学科成为不仅折磨CS专业同学的“小恶魔”，也是折磨生物、化学、统计等其他学科留学生的“大魔王”。学习Machine learning的一大绊脚石在于使用语言众多，跨学科范围广，所以学习起来尤其困难。但是不管你在学习Machine Learning时遇到任何难题，StudyGate专业导师团队都能为你轻松解决。 ## 多元统计分析代考 基础数据:$N$个样本，$P$个变量数的单样本，组成的横列的数据表 变量定性: 分类和顺序；变量定量：数值 数学公式的角度分为: 因变量与自变量 ## 时间序列分析代写 随机过程，是依赖于参数的一组随机变量的全体，参数通常是时间。 随机变量是随机现象的数量表现，其时间序列是一组按照时间发生先后顺序进行排列的数据点序列。通常一组时间序列的时间间隔为一恒定值（如1秒，5分钟，12小时，7天，1年），因此时间序列可以作为离散时间数据进行分析处理。研究时间序列数据的意义在于现实中，往往需要研究某个事物其随时间发展变化的规律。这就需要通过研究该事物过去发展的历史记录，以得到其自身发展的规律。 ## 回归分析代写 多元回归分析渐进（Multiple Regression Analysis Asymptotics）属于计量经济学领域，主要是一种数学上的统计分析方法，可以分析复杂情况下各影响因素的数学关系，在自然科学、社会和经济学等多个领域内应用广泛。 ## MATLAB代写 MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。 ## 统计代写|回归分析作业代写Regression Analysis代考|Estimating Regression Models via Maximum Likelihood 如果你也在 怎样代写回归分析Regression Analysis这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。 回归分析是一种强大的统计方法，允许你检查两个或多个感兴趣的变量之间的关系。虽然有许多类型的回归分析，但它们的核心都是考察一个或多个自变量对因变量的影响。 statistics-lab™ 为您的留学生涯保驾护航 在代写回归分析Regression Analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写回归分析Regression Analysis代写方面经验极为丰富，各种代写回归分析Regression Analysis相关的作业也就用不着说。 ## 统计代写|回归分析作业代写Regression Analysis代考|Estimating Regression Models via Maximum Likelihood Table 1.3 of Chapter 1 is reproduced here as Table 2.1. This table shows how you should view regression data. Table 2.1 also shows you how to get the likelihood function and the associated maximum likelihood estimates: Just plug the observed$y_i$values into the conditional distributions and multiply them. Table 2.2 shows how you can obtain the likelihood function. Now, under particular regression assumptions, each distribution$p\left(y \mid X=x_i\right)$is a function of parameters$\boldsymbol{\theta}$, where$\boldsymbol{\theta}$is a vector (or list) containing all the$\beta^{\prime}$s and other parameters such as$\sigma$. Displaying this dependence explicitly, Table 2.2 becomes Table 2.3. By assuming conditional independence of the potentially observable$Y_i \mid X_i=x_i$observations, you then can multiply the individual likelihoods to get the joint likelihood as follows: $$L(\boldsymbol{\theta} \mid \text { data })=p\left(y_1 \mid X=x_1, \boldsymbol{\theta}\right) \times p\left(y_2 \mid X=x_2, \boldsymbol{\theta}\right) \times p\left(y_3 \mid X=x_3, \boldsymbol{\theta}\right) \times \ldots \times p\left(y_n \mid X=x_n, \boldsymbol{\theta}\right)$$ To estimate the parameters$\boldsymbol{\theta}$via maximum likelihood, you must identify the specific$\hat{\boldsymbol{\theta}}$that maximizes$L(\boldsymbol{\theta} \mid$data$)$. The resulting values of the vector$\hat{\boldsymbol{\theta}}$are called the maximum likelihood estimates. ## 统计代写|回归分析作业代写Regression Analysis代考|Maximum Likelihood in the Classical (Normally Distributed) Regression Model, Which Gives You Ordinary Least Squares When you assume the classical regression model (see Section 1.7), where the distributions are all normal, homoscedastic, and linked to$x$linearly, the probability distributions$p(y \mid X=x)$that you assume to produce your$Y \mid X=x$data are the$\mathrm{N}\left(\beta_0+\beta_1 x, \sigma^2\right)$distributions. These distributions have mathematical form given by: $$p(y \mid X=x, \theta)=p\left(y \mid X=x, \beta_0, \beta_1, \sigma\right)=\frac{1}{\sqrt{2 \pi} \sigma} \exp \left[-\frac{\left{y-\left(\beta_0+\beta_1 x\right)\right}^2}{2 \sigma^2}\right]$$ You need this mathematical form to construct the likelihood function. See Figure 1.7 (again!) for a graphic illustration of these models. Notice that the parameter vector is$\theta=\left{\beta_0, \beta_1, \sigma\right}; i.e., there are three unknown parameters of this model that you must estimate using maximum likelihood. Assuming conditional independence, the likelihood function is \begin{aligned} L(\theta \mid \text { data })= & p\left(y_1 \mid X=x_1, \theta\right) \times p\left(y_2 \mid X=x_2, \theta\right) \times \ldots \times p\left(y_n \mid X=x_n, \theta\right) \ = & \frac{1}{\sqrt{2 \pi} \sigma} \exp \left[-\frac{\left{y_1-\left(\beta_0+\beta_1 x_1\right)\right}^2}{2 \sigma^2}\right] \times \frac{1}{\sqrt{2 \pi} \sigma} \exp \left[-\frac{\left{y_2-\left(\beta_0+\beta_1 x_2\right)\right}^2}{2 \sigma^2}\right] \ & \times \cdots \times \frac{1}{\sqrt{2 \pi} \sigma} \exp \left[-\frac{\left{y_n-\left(\beta_0+\beta_1 x_n\right)\right}^2}{2 \sigma^2}\right] \ = & (2 \pi)^{-n / 2}\left(\sigma^2\right)^{-n / 2} \exp \left[-\frac{\sum_{i=1}^n\left{y_i-\left(\beta_0+\beta_1 x_i\right)\right}^2}{2 \sigma^2}\right] \end{aligned} A technical note: In the random-X$case, the likelihood should also contain the multiplicative terms$p\left(x_1\right) \times p\left(x_2\right) \times \ldots \times p\left(x_n\right)$. But as long as$p(x)$does not depend on the unknown parameters$\left(\beta_0, \beta_1, \sigma\right)$, these extra terms have no effect and are thus usually left off of the likelihood function. Taking the logarithm of the likelihood function and simplifying, you get the loglikelihood function. The log-likelihood function for the classical regression model $$L L(\theta \mid \text { data })=-\frac{n}{2} \ln (2 \pi)-\frac{n}{2} \ln \left(\sigma^2\right)-\frac{1}{2 \sigma^2} \sum_{i=1}^n\left{y_i-\left(\beta_0+\beta_1 x_i\right)\right}^2$$ # 回归分析代写 ## 统计代写|回归分析作业代写Regression Analysis代考|Estimating Regression Models via Maximum Likelihood 第1章表1.3在此复制为表2.1。此表显示了您应该如何查看回归数据。 表2.1还向您展示了如何获得似然函数和相关的最大似然估计:只需将观察到的$y_i$值插入条件分布并将它们相乘。表2.2显示了如何获得似然函数。 现在，在特定的回归假设下，每个分布$p\left(y \mid X=x_i\right)$是参数$\boldsymbol{\theta}$的函数，其中$\boldsymbol{\theta}$是包含所有$\beta^{\prime}$s和其他参数(如$\sigma$)的向量(或列表)。显式显示这种依赖关系，表2.2变成了表2.3。 通过假设潜在可观察到的$Y_i \mid X_i=x_i$观测值的条件独立性，然后您可以将单个可能性相乘，得到如下所示的联合可能性: $$L(\boldsymbol{\theta} \mid \text { data })=p\left(y_1 \mid X=x_1, \boldsymbol{\theta}\right) \times p\left(y_2 \mid X=x_2, \boldsymbol{\theta}\right) \times p\left(y_3 \mid X=x_3, \boldsymbol{\theta}\right) \times \ldots \times p\left(y_n \mid X=x_n, \boldsymbol{\theta}\right)$$ 要通过最大似然估计参数$\boldsymbol{\theta}$，必须确定使$L(\boldsymbol{\theta} \mid$数据最大化的特定$\hat{\boldsymbol{\theta}}$$)。向量\hat{\boldsymbol{\theta}}的结果值称为最大似然估计。 ## 统计代写|回归分析作业代写Regression Analysis代考|Maximum Likelihood in the Classical (Normally Distributed) Regression Model, Which Gives You Ordinary Least Squares 当您假设经典回归模型(参见1.7节)，其中分布都是正态的、均方差的，并且与x线性关联，那么您假设产生Y \mid X=x数据的概率分布p(y \mid X=x)就是\mathrm{N}\left(\beta_0+\beta_1 x, \sigma^2\right)分布。这些分布的数学形式为:$$
p(y \mid X=x, \theta)=p\left(y \mid X=x, \beta_0, \beta_1, \sigma\right)=\frac{1}{\sqrt{2 \pi} \sigma} \exp \left[-\frac{\left{y-\left(\beta_0+\beta_1 x\right)\right}^2}{2 \sigma^2}\right]
$$你需要这种数学形式来构造似然函数。 参见图1.7(再次!)，以获得这些模型的图形说明。注意参数向量是\theta=\left{\beta_0, \beta_1, \sigma\right};也就是说，这个模型有三个未知的参数，你必须使用最大似然来估计。假设条件无关，似然函数为$$
\begin{aligned}
L(\theta \mid \text { data })= & p\left(y_1 \mid X=x_1, \theta\right) \times p\left(y_2 \mid X=x_2, \theta\right) \times \ldots \times p\left(y_n \mid X=x_n, \theta\right) \
= & \frac{1}{\sqrt{2 \pi} \sigma} \exp \left[-\frac{\left{y_1-\left(\beta_0+\beta_1 x_1\right)\right}^2}{2 \sigma^2}\right] \times \frac{1}{\sqrt{2 \pi} \sigma} \exp \left[-\frac{\left{y_2-\left(\beta_0+\beta_1 x_2\right)\right}^2}{2 \sigma^2}\right] \
& \times \cdots \times \frac{1}{\sqrt{2 \pi} \sigma} \exp \left[-\frac{\left{y_n-\left(\beta_0+\beta_1 x_n\right)\right}^2}{2 \sigma^2}\right] \
= & (2 \pi)^{-n / 2}\left(\sigma^2\right)^{-n / 2} \exp \left[-\frac{\sum_{i=1}^n\left{y_i-\left(\beta_0+\beta_1 x_i\right)\right}^2}{2 \sigma^2}\right]
\end{aligned}
$$一个技术说明:在随机- X的情况下，似然也应该包含乘法项p\left(x_1\right) \times p\left(x_2\right) \times \ldots \times p\left(x_n\right)。但只要p(x)不依赖于未知参数\left(\beta_0, \beta_1, \sigma\right)，这些额外的项就没有影响，因此通常被排除在似然函数之外。 取似然函数的对数并化简，就得到了对数似然函数。 经典回归模型的对数似然函数$$
L L(\theta \mid \text { data })=-\frac{n}{2} \ln (2 \pi)-\frac{n}{2} \ln \left(\sigma^2\right)-\frac{1}{2 \sigma^2} \sum_{i=1}^n\left{y_i-\left(\beta_0+\beta_1 x_i\right)\right}^2


## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|回归分析代写Regression analysis代考|Correct Functional Specification

statistics-lab™ 为您的留学生涯保驾护航 在代写线性回归分析linear regression analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写线性回归分析linear regression analysis代写方面经验极为丰富，各种代写线性回归分析linear regression analysis相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|线性回归分析代写linear regression analysis代考|Correct Functional Specification

The conditional mean function is $f(x)=\mathrm{E}(Y \mid X=x)$, the collection of means of the conditional distributions $p(y \mid x)$ (a different mean for every $x$ ), viewed as a function of $x$. The conditional mean function $f(x)$ is the deterministic portion of the more general regression model $Y \mid X=x \sim p(y \mid x)$.
Definition of the true conditional mean function
The true conditional mean function is given by $f(x)=\mathrm{E}(Y \mid X=x)$.
Note that the true conditional mean function is different from the true regression model, which was already given in Section 1.1, but is repeated here to make the distinction clear.
Definition of the true regression model
The true regression model is given by $Y \mid X=x \sim p(y \mid x)$.
When the distributions $p(y \mid x)$ are continuous, you can obtain the true conditional mean function from the true regression model via $\mathrm{E}(Y \mid X=x)=\int y p(y \mid x) d y$. However, you cannot obtain the true regression model from the true conditional mean function, for the simple reason that you cannot tell anything about a distribution from its mean. For example, even if you know that the mean of $Y$ is 10.0 (for any $X=x$ ), you still do not know anything about the distribution of $Y$ (normal, lognormal, Poisson, etc.), or even its variance.
Whether you realize it or not, whenever you instruct the computer to analyze your regression data, you are making an assumption about the mean function. The correct functional specification assumption is simply the assumption that the mean function that you assume correctly specifies the true mean function of the data-generating process.
Correct functional specification assumption
The collection of true conditional means $f(x)=\mathrm{E}(Y \mid X=x)$ fall exactly on a function that is in the family of functions $f(x ; \boldsymbol{\beta})$ that you assume when you analyze your data, for some vector $\beta$ of fixed, unknown parameters.

## 统计代写|线性回归分析代写linear regression analysis代考|Constant Variance (Homoscedasticity)

The correct functional specification assumption refers to the means of the conditional distributions $p(y \mid x)$, as a function of $x$. The constant variance assumption refers to the variances of the conditional distributions $p(y \mid x)$, as a function of $x$. Letting $\mu_x=\mathrm{E}(Y \mid X=x)$, these conditional variances are calculated as $\sigma_x^2=\operatorname{Var}(Y \mid X=x)=\int_{\text {all } y}\left(y-\mu_x\right)^2 p(y \mid x) d y$.
Like the conditional mean function, the conditional variance function can have any function form in reality, linear, exponential, or any generic form whatsoever, provided that the function is non-negative. However, in the classical regression model, this function is assumed to have a very restrictive form: It is assumed to be a flat function that gives the same function value, regardless of the value of $x$.
The constant variance (homoscedasticity) assumption
The variances of the conditional distributions $p(y \mid x)$ are constant (i.e., they are all the same number, $\sigma^2$ ) for all specific values $X=x$.

Unlike the previous assumptions, which do not refer to any specific data set, this assumption refers to the data that you can collect, with observations $i=1,2, \ldots, n$.
Uncorrelated errors (or conditional independence) assumption
The potentially observable “error term,” $\varepsilon_i=Y_i-f\left(x_i ; \boldsymbol{\beta}\right)$, is uncorrelated with the potentially observable error $\varepsilon_j=Y_j-f\left(\boldsymbol{x}_j ; \boldsymbol{\beta}\right)$, for all sample pairs $(i, j), 1 \leq i, j \leq n$.
Alternatively, you can assume that the $Y_i$ are independent, given all the $X$ data. This alternative form is used in the construction of likelihood functions for maximum likelihood estimation and Bayesian analysis.

# 线性回归代写

## 统计代写|线性回归分析代写linear regression analysis代考|Correct Functional Specification

true条件意味着$f(x)=\mathrm{E}(Y \mid X=x)$的集合恰好落在函数族$f(x ; \boldsymbol{\beta})$中的一个函数上，该函数是您在分析数据时假设的，用于某个具有固定未知参数的向量$\beta$。

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|回归分析代写linear regression analysis代考|Data Used in Regression Analysis

statistics-lab™ 为您的留学生涯保驾护航 在代写线性回归分析linear regression analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写线性回归分析linear regression analysis代写方面经验极为丰富，各种代写线性回归分析linear regression analysis相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|线性回归分析代写linear regression analysis代考|Data Used in Regression Analysis

A typical data set used in simple regression analysis looks as shown in Table 1.2.
In Table 1.2, “Obs” refers to “observation number.” It is important to recognize what the observations refer to, because the regression model $p(y \mid x)$ describes variation in $Y$ at that level of observation, as discussed in the following examples. For example, with person-level data, each “Obs” is a different person, person 1, person 2, etc. The variation in $Y$ modeled by $p(y \mid x)$ refers to variation between people. For example, $p(y \mid X=27)$ might refer to the potentially observable variation in Assets among people who are 27 years old. With firm-level data, each “Obs” is a different firm, e.g. firm 1 is Pfizer, firm 2 is Microsoft, etc., and the variation in $Y$ modeled by $p(y \mid x)$ refers to variation between firms. For example, $p(y \mid X=2,000)$ might refer to the potentially observable variation in net worth among firms that have 2,000 employees.

As a reminder, it is essential for understanding this entire book, to always remember the following two points:
The regression model $p(y \mid x)$ does not come from the data. Rather, the regression model $p(y \mid x)$ is assumed to produce the data.
To illustrate the crucial concept that the regression model is a producer of data, consider the data set, available in R, called “EuStockMarkets,” having Daily Closing Prices of Major European Stock Indices, from 1991 to 1998. These data are time-series data, where the “Obs” are consecutive trading days.

data (EustockMarkets)
$>\operatorname{tail}$ (EuStockMarkets $[, 1: 2])[1: 5$,
DAX SMI
$[1855] 5598.32 \quad$,
$[1856]$,
$[1857] 5285.78 \quad$,
$[1858] 5386.94 \quad$,
$[1859]$,
[1860,] 5473.72 ?????

## 统计代写|线性回归分析代写linear regression analysis代考|The Trashcan Experiment: Random- $X$ Versus Fixed- $X$

Here is something you can do (or at least imagine doing) with a group of people. You need a crumpled piece of paper (call it a “ball”), a tape measure, and a clean trashcan. Let each person attempt to throw the ball into the trashcan. The goal of the study is to identify the relationship between success at throwing the ball into the trash can $(Y)$, and distance from the trashcan $(X)$.

In a fixed- $X$ version of the experiment, place markers 5 feet, 10 feet, 15 feet and 20 feet from the trashcan. Have all people attempt to throw the ball into the trashcan from all those distances. Here the $X$ ‘s are fixed because they are known in advance. If you imagine doing another experiment just like this one (say in a different class), then the X’s would be the same: $5,10,15$ and 20.

In a random- $X$ version of the same experiment, you give a person the ball, then tell the person to pick a spot where he or she thinks the probability of making the shot might be around $50 \%$. Have the person attempt to throw the ball into the trashcan multiple times from that distance that he or she selected. Repeat for all people, letting each person pick where they want to stand. Here the X’s are random because they are not known in advance. If you imagine doing another experiment just like this one (say in a different class), then the X’s would be different because different people will choose different places to stand.

The fixed- $X$ version gives rise to experimental data. In experiments, the experimenter first sets the $X$ and then observes the $Y$. The random- $X$ version gives rise to observational data, where the X’s are simply observed, and not controlled by the researcher.

Experimental data are the gold standard, because with observational data the observed effect of $X$ on $Y$ may not be a causal effect. With experimental data, the observed effect of $X$ on $Y$ can be more easily interpreted as a causal effect. Issues of causality in more detail in later chapters.

# 线性回归代写

## 统计代写|线性回归分析代写linear regression analysis代考|Data Used in Regression Analysis

[1859],美元
[1860，] 5473.72 ?????

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|线性回归分析代写linear regression analysis代考|What does an insignificant estimate tell you

statistics-lab™ 为您的留学生涯保驾护航 在代写线性回归分析linear regression analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写线性回归分析linear regression analysis代写方面经验极为丰富，各种代写线性回归分析linear regression analysis相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|线性回归分析代写linear regression analysis代考|What does an insignificant estimate tell you

The basic reason why the lack of evidence is not proof of non-existence is that there are alternative reasons for the lack of evidence. As mentioned earlier, when a jury in a criminal trial deliberates on whether a defendant is guilty, the jury members are not directed to conclude that the defendant has been proven innocent. Rather, they are supposed to determine whether there is significant evidence (beyond a reasonable doubt) that indicates the defendant was guilty. Thus, one reason why a defendant may be found “not guilty” is that there was not enough evidence.

The same concept is supposed to be used for statistical analysis. We are often testing whether a coefficient estimate is different from zero. Let’s say we are examining how class-size affects elementary-school students’ test scores, and let’s say that we find an insignificant estimate on the variable for class-size. In a study of mine (Arkes 2016), I list four general possible explanations for an insignificant estimate:

1. There is actually no effect of the explanatory variable on the outcome in the population.
2. There is an effect in one direction, but the model is unable to detect the effect due to a modeling problem (e.g., omitted-factors bias or measurement error – see Chapter 6) biasing the coefficient estimate in a direction opposite to the actual effect.
1. There is a small effect that cannot be detected with the available data due to inadequate power i.e., not a large enough sample given the size of the effect.
2. There are varying effects in the population (or sample); some people’s outcomes may be affected positively by the treatment, others’ outcomes may be affected negatively, and others’ outcomes may not be affected; and the estimated effect (which is the average effect) is insignificantly different from zero due to the positive and negative effects canceling each other out or being drowned out by those with zero effects.

So what can you conclude from the insignificant estimate on the class-size variable? You cannot conclude that class size does not affect test scores. Rather, as with the hot hand and the search for aliens, the interpretation should be: “There is no evidence that class-size affects test scores.”

Unfortunately, a very common mistake made in the research world is that the conclusion would be that there is no effect. This is important for issues such as whether there are side effects from pharmaceutical drugs or vaccines. The lack of evidence for a side effect does not mean that there is no effect, particularly if confidence intervals for the estimates include values that would represent meaningful side effects of the drug or vaccine.

All that said, there are sometimes cases in which an insignificant estimate has a $95 \%$ or $99 \%$ confidence interval with a fairly narrow range and outer boundary that, if the boundary were the true population parameter, it would be “practically insignificant” (see Section 5.3.9). If this were the case and the coefficient estimate were not subject to any meaningful bias, then it would be safe to conclude that “there is no meaningful effect.”

## 统计代写|线性回归分析代写linear regression analysis代考|Statistical significance is not the goal

As we conduct research, our ultimate goal should be to advance knowledge. Our goal should not be to find a statistically-significant estimate. Advancing knowledge occurs by conducting objective and honest research.

A statistically insignificant coefficient estimate on a key-explanatory variable is just as valid as a significant coefficient estimate. The problem, many believe, is that an insignificant estimate may not provide as much information as a significant estimate. As described in the previous section, an insignificant estimate does not necessarily mean that there is no meaningful relationship, and so it could have multiple possible interpretations. If the appropriate confidence intervals for the coefficient were narrow (which would indicate adequate power), the methods were convincing for ruling out modeling problems, and the effects would likely go in just one direction, then it would be more reasonable to conclude that an insignificant estimate indicates there is no meaningful effect of the treatment. But meeting all those conditions is rare, and so there are multiple possible conclusions that cannot be distinguished.

As mentioned in the previous section, a statistically-significant estimate could also be subject to the various interpretations of insignificant estimates. But these are often ignored and not deemed as important, to most people, as long as there is statistical significance.

Statistical significance is valued more, perhaps, because it is evidence confirming, to some extent, the researcher’s theory and/or hypothesis. I conducted a quick, informal review of recent issues of leading economic, financial, and education journals. As it has been historically, almost all empirical studies had statistically-significant coefficient estimates on the key-explanatory variable. Indeed, I had a difficult time finding an insignificant estimate. This suggests that the pattern continues that journals are more likely to publish studies with significant estimates on the key-explanatory variables.

The result of statistical significance being valued more is that it incentivizes researchers to make statistical significance the goal of research. This can lead to $\mathbf{p}$-hacking, which involves changing the set of control variables, the method (e.g. Ordinary Least Squares (OLS) vs. an alternative method, such as in Chapters 8 and 9), the sample requirements, or how the variables (including the outcome) are defined until one achieves a p-value below a major threshold. (I describe p-hacking in more detail in Section 13.3.)

It is unfortunate that insignificant estimates are not accepted more. But, hopefully, this book will be another stepping stone for the movement to be more accepting of insignificant estimates. I personally trust insignificant estimates more than significant estimates (except for the hot hand in basketball).

The bottom line is that, as we conduct research, we should be guided by proper modeling strategies and not by what the results are saying.

# 线性回归代写

## 统计代写|线性回归分析代写linear regression analysis代考|What does an insignificant estimate tell you

1. 实际上，解释变量对总体结果没有影响。
2. 在一个方向上有影响，但由于建模问题（例如，遗漏因素偏差或测量误差——参见第 6 章），模型无法检测到影响系数估计值在与实际影响相反的方向上的偏差。
1. 由于功效不足，即没有足够大的样本给定效应大小，因此无法用可用数据检测到一个小效应。
2. 总体（或样本）有不同的影响；某些人的结果可能会受到治疗的积极影响，其他人的结果可能会受到负面影响，而其他人的结果可能不会受到影响；由于正负效应相互抵消或被零效应淹没，因此估计效应（即平均效应）与零无显着差异。

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。