## 数学代写|基础数据分析代写Elementary data Analysis代考|STAT1350

statistics-lab™ 为您的留学生涯保驾护航 在代写基础数据分析Elementary data Analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写基础数据分析Elementary data Analysis代写方面经验极为丰富，各种基础数据分析Elementary data Analysis相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 数学代写|基础数据分析代写Elementary data Analysis代考|Estimating the Regression Function

We want to find the regression function $\mu(x)=\mathbb{E}[Y \mid X=x]$, and what we’ve got is a big set of training examples, of pairs $\left(x_1, y_1\right),\left(x_2, y_2\right), \ldots\left(x_n, y_n\right)$. What should we do?
If $X$ takes on only a finite set of values, then a simple strategy is to use the conditional sample means:
$$\widehat{\mu}(x)=\frac{1}{#\left{i: x_i=x\right}} \sum_{i: x_i=x} y_i$$
By the same kind of law-of-large-numbers reasoning as before, we can be confident that $\widehat{\mu}(x) \rightarrow \mathbb{E}[Y \mid X=x]$.

Unfortunately, this only works when $X$ takes values in a finite set. If $X$ is continuous, then in general the probability of our getting a sample at any particular value is zero, as is the probability of getting multiple samples at exactly the same value of $x$. This is a basic issue with estimating any kind of function from data $-$ the function will always be undersampled, and we need to fill in between the values we see. We also need to somehow take into account the fact that each $y_i$ is a sample from the conditional distribution of $Y \mid X=x_i$, and generally not equal to $\mathbb{E}\left[Y \mid X=x_i\right]$. So any kind of function estimation is going to involve interpolation, extrapolation, and smoothing.

Different methods of estimating the regression function – different regression methods, for short – involve different choices about how we interpolate, extrapolate and smooth. This involves our making a choice about how to approximate $\mu(x)$ by a limited class of functions which we know (or at least hope) we can estimate. There is no guarantee that our choice leads to a good approximation in the case at hand, though it is sometimes possible to say that the approximation error will shrink as we get more and more data. This is an extremely important topic and deserves an extended discussion, coming next.

## 数学代写|基础数据分析代写Elementary data Analysis代考|The Bias-Variance Tradeoff

Suppose that the true regression function is $\mu(x)$, but we use the function $\hat{\mu}$ to make our predictions. Let’s look at the mean squared error at $X=x$ in a slightly different way than before, which will make it clearer what happens when we can’t use $\mu$ to make predictions. We’ll begin by expanding $(Y-\widehat{\mu}(x))^2$, since the MSE at $x$ is just the expectation of this.
\begin{aligned} &(Y-\widehat{\mu}(x))^2 \ &\quad=(Y-\mu(x)+\mu(x)-\widehat{\mu}(x))^2 \ &\quad=(Y-\mu(x))^2+2(Y-\mu(x))(\mu(x)-\widehat{\mu}(x))+(\mu(x)-\widehat{\mu}(x))^2 \end{aligned}
Eq. $1.17$ tells us that $Y-\mu(X)=\eta$, a random variable which has expectation zero (and is uncorrelated with $X$ ). When we take the expectation of Eq. 1.20, nothing happens to the last term (since it doesn’t involve any random quantities); the middle term goes to zero (because $\mathbb{E}[Y-\mu(X)]=\mathbb{E}[\eta]=0$ ), and the first term becomes the variance of $\eta$. This depends on $x$, in general, so let’s call it $\sigma_x^2$. We have
$$\operatorname{MSE}(\widehat{\mu}(x))=\sigma_x^2+(\mu(x)-\widehat{\mu}(x))^2$$
The $\sigma_x^2$ term doesn’t depend on our prediction function, just on how hard it is, intrinsically, to predict $Y$ at $X=x$. The second term, though, is the extra error we get from not knowing $\mu$. (Unsurprisingly, ignorance of $\mu$ cannot improve our predictions.) This is our first bias-variance decomposition: the total MSE at $x$ is decomposed into a (squared) bias $\mu(x)-\widehat{\mu}(x)$, the amount by which our predictions are systematically off, and a variance $\sigma_x^2$, the unpredictable, “statistical” fluctuation around even the best prediction.

All of the above assumes that $\hat{\mu}$ is a single fixed function. In practice, of course, $\widehat{\mu}$ is something we estimate from earlier data. But if those data are random, the exact regression function we get is random too; let’s call this random function $\hat{M}_n$, where the subscript reminds us of the finite amount of data we used to estimate it. What we have analyzed is really $\operatorname{MSE}\left(\widehat{M}_n(x) \mid \hat{M}_n=\widehat{\mu}\right)$, the mean squared error conditional on a particular estimated regression function.

## 数学代写|基础数据分析代写基本数据分析代考|估计回归函数

$$\widehat{\mu}(x)=\frac{1}{#\left{i: x_i=x\right}} \sum_{i: x_i=x} y_i$$

## 数学代写|基础数据分析代写基本数据分析代考|偏差-方差权衡

\begin{aligned} &(Y-\widehat{\mu}(x))^2 \ &\quad=(Y-\mu(x)+\mu(x)-\widehat{\mu}(x))^2 \ &\quad=(Y-\mu(x))^2+2(Y-\mu(x))(\mu(x)-\widehat{\mu}(x))+(\mu(x)-\widehat{\mu}(x))^2 \end{aligned}
Eq。$1.17$告诉我们$Y-\mu(X)=\eta$，一个期望为零的随机变量(并且与$X$不相关)。当我们取Eq. 1.20的期望时，最后一项没有变化(因为它不涉及任何随机量);中间项趋于零(因为$\mathbb{E}[Y-\mu(X)]=\mathbb{E}[\eta]=0$)，第一项变为$\eta$的方差。这取决于$x$，所以我们称它为$\sigma_x^2$。我们有
$$\operatorname{MSE}(\widehat{\mu}(x))=\sigma_x^2+(\mu(x)-\widehat{\mu}(x))^2$$
$\sigma_x^2$项不依赖于我们的预测函数，只依赖于从本质上预测$Y$在$X=x$的难度。第二项是由于不知道$\mu$而产生的额外误差。(不出所料，对$\mu$的无知无法改善我们的预测。)这是我们的第一个偏差-方差分解:在$x$处的总MSE被分解为(平方)偏差$\mu(x)-\widehat{\mu}(x)$(我们的预测系统误差的量)和方差$\sigma_x^2$(即使是最好的预测周围不可预测的“统计”波动。

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 数学代写|基础数据分析代写Elementary data Analysis代考|MAT106

statistics-lab™ 为您的留学生涯保驾护航 在代写基础数据分析Elementary data Analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写基础数据分析Elementary data Analysis代写方面经验极为丰富，各种基础数据分析Elementary data Analysis相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 数学代写|基础数据分析代写Elementary data Analysis代考|The Regression Function

Of course, it’s not very useful to predict just one number for a variable. Typically, we have lots of variables in our data, and we believe they are related somehow. For example, suppose that we have data on two variables, $X$ and $Y$, which might look like Figure $1.1 .{ }^3$ The feature $Y$ is what we are trying to predict, a.k.a. the dependent variable or output or response, and $X$ is the predictor or independent variable or covariate or input. $Y$ might be something like the profitability of a customer and $X$ their credit rating, or, if you want a less mercenary example, $Y$ could be some measure of improvement in blood cholesterol and $X$ the dose taken of a drug. Typically we won’t have just one input feature $X$ but rather many of them, but that gets harder to draw and doesn’t change the points of principle.

Figure $1.2$ shows the same data as Figure $1.1$, only with the sample mean added on. This clearly tells us something about the data, but also it seems like we should be able to do better – to reduce the average error – by using $X$, rather than by ignoring it.

Let’s say that the we want our prediction to be a function of $X$, namely $f(X)$. What should that function be, if we still use mean squared error? We can work this out by using the law of total expectation, i.e., the fact that $\mathbb{E}[U]=\mathbb{E}[\mathbb{E}[U \mid V]]$ for any random variables $U$ and $V$.
\begin{aligned} \operatorname{MSE}(f) &=\mathbb{E}\left[(Y-f(X))^2\right] \ &=\mathbb{E}\left[\mathbb{E}\left[(Y-f(X))^2 \mid X\right]\right] \ &=\mathbb{E}\left[\mathbb{V}[Y \mid X]+(\mathbb{E}[Y-f(X) \mid X])^2\right] \end{aligned}
When we want to minimize this, the first term inside the expectation doesn’t depend on our prediction, and the second term looks just like our previous optimization only with all expectations conditional on $X$, so for our optimal function $\mu(x)$ we get
$$\mu(x)=\mathbb{E}[Y \mid X=x]$$
In other words, the (mean-squared) optimal conditional prediction is just the conditional expected value. The function $\mu(x)$ is called the regression function. This is what we would like to know when we want to predict $Y$.

## 数学代写|基础数据分析代写Elementary data Analysis代考|Some Disclaimers

It’s important to be clear on what is and is not being assumed here. Talking about $X$ as the “independent variable” and $Y$ as the “dependent” one suggests a causal model, which we might write
$$Y \leftarrow \mu(X)+\epsilon$$
where the direction of the arrow, $\leftarrow$, indicates the flow from causes to effects, and $\epsilon$ is some noise variable. If the gods of inference are very, very kind, then $\epsilon$ would have a fixed distribution, independent of $X$, and we could without loss of generality take it to have mean zero. (“Without loss of generality” because if it has a non-zero mean, we can incorporate that into $\mu(X)$ as an additive constant.) However, no such assumption is required to get Eq. 1.15. It works when predicting effects from causes, or the other way around when predicting (or “retrodicting”) causes from effects, or indeed when there is no causal relationship whatsoever between $X$ and $Y^4$. It is always true that
$$Y \mid X=\mu(X)+\eta(X)$$
where $\eta(X)$ is a random variable with expected value 0 , but as the notation indicates the distribution of this variable generally depends on $X$.

It’s also important to be clear that when we find the regression function is a constant, $\mu(x)=\mu_0$ for all $x$, that this does not mean that $X$ and $Y$ are statistically independent. If they are independent, then the regression function is a constant, but turning this around is the logical fallacy of “affirming the consequent” 5 .

## 数学代写|基础数据分析代写基本数据分析代考|回归函数

\begin{aligned} \operatorname{MSE}(f) &=\mathbb{E}\left[(Y-f(X))^2\right] \ &=\mathbb{E}\left[\mathbb{E}\left[(Y-f(X))^2 \mid X\right]\right] \ &=\mathbb{E}\left[\mathbb{V}[Y \mid X]+(\mathbb{E}[Y-f(X) \mid X])^2\right] \end{aligned}

$$\mu(x)=\mathbb{E}[Y \mid X=x]$$

## 数学代写|基础数据分析代写基本数据分析代考|一些免责声明

.

$$Y \leftarrow \mu(X)+\epsilon$$
，其中箭头的方向$\leftarrow$表示从原因到结果的流动，而$\epsilon$是一些噪声变量。如果推理之神非常非常善良，那么$\epsilon$将有一个固定的分布，独立于$X$，我们可以在不丧失一般性的情况下将它的均值设为零。(“不丧失一般性”，因为如果它有一个非零的平均值，我们可以把它作为一个相加常数合并到$\mu(X)$。)然而，得到式1.15不需要这样的假设。当从原因预测结果时，或者从结果预测(或“回溯”)原因时，或者当$X$和$Y^4$之间没有任何因果关系时，它是有效的。
$$Y \mid X=\mu(X)+\eta(X)$$

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 数学代写|基础数据分析代写Elementary data Analysis代考|ST309

statistics-lab™ 为您的留学生涯保驾护航 在代写基础数据分析Elementary data Analysis方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写基础数据分析Elementary data Analysis代写方面经验极为丰富，各种基础数据分析Elementary data Analysis相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 数学代写|基础数据分析代写Elementary data Analysis代考|Guessing the Value of a Random Variable

We have a quantitative, numerical variable, which we’ll imaginatively call $Y$. We’ll suppose that it’s a random variable, and try to predict it by guessing a single value for it. (Other kinds of predictions are possible – we might guess whether $Y$ will fall within certain limits, or the probability that it does so, or even the whole probability distribution of $Y$. But some lessons we’ll learn here will apply to these other kinds of predictions as well.) What is the best value to guess? More formally, what is the optimal point forecast for $Y$ ?

To answer this question, we need to pick a function to be optimized, which should measure how good our guesses are $-$ or equivalently how bad they are, i.e., how big an error we’re making. A reasonable start point is the mean squared error:
$$\operatorname{MSE}(m) \equiv \mathbb{E}\left[(Y-m)^2\right]$$

So we’d like to find the value $\mu$ where $\operatorname{MSE}(m)$ is smallest.
\begin{aligned} \operatorname{MSE}(m) &=\mathbb{E}\left[(Y-m)^2\right] \ &=(\mathbb{E}[Y-m])^2+\mathbb{V}[Y-m] \ &=(\mathbb{E}[Y-m])^2+\mathbb{V}[Y] \ &=(\mathbb{E}[Y]-m)^2+\mathbb{V}[Y] \ \frac{d \mathrm{MSE}}{d m} &=-2(\mathbb{E}[Y]-m)+0 \ \left.\frac{d \mathrm{MSE}}{d m}\right|_{m=\mu} &=0 \ 2(\mathbb{E}[Y]-\mu) &=0 \ \mu &=\mathbb{E}[Y] \end{aligned}
So, if we gauge the quality of our prediction by mean-squared error, the best prediction to make is the expected value.

## 数学代写|基础数据分析代写Elementary data Analysis代考|Estimating the Expected Value

Of course, to make the prediction $\mathbb{E}[Y]$ we would have to know the expected value of $Y$. Typically, we do not. However, if we have sampled values, $y_1, y_2, \ldots y_n$, we can estimate the expectation from the sample mean:
$$\widehat{\mu} \equiv \frac{1}{n} \sum_{i=1}^n y_i$$
If the samples are independent and identically distributed (IID), then the law of large numbers tells us that
$$\widehat{\mu} \rightarrow \mathbb{E}[Y]=\mu$$
and algebra with variances (Exercise 1) tells us something about how fast the convergence is, namely that the squared error will typically be about $\mathbb{V}[Y] / n$.

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。