## 统计代写|统计推断作业代写statistics interference代考|Interpretations of uncertainty

statistics-lab™ 为您的留学生涯保驾护航 在代写统计推断statistics interference方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写统计推断statistics interference方面经验极为丰富，各种代写统计推断statistics interference相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|统计推断作业代写statistics interference代考|General remarks

We can now consider some issues involved in formulating and comparing the different approaches.

In some respects the Bayesian formulation is the simpler and in other respects the more difficult. Once a likelihood and a prior are specified to a reasonable approximation all problems are, in principle at least, straightforward. The resulting posterior distribution can be manipulated in accordance with the ordinary laws of probability. The difficulties centre on the concepts underlying the definition of the probabilities involved and then on the numerical specification of the prior to sufficient accuracy.

Sometimes, as in certain genetical problems, it is reasonable to think of $\theta$ as generated by a stochastic mechanism. There is no dispute that the Bayesian approach is at least part of a reasonable formulation and solution in such situations. In other cases to use the formulation in a literal way we have to regard probability as measuring uncertainty in a sense not necessarily directly linked to frequencies. We return to this issue later. Another possible justification of some Bayesian methods is that they provide an algorithm for extracting from the likelihood some procedures whose fundamental strength comes from frequentist considerations. This can be regarded, in particular, as supporting
$5.2$ Broad roles of probability
a broad class of procedures, known as shrinkage methods, including ridge regression.

The emphasis in this book is quite often on the close relation between answers possible from different approaches. This does not imply that the different views never conflict. Also the differences of interpretation between different numerically similar answers may be conceptually important.

## 统计代写|统计推断作业代写statistics interference代考|Broad roles of probability

A recurring theme in the discussion so far has concerned the broad distinction between the frequentist and the Bayesian formalization and meaning of probability. Kolmogorov’s axiomatic formulation of the theory of probability largely decoupled the issue of meaning from the mathematical aspects; his axioms were, however, firmly rooted in a frequentist view, although towards the end of his life he became concerned with a different interpretation based on complexity. But in the present context meaning is crucial.

There are two ways in which probability may be used in statistical discussions. The first is phenomenological, to describe in mathematical form the empirical regularities that characterize systems containing haphazard variation. This typically underlies the formulation of a probability model for the data, in particular leading to the unknown parameters which are regarded as a focus of interest. The probability of an event $\mathcal{E}$ is an idealized limiting proportion of times in which $\mathcal{E}$ occurs in a large number of repeat observations on the system under the same conditions. In some situations the notion of a large number of repetitions can be reasonably closely realized; in others, as for example with economic time series, the notion is a more abstract construction. In both cases the working assumption is that the parameters describe features of the underlying data-generating process divorced from special essentially accidental features of the data under analysis.

That first phenomenological notion is concerned with describing variability. The second role of probability is in connection with uncertainty and is thus epistemological. In the frequentist theory we adapt the frequency-based view of probability, using it only indirectly to calibrate the notions of confidence intervals and significance tests. In most applications of the Bayesian view we need an extended notion of probability as measuring the uncertainty of $\mathcal{E}$ given $\mathcal{F}$, where now $\mathcal{E}$, for example, is not necessarily the outcome of a random system, but may be a hypothesis or indeed any feature which is unknown to the investigator. In statistical applications $\mathcal{E}$ is typically some statement about the unknown parameter $\theta$ or more specifically about the parameter of interest $\psi$. The present

discussion is largely confined to such situations. The issue of whether a single number could usefully encapsulate uncertainty about the correctness of, say, the Fundamental Theory underlying particle physics is far outside the scope of the present discussion. It could, perhaps, be applied to a more specific question such as a prediction of the Fundamental Theory: will the Higgs boson have been discovered by 2010 ?

One extended notion of probability aims, in particular, to address the point that in interpretation of data there are often sources of uncertainty additional to those arising from narrow-sense statistical variability. In the frequentist approach these aspects, such as possible systematic errors of measurement, are addressed qualitatively, usually by formal or informal sensitivity analysis, rather than incorporated into a probability assessment.

## 统计代写|统计推断作业代写statistics interference代考|Frequentist interpretation of upper limits

First we consider the frequentist interpretation of upper limits obtained, for example, from a suitable pivot. We take the simplest example, Example 1.1, namely the normal mean when the variance is known, but the considerations are fairly general. The upper limit
$$\bar{y}+k_{c}^{} \sigma_{0} / \sqrt{n},$$ derived here from the probability statement $$P\left(\mu<\bar{Y}+k_{c}^{} \sigma_{0} / \sqrt{n}\right)=1-c,$$
is a particular instance of a hypothetical long run of statements a proportion $1-c$ of which will be true, always, of course, assuming our model is sound. We can, at least in principle, make such a statement for each $c$ and thereby generate a collection of statements, sometimes called a confidence distribution. There is no restriction to a single $c$, so long as some compatibility requirements hold.
Because this has the formal properties of a distribution for $\mu$ it was called by R. A. Fisher the fiducial distribution and sometimes the fiducial probability distribution. A crucial question is whether this distribution can be interpreted and manipulated like an ordinary probability distribution. The position is:

• a single set of limits for $\mu$ from some data can in some respects be considered just like a probability statement for $\mu$;
• such probability statements cannot in general be combined or manipulated by the laws of probability to evaluate, for example, the chance that $\mu$ exceeds some given constant, for example zero. This is clearly illegitimate in the present context.

That is, as a single statement a $1-c$ upper limit has the evidential force of a statement of a unique event within a probability system. But the rules for manipulating probabilities in general do not apply. The limits are, of course, directly based on probability calculations.

Nevertheless the treatment of the confidence interval statement about the parameter as if it is in some respects like a probability statement contains the important insights that, in inference for the normal mean, the unknown parameter is more likely to be near the centre of the interval than near the end-points and that, provided the model is reasonably appropriate, if the mean is outside the interval it is not likely to be far outside.

A more emphatic demonstration that the sets of upper limits defined in this way do not determine a probability distribution is to show that in general there is an inconsistency if such a formal distribution determined in one stage of analysis is used as input into a second stage of probability calculation. We shall not give details; see Note 5.2.

The following example illustrates in very simple form the care needed in passing from assumptions about the data, given the model, to inference about the model, given the data, and in particular the false conclusions that can follow from treating such statements as probability statements.

5.2概率

## 统计代写|统计推断作业代写statistics interference代考|Frequentist interpretation of upper limits

• 一组限制μ从某些数据中可以在某些方面被认为就像一个概率陈述μ;
• 这种概率陈述通常不能被概率法则组合或操纵来评估，例如，μ超过某个给定的常数，例如零。在目前的情况下，这显然是非法的。

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|统计推断作业代写statistics interference代考|Some more general frequentist developments

statistics-lab™ 为您的留学生涯保驾护航 在代写统计推断statistics interference方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写统计推断statistics interference方面经验极为丰富，各种代写统计推断statistics interference相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|统计推断作业代写statistics interference代考|Exponential family problems

Some limited but important classes of problems have a formally exact solution within the frequentist approach. We begin with two situations involving exponential family distributions. For simplicity we suppose that the parameter $\psi$ of interest is one-dimensional.

We start with a full exponential family model in which $\psi$ is a linear combination of components of the canonical parameter. After linear transformation of the parameter and canonical statistic we can write the density of the observations in the form
$$m(y) \exp \left{s_{\psi} \psi+s_{\lambda}^{T} \lambda-k(\psi, \lambda)\right}$$
where $\left(s_{\psi}, s_{\lambda}\right)$ is a partitioning of the sufficient statistic corresponding to the partitioning of the parameter. For inference about $\psi$ we prefer a pivotal distribution not depending on $\lambda$. It can be shown via the mathematical property of completeness, essentially that the parameter space is rich enough, that separation from $\lambda$ can be achieved only by working with the conditional distribution of the data given $S_{\lambda}=s_{\lambda}$. That is, we evaluate the conditional distribution of $S_{\psi}$ given $S_{\lambda}=s_{\lambda}$ and use the resulting distribution to find which values of $\psi$ are consistent with the data at various levels.

Many of the standard procedures dealing with simple problems about Poisson, binomial, gamma and normal distributions can be found by this route.We illustrate the arguments sketched above by a number of problems connected with binomial distributions, noting that the canonical parameter of a binomial distribution with probability $\pi$ is $\phi=\log {\pi /(1-\pi)}$. This has some advantages in interpretation and also disadvantages. When $\pi$ is small, the parameter is essentially equivalent to $\log \pi$ meaning that differences in canonical parameter are equivalent to ratios of probabilities.

## 统计代写|统计推断作业代写statistics interference代考|Transformation models

A different kind of reduction is possible for models such as the location model which preserve a special structure under a family of transformations.

Example 4.10. Location model. We return to Example 1.8, the location model. The likelihood is $\Pi g\left(y_{k}-\mu\right)$ which can be rearranged in terms of the order statistics $y_{(t)}$, i.e., the observed values arranged in increasing order of magnitude. These in general form the sufficient statistic for the model, minimal except for the normal distribution. Now the random variables corresponding to the differences between order statistics, specified, for example, as
$$a=\left(y_{(2)}-y_{(1)}, \ldots, y_{(n)}-y_{(1)}\right)=\left(a_{2}, \ldots, a_{n}\right),$$
have a distribution not depending on $\mu$ and hence are ancillary; inference about $\mu$ is thus based on the conditional distribution of, say, $y_{(1)}$ given $A=a$. The choice of $y_{(1)}$ is arbitrary because any estimate of location, for example the mean, is $y_{(1)}$ plus a function of $a$, and the latter is not random after conditioning. Now the marginal density of $A$ is
$$\int g\left(y_{(1)}-\mu\right) g\left(y_{(1)}+a_{2}-\mu\right) \cdots g\left(y_{(1)}+a_{n}-\mu\right) d y_{(1)} .$$The integral with respect to $y_{(1)}$ is equivalent to one integrated with respect to $\mu$ so that, on reexpressing the integrand in terms of the likelihood for the unordered variables, the density of $A$ is the function necessary to normalize the likelihood to integrate to one with respect to $\mu$. That is, the conditional density of $Y_{(1)}$, or of any other measure of location, is determined by normalizing the likelihood function and regarding it as a function of $y_{(1)}$ for given $a$. This implies that $p$-values and confidence limits for $\mu$ result from normalizing the likelihood and treating it like a distribution of $\mu$.

This is an exceptional situation in frequentist theory. In general, likelihood is a point function of its argument and integrating it over sets of parameter values is not statistically meaningful. In a Bayesian formulation it would correspond to combination with a uniform prior but no notion of a prior distribution is involved in the present argument.

The ancillary statistics allow testing conformity with the model. For example, to check on the functional form of $g(.)$ it would be best to work with the differences of the ordered values from the mean, $\left(y_{(l)}-\bar{y}\right)$ for $l=1, \ldots, n$. These are functions of $a$ as previously defined. A test statistic could be set up informally sensitive to departures in distributional form from those implied by $g(.)$. Or $\left(y_{(l)}-\bar{y}\right)$ could be plotted against $G^{-1}{(l-1 / 2) /(n+1)}$, where $G(.)$ is the cumulative distribution function corresponding to the density $g(.)$.

## 统计代写|统计推断作业代写statistics interference代考|Some further Bayesian examples

In principle the prior density in a Bayesian analysis is an insertion of additional information and the form of the prior should be dictated by the nature of that evidence. It is useful, at least for theoretical discussion, to look at priors which lead to mathematically tractable answers. One such form, useful usually only for nuisance parameters, is to take a distribution with finite support, in particular a two-point prior. This has in one dimension three adjustable parameters, the position of two points and a probability, and for some limited purposes this may be adequate. Because the posterior distribution remains concentrated on the same two points computational aspects are much simplified.

We shall not develop that idea further and turn instead to other examples of parametric conjugate priors which exploit the consequences of exponential family structure as exemplified in Section 2.4.

Example 4.12. Normal variance. Suppose that $Y_{1}, \ldots, Y_{n}$ are independently normally distributed with known mean, taken without loss of generality to be zero, and unknown variance $\sigma^{2}$. The likelihood is, except for a constant factor,
$$\frac{1}{\sigma^{n}} \exp \left{-\Sigma y_{k}^{2} /\left(2 \sigma^{2}\right)\right} .$$
The canonical parameter is $\phi=1 / \sigma^{2}$ and simplicity of structure suggests taking $\phi$ to have a prior gamma density which it is convenient to write in the form
$$\pi\left(\phi ; g, n_{\pi}\right)=g(g \phi)^{n_{\pi} / 2-1} e^{-g \phi} / \Gamma\left(n_{\pi} / 2\right),$$

defined by two quantities assumed known. One is $n_{\pi}$, which plays the role of an effective sample size attached to the prior density, by analogy with the form of the chi-squared density with $n$ degrees of freedom. The second defining quantity is $g$. Transformed into a distribution for $\sigma^{2}$ it is often called the inverse gamma distribution. Also $E_{\pi}(\Phi)=n_{\pi} /(2 g)$.

On multiplying the likelihood by the prior density, the posterior density of $\Phi$ is proportional to
$$\phi^{\left(n+n_{\pi}\right) / 2-1} \exp \left[-\left{\left(\Sigma y_{k}^{2}+n_{\pi} / E_{\pi}(\Phi)\right)\right} \phi / 2\right]$$
The posterior distribution is in effect found by treating
$$\left{\Sigma y_{k}^{2}+n_{\pi} / E_{\pi}(\Phi)\right} \Phi$$
as having a chi-squared distribution with $n+n_{\pi}$ degrees of freedom.
Formally, frequentist inference is based on the pivot $\Sigma Y_{k}^{2} / \sigma^{2}$, the pivotal distribution being the chi-squared distribution with $n$ degrees of freedom. There is formal, although of course not conceptual, equivalence between the two methods when $n_{a}=0$. This arises from the improper prior $d \phi / \phi$, equivalent to $d \sigma / \sigma$ or to a uniform improper prior for $\log \sigma$. That is, while there is never in this setting exact agreement between Bayesian and frequentist solutions, the latter can be approached as a limit as $n_{\pi} \rightarrow 0$.

## 统计代写|统计推断作业代写statistics interference代考|Exponential family problems

m(y) \exp \left{s_{\psi} \psi+s_{\lambda}^{T} \lambda-k(\psi, \lambda)\right}

## 统计代写|统计推断作业代写statistics interference代考|Transformation models

$$a=\left(y_{(2)}-y_{(1)}, \ldots, y_{ (n)}-y_{(1)}\right)=\left(a_{2}, \ldots, a_{n}\right),$$

$$\int g\left(y_{(1)}-\mu\right) g\left(y_{(1)}+a_{2} -\mu\right) \cdots g\left(y_{(1)}+a_{n}-\mu\right) d y_{(1)} .$$关于$y_{的积分(1)}$ 等价于关于 $\mu$ 的积分，因此，在根据无序变量的似然性重新表示被积函数时，$A$ 的密度是归一化似然性所必需的函数对于 $\mu$ 为一。也就是说，$Y_{(1)}$ 或任何其他位置度量的条件密度是通过对似然函数进行归一化并将其视为给定 $a 的$y_{(1)}$的函数来确定的美元。这意味着$\mu$的$p$值和置信限来自于对可能性进行归一化并将其视为$\mu$的分布。 对于诸如位置模型这样的模型可以进行不同类型的归约，该模型在一系列变换下保留了特殊结构。 示例 4.10。定位模型。我们回到示例 1.8，位置模型。可能性是$\Pi g\left(y_{k}-\mu\right)$，它可以根据顺序统计$y_{(t)}$重新排列，即观察值按递增顺序排列震级。这些通常构成模型的足够统计量，除了正态分布之外是最小的。现在将随机变量对应的订单统计量的差异，指定为例如 $$a=\left(y_{(2)}-y_{(1)}, \ldots, y_{ (n)}-y_{(1)}\right)=\left(a_{2}, \ldots, a_{n}\right),$$ 有一个不依赖于$\ 的分布mu$，因此是辅助的；因此，关于$\mu$的推断是基于给定$A=a$的$y_{(1)}$的条件分布。$y_{(1)}$的选择是任意的，因为任何位置估计，例如均值，都是$y_{(1)}$加上$a$的函数，而后者在调节后不是随机的。现在$A$的边际密度是 $$\int g\left(y_{(1)}-\mu\right) g\left(y_{(1)}+a_{2} -\mu\right) \cdots g\left(y_{(1)}+a_{n}-\mu\right) d y_{(1)} .$$关于$y_{的积分(1)}$等价于关于$\mu$的积分，因此，在根据无序变量的似然性重新表示被积函数时，$A$的密度是归一化似然性所必需的函数对于$\mu$为一。也就是说，$Y_{(1)}$或任何其他位置度量的条件密度是通过对似然函数进行归一化并将其视为给定$a 的 $y_{(1)}$ 的函数来确定的美元。这意味着 $\mu$ 的 $p$ 值和置信限来自于对可能性进行归一化并将其视为 $\mu$ 的分布。

## 统计代写|统计推断作业代写statistics interference代考|Some further Bayesian examples

$$\frac{1}{\sigma^{n}} \exp \left{-\Sigma y_{k}^{2} /\left (2 \sigma^{2}\right)\right} .$$

$$\pi\left(\phi ; g, n_{\pi}\right)=g(g \phi)^{ n_{\pi} / 2-1} e^{-g \phi} / \Gamma\left(n_{\pi} / 2\right),$$

$$\frac{1}{\sigma^{n}} \exp \left{-\Sigma y_{k}^{2} /\left (2 \sigma^{2}\right)\right} .$$

$$\pi\left(\phi ; g, n_{\pi}\right)=g(g \phi)^{ n_{\pi} / 2-1} e^{-g \phi} / \Gamma\left(n_{\pi} / 2\right),$$原则上，贝叶斯分析中的先验密度是附加信息的插入，先验的形式应由该证据的性质决定。至少对于理论讨论而言，查看导致数学上易于处理的答案的先验是有用的。一种这样的形式，通常只对讨厌的参数有用，是采用有限支持的分布，特别是两点先验。这在一维中具有三个可调整的参数、两个点的位置和一个概率，对于某些有限的目的，这可能就足够了。因为后验分布仍然集中在相同的两点上，所以计算方面大大简化了。

$$\frac{1}{\sigma^{n}} \exp \left{-\Sigma y_{k}^{2} /\left (2 \sigma^{2}\right)\right} .$$

$$\pi\left(\phi ; g, n_{\pi}\right)=g(g \phi)^{ n_{\pi} / 2-1} e^{-g \phi} / \Gamma\left(n_{\pi} / 2\right),$$

$$\phi^{\left(n+n_{\pi}\right) / 2 成正比-1} \exp \left[-\left{\left(\Sigma y_{k}^{2}+n_{\pi} / E_{\pi}(\Phi)\right)\right} \phi / 2\right]$$

$$\left{\Sigma y_{k}^{2}+n_{\pi} / E_ {\pi}(\Phi)\right} \Phi$$

$$\phi^{\left(n+n_{\pi}\right) / 2 成正比-1} \exp \left[-\left{\left(\Sigma y_{k}^{2}+n_{\pi} / E_{\pi}(\Phi)\right)\right} \phi / 2\right]$$

$$\left{\Sigma y_{k}^{2}+n_{\pi} / E_ {\pi}(\Phi)\right} \Phi$$

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|统计推断作业代写statistics interference代考|More complicated situations

statistics-lab™ 为您的留学生涯保驾护航 在代写统计推断statistics interference方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写统计推断statistics interference方面经验极为丰富，各种代写统计推断statistics interference相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|统计推断作业代写statistics interference代考|General remarks

The previous frequentist discussion in especially Chapter 3 yields a theoretical approach which is limited in two senses. It is restricted to problems with no nuisance parameters or ones in which elimination of nuisance parameters is straightforward. An important step in generalizing the discussion is to extend the notion of a Fisherian reduction. Then we turn to a more systematic discussion of the role of nuisance parameters.

By comparison, as noted previously in Section 1.5, a great formal advantage of the Bayesian formulation is that, once the formulation is accepted, all subsequent problems are computational and the simplifications consequent on sufficiency serve only to ease calculations.

## 统计代写|统计推断作业代写statistics interference代考|General Bayesian formulation

The argument outlined in Section $1.5$ for inference about the mean of a normal distribution can be generalized as follows. Consider the model $f_{Y \mid \Theta}(y \mid \theta)$, where, because we are going to treat the unknown parameter as a random variable, we now regard the model for the data-generating process as a conditional density. Suppose that $\Theta$ has the prior density $f_{\Theta}(\theta)$, specifying the marginal

distribution of the parameter, i.e., in effect the distribution $\Theta$ has when the observations $y$ are not available.

Given the data and the above formulation it is reasonable to assume that all information about $\theta$ is contained in the conditional distribution of $\Theta$ given $Y=y$. We call this the posterior distribution of $\Theta$ and calculate it by the standard laws of probability theory, as given in (1.12), by
$$f_{\Theta \mid Y}(\theta \mid y)=\frac{f_{Y \mid \Theta}(y \mid \theta) f_{\Theta}(\theta)}{\int_{\Omega_{\theta}} f_{Y \mid \Theta}(y \mid \phi) f_{\Theta}(\phi) d \phi} .$$
The main problem in computing this lies in evaluating the normalizing constant in the denominator, especially if the dimension of the parameter space is high. Finally, to isolate the information about the parameter of interest $\psi$, we marginalize the posterior distribution over the nuisance parameter $\lambda$. That is, writing $\theta=(\psi, \lambda)$, we consider
$$f_{\Psi \mid Y}(\psi \mid y)=\int f_{\Theta \mid Y}(\psi, \lambda \mid y) d \lambda$$
The models and parameters for which this leads to simple explicit solutions are broadly those for which frequentist inference yields simple solutions.

Because in the formula for the posterior density the prior density enters both in the numerator and the denominator, formal multiplication of the prior density by a constant would leave the answer unchanged. That is, there is no need for the prior measure to be normalized to 1 so that, formally at least, improper, i.e. divergent, prior densities may be used, always provided proper posterior densities result. A simple example in the context of the normal mean, Section $1.5$, is to take as prior density element $f_{M}(\mu) d \mu$ just $d \mu$. This could be regarded as the limit of the proper normal prior with variance $v$ taken as $v \rightarrow \infty$. Such limits raise few problems in simple cases, but in complicated multiparameter problems considerable care would be needed were such limiting notions contemplated. There results here a posterior distribution for the mean that is normal with mean $\bar{y}$ and variance $\sigma_{0}^{2} / n$, leading to posterior limits numerically identical to confidence limits.

In fact, with a scalar parameter it is possible in some generality to find a prior giving very close agreement with corresponding confidence intervals. With multidimensional parameters this is not in general possible and naive use of flat priors can lead to procedures that are very poor from all perspectives; see Example 5.5.

## 统计代写|统计推断作业代写statistics interference代考|Frequentist analysis

One approach to simple problems is essentially that of Section $2.5$ and can be summarized, as before, in the Fisherian reduction:

• find the likelihood function;
• reduce to a sufficient statistic $S$ of the same dimension as $\theta$;
• find a function of $S$ that has a distribution depending only on $\psi$;
• place it in pivotal form or alternatively use it to derive $p$-values for null hypotheses;
• invert to obtain limits for $\psi$ at an arbitrary set of probability levels.
There is sometimes an extension of the method that works when the model is of the $(k, d)$ curved exponential family form. Then the sufficient statistic is of dimension $k$ greater than $d$, the dimension of the parameter space. We then proceed as follows:
• if possible, rewrite the $k$-dimensional sufficient statistic, when $k>d$, in the form $(S, A)$ such that $S$ is of dimension $d$ and $A$ has a distribution not depending on $\theta$;
• consider the distribution of $S$ given $A=a$ and proceed as before. The statistic $A$ is called ancillary.
There are limitations to these methods. In particular a suitable A may not exist, and then one is driven to asymptotic, i.e., approximate, arguments for problems of reasonable complexity and sometimes even for simple problems.
We give some examples, the first of which is not of exponential family form.
Example 4.1. Uniform distribution of known range. Suppose that $\left(Y_{1}, \ldots, Y_{n}\right)$ are independently and identically distributed in the uniform distribution over $(\theta-1, \theta+1)$. The likelihood takes the constant value $2^{-n}$ provided the smallest

and largest values $\left(y_{(1)}, y_{(n)}\right)$ lie within the range $(\theta-1, \theta+1)$ and is zero otherwise. The minimal sufficient statistic is of dimension 2 , even though the parameter is only of dimension 1 . The model is a special case of a location family and it follows from the invariance properties of such models that $A=Y_{(n)}-Y_{(1)}$ has a distribution independent of $\theta$.

This example shows the imperative of explicit or implicit conditioning on the observed value $a$ of $A$ in quite compelling form. If $a$ is approximately 2 , only values of $\theta$ very close to $y^{}=\left(y_{(1)}+y_{(n)}\right) / 2$ are consistent with the data. If, on the other hand, $a$ is very small, all values in the range of the common observed value $y^{}$ plus and minus 1 are consistent with the data. In general, the conditional distribution of $Y^{}$ given $A=a$ is found as follows. The joint density of $\left(Y_{(1)}, Y_{(n)}\right)$ is $$n(n-1)\left(y_{(n)}-y_{(1)}\right)^{n-2} / 2^{n}$$ and the transformation to new variables $\left(Y^{}, A=Y_{(n)}-Y_{(1)}\right)$ has unit Jacobian. Therefore the new variables $\left(Y^{}, A\right)$ have density $n(n-1) a^{(n-2)} / 2^{n}$ defined over the triangular region $\left(0 \leq a \leq 2 ; \theta-1+a / 2 \leq y^{} \leq \theta+1-a / 2\right)$ and density zero elsewhere. This implies that the conditional density of $Y^{}$ given $A=a$ is uniform over the allowable interval $\theta-1+a / 2 \leq y^{} \leq \theta+1-a / 2$.

Conditional confidence interval statements can now be constructed although they add little to the statement just made, in effect that every value of $\theta$ in the relevant interval is in some sense equally consistent with the data. The key point is that an interval statement assessed by its unconditional distribution could be formed that would give the correct marginal frequency of coverage but that would hide the fact that for some samples very precise statements are possible whereas for others only low precision is achievable.

## 统计代写|统计推断作业代写statistics interference代考|General Bayesian formulation

Fθ∣是(θ∣是)=F是∣θ(是∣θ)Fθ(θ)∫ΩθF是∣θ(是∣φ)Fθ(φ)dφ.

FΨ∣是(ψ∣是)=∫Fθ∣是(ψ,λ∣是)dλ

## 统计代写|统计推断作业代写statistics interference代考|Frequentist analysis

• 找到似然函数；
• 减少到足够的统计量小号尺寸相同θ;
• 找到一个函数小号它的分布仅取决于ψ;
• 把它放在关键的形式，或者用它来推导p- 零假设的值；
• 反转以获得限制ψ在任意一组概率水平上。
当模型属于(到,d)弯曲指数族形式。那么充分的统计量是有维度的到比…更棒d，参数空间的维数。然后我们进行如下操作：
• 如果可能，重写到维充分统计量，当到>d，形式为(小号,一种)这样小号有维度d和一种分布不依赖于θ;
• 考虑分布小号给定一种=一种并像以前一样进行。统计数据一种称为辅助。
这些方法有局限性。特别是一个合适的 A 可能不存在，然后一个人被驱使对合理复杂的问题，有时甚至是简单问题进行渐近的，即近似的论证。
我们给出一些例子，其中第一个不是指数族形式。
例 4.1。已知范围的均匀分布。假设(是1,…,是n)在均匀分布中独立同分布(θ−1,θ+1). 可能性取常数值2−n提供最小的

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|统计推断作业代写statistics interference代考|Relation with acceptance and rejection

statistics-lab™ 为您的留学生涯保驾护航 在代写统计推断statistics interference方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写统计推断statistics interference方面经验极为丰富，各种代写统计推断statistics interference相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|统计推断作业代写statistics interference代考|Relation with acceptance and rejection

There is a conceptual difference, but essentially no mathematical difference, between the discussion here and the treatment of testing as a two-decision problem, with control over the formal error probabilities. In this we fix in principle the probability of rejecting $H_{0}$ when it is true, usually denoted by $\alpha$, aiming to maximize the probability of rejecting $H_{0}$ when false. This approach demands the explicit formulation of alternative possibilities. Essentially it amounts to setting in advance a threshold for $p_{o b s}$. It is, of course, potentially appropriate when clear decisions are to be made, as for example in some classification problems. The previous discussion seems to match more closely scientific practice in these matters, at least for those situations where analysis and interpretation rather than decision-making are the focus.

That is, there is a distinction between the Neyman-Pearson formulation of testing regarded as clarifying the meaning of statistical significance via hypothetical repetitions and that same theory regarded as in effect an instruction on how to implement the ideas by choosing a suitable $\alpha$ in advance and reaching different decisions accordingly. The interpretation to be attached to accepting or rejecting a hypothesis is strongly context-dependent; the point at stake here, however, is more a question of the distinction between assessing evidence, as contrasted with deciding by a formal rule which of two directions to take.

## 统计代写|统计推断作业代写statistics interference代考|Formulation of alternatives and test statistics

As set out above, the simplest version of a significance test involves formulation of a null hypothesis $H_{0}$ and a test statistic $T$, large, or possibly extreme, values of which point against $H_{0}$. Choice of $T$ is crucial in specifying the kinds of departure from $H_{0}$ of concern. In this first formulation no alternative probability models are explicitly formulated; an implicit family of possibilities is

specified via $T$. In fact many quite widely used statistical tests were developed in this way.

A second possibility is that the null hypothesis corresponds to a particular parameter value, say $\psi=\psi_{0}$, in a family of models and the departures of main interest correspond either, in the one-dimensional case, to one-sided alternatives $\psi>\psi_{0}$ or, more generally, to alternatives $\psi \neq \psi_{0}$. This formulation will suggest the most sensitive test statistic, essentially equivalent to the best estimate of $\psi$, and in the Neyman-Pearson formulation such an explicit formulation of alternatives is essential.

The approaches are, however, not quite as disparate as they may seem. Let $f_{0}(y)$ denote the density of the observations under $H_{0}$. Then we may associate with a proposed test statistic $T$ the exponential family
$$f_{0}(y) \exp {t \theta-k(\theta)}$$
where $k(\theta)$ is a normalizing constant. Then the test of $\theta=0$ most sensitive to these departures is based on $T$. Not all useful tests appear natural when viewed in this way, however; see, for instance, Example 3.5.

Many of the test procedures for examining model adequacy that are provided in standard software are best regarded as defined directly by the test statistic used rather than by a family of alternatives. In principle, as emphasized above, the null hypothesis is the conditional distribution of the data given the sufficient statistic for the parameters in the model. Then, within that null distribution, interesting directions of departure are identified.

The important distinction is between situations in which a whole family of distributions arises naturally as a base for analysis versus those where analysis is at a stage where only the null hypothesis is of explicit interest.

Tests where the null hypotheses itself is formulated in terms of arbitrary distributions, so-called nonparametric or distribution-free tests, illustrate the use of test statistics that are formulated largely or wholly informally, without specific probabilistically formulated alternatives in mind. To illustrate the arguments involved, consider initially a single homogenous set of observations.

## 统计代写|统计推断作业代写statistics interference代考|Relation with interval estimation

While conceptually it may seem simplest to regard estimation with uncertainty as a simpler and more direct mode of analysis than significance testing there are some important advantages, especially in dealing with relatively complicated problems, in arguing in the other direction. Essentially confidence intervals, or more generally confidence sets, can be produced by testing consistency with every possible value in $\Omega_{\psi}$ and taking all those values not ‘rejected’ at level $c$, say, to produce a $1-c$ level interval or region. This procedure has the property that in repeated applications any true value of $\psi$ will be included in the region except in a proportion $1-c$ of cases. This can be done at various levels $c$, using the same form of test throughout.

Example 3.6. Ratio of normal means. Given two independent sets of random variables from normal distributions of unknown means $\mu_{0}, \mu_{1}$ and with known variance $\sigma_{0}^{2}$, we first reduce by sufficiency to the sample means $\bar{y}{0}, \bar{y}{1}$. Suppose that the parameter of interest is $\psi=\mu_{1} / \mu_{0}$. Consider the null hypothesis $\psi=\psi_{0}$. Then we look for a statistic with a distribution under the null hypothesis that does not depend on the nuisance parameter. Such a statistic is
$$\frac{\bar{Y}{1}-\psi{0} \bar{Y}{0}}{\sigma{0} \sqrt{\left(1 / n_{1}+\psi_{0}^{2} / n_{0}\right)}}$$
this has a standard normal distribution under the null hypothesis. This with $\psi_{0}$ replaced by $\psi$ could be treated as a pivot provided that we can treat $\bar{Y}_{0}$ as positive.

Note that provided the two distributions have the same variance a similar result with the Student $t$ distribution replacing the standard normal would apply if the variance were unknown and had to be estimated. To treat the probably more realistic situation where the two distributions have different and unknown variances requires the approximate techniques of Chapter $6 .$

We now form a $1-c$ level confidence region by taking all those values of $\psi_{0}$ that would not be ‘rejected’ at level $c$ in this test. That is, we take the set
$$\left{\psi: \frac{\left(\bar{Y}{1}-\psi \bar{Y}{0}\right)^{2}}{\sigma_{0}^{2}\left(1 / n_{1}+\psi^{2} / n_{0}\right)} \leq k_{1 ; c}^{}\right},$$ where $k_{1 ; c}^{}$ is the upper $c$ point of the chi-squared distribution with one degree of freedom.

Thus we find the limits for $\psi$ as the roots of a quadratic equation. If there are no real roots, all values of $\psi$ are consistent with the data at the level in question. If the numerator and especially the denominator are poorly determined, a confidence interval consisting of the whole line may be the only rational conclusion to be drawn and is entirely reasonable from a testing point of view, even though regarded from a confidence interval perspective it may, wrongly, seem like a vacuous statement.

## 统计代写|统计推断作业代写statistics interference代考|Formulation of alternatives and test statistics

$$f_{0}(y) \exp {t \theta-k(\theta)}$$相关联
，其中 $k(\theta)$ 是归一化的持续的。那么对这些偏离最敏感的$\theta=0$ 的检验是基于$T$。然而，当以这种方式查看时，并非所有有用的测试都显得自然。例如，参见示例 3.5。

## 统计代写|统计推断作业代写statistics interference代考|Relation with interval estimation

$$\frac{\bar{Y}{1}-\psi{0} \bar{Y}{0}}{\sigma{0} \sqrt{\left(1 / n_{1} +\psi_{0}^{2} / n_{0}\right)}}$$

$$\left{\psi: \frac{\left(\bar{Y}{1}-\psi \bar{Y}{0}\right)^{2}}{\ sigma_{0}^{2}\left(1 / n_{1}+\psi^{2} / n_{0}\right)} \leq k_{1 ; c}^{}\right},$$ 其中 $k_{1 ; c}^{}$ 是具有一个自由度的卡方分布的上 $c$ 点。

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|统计推断作业代写statistics interference代考|Significance tests

statistics-lab™ 为您的留学生涯保驾护航 在代写统计推断statistics interference方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写统计推断statistics interference方面经验极为丰富，各种代写统计推断statistics interference相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|统计推断作业代写statistics interference代考|General remarks

So far, in our frequentist discussion we have summarized information about the unknown parameter $\psi$ by finding procedures that would give in hypothetical repeated applications upper (or lower) bounds for $\psi$ a specified proportion of times in a long run of repeated applications. This is close to but not the same as specifying a probability distribution for $\psi$; it avoids having to treat $\psi$ as a random variable, and moreover as one with a known distribution in the absence of the data.

Suppose now there is specified a particular value $\psi_{0}$ of the parameter of interest and we wish to assess the relation of the data to that value. Often the hypothesis that $\psi=\psi_{0}$ is called the null hypothesis and conventionally denoted by $H_{0}$. It may, for example, assert that some effect is zero or takes on a value given by a theory or by previous studies, although $\psi_{0}$ does not have to be restricted in that way.

There are at least six different situations in which this may arise, namely the following.

• There may be some special reason for thinking that the null hypothesis may be exactly or approximately true or strong subject-matter interest may focus on establishing that it is likely to be false.
• There may be no special reason for thinking that the null hypothesis is true but it is important because it divides the parameter space into two (or more) regions with very different interpretations. We are then interested in whether the data establish reasonably clearly which region is correct, for example it may establish the value of $\operatorname{sgn}\left(\psi-\psi_{0}\right)$.
• Testing may be a technical device used in the process of generating confidence intervals.
• Consistency with $\psi=\psi_{0}$ may provide a reasoned justification for simplifying an otherwise rather complicated model into one that is more transparent and which, initially at least, may be a reasonable basis for interpretation.
• Only the model when $\psi=\psi_{0}$ is under consideration as a possible model for interpreting the data and it has been embedded in a richer family just to provide a qualitative basis for assessing departure from the model.
• Only a single model is defined, but there is a qualitative idea of the kinds of departure that are of potential subject-matter interest.

The last two formulations are appropriate in particular for examining model adequacy.

From time to time in the discussion it is useful to use the short-hand description of $H_{0}$ as being possibly true. Now in statistical terms $H_{0}$ refers to a probability model and the very word ‘model’ implies idealization. With a very few possible exceptions it would be absurd to think that a mathematical model is an exact representation of a real system and in that sense all $H_{0}$ are defined within a system which is untrue. We use the term to mean that in the current state of knowledge it is reasonable to proceed as if the hypothesis is true. Note that an underlying subject-matter hypothesis such as that a certain environmental exposure has absolutely no effect on a particular disease outcome might indeed be true.

## 统计代写|统计推断作业代写statistics interference代考|Simple significance test

In the formulation of a simple significance test, we suppose available data $y$ and a null hypothesis $H_{0}$ that specifies the distribution of the corresponding random variable $Y$. In the first place, no other probabilistic specification is involved, although some notion of the type of departure from $H_{0}$ that is of subject-matter concern is essential.

The first step in testing $H_{0}$ is to find a distribution for observed random variables that has a form which, under $H_{0}$, is free of nuisance parameters, i.e., is

completely known. This is trivial when there is a single unknown parameter whose value is precisely specified by the null hypothesis. Next find or determine a test statistic $T$, large (or extreme) values of which indicate a departure from the null hypothesis of subject-matter interest. Then if $t_{\text {obs }}$ is the observed value of $T$ we define
$$p_{\mathrm{obs}}=P\left(T \geq t_{\mathrm{obs}}\right) \text {, }$$
the probability being evaluated under $H_{0}$, to be the (observed) $p$-value of the test.

It is conventional in many fields to report only very approximate values of $p_{\text {obs }}$, for example that the departure from $H_{0}$ is significant just past the 1 per cent level, etc.

The hypothetical frequency interpretation of such reported significance levels is as follows. If we were to accept the available data as just decisive evidence against $H_{0}$, then we would reject the hypothesis when true a long-run proportion pobs of times.

Put more qualitatively, we examine consistency with $H_{0}$ by finding the consequences of $H_{0}$, in this case a random variable with a known distribution, and seeing whether the prediction about its observed value is reasonably well fulfilled.

We deal first with a very special case involving testing a null hypothesis that might be true to a close approximation.

## 统计代写|统计推断作业代写statistics interference代考|One- and two-sided tests

In many situations observed values of the test statistic in either tail of its distribution represent interpretable, although typically different, departures from $H_{0}$ The simplest procedure is then often to contemplate two tests, one for each tail, in effect taking the more significant, i.e., the smaller tail, as the basis for possible interpretation. Operational interpretation of the result as a hypothetical error rate is achieved by doubling the corresponding $p$, with a slightly more complicated argument in the discrete case.

More explicitly we argue as follows. With test statistic $T$, consider two $p$-values, namely
$$p_{\mathrm{obs}}^{+}=P\left(T \geq t ; H_{0}\right), \quad p_{\mathrm{obs}}^{-}=P\left(T \leq t ; H_{0}\right) .$$
In general the sum of these values is $1+P(T=t)$. In the two-sided case it is then reasonable to define a new test statistic
$$Q=\min \left(P_{\mathrm{obs}}^{+}, P_{\text {obs }}^{-}\right)$$
The level of significance is
$$P\left(Q \leq q_{\text {obs }} ; H_{0}\right)$$
In the continuous case this is $2 q_{\text {obs }}$ because two disjoint events are involved. In a discrete problem it is $q_{\text {obs plus the achievable } p \text {-value from the other tail }}$ of the distribution nearest to but not exceeding $q_{\mathrm{obs}}$. As has been stressed the precise calculation of levels of significance is rarely if ever critical, so that the careful definition is more one of principle than of pressing applied importance. A more important point is that the definition is unaffected by a monotone transformation of $T$.

In one sense very many applications of tests are essentially two-sided in that, even though initial interest may be in departures in one direction, it will rarely be wise to disregard totally departures in the other direction, even if initially they are unexpected. The interpretation of differences in the two directions may well be very different. Thus in the broad class of procedures associated with the linear model of Example $1.4$ tests are sometimes based on the ratio of an estimated variance, expected to be large if real systematic effects are present, to an estimate essentially of error. A large ratio indicates the presence of systematic effects whereas a suspiciously small ratio suggests an inadequately specified model structure.

## 统计代写|统计推断作业代写statistics interference代考|General remarks

• 可能有一些特殊的理由认为零假设可能完全或近似正确，或者强烈的主题兴趣可能集中在确定它可能是错误的。
• 认为零假设为真可能没有特别的理由，但它很重要，因为它将参数空间划分为两个（或更多）具有非常不同解释的区域。然后，我们对数据是否合理清楚地确定哪个区域是正确的感兴趣，例如它可以确定sgn⁡(ψ−ψ0).
• 测试可能是在生成置信区间的过程中使用的技术设备。
• 一致性ψ=ψ0可以为将原本相当复杂的模型简化为更透明的模型提供合理的理由，并且至少在最初可能是解释的合理基础。
• 仅当模型ψ=ψ0正在考虑作为解释数据的可能模型，它已被嵌入到一个更丰富的家庭中，只是为了提供一个定性基础来评估偏离模型的情况。
• 仅定义了一个模型，但对具有潜在主题兴趣的偏离类型有一个定性的想法。

p这bs=磷(吨≥吨这bs),

## 统计代写|统计推断作业代写statistics interference代考|One- and two-sided tests

p这bs+=磷(吨≥吨;H0),p这bs−=磷(吨≤吨;H0).

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|统计推断作业代写statistics interference代考| Exponential family

statistics-lab™ 为您的留学生涯保驾护航 在代写统计推断statistics interference方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写统计推断statistics interference方面经验极为丰富，各种代写统计推断statistics interference相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|统计推断作业代写statistics interference代考|Exponential family

We now consider a family of models that are important both because they have relatively simple properties and because they capture in one discussion many,

although by no means all, of the inferential properties connected with important standard distributions, the binomial, Poisson and geometric distributions and the normal and gamma distributions, and others too.

Suppose that $\theta$ takes values in a well-behaved region in $R^{d}$, and not of lower dimension, and that we can find a $(d \times 1)$-dimensional statistic $s$ and a parameterization $\phi$, i.e., a $(1,1)$ transformation of $\theta$, such that the model has the form
$$m(y) \exp \left{s^{T} \phi-k(\phi)\right}$$
where $s=s(y)$ is a function of the data. Then $S$ is sufficient; subject to some important regularity conditions this is called a regular or full $(d, d)$ exponential family of distributions. The statistic $S$ is called the canonical statistic and $\phi$ the canonical parameter. The parameter $\eta=E(S ; \phi)$ is called the mean parameter. Because the stated function defines a distribution, it follows that
\begin{aligned} &\int m(y) \exp \left{s^{T} \phi-k(\phi)\right} d y=1 \ &\int m(y) \exp \left{s^{T}(\phi+p)-k(\phi+p)\right} d y=1 . \end{aligned}
Hence, noting that $s^{T} p=p^{T} s$, we have from (2.11) that the moment generating function of $S$ is
$$E\left{\exp \left(p^{T} S\right)\right}=\exp {k(\phi+p)-k(\phi)}$$
Therefore the cumulant generating function of $S$, defined as the log of the moment generating function, is
$$k(\phi+p)-k(\phi),$$
providing a fairly direct interpretation of the function $k$ (.). Because the mean is given by the first derivatives of that generating function, we have that $\eta=$ $\nabla k(\phi)$, where $\nabla$ is the gradient operator $\left(\partial / \partial \phi_{1}, \ldots, \partial / \partial \phi_{d}\right)^{T}$. See Note $2.3$ for a brief account of both cumulant generating functions and the $\nabla$ notation.
Example 2.5. Binomial distribution. If $R$ denotes the number of successes in $n$ independent binary trials each with probability of success $\pi$, its density can be written
$$n ! /{r !(n-r) !}^{\pi r}(1-\pi)^{n-r}=m(r) \exp \left{r \phi-n \log \left(1+e^{\phi}\right)\right},$$
say, where $\phi=\log {\pi /(1-\pi)}$, often called the $\log$ odds, is the canonical parameter and $r$ the canonical statistic. Note that the mean parameter is $E(R)=$ $n \pi$ and can be recovered also by differentiating $k(\phi)$.

## 统计代写|统计推断作业代写statistics interference代考|Choice of priors for exponential family problems

While in Bayesian theory choice of prior is in principle not an issue of achieving mathematical simplicity, nevertheless there are gains in using reasonably simple and flexible forms. In particular, if the likelihood has the full exponential family form
$$m(y) \exp \left{s^{T} \phi-k(\phi)\right}$$
a prior for $\phi$ proportional to
$$\exp \left{s_{0}^{T} \phi-a_{0} k(\phi)\right}$$
leads to a posterior proportional to
$$\exp \left{\left(s+s_{0}\right)^{T} \phi-\left(1+a_{0}\right) k(\phi)\right}$$
Such a prior is called conjugate to the likelihood, or sometimes closed under sampling. The posterior distribution has the same form as the prior with $s_{0}$ replaced by $s+s_{0}$ and $a_{0}$ replaced by $1+a_{0}$.

Example 2.8. Binomial distribution (ctd). This continues the discussion of Example 2.5. If the prior for $\pi$ is proportional to
$$\pi^{r_{0}}(1-\pi)^{n_{0}-r_{0}}$$
i.e., is a beta distribution, then the posterior is another beta distribution corresponding to $r+r_{0}$ successes in $n+n_{0}$ trials. Thus both prior and posterior are beta distributions. It may help to think of $\left(r_{0}, n_{0}\right)$ as fictitious data! If the prior information corresponds to fairly small values of $n_{0}$ its effect on the conclusions will be small if the amount of real data is appreciable.

## 统计代写|统计推断作业代写statistics interference代考|Simple frequentist discussion

In Bayesian approaches sufficiency arises as a convenient simplification of the likelihood; whatever the prior the posterior is formed from the likelihood and hence depends on the data only via the sufficient statistic.

In frequentist approaches the issue is more complicated. Faced with a new model as the basis for analysis, we look for a Fisherian reduction, defined as follows:

• find the likelihood function;
• reduce to a sufficient statistic $S$ of the same dimension as $\theta$;
• find a function of $S$ that has a distribution depending only on $\psi$;
• invert that distribution to obtain limits for $\psi$ at an arbitrary set of probability levels;
• use the conditional distribution of the data given $S=s$ informally or formally to assess the adequacy of the formulation.
Immediate application is largely confined to regular exponential family models. While most of our discussion will centre on inference about the parameter of interest, $\psi$, the complementary role of sufficiency of providing an explicit base for model checking is in principle very important. It recognizes that our formulations are always to some extent provisional, and usually capable to some extent of empirical check; the universe of discussion is not closed. In general there is no specification of what to do if the initial formulation is inadequate but, while that might sometimes be clear in broad outline, it seems both in practice and in principle unwise to expect such a specification to be set out in detail in each application.

The next phase of the analysis is to determine how to use $s$ to answer more focused questions, for example about the parameter of interest $\psi$. The simplest possibility is that there are no nuisance parameters, just a single parameter $\psi$ of interest, and reduction to a single component $s$ occurs. We then have one observation on a known density $f_{S}(s ; \psi)$ and distribution function $F_{S}(s ; \psi)$. Subject to some monotonicity conditions which, in applications, are typically satisfied, the probability statement
$$P\left(S \leq a_{c}(\psi)\right)=F_{S}\left(a_{c}(\psi) ; \psi\right)=1-c$$
can be inverted for continuous random variables into
$$P\left{\psi \leq b_{c}(S)\right}=1-c$$
Thus the statement on the basis of data $y$, yielding sufficient statistic $s$, that
$$\psi \leq b_{c}(s)$$provides an upper bound for $\psi$, that is a single member of a hypothetical long run of statements a proportion $1-c$ of which are true, generating a set of statements in principle at all values of $c$ in $(0,1)$.

## 统计代写|统计推断作业代写statistics interference代考|Exponential family

m(y) \exp \left{s^{T} \phi-k(\phi)\right}m(y) \exp \left{s^{T} \phi-k(\phi)\right}

\begin{对齐} &\int m(y) \exp \left{s^{T} \phi-k(\phi)\right} d y=1 \ &\int m(y) \exp \left{s ^{T}(\phi+p)-k(\phi+p)\right} d y=1 。\end{对齐}\begin{对齐} &\int m(y) \exp \left{s^{T} \phi-k(\phi)\right} d y=1 \ &\int m(y) \exp \left{s ^{T}(\phi+p)-k(\phi+p)\right} d y=1 。\end{对齐}

E\left{\exp \left(p^{T} S\right)\right}=\exp {k(\phi+p)-k(\phi)}E\left{\exp \left(p^{T} S\right)\right}=\exp {k(\phi+p)-k(\phi)}

！/{r !(nr) !}^{\pi r}(1-\pi)^{nr}=m(r) \exp \left{r \phi-n \log \left(1+e^{ \phi}\right)\right},！/{r !(nr) !}^{\pi r}(1-\pi)^{nr}=m(r) \exp \left{r \phi-n \log \left(1+e^{ \phi}\right)\right},

## 统计代写|统计推断作业代写statistics interference代考|Choice of priors for exponential family problems

m(y) \exp \left{s^{T} \phi-k(\phi)\right}m(y) \exp \left{s^{T} \phi-k(\phi)\right}

\exp \left{s_{0}^{T} \phi-a_{0} k(\phi)\right}\exp \left{s_{0}^{T} \phi-a_{0} k(\phi)\right}

\exp \left{\left(s+s_{0}\right)^{T} \phi-\left(1+a_{0}\right) k(\phi)\right}\exp \left{\left(s+s_{0}\right)^{T} \phi-\left(1+a_{0}\right) k(\phi)\right}

## 统计代写|统计推断作业代写statistics interference代考|Simple frequentist discussion

• 找到似然函数；
• 减少到足够的统计量小号尺寸相同θ;
• 找到一个函数小号它的分布仅取决于ψ;
• 反转该分布以获得限制ψ在任意一组概率水平上；
• 使用给定数据的条件分布小号=s非正式或正式地评估配方的充分性。
即时应用主要限于常规指数族模型。虽然我们的大部分讨论将集中在对感兴趣参数的推断上，ψ，为模型检查提供明确基础的充分性的补充作用原则上非常重要。它承认我们的公式在某种程度上总是临时的，并且通常能够在某种程度上进行经验检查；讨论的范围不是封闭的。一般来说，如果最初的表述不充分，没有具体说明该怎么做，虽然有时可能很清楚，但在实践和原则上，期望这样的说明在每篇文章中都详细列出似乎是不明智的。应用。

P\left{\psi \leq b_{c}(S)\right}=1-cP\left{\psi \leq b_{c}(S)\right}=1-c

ψ≤bC(s)提供了一个上限ψ, 那是假设的长期陈述的单个成员 a 比例1−C其中是正确的，原则上在所有值下生成一组陈述C在(0,1).

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|统计推断作业代写statistics interference代考| Some concepts and simple applications

statistics-lab™ 为您的留学生涯保驾护航 在代写统计推断statistics interference方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写统计推断statistics interference方面经验极为丰富，各种代写统计推断statistics interference相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|统计推断作业代写statistics interference代考|Likelihood

The likelihood for the vector of observations $y$ is defined as
$$\operatorname{lik}(\theta ; y)=f_{Y}(y ; \theta),$$
considered in the first place as a function of $\theta$ for given $y$. Mostly we work with its logarithm $l(\theta ; y)$, often abbreviated to $l(\theta)$. Sometimes this is treated as a function of the random vector $Y$ rather than of $y$. The log form is convenient, in particular because $f$ will often be a product of component terms. Occasionally we work directly with the likelihood function itself. For nearly all purposes multiplying the likelihood formally by an arbitrary function of $y$, or equivalently adding an arbitrary such function to the log likelihood, would leave unchanged that part of the analysis hinging on direct calculations with the likelihood.

Any calculation of a posterior density, whatever the prior distribution, uses the data only via the likelihood. Beyond that, there is some intuitive appeal in the idea that differences in $I(\theta)$ measure the relative effectiveness of different parameter values $\theta$ in explaining the data. This is sometimes elevated into a principle called the law of the likelihood.

A key issue concerns the additional arguments needed to extract useful information from the likelihood, especially in relatively complicated problems possibly with many nuisance parameters. Likelihood will play a central role in almost all the arguments that follow.

## 统计代写|统计推断作业代写statistics interference代考|Sufficiency

The term statistic is often used (rather oddly) to mean any function of the observed random variable $Y$ or its observed counterpart. A statistic $S=S(Y)$ is called sufficient under the model if the conditional distribution of $Y$ given $S=s$ is independent of $\theta$ for all $s, \theta$. Equivalently
$$l(\theta ; y)=\log h(s, \theta)+\log m(y),$$
for suitable functions $h$ and $m$. The equivalence forms what is called the Neyman factorization theorem. The proof in the discrete case follows most explicitly by defining any new variable $W$, a function of $Y$, such that $Y$ is in $(1,1)$ correspondence with $(S, W)$, i.e., such that $(S, W)$ determines $Y$. The individual atoms of probability are unchanged by transformation. That is,
$$f_{Y}(y ; \theta)=f_{S, W}(s, w ; \theta)=f_{S}(s ; \theta) f_{W \mid S}(w ; s),$$
where the last term is independent of $\theta$ by definition. In the continuous case there is the minor modification that a Jacobian, not involving $\theta$, is needed when transforming from $Y$ to $(S, W)$. See Note $2.2$.

We use the minimal form of $S$; i.e., extra components could always be added to any given $S$ and the sufficiency property retained. Such addition is undesirable and is excluded by the requirement of minimality. The minimal form always exists and is essentially unique.

Any Bayesian inference uses the data only via the minimal sufficient statistic. This is because the calculation of the posterior distribution involves multiplying the likelihood by the prior and normalizing. Any factor of the likelihood that is a function of $y$ alone will disappear after normalization.

In a broader context the importance of sufficiency can be considered to arise as follows. Suppose that instead of observing $Y=y$ we were equivalently to be given the data in two stages:

• first we observe $S=s$, an observation from the density $f_{S}(s ; \theta)$;
• then we are given the remaining data, in effect an observation from the density $f_{Y \mid S}(y ; s)$.
Now, so long as the model holds, the second stage is an observation on a fixed and known distribution which could as well have been obtained from a random number generator. Therefore $S=s$ contains all the information about $\theta$ given the model, whereas the conditional distribution of $Y$ given $S=s$ allows assessment of the model.

## 统计代写|统计推断作业代写statistics interference代考|Simple examples

Example 2.1. Exponential distribution (ctd). The likelihood for Example $1.6$ is
$$\rho^{n} \exp \left(-\rho \Sigma y_{k}\right),$$
so that the log likelihood is
$$n \log \rho-\rho \Sigma y_{k},$$
and, assuming $n$ to be fixed, involves the data only via $\Sigma y_{k}$ or equivalently via $\bar{y}=\Sigma y_{k} / n$. By the factorization theorem the sum (or mean) is therefore sufficient. Note that had the sample size also been random the sufficient statistic would have been $\left(n, \Sigma y_{k}\right)$; see Example $2.4$ for further discussion.

In this example the density of $S=\Sigma Y_{k}$ is $\rho(\rho s)^{n-1} e^{-\rho s} /(n-1)$ !, a gamma distribution. It follows that $\rho S$ has a fixed distribution. It follows also that the joint conditional density of the $Y_{k}$ given $S=s$ is uniform over the simplex $0 \leq y_{k} \leq s ; \Sigma y_{k}=s$. This can be used to test the adequacy of the model.
Example 2.2. Linear model (ctd). A minimal sufficient statistic for the linear model, Example 1.4, consists of the least squares estimates and the residual sum of squares. This strong justification of the use of least squares estimates depends on writing the log likelihood in the form
$$-n \log \sigma-(y-z \beta)^{T}(y-z \beta) /\left(2 \sigma^{2}\right)$$
and then noting that
$$(y-z \beta)^{T}(y-z \beta)=(y-z \hat{\beta})^{T}(y-z \hat{\beta})+(\hat{\beta}-\beta)^{T}\left(z^{T} z\right)(\hat{\beta}-\beta),$$
in virtue of the equations defining the least squares estimates. This last identity has a direct geometrical interpretation. The squared norm of the vector defined by the difference between $Y$ and its expected value $z \beta$ is decomposed into a component defined by the difference between $Y$ and the estimated mean $z \hat{\beta}$ and an orthogonal component defined via $\hat{\beta}-\beta$. See Figure 1.1.

It follows that the log likelihood involves the data only via the least squares estimates and the residual sum of squares. Moreover, if the variance $\sigma^{2}$ were

known, the residual sum of squares would be a constant term in the log likelihood and hence the sufficient statistic would be reduced to $\hat{\beta}$ alone.

This argument fails for a regression model nonlinear in the parameters, such as the exponential regression (1.5). In the absence of error the $n \times 1$ vector of observations then lies on a curved surface and while the least squares estimates are still given by orthogonal projection they satisfy nonlinear equations and the decomposition of the log likelihood which is the basis of the argument for sufficiency holds only as an approximation obtained by treating the curved surface as locally flat.

## 统计代写|统计推断作业代写statistics interference代考|Sufficiency

F是(是;θ)=F小号,在(s,在;θ)=F小号(s;θ)F在∣小号(在;s),

• 首先我们观察小号=s, 从密度观察F小号(s;θ);
• 然后我们得到剩余的数据，实际上是对密度的观察F是∣小号(是;s).
现在，只要模型成立，第二阶段就是对固定且已知分布的观察，该分布也可以从随机数生成器中获得。所以小号=s包含有关的所有信息θ给定模型，而条件分布是给定小号=s允许评估模型。

## 统计代写|统计推断作业代写statistics interference代考|Simple examples

ρn经验⁡(−ρΣ是到),

n日志⁡ρ−ρΣ是到,

−n日志⁡σ−(是−和b)吨(是−和b)/(2σ2)

(是−和b)吨(是−和b)=(是−和b^)吨(是−和b^)+(b^−b)吨(和吨和)(b^−b),

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|统计推断作业代写statistics interference代考| Bayesian discussion

statistics-lab™ 为您的留学生涯保驾护航 在代写统计推断statistics interference方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写统计推断statistics interference方面经验极为丰富，各种代写统计推断statistics interference相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|统计推断作业代写statistics interference代考|Bayesian discussion

In the second approach to the problem we treat $\mu$ as having a probability distribution both with and without the data. This raises two questions: what is the meaning of probability in such a context, some extended or modified notion of probability usually being involved, and how do we obtain numerical values for the relevant probabilities? This is discussed further later, especially in Chapter $5 .$ For the moment we assume some such notion of probability concerned with measuring uncertainty is available.

If indeed we can treat $\mu$ as the realized but unobserved value of a random variable $M$, all is in principle straightforward. By Bayes’ theorem, i.e., by simple laws of probability,
$$f_{M \mid Y}(\mu \mid y)=f_{Y \mid M}(y \mid \mu) f_{M}(\mu) / \int f_{Y \mid M}(y \mid \phi) f_{M}(\phi) d \phi .$$
The left-hand side is called the posterior density of $M$ and of the two terms in the numerator the first is determined by the model and the other, $f_{M}(\mu)$, forms the prior distribution summarizing information about $M$ not arising from $y$. Any method of inference treating the unknown parameter as having a probability distribution is called Bayesian or, in an older terminology, an argument of inverse probability. The latter name arises from the inversion of the order of target and conditioning events as between the model and the posterior density.
The intuitive idea is that in such cases all relevant information about $\mu$ is then contained in the conditional distribution of the parameter given the data, that this is determined by the elementary formulae of probability theory and that remaining problems are solely computational.

In our example suppose that the prior for $\mu$ is normal with known mean $m$ and variance $v$. Then the posterior density for $\mu$ is proportional to
$$\exp \left{-\Sigma\left(y_{k}-\mu\right)^{2} /\left(2 \sigma_{0}^{2}\right)-(\mu-m)^{2} /(2 v)\right}$$
considered as a function of $\mu$. On completing the square as a function of $\mu$, there results a normal distribution of mean and variance respectively
$$\begin{gathered} \frac{\bar{y} /\left(\sigma_{0}^{2} / n\right)+m / v}{1 /\left(\sigma_{0}^{2} / n\right)+1 / v} \ \frac{1}{1 /\left(\sigma_{0}^{2} / n\right)+1 / v} \end{gathered}$$
for more details of the argument, see Note 1.5. Thus an upper limit for $\mu$ satisfied with posterior probability $1-c$ is
$$\frac{\bar{y} /\left(\sigma_{0}^{2} / n\right)+m / v}{1 /\left(\sigma_{0}^{2} / n\right)+1 / v}+k_{c}^{*} \sqrt{\frac{1}{1 /\left(\sigma_{0}^{2} / n\right)+1 / v}}$$

## 统计代写|统计推断作业代写statistics interference代考|Some further discussion

We now give some more detailed discussion especially of Example $1.4$ and outline a number of special models that illustrate important issues.

The linear model of Example $1.4$ and methods of analysis of it stemming from the method of least squares are of much direct importance and also are the base of many generalizations. The central results can be expressed in matrix form centring on the least squares estimating equations
$$z^{T} z \hat{\beta}=z^{T} Y,$$
the vector of fitted values
$$\hat{Y}=z \hat{\beta},$$
and the residual sum of squares
$$\text { RSS }=(Y-\hat{Y})^{T}(Y-\hat{Y})=Y^{T} Y-\hat{\beta}^{T}\left(z^{T} z\right) \hat{\beta} .$$
Insight into the form of these results is obtained by noting that were it not for random error the vector $Y$ would lie in the space spanned by the columns of $z$, that $\hat{Y}$ is the orthogonal projection of $Y$ onto that space, defined thus by
$$z^{T}(Y-\hat{Y})=z^{T}(Y-z \hat{\beta})=0$$
and that the residual sum of squares is the squared norm of the component of $Y$ orthogonal to the columns of $z$. See Figure 1.1.

There is a fairly direct generalization of these results to the nonlinear regression model of Example 1.5. Here if there were no error the observations would lie on the surface defined by the vector $\mu(\beta)$ as $\beta$ varies. Orthogonal projection involves finding the point $\mu(\hat{\beta})$ closest to $Y$ in the least squares sense, i.e., minimizing the sum of squares of deviations ${Y-\mu(\beta)}^{T}{Y-\mu(\beta)}$. The resulting equations defining $\hat{\beta}$ are best expressed by defining
$$z^{T}(\beta)=\nabla \mu^{T}(\beta),$$
where $\nabla$ is the $q \times 1$ gradient operator with respect to $\beta$, i.e., $\nabla^{T}=$ $\left(\partial / \partial \beta_{1}, \ldots, \partial / \partial \beta_{q}\right)$. Thus $z(\beta)$ is an $n \times q$ matrix, reducing to the previous $z$

in the linear case. Just as the columns of $z$ define the linear model, the columns of $z(\beta)$ define the tangent space to the model surface evaluated at $\beta$. The least squares estimating equation is thus
$$z^{T}(\hat{\beta}){Y-\mu(\hat{\beta})}=0 .$$
The local linearization implicit in this is valuable for numerical iteration. One of the simplest special cases arises when $E\left(Y_{k}\right)=\beta_{0} \exp \left(-\beta_{1} z_{k}\right)$ and the geometry underlying the nonlinear least squares equations is summarized in Figure 1.2.

The simple examples used here in illustration have one component random variable attached to each observation and all random variables are mutually independent. In many situations random variation comes from several sources and random components attached to different component observations may not be independent, showing for example temporal or spatial dependence.

## 统计代写|统计推断作业代写statistics interference代考|Parameters

A central role is played throughout the book by the notion of a parameter vector, $\theta$. Initially this serves to index the different probability distributions making up the full model. If interest were exclusively in these probability distributions as such, any $(1,1)$ transformation of $\theta$ would serve equally well and the choice of a particular version would be essentially one of convenience. For most of the applications in mind here, however, the interpretation is via specific parameters and this raises the need both to separate parameters of interest, $\psi$, from nuisance parameters, $\lambda$, and to choose specific representations. In relatively complicated problems where several different research questions are under study different parameterizations may be needed for different purposes.

There are a number of criteria that may be used to define the individual component parameters. These include the following:

• the components should have clear subject-matter interpretations, for example as differences, rates of change or as properties such as in a physical context mass, energy and so on. If not dimensionless they should be measured on a scale unlikely to produce very large or very small values;
• it is desirable that this interpretation is retained under reasonable perturbations of the model;
• different components should not have highly correlated errors of estimation;
• statistical theory for estimation should be simple;
• if iterative methods of computation are needed then speedy and assured convergence is desirable.
The first criterion is of primary importance for parameters of interest, at least in the presentation of conclusions, but for nuisance parameters the other criteria are of main interest. There are considerable advantages in formulations leading to simple methods of analysis and judicious simplicity is a powerful aid
14
Preliminaries
to understanding, but for parameters of interest subject-matter meaning must have priority.

## 统计代写|统计推断作业代写statistics interference代考|Bayesian discussion

F米∣是(μ∣是)=F是∣米(是∣μ)F米(μ)/∫F是∣米(是∣φ)F米(φ)dφ.

\exp \left{-\Sigma\left(y_{k}-\mu\right)^{2} /\left(2 \sigma_{0}^{2}\right)-(\mu-m)^ {2} /(2 v)\right}\exp \left{-\Sigma\left(y_{k}-\mu\right)^{2} /\left(2 \sigma_{0}^{2}\right)-(\mu-m)^ {2} /(2 v)\right}

## 统计代写|统计推断作业代写statistics interference代考|Some further discussion

Example 的线性模型1.4以及源自最小二乘法的分析方法具有直接的重要性，也是许多概括的基础。中心结果可以表示为以最小二乘估计方程为中心的矩阵形式

## 统计代写|统计推断作业代写statistics interference代考|Parameters

• 组件应具有明确的主题解释，例如差异、变化率或物理环境中的质量、能量等属性。如果不是无量纲的，它们应该在一个不可能产生非常大或非常小的值的尺度上进行测量；
• 在模型的合理扰动下保留这种解释是可取的；
• 不同组成部分不应有高度相关的估计误差；
• 估计的统计理论应该简单；
• 如果需要迭代计算方法，则需要快速且可靠的收敛。
第一个标准对于感兴趣的参数至关重要，至少在结论的呈现中，但对于有害参数，其他标准是主要关注的。有相当大的优势，导致简单的分析方法和明智的简单性是一个强大的帮助
14
初步
理解，但对于感兴趣的参数，主题意义必须优先。

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|统计推断作业代写statistics interference代考| Formulation of objectives

statistics-lab™ 为您的留学生涯保驾护航 在代写统计推断statistics interference方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写统计推断statistics interference方面经验极为丰富，各种代写统计推断statistics interference相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|统计推断作业代写statistics interference代考|Formulation of objectives

We can, as already noted, formulate possible objectives in two parts as follows.
Part I takes the family of models as given and aims to:

• give intervals or in general sets of values within which $\psi$ is in some sense likely to lie;
• assess the consistency of the data with a particular parameter value $\psi_{0}$;
• predict as yet unobserved random variables from the same random system that generated the data;
• use the data to choose one of a given set of decisions $\mathcal{D}$, requiring the specification of the consequences of various decisions.
Part II uses the data to examine the family of models via a process of model criticism. We return to this issue in Section 3.2.

We shall concentrate in this book largely but not entirely on the first two of the objectives in Part I, interval estimation and measuring consistency with specified values of $\psi$.

To an appreciable extent the theory of inference is concerned with generalizing to a wide class of models two approaches to these issues which will be outlined in the next section and with a critical assessment of these approaches.

## 统计代写|统计推断作业代写statistics interference代考|General remarks

Consider the first objective above, that of providing intervals or sets of values likely in some sense to contain the parameter of interest, $\psi$.

There are two broad approaches, called frequentist and Bayesian, respectively, both with variants. Alternatively the former approach may be said to be based on sampling theory and an older term for the latter is that it uses imverse probability. Much of the rest of the book is concerned with the similarities and differences between these two approaches. As a prelude to the general development we show a very simple example of the arguments involved.

We take for illustration Example 1.1, which concerns a normal distribution with unknown mean $\mu$ and known variance. In the formulation probability is used to model variability as experienced in the phenomenon under study and its meaning is as a long-run frequency in repetitions, possibly, or indeed often, hypothetical, of that phenomenon.

What can reasonably be said about $\mu$ on the basis of observations $y_{1}, \ldots, y_{n}$ and the assumptions about the model?

## 统计代写|统计推断作业代写statistics interference代考|Frequentist discussion

In the first approach we make no further probabilistic assumptions. In particular we treat $\mu$ as an unknown constant. Strong arguments can be produced for reducing the data to their mean $\bar{y}=\Sigma y_{k} / n$, which is the observed value of the corresponding random variable $\bar{Y}$. This random variable has under the assumptions of the model a normal distribution of mean $\mu$ and variance $\sigma_{0}^{2} / n$, so that in particular
$$P\left(\bar{Y}>\mu-k_{c}^{} \sigma_{0} / \sqrt{n}\right)=1-c$$ where, with $\Phi\left(\right.$.) denoting the standard normal integral, $\Phi\left(k_{c}^{}\right)=1-c$. For example with $c=0.025, k_{c}^{}=1.96$. For a sketch of the proof, see Note $1.5$. Thus the statement equivalent to (1.9) that $$P\left(\mu<\bar{Y}+k_{c}^{} \sigma_{0} / \sqrt{n}\right)=1-c,$$
can be interpreted as specifying a hypothetical long run of statements about $\mu$ a proportion $1-c$ of which are correct. We have observed the value $\bar{y}$ of the random variable $\bar{Y}$ and the statement
$$\mu<\bar{y}+k_{c}^{*} \sigma_{0} / \sqrt{n}$$
is thus one of this long run of statements, a specified proportion of which are correct. In the most direct formulation of this $\mu$ is fixed and the statements vary and this distinguishes the statement from a probability distribution for $\mu$. In fact a similar interpretation holds if the repetitions concern an arbitrary sequence of fixed values of the mean.

There are a large number of generalizations of this result, many underpinning standard elementary statistical techniques. For instance, if the variance $\sigma^{2}$ is unknown and estimated by $\Sigma\left(y_{k}-\bar{y}\right)^{2} /(n-1)$ in $(1.9)$, then $k_{c}^{*}$ is replaced by the corresponding point in the Student $t$ distribution with $n-1$ degrees of freedom.

There is no need to restrict the analysis to a single level $c$ and provided concordant procedures are used at the different $c$ a formal distribution is built up.
Arguments involving probability only via its (hypothetical) long-run frequency interpretation are called frequentist. That is, we define procedures for assessing evidence that are calibrated by how they would perform were they used repeatedly. In that sense they do not differ from other measuring instruments. We intend, of course, that this long-run behaviour is some assurance that with our particular data currently under analysis sound conclusions are drawn. This raises important issues of ensuring, as far as is feasible, the relevance of the long run to the specific instance.

## 统计代写|统计推断作业代写statistics interference代考|Formulation of objectives

• 给出区间或一般的一组值，其中ψ在某种意义上可能会撒谎；
• 评估数据与特定参数值的一致性ψ0;
• 从生成数据的同一随机系统中预测尚未观察到的随机变量；
• 使用数据来选择一组给定的决策D，要求说明各种决定的后果。
第二部分通过模型批评过程使用数据来检查模型族。我们在第 3.2 节中回到这个问题。

μ<是¯+到C∗σ0/n

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|统计推断作业代写statistics interference代考|Preliminaries

statistics-lab™ 为您的留学生涯保驾护航 在代写统计推断statistics interference方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写统计推断statistics interference方面经验极为丰富，各种代写统计推断statistics interference相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|统计推断作业代写statistics interference代考|Starting point

We typically start with a subject-matter question. Data are or become available to address this question. After preliminary screening, checks of data quality and simple tabulations and graphs, more formal analysis starts with a provisional model. The data are typically split in two parts $(y: z)$, where $y$ is regarded as the observed value of a vector random variable $Y$ and $z$ is treated as fixed. Sometimes the components of $y$ are direct measurements of relevant properties on study individuals and sometimes they are themselves the outcome of some preliminary analysis, such as means, measures of variability, regression coefficients and so on. The set of variables $z$ typically specifies aspects of the system under study that are best treated as purely explanatory and whose observed values are not usefully represented by random variables. That is, we are interested solely in the distribution of outcome or response variables conditionally on the variables $z$; a particular example is where $z$ represents treatments in a randomized experiment.
We use throughout the notation that observable random variables are represented by capital letters and observations by the corresponding lower case letters.
A model, or strictly a family of models, specifies the density of $Y$ to be
$$f_{Y}(y: z ; \theta)$$

where $\theta \subset \Omega_{\theta}$ is unknown. The distribution may depend also on design features of the study that generated the data. We typically simplify the notation to $f_{Y}(y ; \theta)$, although the explanatory variables $z$ are frequently essential in specific applications.
To choose the model appropriately is crucial to fruitful application.
We follow the very convenient, although deplorable, practice of using the term density both for continuous random variables and for the probability function of discrete random variables. The deplorability comes from the functions being dimensionally different, probabilities per unit of measurement in continuous problems and pure numbers in discrete problems. In line with this convention in what follows integrals are to be interpreted as sums where necessary. Thus we write
$$E(Y)=E(Y ; \theta)=\int y f_{Y}(y ; \theta) d y$$
for the expectation of $Y$, showing the dependence on $\theta$ only when relevant. The integral is interpreted as a sum over the points of support in a purely discrete case. Next, for each aspect of the research question we partition $\theta$ as $(\psi, \lambda)$, where $\psi$ is called the parameter of interest and $\lambda$ is included to complete the specification and commonly called a nuisance parameter. Usually, but not necessarily, $\psi$ and $\lambda$ are variation independent in that $\Omega_{\theta}$ is the Cartesian product $\Omega_{\psi} \times \Omega_{\lambda}$. That is, any value of $\psi$ may occur in connection with any value of $\lambda$. The choice of $\psi$ is a subject-matter question. In many applications it is best to arrange that $\psi$ is a scalar parameter, i.e., to break the research question of interest into simple components corresponding to strongly focused and incisive research questions, but this is not necessary for the theoretical discussion.

## 统计代写|统计推断作业代写statistics interference代考|Role of formal theory of inference

The formal theory of inference initially takes the family of models as given and the objective as being to answer questions about the model in the light of the data. Choice of the family of models is, as already remarked, obviously crucial but outside the scope of the present discussion. More than one choice may be needed to answer different questions.

A second and complementary phase of the theory concerns what is sometimes called model criticism, addressing whether the data suggest minor or major modification of the model or in extreme cases whether the whole focus of the analysis should be changed. While model criticism is often done rather informally in practice, it is important for any formal theory of inference that it embraces the issues involved in such checking.

## 统计代写|统计推断作业代写statistics interference代考|Some simple models

General notation is often not best suited to special cases and so we use more conventional notation where appropriate.

Example 1.1. The normal mean. Whenever it is required to illustrate some point in simplest form it is almost inevitable to return to the most hackneyed of examples, which is therefore given first. Suppose that $Y_{1}, \ldots, Y_{n}$ are independently normally distributed with unknown mean $\mu$ and known variance $\sigma_{0}^{2}$. Here $\mu$ plays the role of the unknown parameter $\theta$ in the general formulation. In one of many possible generalizations, the variance $\sigma^{2}$ also is unknown. The parameter vector is then $\left(\mu, \sigma^{2}\right)$. The component of interest $\psi$ would often be $\mu$

but could be, for example, $\sigma^{2}$ or $\mu / \sigma$, depending on the focus of subject-matter interest.

Example 1.2. Linear regression. Here the data are $n$ pairs $\left(y_{1}, z_{1}\right), \ldots,\left(y_{n}, z_{n}\right)$ and the model is that $Y_{1}, \ldots, Y_{n}$ are independently normally distributed with variance $\sigma^{2}$ and with
$$E\left(Y_{k}\right)=\alpha+\beta z_{k} .$$
Here typically, but not necessarily, the parameter of interest is $\psi=\beta$ and the nuisance parameter is $\lambda=\left(\alpha, \sigma^{2}\right)$. Other possible parameters of interest include the intercept at $z=0$, namely $\alpha$, and $-\alpha / \beta$, the intercept of the regression line on the $z$-axis.

Example 1.3. Linear regression in semiparametric form. In Example $1.2$ replace the assumption of normality by an assumption that the $Y_{k}$ are uncorrelated with constant variance. This is semiparametric in that the systematic part of the variation, the linear dependence on $z_{k}$, is specified parametrically and the random part is specified only via its covariance matrix, leaving the functional form of its distribution open. A complementary form would leave the systematic part of the variation a largely arbitrary function and specify the distribution of error parametrically, possibly of the same normal form as in Example 1.2. This would lead to a discussion of smoothing techniques.

Example 1.4. Linear model. We have an $n \times 1$ vector $Y$ and an $n \times q$ matrix $z$ of fixed constants such that
$$E(Y)=z \beta, \quad \operatorname{cov}(Y)=\sigma^{2} I,$$
where $\beta$ is a $q \times 1$ vector of unknown parameters, $I$ is the $n \times n$ identity matrix and with, in the analogue of Example 1.2, the components independently normally distributed. Here $z$ is, in initial discussion at least, assumed of full rank $q<n$. A relatively simple but important generalization has $\operatorname{cov}(Y)=$ $\sigma^{2} V$, where $V$ is a given positive definite matrix. There is a corresponding semiparametric version generalizing Example 1.3.

Both Examples $1.1$ and $1.2$ are special cases, in the former the matrix $z$ consisting of a column of $1 \mathrm{~s}$.

Example 1.5. Normal-theory nonlinear regression. Of the many generalizations of Examples $1.2$ and 1.4, one important possibility is that the dependence on the parameters specifying the systematic part of the structure is nonlinear. For example, instead of the linear regression of Example $1.2$ we might wish to consider
$$E\left(Y_{k}\right)=\alpha+\beta \exp \left(\gamma z_{k}\right)$$

F是(是:和;θ)

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。