### 统计代写|统计推断作业代写statistics interference代考| Exponential family

## 统计代写|统计推断作业代写statistics interference代考|Exponential family

We now consider a family of models that are important both because they have relatively simple properties and because they capture in one discussion many,

although by no means all, of the inferential properties connected with important standard distributions, the binomial, Poisson and geometric distributions and the normal and gamma distributions, and others too.

Suppose that $\theta$ takes values in a well-behaved region in $R^{d}$, and not of lower dimension, and that we can find a $(d \times 1)$-dimensional statistic $s$ and a parameterization $\phi$, i.e., a $(1,1)$ transformation of $\theta$, such that the model has the form
$$m(y) \exp \left{s^{T} \phi-k(\phi)\right}$$
where $s=s(y)$ is a function of the data. Then $S$ is sufficient; subject to some important regularity conditions this is called a regular or full $(d, d)$ exponential family of distributions. The statistic $S$ is called the canonical statistic and $\phi$ the canonical parameter. The parameter $\eta=E(S ; \phi)$ is called the mean parameter. Because the stated function defines a distribution, it follows that
\begin{aligned} &\int m(y) \exp \left{s^{T} \phi-k(\phi)\right} d y=1 \ &\int m(y) \exp \left{s^{T}(\phi+p)-k(\phi+p)\right} d y=1 . \end{aligned}
Hence, noting that $s^{T} p=p^{T} s$, we have from (2.11) that the moment generating function of $S$ is
$$E\left{\exp \left(p^{T} S\right)\right}=\exp {k(\phi+p)-k(\phi)}$$
Therefore the cumulant generating function of $S$, defined as the log of the moment generating function, is
$$k(\phi+p)-k(\phi),$$
providing a fairly direct interpretation of the function $k$ (.). Because the mean is given by the first derivatives of that generating function, we have that $\eta=$ $\nabla k(\phi)$, where $\nabla$ is the gradient operator $\left(\partial / \partial \phi_{1}, \ldots, \partial / \partial \phi_{d}\right)^{T}$. See Note $2.3$ for a brief account of both cumulant generating functions and the $\nabla$ notation.
Example 2.5. Binomial distribution. If $R$ denotes the number of successes in $n$ independent binary trials each with probability of success $\pi$, its density can be written
$$n ! /{r !(n-r) !}^{\pi r}(1-\pi)^{n-r}=m(r) \exp \left{r \phi-n \log \left(1+e^{\phi}\right)\right},$$
say, where $\phi=\log {\pi /(1-\pi)}$, often called the $\log$ odds, is the canonical parameter and $r$ the canonical statistic. Note that the mean parameter is $E(R)=$ $n \pi$ and can be recovered also by differentiating $k(\phi)$.

## 统计代写|统计推断作业代写statistics interference代考|Choice of priors for exponential family problems

While in Bayesian theory choice of prior is in principle not an issue of achieving mathematical simplicity, nevertheless there are gains in using reasonably simple and flexible forms. In particular, if the likelihood has the full exponential family form
$$m(y) \exp \left{s^{T} \phi-k(\phi)\right}$$
a prior for $\phi$ proportional to
$$\exp \left{s_{0}^{T} \phi-a_{0} k(\phi)\right}$$
leads to a posterior proportional to
$$\exp \left{\left(s+s_{0}\right)^{T} \phi-\left(1+a_{0}\right) k(\phi)\right}$$
Such a prior is called conjugate to the likelihood, or sometimes closed under sampling. The posterior distribution has the same form as the prior with $s_{0}$ replaced by $s+s_{0}$ and $a_{0}$ replaced by $1+a_{0}$.

Example 2.8. Binomial distribution (ctd). This continues the discussion of Example 2.5. If the prior for $\pi$ is proportional to
$$\pi^{r_{0}}(1-\pi)^{n_{0}-r_{0}}$$
i.e., is a beta distribution, then the posterior is another beta distribution corresponding to $r+r_{0}$ successes in $n+n_{0}$ trials. Thus both prior and posterior are beta distributions. It may help to think of $\left(r_{0}, n_{0}\right)$ as fictitious data! If the prior information corresponds to fairly small values of $n_{0}$ its effect on the conclusions will be small if the amount of real data is appreciable.

## 统计代写|统计推断作业代写statistics interference代考|Simple frequentist discussion

In Bayesian approaches sufficiency arises as a convenient simplification of the likelihood; whatever the prior the posterior is formed from the likelihood and hence depends on the data only via the sufficient statistic.

In frequentist approaches the issue is more complicated. Faced with a new model as the basis for analysis, we look for a Fisherian reduction, defined as follows:

• find the likelihood function;
• reduce to a sufficient statistic $S$ of the same dimension as $\theta$;
• find a function of $S$ that has a distribution depending only on $\psi$;
• invert that distribution to obtain limits for $\psi$ at an arbitrary set of probability levels;
• use the conditional distribution of the data given $S=s$ informally or formally to assess the adequacy of the formulation.
Immediate application is largely confined to regular exponential family models. While most of our discussion will centre on inference about the parameter of interest, $\psi$, the complementary role of sufficiency of providing an explicit base for model checking is in principle very important. It recognizes that our formulations are always to some extent provisional, and usually capable to some extent of empirical check; the universe of discussion is not closed. In general there is no specification of what to do if the initial formulation is inadequate but, while that might sometimes be clear in broad outline, it seems both in practice and in principle unwise to expect such a specification to be set out in detail in each application.

The next phase of the analysis is to determine how to use $s$ to answer more focused questions, for example about the parameter of interest $\psi$. The simplest possibility is that there are no nuisance parameters, just a single parameter $\psi$ of interest, and reduction to a single component $s$ occurs. We then have one observation on a known density $f_{S}(s ; \psi)$ and distribution function $F_{S}(s ; \psi)$. Subject to some monotonicity conditions which, in applications, are typically satisfied, the probability statement
$$P\left(S \leq a_{c}(\psi)\right)=F_{S}\left(a_{c}(\psi) ; \psi\right)=1-c$$
can be inverted for continuous random variables into
$$P\left{\psi \leq b_{c}(S)\right}=1-c$$
Thus the statement on the basis of data $y$, yielding sufficient statistic $s$, that
$$\psi \leq b_{c}(s)$$provides an upper bound for $\psi$, that is a single member of a hypothetical long run of statements a proportion $1-c$ of which are true, generating a set of statements in principle at all values of $c$ in $(0,1)$.

## 统计代写|统计推断作业代写statistics interference代考|Exponential family

m(y) \exp \left{s^{T} \phi-k(\phi)\right}m(y) \exp \left{s^{T} \phi-k(\phi)\right}

\begin{对齐} &\int m(y) \exp \left{s^{T} \phi-k(\phi)\right} d y=1 \ &\int m(y) \exp \left{s ^{T}(\phi+p)-k(\phi+p)\right} d y=1 。\end{对齐}\begin{对齐} &\int m(y) \exp \left{s^{T} \phi-k(\phi)\right} d y=1 \ &\int m(y) \exp \left{s ^{T}(\phi+p)-k(\phi+p)\right} d y=1 。\end{对齐}

E\left{\exp \left(p^{T} S\right)\right}=\exp {k(\phi+p)-k(\phi)}E\left{\exp \left(p^{T} S\right)\right}=\exp {k(\phi+p)-k(\phi)}

！/{r !(nr) !}^{\pi r}(1-\pi)^{nr}=m(r) \exp \left{r \phi-n \log \left(1+e^{ \phi}\right)\right},！/{r !(nr) !}^{\pi r}(1-\pi)^{nr}=m(r) \exp \left{r \phi-n \log \left(1+e^{ \phi}\right)\right},

## 统计代写|统计推断作业代写statistics interference代考|Choice of priors for exponential family problems

m(y) \exp \left{s^{T} \phi-k(\phi)\right}m(y) \exp \left{s^{T} \phi-k(\phi)\right}

\exp \left{s_{0}^{T} \phi-a_{0} k(\phi)\right}\exp \left{s_{0}^{T} \phi-a_{0} k(\phi)\right}

\exp \left{\left(s+s_{0}\right)^{T} \phi-\left(1+a_{0}\right) k(\phi)\right}\exp \left{\left(s+s_{0}\right)^{T} \phi-\left(1+a_{0}\right) k(\phi)\right}

## 统计代写|统计推断作业代写statistics interference代考|Simple frequentist discussion

• 找到似然函数；
• 减少到足够的统计量小号尺寸相同θ;
• 找到一个函数小号它的分布仅取决于ψ;
• 反转该分布以获得限制ψ在任意一组概率水平上；
• 使用给定数据的条件分布小号=s非正式或正式地评估配方的充分性。
即时应用主要限于常规指数族模型。虽然我们的大部分讨论将集中在对感兴趣参数的推断上，ψ，为模型检查提供明确基础的充分性的补充作用原则上非常重要。它承认我们的公式在某种程度上总是临时的，并且通常能够在某种程度上进行经验检查；讨论的范围不是封闭的。一般来说，如果最初的表述不充分，没有具体说明该怎么做，虽然有时可能很清楚，但在实践和原则上，期望这样的说明在每篇文章中都详细列出似乎是不明智的。应用。

P\left{\psi \leq b_{c}(S)\right}=1-cP\left{\psi \leq b_{c}(S)\right}=1-c

ψ≤bC(s)提供了一个上限ψ, 那是假设的长期陈述的单个成员 a 比例1−C其中是正确的，原则上在所有值下生成一组陈述C在(0,1).

