### 统计代写|统计模型作业代写Statistical Modelling代考|Small Sample Refinement: Saddlepoint Approximations

## 统计代写|统计模型作业代写Statistical Modelling代考|Small Sample Refinement: Saddlepoint Approximations

Saddlepoint approximation is a technique for refined approximation of densities and distribution functions when the normal (or $\chi^{2}$ ) approximation is too crude. The method is typically used on distributions for

• Sums of random variables.
• ML estimators.
• Likelihood ratio test statistics,
and for approximating structure functions. These approximations often have surprisingly small errors, and in some remarkable cases they yield exact results. We confine ourselves to the basics, and consider only density approximation for canonical statistics $t$ (which involves the structure function $g(t)$ ), and for the MLE in any parameterization of the exponential family.
The saddlepoint approximation for probability densities (continuous distribution case) is due to Daniels (1954). Approximations for distribution functions, being of particular interest for test statistics (for example the so-called Lugannani-Rice formula for the signed log-likelihood ratio), is a more complicated topic, because it will require approximation of integrals of densities instead of just densities, see for example Jensen (1995).

Suppose we have a regular exponential family, with canonical statistic $\boldsymbol{t}$ (no index $n$ here) and canonical parameter $\theta$. If the family is not regular, we restrict $\boldsymbol{\theta}$ to the interior of $\Theta$. Usually $t=t(y)$ has some sort of a sum form, even though the number of terms may be small. We shall derive an approximation for the density $f\left(t ; \theta_{0}\right)$ of $t$, for some specified $\theta=\theta_{0}$. One reason for wanting such an approximation in a given exponential family is the presence of the structure function $g(t)$ in $f\left(t ; \theta_{0}\right)$, which is the factor, often complicated, by which $f\left(t ; \boldsymbol{\theta}{0}\right)$ differs from $f\left(\boldsymbol{y} ; \boldsymbol{\theta}{0}\right)$. Another reason is that we might want the distribution of the ML estimator, and $t=\hat{\boldsymbol{\mu}}_{t}$ is one such MLE.

## 统计代写|统计模型作业代写Statistical Modelling代考|Extensions and further refinements.

The same technique can at least in principle be used outside exponential families to approximate a density $f(t)$. The idea is to embed $f(t)$ in an exponential family by exponential tilting, see Example $2.12$ with $f(t)$ in the role of its $f_{0}(y)$, For practical use, $f$ must have an explicit and manageable Laplace transform, since the norming constant $C(\boldsymbol{\theta})$ will be the Laplace transform of the density $f$, cf. (3.9).
For iid samples of size $n$ one might guess the error is expressed by a factor of magnitude $1+O(1 / \sqrt{n})$. It is in fact a magnitude better, $1+O(1 / n)$, and this holds uniformly on compact subsets of $\Theta$. The error factor can be further refined by using a more elaborate Edgeworth expansion instead of the simple normal density in the second line of (4.6). An Edgeworth expansion is typically excellent in the centre of the distribution, where it would be used here (for $\boldsymbol{\theta}=\hat{\theta}(t))$. On the other hand, it can be terribly bad in the tails and is generally not advisable for a direct approximation of $f\left(t ; \boldsymbol{\theta}_{0}\right)$. See for example Jensen (1995, Ch. 2) for details and proofs. Such a refinement of the saddlepoint approximation was used by Martin-Löf to prove Boltzmann’s theorem, see Chapter $6 .$

Further refinement is possible, at least in principle, through multiplication by a norming factor such that the density approximation becomes a proper probability density. Note, this factor will typically depend on $\theta_{0}$. Even if the relative approximation error is often remarkably small, it generally depends on the argument $\boldsymbol{t}$. For the distribution of the sample mean it is constant only for three univariate densities of continuous distributions: the normal density, the gamma density, and the inverse Gaussian density (Exercise 4.6). The representation is exact for the normal, see Exercise 4.4, and the inverse Gaussian. For the gamma density the error factor is close to 1, representing the Stirling’s formula approximation of $n$ ! (Section B.2.2). Correction by a norming factor, as mentioned in the previous item, would of course eliminate such a constant relative error, but be pointless in this particular case.
$\Delta$

## 统计代写|统计模型作业代写Statistical Modelling代考|Saddlepoint approximation for the MLE

The saddlepoint approximation for the density $f\left(\hat{\psi} ; \psi_{0}\right)$ of the $M L$ estimator $\hat{\psi}=\hat{\psi}(t)$ in any smooth parameterization of a regular exponential family is
$$f\left(\hat{\psi} ; \psi_{0}\right) \approx \frac{\sqrt{\operatorname{det} I(\hat{\psi})}}{(2 \pi)^{k / 2}} \frac{L\left(\psi_{0}\right)}{L(\hat{\psi})}$$
Here $I(\hat{\psi})$ is the Fisher information for the chosen parameter $\psi$, with inserted $\hat{\psi}$.

Proof We transform from the density for $t=\hat{\mu}{t}$ to the density for $\hat{\psi}(t)$. To achieve this we only have to see the effect of a variable transformation on 4.2 Small Sample Refinement: Saddlepoint Approximations 73 the density. When we replace the variable $\hat{\mu}$, by $\hat{\psi}$, we must also multiply by the Jacobian determinant for the transformation that has Jacobian matrix $\left(\frac{\partial \hat{\mu}{t}}{\partial \bar{\phi}}\right)$. However, this is also precisely what is needed to change the square root of $\operatorname{det} I\left(\mu_{t}\right)=\operatorname{det} V_{t}^{-1}=1 / \operatorname{det} V_{t}$ to take account of the reparameterization, see the Reparameterization lemma, Proposition 3.14. Hence the form of formula (4.9) is invariant under reparameterizations.

Proposition $4.7$ works even much more generally, as shown by BarndorffNielsen and others, and the formula is usually named Barndorff-Nielsen’s formula, or the $p^{*}$ ( ‘ $p$-star’) formula, see Reid (1988) and Barndorff-Nielsen and Cox (1994). Pawitan (2001, Sec. 9.8) refers to it as the ‘magical formula’. The formula holds also after conditioning on distribution-constant (= ancillary) statistics (see Proposition $7.5$ and Remark 7.6), and it holds exactly in transformation models, given the ancillary configuration statistic. It is more complicated to derive outside the exponential families, however, because we cannot then start from (4.7) with the canonical parameter $\boldsymbol{\theta}$ as the parameter of interest.

By analogy with the third item of Remark 4.6 about approximation (4.7), we may here refine (4.9) by replacing the approximate normalization constant $1 / \sqrt{2 \pi}$ by the exact one, that makes the density integrate to 1 .

