### 统计代写|统计模型作业代写Statistical Modelling代考|The Steepness Criterion

## 统计代写|统计模型作业代写Statistical Modelling代考|The Steepness Criterion

For the truth of the result of Proposition 3.13, it is not quite necessary that the exponential family is regular. A necessary and sufficient condition for Proposition $3.13$ to hold is that the model is steep (BarndorffNielsen). Steepness is a requirement on the log-likelihood or equivalently

on $\log C(\theta)$, that it must have infinite derivatives in all boundary points belonging to the canonical parameter space $\boldsymbol{\Theta}$. Thus, the gradient $\boldsymbol{\mu}{t}(\boldsymbol{\theta})$ does not exist in boundary points. In other words, there cannot be an observation $\boldsymbol{t}$ satisfying $\boldsymbol{t}=\boldsymbol{\mu}{t}(\boldsymbol{\theta})$ for a steep family. Conversely, suppose the family is not steep, and furthermore suppose $\operatorname{dim} \theta=1$, for simplicity of argument. Since the family is not steep, there is a boundary point where $\mu_{t}(\theta)$ is finite. Then, there must be possible $t$-values on both sides of this $\mu_{t}$ (otherwise we had a one-point distribution in that point). Thus there are possible $t$-values on the outside of the boundary value of $\mu_{t}$. This is inconsistent with the conclusion of Proposition 3.13, so steepness is required.

Note that regular families have no boundary points in $\Theta$, so for them the steepness requirement is vacuous. A precise definition of steepness must involve derivatives along different directions within $\boldsymbol{\Theta}$ towards a boundary point of $\Theta$, but we abstain from formulating a more precise version.

Examples of models that are steep but not regular are rare in applications. A simple theoretical construction is based on the $h$-function (3.4), a linear (i.e. $t(y)=y$ ) exponential family with $h(y)=1 / y^{2}$ for $y>1$. The family is not regular, because the norming constant $C(\theta)$ is finite for $\theta$ in the closed interval $\theta \leq 0$ but it is steep. The left derivative of $C(\theta)$ (hence also of $\log C(\theta)$ ) at zero is infinite, because $1 / y$ is not integrable on $(1, \infty)$. If $h(y)$ is changed to $h(y)=1 / y^{3}$, the derivative of $C(\theta)$ approaches a bounded constant as $\theta$ approaches 0 , since $1 / y^{2}$ is integrable. Hence, $h(y)=1 / y^{3}$ generates a nonregular linear exponential family that is not even steep. The inverse Gaussian distribution is a steep model of some practical interest, see Exercise 3.4. The Strauss model for spatial variation, Section 13.1.2, has also been of applied concern; it is a nonregular model that is not even steep.

For models that are steep but not regular, boundary points of $\Theta$ do not satisfy the same regularity conditions as the interior points, but for Proposition $3.11$ to hold we need only exclude possible boundary points. In Barndorff-Nielsen and Cox (1994), such families, steep families with canonical parameter space restricted to the interior of the maximal $\Theta$, are called prime exponential models. Note that for outcomes of $\hat{\theta}(t)$ in Proposition $3.13$, it is the interior of $\Theta$ that is in a one-to-one correspondence with the interior of the convex hull.

## 统计代写|统计模型作业代写Statistical Modelling代考|nonregular family, not even steep

Exponential tilting (Example 2.12) applied to a Pareto type distribution yields an exponential family with density
$$f(y ; \theta) \propto y^{-a-1} e^{\theta_{y}}, \quad y>1,$$
where $a>1$ is assumed given, and the Pareto distribution is
$$f(y ; 0)=a y^{-a-1}, \quad y>1 .$$
(a) Show that the canonical parameter space is the closed half-line, $\theta \leq 0$.
(b) $E_{\theta}(t)=\mu_{t}(\theta)$ is not explicit. Nevertheless, show that its maximum must be $a /(a-1)>1$.
(c) Thus conclude that the likelihood equation for a single observation $y$ has no root if $y>a /(a-1)$. However, note that the likelihood actually has a maximum in $\Theta$ for any such $y$, namely $\hat{\theta}=0$.
Exercise 3.6 Legendre transform
The Legendre transform (or convex conjugate) of $\log C(\theta)$ has a role in understanding convexity-related properties of canonical exponential families, see Barndorff-Nielsen (1978). For a convex function $f(x)$ the transform $f^{\star}\left(x^{\star}\right)$ is defined by $f^{\star}\left(x^{\star}\right)=x^{T} x^{\star}-f(x)$, for the maximizing $x$, which must (of course) satisfy $D f(x)=x^{\star}$.
As technical training, show for $f(\theta)=\log C(\theta)$ that
(a) the transform is $f^{\star}\left(\mu_{t}\right)=\theta^{T} \mu_{t}-\log C(\theta)$, where $\theta=\hat{\theta}\left(\mu_{t}\right)$;
(b) $D f^{\star}\left(\mu_{t}\right)=\hat{\theta}\left(\mu_{t}\right)$ and $D^{2} f^{\star}\left(\mu_{t}\right)=V_{t}\left(\hat{\theta}\left(\mu_{t}\right)\right)^{-1}$;
(c) The Legendre transform of $f^{\star}\left(\mu_{t}\right)$ brings back $\log C\left(\hat{\theta}\left(\mu_{t}\right)\right)=\log C(\theta)$.
$\Delta$

## 统计代写|统计模型作业代写Statistical Modelling代考|Alternative Parameterizations

The canonical parameterization yields some useful and elegant results, as shown in Section 3.2. However, sometimes other parameterizations are theoretically or intuitively more convenient, or of particular interest in an application. Before we introduce some such alternatives to the canonical parameterization, it is useful to have general formulas for how score vectors and information matrices are changed when we make a transformation of the parameters. After this ‘Reparameterization lemma’, we introduce two alternative types of parameterization of full exponential families: the mean value parameterization and the mixed parameterization. These parameters are often at least partially more ‘natural’ than the canonical parameters, but this is not the main reason for their importance.

## 统计代写|统计模型作业代写Statistical Modelling代考|nonregular family, not even steep

F(是;θ)∝是−一种−1和θ是,是>1,

F(是;0)=一种是−一种−1,是>1.
(a) 证明规范参数空间是闭合的半线，θ≤0.
(二)和θ(吨)=μ吨(θ)不是明确的。然而，证明它的最大值必须是一种/(一种−1)>1.
(c) 因此得出结论，单个观察的似然方程是如果没有根是>一种/(一种−1). 但是，请注意，可能性实际上在θ对于任何此类是，即θ^=0.

(a) 变换是F⋆(μ吨)=θ吨μ吨−日志⁡C(θ)， 在哪里θ=θ^(μ吨);
(二)DF⋆(μ吨)=θ^(μ吨)和D2F⋆(μ吨)=在吨(θ^(μ吨))−1;
(c) 的勒让德变换F⋆(μ吨)带回来日志⁡C(θ^(μ吨))=日志⁡C(θ).
Δ

