统计代写|统计推断作业代写statistical inference代考|Theories of Estimation

统计代写|统计推断作业代写statistical inference代考|ELEMENTS OF POINT ESTIMATION

Essentially there are three stages of sophistication with regard to estimation of a parameter:

  1. At the lowest level-a simple point estimate;
  2. At a higher level-a point estimate along with some indication of the error of that estimate;
  3. At the highest level-one conceives of estimating in terms of a “distribution” or probability of some sort of the potential values that can occur.

This entails the specification of some set of values presumably more restrictive than the entire set of values that the parameter can take on or relative plausibilities of those values or an interval or region.

Consider the I.Q. of University of Minnesota freshmen by taking a random sample of them. We could be satisfied with the sample average as reflective of that population. More insight, however, may be gained by considering the variability

of scores by estimating a variance. Finally one might decide that a highly likely interval for the entire average of freshmen would be more informative.

Sometimes a point estimate is about all you can do. Representing distances on a map, for example. At present there is really no way of reliably illustrating a standard error on a map-so a point estimate will suffice.

An “estimate” is a more or less reasonable guess at the true value of a magnitude or parameter or even a potential observation, and we are not necessarily interested in the consequences of estimation. We may only be concerned in what we should believe a true value to be rather than what action or what the consequences are of this belief. At this point we separate estimation theory from decision theory, though in many instances this is not the case.

统计代写|统计推断作业代写statistical inference代考|POINT ESTIMATION

  1. An estimate might be considered “good” if it is in fact close to the true value on average or in the long run (pre-trial).
  2. An estimate might be considered “good” if the data give good reason to believe the estimate will be close to the true value (post trial).
    A system of estimation will be called an estimator
  3. Choose estimators which on average or very often yield estimates which are close to the true value.
  4. Choose an estimator for which the data give good reason to believe it will be close to the true value that is, a well-supported estimate (one that is suitable after the trials are made.)

With regard to the first type of estimators we do not reject one (theoretically) if it gives a poor result (differs greatly from the true value) in a particular case (though you would be foolish not to). We would only reject an estimation procedure if it gives bad results on average or in the long run. The merit of an estimator is judged, in general, by the distribution of estimates it gives rise to-the properties of its sampling distribution. One property sometimes stressed is unbiasedness. If

$T(D)$ is the estimator of $\theta$ then unbiasedness requires
E[T(D)]=\theta .
For example an unbiased estimator of a population variance $\sigma^{2}$ is
(n-1)^{-1} \sum\left(x_{i}-\bar{x}\right)^{2}=s^{2}
since $E\left(s^{2}\right)=\sigma^{2}$.
Suppose $Y_{1}, Y_{2}, \ldots$ are i.i.d. Bernoulli random variables $P\left(Y_{i}=1\right)=\theta$ and we sample until the first “one” comes up so that probability that the first one appears after $X=x$ zeroes is
P(X=x \mid \theta)=\theta(1-\theta)^{x} \quad x=0,1, \ldots \quad 0<\theta<1 . $$ Seeking an unbiased estimator we have $$ \theta=E(T(Y))=\sum_{x=0}^{\infty} t(x) \theta(1-\theta)^{x}=t(0) \theta+t(1) \theta(1-\theta)+\cdots . $$ Equating the terms yields the unique solution $t(0)=1, t(x)=0$ for $x \geq 1$. This is flawed because this unique estimator always lies outside of the range of $\theta$. So unbiasedness alone can be a very poor guide. Prior to unbiasedness we should have consistency (which is an asymptotic type of unbiasedness, but considerably more). Another desideratum that many prefer is invariance of the estimation procedure. But if $E(X)=\theta$, then for $g(X)$ a smooth function of $X, E(g(X)) \neq g(\theta)$ unless $g(\cdot)$ is linear in $X$. Definitions of classical and Fisher consistency follow: Consistency: An estimator $T_{n}$ computed from a sample of size $n$ is said to be a consistent estimator of $\theta$ if for any arbitrary $\epsilon>0$ and $\delta>0$ there is some value, $N$, such that
P\left[\left|T_{n}-\theta\right|<\epsilon\right]>1-\delta \quad \text { for all } n>N,

统计代写|统计推断作业代写statistical inference代考|Fisher’s Definition of Consistency for i.i.d. Random Variables

“A function of the observed frequencies which takes on the exact parametric value when for those frequencies their expectations are substituted.”

For a discrete random variable with $P\left(X_{j}=x_{j} \mid \theta\right)=p_{j}(\theta)$ let $T_{n}$ be a function of the observed frequencies $n_{j}$ whose expectations are $E\left(n_{j}\right)=n p_{j}(\theta)$. Then the linear function of the frequencies $T_{n}=\frac{1}{n} \sum_{j} c_{j} n_{j}$ will assume the value
\tau(\theta)=\Sigma c_{j} p_{j}(\theta)
when $n p_{j}(\theta)$ is substituted for $n_{j}$ and thus $n^{-1} T_{n}$ is a consistent estimator of $\tau(\theta)$.
Another way of looking at this is:
Let $F_{n}(x)=\frac{1}{n} \times #$ of observations $\leq x$
=\frac{i}{n} \text { for } x_{(i-1)}<x \leq x_{(i)}
where $x_{(j)}$ is the $j$ th smallest observation. If $T_{n}=g\left(F_{n}(x)\right)$ and $g(F(x \mid \theta))=\tau(\theta)$ then $T_{n}$ is Fisher consistent for $\tau(\theta)$. Note if
T_{n}=\int x d F_{n}(x)=\bar{x}{n} $$ and if $$ g(F)=\int x d F(x)=\mu $$ then $\bar{x}{n}$ is Fisher consistent for $\mu$.
On the other hand, if $T_{n}=\bar{x}{n}+\frac{1}{n}$, then this is not Fisher consistent but is consistent in the ordinary sense. Fisher Consistency is only defined for i.i.d. $X{1}, \ldots, X_{n}$.
However, as noted by Barnard (1974), “Fisher consistency can only with difficulty be invoked to justify specific procedures with finite samples” and also “fails because not all reasonable estimates are functions of relative frequencies.” He also presents an estimating procedure that does meet his requirements that the estimate lies within the parameter space and is invariant based on pivotal functions.

  1. 在最低层——简单的点估计;
  2. 在更高的层次上——一个点估计以及该估计错误的一些指示;
  3. 在最高级别,第一级设想根据可能发生的某种潜在值的“分布”或概率进行估计。






统计代写|统计推断作业代写statistical inference代考|POINT ESTIMATION

  1. 如果实际上平均或长期(预审)接近真实值,则估计值可能被认为是“好”的。
  2. 如果数据有充分的理由相信估计值将接近真实值(试验后),则估计值可能被认为是“好的”。
  3. 选择平均或经常产生接近真实值的估计值的估计器。
  4. 选择一个数据有充分理由相信它会接近真实值的估计量,即一个有充分支持的估计值(在进行试验后适合的估计值。)


磷(X=X∣θ)=θ(1−θ)XX=0,1,…0<θ<1.寻求一个无偏估计我们有θ=和(吨(是))=∑X=0∞吨(X)θ(1−θ)X=吨(0)θ+吨(1)θ(1−θ)+⋯.相等的项产生唯一的解决方案吨(0)=1,吨(X)=0为了X≥1. 这是有缺陷的,因为这个唯一的估计量总是在θ. 因此,仅凭公正可能是一个非常糟糕的指南。在无偏性之前,我们应该具有一致性(这是一种渐近类型的无偏性,但要多得多)。许多人更喜欢的另一个要求是估计过程的不变性。但如果和(X)=θ,那么对于G(X)的平滑函数X,和(G(X))≠G(θ)除非G(⋅)是线性的X. 经典一致性和 Fisher 一致性的定义如下: 一致性:估计量吨n根据大小样本计算n据说是一致的估计量θ如果对于任何任意ε>0和d>0有一定的价值,ñ, 这样
磷[|吨n−θ|<ε]>1−d 对全部 n>ñ,

统计代写|统计推断作业代写statistical inference代考|Fisher’s Definition of Consistency for i.i.d. Random Variables


对于离散随机变量磷(Xj=Xj∣θ)=pj(θ)让吨n是观测频率的函数nj谁的期望是和(nj)=npj(θ). 然后是频率的线性函数吨n=1n∑jCjnj将假定值
让F_{n}(x)=\frac{1}{n} \times #F_{n}(x)=\frac{1}{n} \times #观察≤X
=一世n 为了 X(一世−1)<X≤X(一世)
在哪里X(j)是个j最小的观察。如果吨n=G(Fn(X))和G(F(X∣θ))=τ(θ)然后吨n费雪是否一致τ(θ). 注意如果
另一方面,如果吨n=X¯n+1n,那么这不是Fisher一致,而是通常意义上的一致。Fisher 一致性仅针对 iid 定义X1,…,Xn.
然而,正如 Barnard (1974) 所指出的,“Fisher 一致性很难被用来证明具有有限样本的特定程序”并且“失败,因为并非所有合理的估计都是相对频率的函数”。他还提出了一个估计程序,该程序确实满足了他的要求,即估计值位于参数空间内并且基于关键函数是不变的。

