统计代写|抽样调查作业代写sampling theory of survey代考| Balancing for Polynomial Models

## 统计代写|抽样调查作业代写sampling theory of survey代考|Balancing for Polynomial Models

We return to the model $\mathcal{M}{10}^{\prime}$ of $4.1 .2$ and consider an extension $\mathcal{M}{k}$ defined as follows:
\begin{aligned} Y_{i} &=\sum_{j=0}^{k} \beta_{j} X_{i}^{j}+\varepsilon_{i} \ E_{m}\left(\varepsilon_{i}\right) &=0, V_{m}\left(\varepsilon_{i}\right)=\sigma^{2}, C_{m}\left(\varepsilon_{i}, \varepsilon_{j}\right)=0, \text { for } i \neq j \end{aligned}
where $i, j=1,2, \ldots, N$. By generalizing the developments of section 4.1.2, we derive.

RESULT 4.2 Let $\mathcal{M}{k}$ be given. Then, the MSE of the $B L U$ predictor $t{o}$ for $Y$ is minimum for a sample $s$ of size $n$ if
$$\frac{1}{n} \sum_{s} X_{i}^{j}=\frac{1}{N} \sum_{1}^{N} X_{i}^{j} \text { for } j=0,1, \ldots, k \text {. }$$
If these equalities hold we have
$$t_{o}(s, Y)=N \bar{y} \text {. }$$
A sample satisfying the equalities in Result $4.2$ is said to be balanced up to order $k$.

Now, assume the true model $\mathcal{M}{k^{\prime}}$ agrees with a statistician’s working model $\mathcal{M}{k}$ in all respects except that
$$E_{m}\left(Y_{i}\right)=\sum_{0}^{k^{\prime}} \beta_{j} X_{i}^{j}$$
with $k^{\prime}>k$. The statistician will use $t_{o}$ instead of $t_{o}^{\prime}$, the BLU predictor for $Y$ on the base of $\mathcal{M}_{k^{\prime}}$. However, if he selects a

sample that is balanced up to order $k^{\prime}$
$$t_{o}^{\prime}(s, Y)=t_{o}(s, Y)=N \bar{y}$$
and his error does not cause losses.
It is, of course, too ambitious to realize exactly the balancing conditions even if $k^{\prime}$ is of moderate size, for example, $k^{\prime}=4$ or 5 . But if $n$ is large the considerations outlined in Result 4.1 apply again for SRSWOR or SRSWOR independently from within strata after internally homogeneous strata are priorly constructed.

## 统计代写|抽样调查作业代写sampling theory of survey代考|Linear Models in Matrix Notation

Suppose $x_{1}, x_{2}, \ldots, x_{k}$ are real variables, called auxiliary or explanatory variables, each closely related to the variable of interest $y$. Let
$$x{i}=\left(X{i 1}, X_{i 2}, \ldots, X_{i k}\right)^{\prime}$$
be the vector of explanatory variables for unit $i$ and assume the linear model
$$\begin{gathered} Y_{i}=x{i}^{\prime} \beta+\varepsilon{i} \ \text { for } i=1,2, \ldots, N . \text { Here } \ \beta=\left(\beta_{1}, \beta_{2}, \ldots, \beta_{k}\right)^{\prime} \end{gathered}$$
is the vector of (unknown) regression parameters; $\varepsilon_{1}, \varepsilon_{2}, \ldots$, $\varepsilon_{N}$ are random variables satisfying
\begin{aligned} E_{m} \varepsilon_{i} &=0 \ V_{m} \varepsilon_{i} &=v_{i i} \ C_{m}\left(\varepsilon_{i}, \varepsilon_{j}\right) &=v_{i j}, i \neq j \end{aligned}
where $E_{m}, V_{m}, C_{m}$ are operators for expectation, variance, and covariance with respect to the model distribution; and the matrix $V=\left(v_{i j}\right)$ is assumed to be known up to a constant $\sigma^{2}$.
To have a more compact notation define
\begin{aligned} Y &=\left(Y_{1}, Y_{2}, \ldots, Y_{N}\right)^{\prime} \ X &=\left(x{1}, x{2}, \ldots, x{N}\right)^{\prime}=\left(X{i j}\right) \ \varepsilon &=\left(\varepsilon_{1}, \varepsilon_{2}, \ldots, \varepsilon_{N}\right)^{\prime} \end{aligned}

and write the linear model as
$$Y=X \beta+\varepsilon$$
where
\begin{aligned} E_{m \varepsilon} &=0 \ V_{m}(\varepsilon) &=V \end{aligned}

## 统计代写|抽样调查作业代写sampling theory of survey代考|Robustness Against Model Failures

Consider the general linear model described in section 4.1.4. TAM (1986) has shown that a necessary and sufficient condition for
$$T^{\prime} Y{s}=\sum{s} T_{i} Y_{i}$$
to be $\mathrm{BLU}$ for $Y=1^{\prime} Y$ is that
(a) $\frac{T}{V} \frac{X_{s}}{T}=1 K T$
(b) $\frac{T}{V_{s s}} \underline{T}-\bar{K} 1 \in M\left(X{s}\right)$ where $$K=\left(V{s s}, V_{s r}\right),$$
and $M\left(X{s}\right)$ is the column space of $X{s}$.

In case $V_{r s}=0$ these conditions reduce to $(q)$ and
$(b)^{\prime} \quad V_{s s}\left(T-1{s}\right) \in M\left(X{s}\right)$
as given earlier by PEREIRA and RODRIGUES (1983).
By TAM’s (1986) results one may deduce the following.
If the true model is as above, $\mathcal{M}$, but one employs the best predictor postulating a wrong model, say $\mathcal{M}^{}$, using $X^{}$ instead of $X$ throughout where
$$X=\left(X^{}, X\right),$$ then the best predictor under $\mathcal{M}^{}$ is still best under $\mathcal{M}$ if and only if
$$T^{\prime} X_{s}=1^{\prime} \tilde{X}$$
using obvious notations. This evidently is a condition that the predictor should remain model-unbiased under the correct model $\mathcal{M}$. Thus, choosing a right sample meeting this stipulation, one may achieve robustness. But, in practice, $X$ will be unknown and one cannot realize this robustness condition at will, although for large samples this condition may hold approximately. In this situation, it is advisable to adopt suitable unequal probability sampling designs that assign higher selection probabilities to samples for which this condition should hold approximately, provided one may guess effectively the nature for variables omitted but influential in explaining variabilities in $y$ values. If a sample is thus rightly chosen one may preserve optimality even under modeling deficient as above. On the other hand, if one employs the best predictor using $W^{}$ instead of $X$ when $W^{}=(X, W)$, then this predictor continues to remain best if and only if the condition (b) above still holds. But this condition is too restrictive, demanding correct specification of the nature of $V$, which should be too elusive in practice. ROYALL and HERSON (1973), TALLIS (1978), SCOTT, BREWER and Ho (1978), PEREIRA and RODRIGUES (1983), RODRIGUES (1984), ROYALL and PFEFFERMANN (1982), and PFEFFERMANN (1984) have derived results relevant to this context of robust prediction.

## 统计代写|抽样调查作业代写sampling theory of survey代考|Balancing for Polynomial Models

t_{o}(s, Y )=N \bar{y} \text {. }
$$一个满足 Result 等式的样本4.2据说是按订单平衡的ķ. 现在，假设真实模型米ķ′同意统计学家的工作模式米ķ在所有方面，除了 和米(是一世)=∑0ķ′bjX一世j 和ķ′>ķ. 统计学家将使用吨这代替吨这′, BLU 预测器是基于米ķ′. 但是，如果他选择了 按订单平衡的样品ķ′$$
t_{o}^{\prime}(s, Y )=t_{o}(s, Y )=N \bar{y}
$$并且他的错误不会造成损失。 当然，要准确地实现平衡条件过于雄心勃勃，即使ķ′大小适中，例如，ķ′=4或 5。但如果n结果 4.1 中概述的考虑因素再次适用于独立于地层内部的 SRSWOR 或 SRSWOR 在内部均质地层预先构建后。 ## 统计代写|抽样调查作业代写sampling theory of survey代考|Linear Models in Matrix Notation 认为X1,X2,…,Xķ是实变量，称为辅助变量或解释变量，每个变量都与感兴趣的变量密切相关是. 令$$
x {i}=\left(X{i 1}, X_{i 2}, \ldots, X_{ik}\right)^{\prime}
b和吨H和在和C吨这r这F和Xpl一种n一种吨这r是在一种r一世一种bl和sF这r在n一世吨$一世$一种nd一种ss在米和吨H和l一世n和一种r米这d和l
\begin{gathered}
Y_{i}= x {i}^{\prime} \beta +\varepsilon{i} \
\text { for } i=1,2, \ldots, N 。\text { 这里 } \
\beta =\left(\beta_{1}, \beta_{2}, \ldots, \beta_{k}\right)^{\prime}
\end{gathered}

\begin{aligned}
Y &=\left(Y_{1}, Y_{2}, \ldots, Y_{N}\right)^{\prime} \
X &=\left( x {1}, x { 2}, \ldots, x {N}\right)^{\prime}=\left(X{ij}\right) \
\varepsilon &=\left(\varepsilon_{1}, \varepsilon_{2}, \ ldots, \varepsilon_{N}\right)^{\prime}
\end{aligned}
$$并将线性模型写为$$
Y = X \beta + \varepsilon

\begin{对齐}
E_{m \varepsilon } &= 0 \
V_{m}( \varepsilon ) &=V
\end{aligned}
$$## 统计代写|抽样调查作业代写sampling theory of survey代考|Robustness Against Model Failures 考虑第 4.1.4 节中描述的一般线性模型。TAM (1986) 证明了$$
T ^{\prime} Y {s}=\sum{s} T_{i} Y_{i}
$$的充要条件 是乙大号在对于 Y= 1 \prime} Y一世s吨H一种吨(一种)\frac{T}{V} \frac{X_{s}}{T}= 1 K T(b)\frac{T}{V_{ss}} T -\bar{K} 1 \in M\left( X {s}\right)在H和r和ķ=(在ss,在sr),一种ndM\left( X {s}\right)一世s吨H和C这l在米nsp一种C和这FX {s}。 如果在rs=0这些条件减少到(q)和 (b)^{\prime} \quad V_{ss}\left(T- 1 {s}\right) \in M\left( X {s}\right)一种sG一世在和n和一种rl一世和rb是磷和R和一世R一种一种ndR这DR一世G在和小号(1983).乙是吨一种米′s(1986)r和s在l吨s这n和米一种是d和d在C和吨H和F这ll这在一世nG.一世F吨H和吨r在和米这d和l一世s一种s一种b这在和,\数学{M},b在吨这n和和米pl这是s吨H和b和s吨pr和d一世C吨这rp这s吨在l一种吨一世nG一种在r这nG米这d和l,s一种是\数学{M}^{},在s一世nGX ^{}一世ns吨和一种d这FX吨Hr这在GH这在吨在H和r和 X =\left( X ^{}, X \right),吨H和n吨H和b和s吨pr和d一世C吨这r在nd和r米一世ss吨一世llb和s吨在nd和r米一世F一种nd这nl是一世F T ^{\prime} X _{s}= 1 ^{\prime} \tilde{X}$$

