## 统计代写|抽样调查作业代写sampling theory of survey代考|ELEMENTARY DEFINITIONS

Let $N$ be a known number of units, e.g., godowns, hospitals, or income earners, each assignable identifying labels $1,2, \ldots, N$ and bearing values, respectively, $Y_{1}, Y_{2}, \ldots, Y_{N}$ of a realvalued variable $y$, which are initially unknown to an investigator who intends to estimate the total
$$Y=\sum_{1}^{N} Y_{i}$$
or the mean $\bar{Y}=Y / N$.
We call the sequence $U=(1, \ldots, N)$ of labels a population. Selecting units leads to a sequence $s=\left(i_{1}, \ldots, i_{n}\right)$, which is called a sample. Here $i_{1}, \ldots, i_{n}$ are elements of $U$, not necessarily distinct from one another but the order of its appearance is maintained. We refer to $n=n(s)$ as the size of $s$, while the effective sample size $v(s)=|s|$ is the cardinality of $s$, i.e., the number of distinct units in $s$. Once a specific sample $s$ is chosen we suppose it is possible to ascertain the values $Y_{i_{1}}, \ldots, Y_{i_{n}}$ of $y$ associated with the respective units of $s$. Then $d=\left[\left(i_{1}, Y_{i_{1}}\right), \ldots,\left(i_{n}, Y_{i_{n}}\right)\right] \quad$ or briefly $d=\left[\left(i, Y_{i}\right) \mid i \in s\right]$
constitutes the survey data.
An estimator $t$ is a real-valued function $t(d)$, which is free of $Y_{i}$ for $i \notin s$ but may involve $Y_{i}$ for $i \in s$. Sometimes we will express $t(d)$ alternatively by $t(s, Y)$, where $Y=\left(Y_{1}, \ldots\right.$, $\left.Y_{N}\right)^{\prime} .$

## 统计代写|抽样调查作业代写sampling theory of survey代考|DESIGN-BASED INFERENCE

Let $\Sigma_{1}$ be the sum over samples for which $|t(s, Y)-Y| \geq k>0$ and let $\Sigma_{2}$ be the sum over samples for which $|t(s, Y)-Y|<k$ for a fixed $Y$. Then from
\begin{aligned} M_{p}(t) &=\Sigma_{1} p(s)(t-Y)^{2}+\Sigma_{2} p(s)(t-Y)^{2} \ & \geq k^{2} \operatorname{Prob}[|t(s, Y)-Y| \geq k] \end{aligned}
one derives the Chebyshev inequality:
$$\operatorname{Prob}[|t(s, Y)-Y| \geq k] \leq \frac{M_{p}(t)}{k^{2}} .$$
Hence
$\operatorname{Prob}[t-k \leq Y \leq t+k] \geq 1-\frac{M_{p}(t)}{k^{2}}=1-\frac{1}{k^{2}}\left[V_{p}(t)+B_{p}^{2}(t)\right]$ where $B_{p}(t)=E_{p}(t)-Y$ is the bias of $t$. Writing $\sigma_{p}(t)=$ $\sqrt{V_{p}(t)}$ for the standard error of $t$ and taking $k=3 \sigma_{p}(t)$, it follows that, whatever $Y$ may be, the random interval $t \pm 3 \sigma_{p}(t)$ covers the unknown $Y$ with a probability not less than
$$\frac{8}{9}-\frac{1}{9} \frac{B_{p}^{2}(t)}{V_{p}(t)} .$$
So, to keep this probability high and the length of this covering interval small it is desirable that both $\left|B_{p}(t)\right|$ and $\sigma_{p}(t)$ be small, leading to a small $M_{p}(t)$ as well.

