统计代写|抽样理论作业代写sampling theory 代考|STAT7124

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|抽样理论作业代写sampling theory 代考|Circumstantial Property

A property is said to be circumstantial when it depends on conditions that we are not necessarily able to control; it is the conjuncture. A few examples are:

• A change in moisture content in a coal shipment
• A change in stream flow rate feeding a SAG mill
• A change in the particle size distribution of an iron ore shipment
• A change in density between fragments
• A material balance going from acceptable to unacceptable
• A reconciliation between the mine and the plant going from good to bad
• Your child losing ground on his understanding of mathematics
• Parents always fighting one another, and so on.
There are millions of examples in our lives. There is nothing much you can do about these undesirable effects. The only thing you could do is to change the structural property that is creating all these problems in the first place.
The conclusion is clear: a structural property will remain true, unless you change:
• A poor sampling protocol
• Change a faulty sampling system
• Change an inappropriate analytical procedure
• Install a weightometer on a conveyor belt that is no longer than 100 meters
• Eliminate a wishful cutoff grade at the mine
• Give a reliable textbook to your child losing ground in mathematics and explain to him how to use it, and so on.

Remember, a structural property can be relied upon. However, a circumstantial property cannot be relied upon and strictly depends on chance. Therefore, too much emphasis on solving effects of a cause is often a loss of time and money; it is an exercise in futility that invariably leads to conflict and a sentiment of being useless. It is important to know that accuracy has a secondary circumstantial property. For instance, it could happen that an incorrect sampling operation is also accurate; nevertheless, it is dangerous to rely or speculate on such an incorrect sampling operation.

It is of the utmost importance to place emphasis on identifying structural properties. Invest resources in finding the cause of a problem instead of reacting to its effects; that way you will be more likely to live happy.

## 统计代写|抽样理论作业代写sampling theory 代考|The Primary Structural Property of Sampling Correctness

Now a new problem comes forth, which is a problem of judgment: we have the choice between measuring the precision and accuracy of a given sampling protocol or deciding upon the precision and correctness of a given sampling protocol. The difference between these two alternatives is of prime importance as only the second alternative provides a safe solution. The reader should clearly appreciate this difference because the choice of the right or wrong sampling strategy depends on it. In sampling, there are many technical errors and we are already familiar with most of them (e.g., Fundamental Sampling Error [FSE], Grouping and Segregation Error [GSE], Heterogeneity Fluctuation Error $[H F E]$, Increment Delimitation Error [IDE], Increment Extraction Error [IEE], Increment Preparation Error [IPE], etc.). We also know how to deal with them, but judgment errors are of equal importance, often have disastrous consequences, and we really do not know how to deal with them. Examples of such judgement mistakes are many, especially with people trying to quantify FSE using routine replicate samples: they call this exercise “the calibration of Pierre Gy’s formula,” which is a very debatable exercise mixing oranges and apples, which is typical of empirical approaches and a favorite of many statisticians.

If correctness is a structural property of the sampling process, it means that it is an intrinsic property of this process, as long the integrity of the equipment is not damaged. Sampling correctness does not depend on circumstances external to the sampling process over which we have no control, such as the properties of the material to be sampled. Thus, we can state that sampling correctness is a primary quality of a sampling process over which we may have control.

On the contrary, accuracy is the circumstantial property of a sampling process, and depends on various factors such as:

• Respect for the conditions of sampling correctness
• Properties of the material to be sampled over which we have no control.

Consequently, we can state that accuracy is a secondary property if compared to correctness. We cannot directly control accuracy, but we can directly control correctness, which makes us wonder about the effectiveness of performing bias tests to control the validity of a sampling protocol or a sampling device.

A correct sampling process is always accurate; however, an incorrect sampling process can be circumstantially accurate today, unacceptably biased in one direction tomorrow, and unacceptably biased in the other direction after tomorrow. Gambling over the accuracy of an incorrect sampling process is not recommended, unless your objective is to lose money. Someone often told me “I know this sampling device is incorrect, however it is accurate enough for my applications”; such a statement is very dangerous because a bias in sampling is never constant due to the transient nature of segregation. Basically, someone does not know if the incorrect sampling device is accurate enough as they have no clue what circumstances are going to show up.

Several standards committees have made recommendations to control accuracy. We strongly disagree with them and recommend instead to control correctness, which is the primary objective; then, and only then, acceptable accuracy may be the reward.

## 统计代写|抽样理论作业代写采样理论代考|间接属性

.采样理论 .采样理论代考|

• 煤炭运输中的水分变化
• 给SAG磨的流量变化
• 铁矿石运输中颗粒大小分布的变化
• 碎片之间的密度变化
• 物料平衡从可接受到不可接受
• 煤矿和工厂之间的调和从好到坏
• 你的血压从好到坏
• 你的孩子在他的理解上失去了基础父母之间总是吵架，诸如此类。在我们的生活中有成千上万的例子。对于这些不良影响，你无能为力。你唯一能做的就是改变结构性质因为它在一开始就造成了这些问题。
• 改变有问题的采样系统
• 改变不适当的分析程序
• 在传送带上安装一个重量计，长度不超过100米
• 消除矿山的一个一心一气的边界等级
• 给你数学落后的孩子一本可靠的教科书，并向他解释如何使用它，等等记住，结构属性是可以依赖的。然而，间接财产不能依赖，而完全取决于偶然。因此，过分强调解决一个原因的影响往往是时间和金钱的损失;这是一种徒劳的做法，总是会导致冲突和一种无用的感觉。重要的是要知道准确性有一个次要的环境属性。例如，不正确的抽样操作也可能是准确的;然而，依赖或推测这种不正确的抽样操作是危险的强调结构性质的识别是极其重要的。把资源投入到寻找问题的原因上，而不是对问题的后果做出反应;这样你就更有可能过得快乐
统计代写|抽样理论作业代写采样理论代考|采样正确性的主要结构性质
现在，一个新的问题出现了，这是一个判断的问题:我们可以选择是测量给定采样协议的精度和准确性，还是决定给定采样协议的精度和正确性。这两种选择之间的区别是最重要的，因为只有第二种选择提供了安全的解决方案。读者应该清楚地认识到这种差异，因为选择正确或错误的抽样策略取决于它。在抽样中，有很多技术误差，我们已经熟悉其中的大部分(如基本抽样误差[FSE]，分组和分离误差[GSE]，异质性波动误差$[H F E]$，增量定界误差[IDE]，增量提取误差[IEE]，增量准备误差[IPE]等)。我们也知道如何处理它们，但判断错误同样重要，往往会产生灾难性的后果，而我们真的不知道如何处理它们。这种判断错误的例子很多，特别是当人们试图使用常规重复样本来量化FSE时:他们称这种做法为“Pierre Gy’s formula的校准”，这是一种混合了橘子和苹果的非常有争议的做法，是典型的经验方法，也是许多统计学家的最爱如果正确性是采样过程的结构属性，那就意味着它是这个过程的内在属性，只要设备的完整性不被破坏。采样的正确性不依赖于我们无法控制的采样过程之外的环境，例如被采样材料的性质。因此，我们可以说，采样正确性是我们可以控制的采样过程的一个基本质量相反，精度是采样过程的间接性质，取决于各种因素，如
• 尊重采样正确性的条件
• 我们无法控制的被采样材料的属性。

因此，我们可以说，如果与正确性相比，准确性是一个次要的属性。我们不能直接控制精度，但我们可以直接控制正确性，这让我们怀疑执行偏差测试来控制采样协议或采样设备的有效性正确的采样过程总是准确的;然而，一个不正确的抽样过程可能在今天的情况下是准确的，在明天的一个方向上是不可接受的偏差，在明天的另一个方向上是不可接受的偏差。除非你的目的是赔钱，否则不建议在一个不正确的采样过程的准确性上冒险。有人经常对我说:“我知道这个采样装置是不正确的，但它对我的应用程序来说是足够准确的”;这种说法是非常危险的，因为由于分离的短暂性，采样中的偏差从来都不是恒定的。基本上，人们不知道不正确的采样设备是否足够准确，因为他们不知道会出现什么情况几个标准委员会已经提出了控制准确性的建议。我们强烈反对他们的观点，并建议控制正确性，这是主要目标;然后，只有在那时，可接受的准确性可能是奖励

## 广义线性模型代考

统计代写|抽样理论作业代写sampling theory 代考|MATH525

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|抽样理论作业代写sampling theory 代考|Sampling Correctness: The Mandatory Path to Predictable Uncertainty

The Cardinal rules of sampling correctness are simple:

1. All fragments or groups of fragments must have the same chance to become part of the sample $(I D E)$.
2. The sampling tool must not become selective on what it is taking (IEE).
3. The mass of collected increments must be proportional to the amount of material presented to the sampling tool (IWE).
4. The integrity of increments must under no circumstance be altered (IPE).
Any deviation from one of these rules and the sampling protocol becomes invalid, wishful thinking, misleading and constitutes a direct, unacceptable departure from the expected due diligence; in other words, it is a mistake leading to unpredictable errors on precision and accuracy. Basically, there are no statistical analyses possible and if you had to defend the incorrect sampling protocol in a court of law, a good sampling expert would make that protocol indefensible in five minutes. In other words, the rules of sampling correctness should under no circumstances be negotiable.

If all these rules a strictly respected with no compromise possible you give yourself a chance to enter the domain of acceptable residual uncertainty, and the data base offers the possibility of performing a valid, predictable statistical analysis. Anything less can be called administrative sampling, leading to statistical ceremonies, and a total loss of time and money, and nonetheless misleading information for top management.

After it is clear that the sampling process is correct, we still have to address Constitution Heterogeneity $\mathrm{CH}$ and Distribution Heterogeneity DH. This is done by minimizing the variances of the Fundamental Sampling Error FSE and the Grouping and Segregation Error GSE, which is the objectives of Part IV. Then, and only then, we may dare to talk about residual uncertainty. Basically, the use of the word uncertainty should be the ultimate reward of having eliminating what is obviously structurally wrong in sampling and analytical protocols. Anything less is an intolerable mistake, therefore an error, which naturally leads to Chapter 2 .

## 统计代写|抽样理论作业代写sampling theory 代考|Structural Property

A property is said to be structural when it necessarily results from a certain number of conditions that we can control or quantify, and that are assumed to be fulfilled. A few examples are:

• The heterogeneity of a constituent of interest in a material to sample (e.g., gold in a given geological unit, alumina in a given iron ore, GMO in a wheat shipment, mercury in a fish, etc.)
• A sampling protocol authoritatively selected by a management team
• The characteristics of a sampling device built by a manufacturer
• A process control procedure implemented by an operator
• The characteristics of a process unit
• A sampling interval selected for convenience
• A cutoff grade selected at the mine for a given constituent of interest
• A standard enforced by regulators,
• A wrong temperature set to warm your home,
• A child trying to study mathematics without a reliable textbook, etc…
• There are millions of examples in our lives. A structural property is what it is: it is the structure with which someone operates. A structural property is always true as long we operate with it. The effect depends on chance; it is the circumstance we endure. It is important to know that heterogeneity and sampling correctness have primary structural properties.

## 统计代写|抽样理论作业代写采样理论代考|采样正确性:可预测不确定性的必经之路

1. 所有片段或片段组必须有相同的机会成为样本的一部分$(I D E)$ .
2. 采样工具不能对它所取的东西进行选择性(IEE)。收集到的增量的质量必须与呈现给采样工具(IWE)的材料的数量成正比。
3. 增量的完整性在任何情况下都不能被改变(IPE)。任何对这些规则和采样协议的偏离都是无效的、一厢情愿的、误导性的，并构成对预期的尽职调查的直接、不可接受的偏离;换句话说，这是一种导致精度和准确性上不可预测的错误。基本上，统计分析是不可能的，如果你必须在法庭上为不正确的采样协议辩护，一个好的采样专家会在五分钟内让该协议站不住脚。换句话说，采样正确性的规则在任何情况下都是不容商榷的。

.

## 广义线性模型代考

统计代写|抽样理论作业代写sampling theory 代考|STAT506

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|抽样理论作业代写sampling theory 代考|Basic Terms

Accuracy – The definition of accuracy is often controversial. It is incorrect to include the notion of precision and reproducibility with the notion of accuracy; consequently, definitions given by most current dictionaries are misleading, confusing, and incorrect. Accuracy has been given several definitions and statisticians and engineers disagree completely on those definitions.

A sampling bias is defined as the mean $m$ (ISE)of the sampling error, (also referred to as the Increment Selection Error ISE). As far as sampling is concerned, a sample is said to be accurate when the absolute value of the bias $|m(I S E)|$ is smaller than a certain standard of accuracy, $m_0(I S E)$, which is regarded as acceptable for a given purpose and clearly defined in preselected Data Quality Objectives (DQO). Basically, what is acceptable to someone may turn out to be unacceptable to somebody else, and it is an issue to clarify before a sampling campaign is implemented.

Accuracy is a property of the mean of a given error exclusively. It is important to make a distinction between an accurate and unbiased sample. An unbiased sample implies that the mean of the Increment Selection Error $m(I S E)$ is equal to zero. Even when carried out in an ideal way, sampling is always biased due to the particulate structure of most materials. The bias is never strictly zero. It may be negligible but always different than zero. We should speak of an accurate sample but not of an unbiased sample. In fact, we can forget about an unbiased sample because it is a limit case never encountered in sampling practice.

Analysis – Often, when we speak of analysis, we speak of estimation or determination of the content of a constituent of interest; this is an incorrect practice. Analysis is the opposite of synthesis. It is the separation, partial or complete, of a material into its constituents. How these constituents are determined after their separation is an entirely different matter. When we talk of analysis, analytical procedure, or analytical error, we should talk of determination, procedure for a determination, or determination error. We choose to go along with the word analysis, although it is not correct, because it is so widely misused in various industries. We may speak of chemical analysis, moisture analysis, particle size distribution analysis, percent solid analysis, or variance analysis in statistics.

Analytical Error $A E$ – Error generated by nonoptimized operations such as assaying, moisture determination, particle size analysis, estimation of the percent solids of a slurry, and so on. This error does not include the last selection process from which the final analytical subsample is obtained; this stage shall be a part of the Total Sampling Error, TSE.

Analytical Uncertainty $A U$ – This refers to a certain amount of uncertainty left in the analytical process after all sources of potential analytical bias have been eliminated in a preventive way, and after necessary tests have been performed to bring the uncertainty below or equal to an acceptable level dictated by Data Quality Objectives. It should be clearly understood that the use of the word uncertainty would be inappropriate when biases have not been minimized in a preventive way and when the amount of uncertainty is not appropriate with DQO, which is essentially an error by operators.

## 统计代写|抽样理论作业代写sampling theory 代考|The Word Error versus The Word Uncertainty Controvercy

Observation from attending many conferences around the world reveals that many people are constantly shifting from the word error to the word uncertainty without a clear vision of what the subtle difference is. The same applies to a large swath of the scientific literature, in which these two concepts unfortunately are often used synonymously, which is a scientific flaw of the first order. In practice, as demonstrated in the Theory of Sampling, there are both sampling errors, and sampling uncertainties. Sampling errors can easily be preventatively minimized, while sampling uncertainty for a preselected sampling protocol is inevitable. Gy’s work, supported by Matheron’s critical reviews, demonstrates beyond any possible doubt that sampling uncertainty cannot be achieved if sampling errors are not preventively minimized by applying very stringent rules that are too often ignored by sampling practitioners. A clarification, in line with the Theory of Sampling, is presented to mitigate the confusion created by circles favoring the use of uncertainty versus the use of the word error. Both words are very different in nature and scope, and they are both necessary in their due place and time.

It is the word “mistake” that bothers statisticians. Indeed, it is a mistake not to optimize sampling protocols according to acceptable Data Quality Objectives (DQO), and a grave mistake not to make sure sampling is correct, and that is the way it is.
Uncertainty: lack of sureness about someone or something; something that is not known beyond doubt; something not constant.

Historically, statisticians prefer the word uncertainty. The following words should be carefully remembered in all discussions involving the creation of sampling protocols and their practical implementations.
$\mathrm{Gy}^{49}$ stated:
With the exception of homogeneous materials, which only exist in theory, the sampling of particulate materials is always an aleatory operation. There is always an uncertainty, regardless of how small it is, between the true, unknown content $a_L$ of the lot $L$ and the true, unknown content $a_S$ of the sample $S$. A vocabulary difficulty needs to be mentioned: tradition has established the word error as common practice, though it implies a mistake that could have been prevented, while statisticians prefer the word uncertainty, which implies no responsibility.

However, in practice, as demonstrated in the Theory of Sampling, there are both sampling errors, and sampling uncertainties. Sampling errors can easily be preventatively minimized, while sampling uncertainty for a preselected sampling protocol is inevitable. For the sake of simplicity, because the word uncertainty is not strong enough, the word error has been selected as current usage in the Theory of Sampling, making it very clear it does not necessarily imply a sense of culpability.

.

## 统计代写|抽样理论作业代写采样理论代考|单词错误与单词不确定性的争议

$\mathrm{Gy}^{49}$指出:

## 广义线性模型代考

统计代写|抽样调查作业代写sampling theory of survey代考|STAT7124

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|抽样调查作业代写sampling theory of survey代考|EXAMPLES OF REPRESENTATIVE STRATEGIES

The ratio estimator
$$t_{1}=X \frac{\sum_{i \in s} Y_{i}}{\sum_{i \in s} X_{i}}$$
is of special importance because of its traditional use in practice. Here, $\left(p, t_{1}\right)$ is obviously representative with respect to a size measure $x$, more precisely to $\left(X_{1}, \ldots, X_{N}\right)$, whatever the sampling design $p$.

Note, however, that $t_{1}$ is usually combined with SRSWOR or SRSWR. The sampling scheme of LAHIRI-MIDZUNO-SEN (LAHIRI, 1951; MIDZUNO, 1952; SEN, 1953) (LMS) yields a design of interest to be employed in conjunction with $t_{1}$ by rendering it design unbiased.
The Hansen-Hurwitz (HH, 1943) estimator (HHE)
$$t_{2}=\frac{1}{n} \sum_{i=1}^{N} f_{s i} \frac{Y_{i}}{P_{i}}$$ with $f_{s i}$ as the frequency of $i$ in $s, i \in \mathcal{U}$, combined with any design $p$, gives rise to a strategy representative with respect to $\left(P_{1}, \ldots, P_{N}\right)^{\prime}$. For the sake of design unbiasedness, $t_{2}$ is usually based on probability proportional to size (PPS) with replacement (PPSWR) sampling, that is, a scheme that consists of $n$ independent draws, each draw selecting unit $i$ with probability $P_{i}$.

Another representative strategy is due to RAO, HARTLEY and COCHRAN (RHC, 1962). We first describe the sampling scheme as follows: On choosing a sample size $n$, the population $\mathcal{U}$ is split at random into $n$ mutually exclusive groups of sizes suitably chosen $N_{i}\left(i=1, \ldots, n ; \sum_{1}^{n} N_{i}=N\right)$ coextensive with $\mathcal{U}$, the units bearing values $P_{i}$, the normed sizes $\left(0<P_{i}<1, \sum P_{i}=1\right)$. From each of the $n$ groups so formed independently one unit is selected with a probability proportional to its size given the units falling in the respective groups.

## 统计代写|抽样调查作业代写sampling theory of survey代考|Raj’s Estimator t5

Another popular strategy is due to RAJ $(1956,1968)$. The sampling scheme is called probability proportional to size without replacement (PPSWOR) with $P_{i}$ ‘s $\left(02)$ draw a unit $i_{n}\left(\neq i_{1}, \ldots, i_{n-1}\right)$ is chosen with probability
$$\frac{P_{i_{n}}}{1-P_{i_{1}}-P_{i_{2}}-\ldots,-P_{i_{n-1}}}$$ out of the units of $U$ minus $i_{1}, i_{2}, \ldots, i_{n-1}$. Then,
\begin{aligned} e_{1} &=\frac{Y_{i_{1}}}{P_{i_{1}}} \ e_{2} &=Y_{i_{1}}+\frac{Y_{i_{2}}}{P_{i_{2}}}\left(1-P_{i_{1}}\right) \ e_{j} &=Y_{i_{1}}+\ldots+Y_{i_{j-1}}+\frac{Y_{i_{j}}}{P_{i_{j}}}\left(1-P_{i_{1}}-\ldots-P_{i_{j-1}}\right) \end{aligned}
$j=3, \ldots, n$ are all unbiased for $Y$ because the conditional expectation
\begin{aligned} E_{c} & {\left[e_{j} \mid\left(i_{1}, Y_{i_{1}}\right), \ldots,\left(i_{j-1}, Y_{i_{j-1}}\right)\right] } \ &=\left(Y_{i_{1}}+\ldots,+Y_{i_{j-1}}\right)+\sum_{\substack{k=1 \ \left(\neq i_{1}, \ldots, i_{j-1}\right)}}^{N} Y_{k}=Y . \end{aligned}
So, unconditionally, $E_{p}\left(e_{j}\right)=Y$ for every $j=1, \ldots, n$, and
$$t_{5}=\frac{1}{n} \sum_{j=1}^{n} e_{j},$$
called Raj’s (1956) estimator, is unbiased for $Y$.

## 统计代写|抽样调查作业代写sampling theory of survey代考|EXAMPLES OF REPRESENTATIVE STRATEGIES

$$t_{1}=X \frac{\sum_{i \in s} Y_{i}}{\sum_{i \in s} X_{i}}$$

$$t_{2}=\frac{1}{n} \sum_{i=1}^{N} f_{s i} \frac{Y_{i}}{P_{i}}$$

## 统计代写|抽样调查作业代写sampling theory of survey代考|Raj’s Estimator t5

$$\frac{P_{i_{n}}}{1-P_{i_{1}}-P_{i_{2}}-\ldots,-P_{i_{n-1}}}$$

$$e_{1}=\frac{Y_{i_{1}}}{P_{i_{1}}} e_{2}=Y_{i_{1}}+\frac{Y_{i_{2}}}{P_{i_{2}}}\left(1-P_{i_{1}}\right) e_{j}=Y_{i_{1}}+\ldots+Y_{i_{j-1}}+\frac{Y_{i_{j}}}{P_{i_{j}}}\left(1-P_{i_{1}}-\ldots-P_{i_{j-1}}\right)$$
$j=3, \ldots, n$ 都是公正的 $Y$ 因为条件期望
$$E_{c}\left[e_{j} \mid\left(i_{1}, Y_{i_{1}}\right), \ldots,\left(i_{j-1}, Y_{i_{j-1}}\right)\right]=\left(Y_{i_{1}}+\ldots,+Y_{i_{j-1}}\right)+\sum_{k=1} \sum_{\left(\neq i_{1}, \ldots, i_{j-1}\right)}^{N} Y_{k}=Y .$$

$$t_{5}=\frac{1}{n} \sum_{j=1}^{n} e_{j}$$

## 广义线性模型代考

统计代写|抽样调查作业代写sampling theory of survey代考|STAT 506

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|抽样调查作业代写sampling theory of survey代考|SAMPLING SCHEMES

A unified theory is developed by noting that it is enough to establish results concerning $(p, t)$ without heeding how one may actually succeed in choosing samples with preassigned probabilities. A method of choosing a sample draw by draw, assigning selection probabilities with each draw, is called a sampling scheme. Following HANURAV (1966), we show below that starting with an arbitrary design we may construct a sampling scheme.

Suppose for each possible sample $s$ from $U$ the selection probability $p(s)$ is fixed. Let
$$\begin{array}{lll} \beta_{i 1}=p\left(i_{1}\right), & \beta_{i_{1}, i_{2}}=p\left(i_{1}, i_{2}\right), \ldots, & \beta_{i_{1}, \ldots, i_{n}}=p\left(i_{1}, \ldots, i_{n}\right) \ \alpha_{i 1}=\Sigma_{1} p(s), & \alpha_{i_{1}, i_{2}}=\Sigma_{2} p(s), \ldots, & \alpha_{i_{1}, \ldots, i_{n}}=\Sigma_{n} p(s) \end{array}$$
where $\Sigma_{1}$ is the sum over all samples $s$ with $i_{1}$ as the first entry; $\Sigma_{2}$ is the sum over all samples with $i_{1}, i_{2}$, respectively, as the first and second entries in $s, \ldots$, and $\Sigma_{n}$ is the sum over all samples of which the first, second, $\ldots, n$th entries are, respectively, $i_{1}, i_{2}, \ldots, i_{n}$.

Then, let us consider the scheme of selection such that on the first draw from $U, i_{1}$ is chosen with probability $\alpha_{i 1}$, a second draw from $U$ is made with probability
$$\left(1-\frac{\beta_{i 1}}{\alpha_{i 1}}\right) \text {. }$$
On the second draw from $U$ the unit $i_{2}$ is chosen with probability
$$\begin{gathered} \alpha_{i_{1}, i_{2}} \ \alpha_{i 1}-\beta_{i 1} \end{gathered}$$
A third draw is made from $U$ with probability
$$\left(1-\frac{\beta_{i_{1}, i_{2}}}{\alpha_{i_{1}, i_{2}}}\right)$$

## 统计代写|抽样调查作业代写sampling theory of survey代考|CONTROLLED SAMPLING

Now, consider an arbitrary design $p$ of fixed size $n$ and a linear estimator $t$; suppose a subset $S_{0}$ of all samples is less desirable from practical considerations like geographical location, inaccessibility, or, more generally, costliness. Then, it is advantageous to replace design $p$ by a modified one, for example, $q$, which attaches minimal values $q(s)$ to the samples $s$ in $S_{0}$ keeping
$$\begin{gathered} E_{p}(t)=E_{q}(t) \ E_{p}(t-Y)^{2}-E_{q}(t-Y)^{2} \end{gathered}$$
and even maintaining other desirable properties of $p$, if any. A resulting $q$ is called a controlled design and a corresponding scheme of selection is called a controlled sampling scheme. Quite a sizeable literature has grown around this problem of finding appropriate controlled designs. The methods of implementing such a scheme utilize theories of incomplete block designs and predominantly involve ingeneous devices of reducing the size of support of possible samples demanding trials and errors. But RAO and NIGAM (1990) have recently presented a simple solution by posing it as a linear programming problem and applying the well-known simplex algorithm to demonstrate their ability to work out suitable controlled schemes.
Taking $t$ as the HOR VIT7-THOMPSON estimator $\bar{t}=\sum_{i \in S}$ $Y_{i} / \pi_{i}$, they minimize the objective function $F=\sum_{s \in S_{0}} q(s)$ subject to the linear constraints
\begin{aligned} \sum_{s \ni i, j} q(s) &=\sum_{s \ni i, j} p(s)=\pi_{i j} \ q(s) & \geq 0 \text { for all } s \end{aligned}
where $\pi_{i j}{ }^{\prime} s$ are known quantities in terms of the original uncontrolled design $p$.

## 统计代写|抽样调查作业代写sampling theory of survey代考|SAMPLING SCHEMES

$$\beta_{i 1}=p\left(i_{1}\right), \quad \beta_{i_{1}, i_{2}}=p\left(i_{1}, i_{2}\right), \ldots, \quad \beta_{i_{1}, \ldots, i_{n}}=p\left(i_{1}, \ldots, i_{n}\right) \alpha_{i 1}=\Sigma_{1} p(s), \quad \alpha_{i_{1}, i_{2}}=\Sigma_{2} p(s), \ldots,$$

$$\left(1-\frac{\beta_{i 1}}{\alpha_{i 1}}\right) .$$

$$\alpha_{i_{1}, i_{2}} \alpha_{i 1}-\beta_{i 1}$$

$$\left(1-\frac{\beta_{i_{1}, i_{2}}}{\alpha_{i_{1}, i_{2}}}\right)$$

## 统计代写|抽样调查作业代写sampling theory of survey代考|CONTROLLED SAMPLING

$$E_{p}(t)=E_{q}(t) E_{p}(t-Y)^{2}-E_{q}(t-Y)^{2}$$

$$\sum_{s \ni i, j} q(s)=\sum_{s \ni i, j} p(s)=\pi_{i j} q(s) \quad \geq 0 \text { for all } s$$

## 广义线性模型代考

统计代写|抽样调查作业代写sampling theory of survey代考|STAT7124

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|抽样调查作业代写sampling theory of survey代考|ELEMENTARY DEFINITIONS

Let $N$ be a known number of units, e.g., godowns, hospitals, or income earners, each assignable identifying labels $1,2, \ldots, N$ and bearing values, respectively, $Y_{1}, Y_{2}, \ldots, Y_{N}$ of a realvalued variable $y$, which are initially unknown to an investigator who intends to estimate the total
$$Y=\sum_{1}^{N} Y_{i}$$
or the mean $\bar{Y}=Y / N$.
We call the sequence $U=(1, \ldots, N)$ of labels a population. Selecting units leads to a sequence $s=\left(i_{1}, \ldots, i_{n}\right)$, which is called a sample. Here $i_{1}, \ldots, i_{n}$ are elements of $U$, not necessarily distinct from one another but the order of its appearance is maintained. We refer to $n=n(s)$ as the size of $s$, while the effective sample size $v(s)=|s|$ is the cardinality of $s$, i.e., the number of distinct units in $s$. Once a specific sample $s$ is chosen we suppose it is possible to ascertain the values $Y_{i_{1}}, \ldots, Y_{i_{n}}$ of $y$ associated with the respective units of $s$. Then $d=\left[\left(i_{1}, Y_{i_{1}}\right), \ldots,\left(i_{n}, Y_{i_{n}}\right)\right] \quad$ or briefly $d=\left[\left(i, Y_{i}\right) \mid i \in s\right]$
constitutes the survey data.
An estimator $t$ is a real-valued function $t(d)$, which is free of $Y_{i}$ for $i \notin s$ but may involve $Y_{i}$ for $i \in s$. Sometimes we will express $t(d)$ alternatively by $t(s, Y)$, where $Y=\left(Y_{1}, \ldots\right.$, $\left.Y_{N}\right)^{\prime} .$

## 统计代写|抽样调查作业代写sampling theory of survey代考|DESIGN-BASED INFERENCE

Let $\Sigma_{1}$ be the sum over samples for which $|t(s, Y)-Y| \geq k>0$ and let $\Sigma_{2}$ be the sum over samples for which $|t(s, Y)-Y|<k$ for a fixed $Y$. Then from
\begin{aligned} M_{p}(t) &=\Sigma_{1} p(s)(t-Y)^{2}+\Sigma_{2} p(s)(t-Y)^{2} \ & \geq k^{2} \operatorname{Prob}[|t(s, Y)-Y| \geq k] \end{aligned}
one derives the Chebyshev inequality:
$$\operatorname{Prob}[|t(s, Y)-Y| \geq k] \leq \frac{M_{p}(t)}{k^{2}} .$$
Hence
$\operatorname{Prob}[t-k \leq Y \leq t+k] \geq 1-\frac{M_{p}(t)}{k^{2}}=1-\frac{1}{k^{2}}\left[V_{p}(t)+B_{p}^{2}(t)\right]$ where $B_{p}(t)=E_{p}(t)-Y$ is the bias of $t$. Writing $\sigma_{p}(t)=$ $\sqrt{V_{p}(t)}$ for the standard error of $t$ and taking $k=3 \sigma_{p}(t)$, it follows that, whatever $Y$ may be, the random interval $t \pm 3 \sigma_{p}(t)$ covers the unknown $Y$ with a probability not less than
$$\frac{8}{9}-\frac{1}{9} \frac{B_{p}^{2}(t)}{V_{p}(t)} .$$
So, to keep this probability high and the length of this covering interval small it is desirable that both $\left|B_{p}(t)\right|$ and $\sigma_{p}(t)$ be small, leading to a small $M_{p}(t)$ as well.

## 统计代写|抽样调查作业代写sampling theory of survey代考|ELEMENTARY DEFINITIONS

$$Y=\sum_{1}^{N} Y_{i}$$

Ibegin{aligned}
$\mathrm{M}{-}{\mathrm{p}}(\mathrm{t}) \&=\mid$ sigma ${1} \mathrm{p}(\mathrm{s})(\mathrm{t} \mathbf{\mathrm { Y }}) \wedge{2}+\backslash \operatorname{sigma}{2} \mathrm{p}(\mathrm{s})(\mathrm{tY}) \wedge{2} \backslash$
\& Igeq $k \wedge{2}$ loperatorname{概率 $}[|t(s, Y)-Y| \backslash g e q ~ k]$
lend{对齐 $}$
onederivestheChebyshevinequality:
loperatorname{概率 $}[|\mathrm{t}(\mathrm{s}, Y)-\mathrm{Y}| \operatorname{lgeq} \mathrm{k}] \backslash \operatorname{leq} \backslash f$ frac $\left{\mathrm{M}{-}{\mathrm{p}}(\mathrm{t})\right}{\mathrm{k} \wedge{2}}$ $\$ \$$因此 \operatorname{Prob}[t-k \leq Y \leq t+k] \geq 1-\frac{M{p}(t)}{k^{2}}=1-\frac{1}{k^{2}}\left[V_{p}(t)+B_{p}^{2}(t)\right] 在哪里 B_{p}(t)=E_{p}(t)-Y 是偏差 t. 写作 \sigma_{p}(t)=\sqrt{V_{p}(t)} 对于标准误 t 并采取 k=3 \sigma_{p}(t), 由此可知，无论 \ צmaybe, therandomintervalt Ipm 3 ปsigma_{p}(t)coverstheunknown 是withaprobabilitynotlessthan \frac{8}{9}-\frac{1}{9} \frac{B_{p}^{2}(t)}{V_{p}(t)}. So, tokeepthisprobabilityhighandthelengthofthiscoveringintervalsmallitisdesirablethatboth Veft \mid \mathrm{B}{-}{\mathrm{p}}(\mathrm{t}) \backslash right \mid and \backslash sigma{p}(t)besmall, leadingtoasmall \mathrm{M}_{-}{\mathrm{p}}(\mathrm{t}) \$$ 也是如此。

## 广义线性模型代考

统计代写|抽样调查作业代写sampling theory of survey代考|STAT 7124

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|抽样调查作业代写sampling theory of survey代考|Probability Proportional to Size Without Replacement Sampling

In probability proportional to size WOR (PPSWOR) sampling scheme, probability of selection of $i_{1}$ at the first draw is $p_{i_{1}}(1)=p_{i_{1}}$. Probability of selecting $i_{2}$ at the second draw is $p_{i_{2}}(2)=\frac{p_{i_{2}}}{1-p_{i_{1}}}$ if the unit $i_{1}\left(i_{2} \neq i_{1}\right)$ is selected at the first draw and $p_{i_{2}}(2)=0$ when the unit $i_{2}$ is selected at the first draw, i.e., $i_{2}=i_{1}$. In general, the probability of selection of $i_{k}$ at the $k$ th draw is $p_{i_{k}}(k)=p_{1-p_{i_{1}}-p_{i_{2}}-\cdots-p_{i_{k-1}}}$, if the units $i_{1}, i_{2}, \ldots, i_{k-1}$ are selected in any of the first $k-1$ draws and $p_{i_{k}}(k)=0$ if the unit $i_{k}$ is selected in any of the first $k-1$ draws for $k=2, \ldots, n ; i=1, \ldots, N$. So, for a PPSWOR sampling scheme, the probability of selecting $i_{1}$ at the first draw, $i_{2}$ at the second draw, and $i_{n}$ at the $n$th draw is
\begin{aligned} p\left(i_{1}, \ldots, i_{n}\right)=& p_{i_{1}} \frac{p_{i_{2}}}{1-p_{i_{1}}} \cdots \frac{p_{i_{k}}}{1-p_{i_{1}}-\cdots-p_{i_{k-1}}} \cdots \frac{p_{i_{n}}}{1-p_{i_{1}}-\cdots-p_{i_{n-1}}} \text { for } \ 1 \leq i_{1} \neq i_{2} \neq \cdots \neq i_{n} \leq N \end{aligned}
It should be noted that PPSWOR reduces to SRSWOR sampling scheme if $p_{i}=1 / N$ for $i=1, \ldots, N$.

## 统计代写|抽样调查作业代写sampling theory of survey代考|HANURAV’S ALGORITHM

Hanurav (1966) established a correspondence between a sampling design and a sampling scheme. He proved that any sampling scheme results in a sampling design. Similarly, for a given sampling design, one can construct at least one sampling scheme, which can implement the sampling design. In fact, Hanurav proposed the most general sampling scheme, known as Hanurav’s algorithm, using which one can derive various types of sampling schemes or sampling designs. Henceforth, we will not differentiate between the terms “sampling design” and “sampling scheme”.

Let $n_{0}$ denote the maximum sample size that might be required from a sampling scheme. Then, Hanurav’s (1966) algorithm is defined as follows:
$$\mathscr{A}=\mathscr{A}\left{q_{1}(i) ; q_{2}(s) ; q_{3}(s, i)\right}$$
where
(i) $0 \leq q_{1}(i) \leq 1, \quad \sum_{i=1}^{N} q_{1}(i)=1$ for $i=1, \ldots, N$
(ii) $0 \leq q_{2}(s) \leq 1$ for any sample $s \in \mathscr{S}$, where $\mathscr{\mathcal { S }}$ be the set of all possible samples.
(iii) $q_{3}(s, i)$ is defined when $q_{2}(s)>0$ and subject to $0 \leq q_{3}(s, i) \leq 1$,
$$\sum_{i=1}^{N} q_{3}(s, i)=1 \text { for } i=1, \ldots, N$$
Samples are selected using the following steps:
Step 1: At the first draw a unit $i_{1}$ is selected with probability $q_{1}\left(i_{1}\right)$; $i_{1}=1, \ldots, N$

Step 2: In this step, we decide whether the sampling procedure will be terminated or continued. Let $s_{(1)}=i_{1}$ be the unit selected in the first draw. A Bernoulli trial is performed with success probability $q_{2}\left(s_{(1)}\right)$. If the trial results in a failure, the sampling procedure is terminated and the selected sample is $s_{(1)}=i_{1}$. On the other hand, if the trial results in a success, we go to step 3 .

## 统计代写|抽样调查作业代写sampling theory of survey代考|Probability Proportional to Size Without Replacement Sampling

$$p\left(i_{1}, \ldots, i_{n}\right)=p_{i_{1}} \frac{p_{i_{2}}}{1-p_{i_{1}}} \cdots \frac{p_{i_{k}}}{1-p_{i_{1}}-\cdots-p_{i_{k-1}}} \cdots \frac{p_{i_{n}}}{1-p_{i_{1}}-\cdots-p_{i_{n-1}}} \text { for } 1 \leq i_{1} \neq i_{2} \neq \cdots$$

## 统计代写|抽样调查作业代写sampling theory of survey代考|HANURAV’S ALGORITHM

Hanurav (1966) 建立了抽样设计和抽样方案之间的对应关系。他证明了任何抽样方案都会导致抽样设计。类似 地，对于给定的抽样设计，可以构建至少一种抽样方案，该方案可以实现抽样设计。事实上，Hanurav 提出了最 通用的抽样方案，称为 Hanurav 算法，利用该算法可以推导出各种类型的抽样方案或抽样设计。此后，我们将 不再区分”抽样设计”和”抽样方案”这两个术语。

$\backslash$ mathscr ${\mathrm{A}}=\backslash$ mathscr ${\mathrm{A}} \backslash \operatorname{left}\left{\mathrm{q}{-}{1}(\mathrm{i}) ; \mathrm{q}{-}{2}(\mathrm{s}) ; \mathrm{q}{-}{3}(\mathrm{s}, \mathrm{i}) \backslash\right.$ right $}$ 其中 (i) $0 \leq q{1}(i) \leq 1, \quad \sum_{i=1}^{N} q_{1}(i)=1$ 为了 $i=1, \ldots, N$
(二) $0 \leq q_{2}(s) \leq 1$ 对于任何样品 $s \in \mathscr{S}$ ，在哪里 $\mathcal{S}$ 是所有可能样本的集合。
$\Leftrightarrow q_{3}(s, i)$ 定义为 $q_{2}(s)>0$ 并受 $0 \leq q_{3}(s, i) \leq 1$ ，
$$\sum_{i=1}^{N} q_{3}(s, i)=1 \text { for } i=1, \ldots, N$$

## 广义线性模型代考

统计代写|抽样调查作业代写sampling theory of survey代考|MATH 525

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 英国补考|抽样调查作业代写sampling theory of survey代考|Sampling and Nonsampling Errors

Obviously, using the complete enumeration method, we get the correct value of the parameter, provided all the $\gamma$-values of the population obtained are correct. This would mean that there is no nonresponse, i.e., a response from each unit is obtained, and there is no measurement error in measuring $\gamma$-values. However, in practice, at least for a large-scale survey, nonresponse is unavoidable, and $\gamma$-values are also subject to error because the respondents report untrue values, especially when $\gamma$-values relate to confidential characteristics such as income and age. The error in a survey, which is originated from nonresponse or incorrect measurement of $y$-values, is termed as the nonsampling error. The nonsampling errors increase with the sample size.
From a sample survey, we cannot get the true value of the parameter because we surveyed only a sample, which is just a part of the population. The error committed by making inference by surveying a part of the population is known as the sampling error. In complete enumeration,sampling error is absent, but it is subjected more to nonsampling error than sample surveys. When the population is large, complete enumeration is not possible as it is very expensive, time-consuming, and requires many trained investigators. The advantages of sample surveys over complete enumeration were advocated by Mahalanobis (1946), Cochran (1977), and Murthy (1977), to name a few.

## 英国补考|抽样调查作业代写sampling theory of survey代考|Cumulative Total Method

Here we label all possible samples of $\mathcal{S}$ as $s_{1}, \ldots, s_{i}, \ldots, s_{M}$, where $M=$ total number of samples in $\mathscr{e}$. Then we calculate the cumulative total $T_{i}=p\left(s_{1}\right)+\cdots+p\left(s_{i}\right)$ for $i=1, \ldots, M$ and select a random sample $R$ (say) from a uniform population with range $(0,1)$. This can be done by choosing a five-digit random number and placing a decimal preceding it. The sample $s_{k}$ is selected if $T_{k-1}<R \leq T_{k}$, for $k=1, \ldots, M$ with $T_{0}=0$.

Example $1.4 .1$
Let $U=(1,2,3,4) ; s_{1}=(1,1,2), s_{2}=(1,2,2), s_{3}=(3,2), s_{4}=(4)$; $p\left(s_{1}\right)=0.25, p\left(s_{2}\right)=0.30, p\left(s_{3}\right)=0.20$, and $p\left(s_{4}\right)=0.25$.
$\begin{array}{lllll}s & s_{1} & s_{2} & s_{3} & s_{4} \ p(s) & 0.25 & 0.30 & 0.20 & 0.25 \ T_{k} & 0.25 & 0.55 & 0.75 & 1\end{array}$
Let a random sample $R=0.34802$ be selected from a uniform population with range $(0,1)$. The sample $s_{2}$ is selected as $T_{1}=0.25<R=$ $0.34802 \leq T_{2}=0.55$.

The cumulative total method mentioned above, however, cannot be used in practice because here we have to list all the possible samples having positive probabilities. For example, suppose we need to select a sample of size 15 from a population size $R=30$ following a sampling design, where all possible samples of size $n=15$ have positive probabilities, we need to list $M=\left(\begin{array}{l}30 \ 15\end{array}\right)$ possible samples, which is obviously a huge number.

## 英国补考|抽样调查作业代写sampling theory of survey代考|Cumulative Total Method

$p\left(s_{1}\right)=0.25, p\left(s_{2}\right)=0.30, p\left(s_{3}\right)=0.20$ ，和 $p\left(s_{4}\right)=0.25$.

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

统计代写|抽样调查作业代写sampling theory of survey代考|STAT 506

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|抽样调查作业代写sampling theory of survey代考|Preliminaries and Basics of Probability Sampling

Various government organizations, researchers, sociologist, and businesses often conduct surveys to get answers to certain specific questions, which cannot be obtained merely through laboratory experiments or simply using economic, mathematical, or statistical formulation. For example, the knowledge of the proportion of unemployed people, those below poverty line, and the extent of child labor in a certain locality is very important for the formulation of a proper economic planning. To get the answers to such questions, we conduct surveys on sections of people of the locality very often. Surveys should be conducted in such a way that the results of the surveys can be interpreted objectively in terms of probability. Drawing inference about aggregate (population) on the basis of a sample, a part of the populations, is a natural instinct of human beings. Surveys should be conducted in such a way that the inference relating to the population should have some valid statistical background. To achieve valid statistical inferences, one needs to select samples using some suitable sampling procedure. The collected data should be analyzed appropriately. In this book, we have discussed various methods of sample selection procedures, data collection, and methods of data analysis and their applications under various circumstances. The statistical theories behind such procedures have also been studied in great detail.

In this chapter we introduce some of the basic definitions and terminologies in survey sampling such as population, unit, sample, sampling designs, and sampling schemes. Various methods of sample selection as well as Hanurav’s algorithm which gives the correspondence between a sampling design and a sampling scheme have also been discussed.

## 统计代写|抽样调查作业代写sampling theory of survey代考|Parameter and Parameter Space

For a given population $U$, we may be interested in studying certain characteristics of it. Such characteristics are known as study variables. When considering a population of students in a certain class, we may be interested to know the age, height, racial group, economic condition, marks on different subjects, and so forth. Each of the variables under study is called a study variable, and it will be denoted by $\gamma$. Let $\gamma_{i}$ be the value of a study variable $y$ for the $i$ th unit of the population $U$, which is generally not known before the survey. The $N$-dimension vector $\mathbf{y}=\left(\gamma_{1}, \ldots, \gamma_{i}, \ldots, \gamma_{\mathrm{N}}\right)$ is known as a parameter of the population $U$ with respect to the characteristic $\gamma$. The set of all-possible values of the vector $\mathbf{y}$ is the $N$-dimensional Euclidean space $R^{N}=\left(-\infty<y_{1}<\infty, \ldots,-\infty<\gamma_{i}<\infty, \ldots,-\infty<\gamma_{N}<\infty\right)$ and it is known as a parameter space. In most of the cases we are not interested in knowing the parameter $\mathbf{y}$ but in a certain parametric function of $\mathbf{y}$ such as, $Y=\sum_{i=1}^{N} \gamma_{i}=$ population total, $\bar{Y}=\frac{1}{N} \sum_{i=1}^{N} \gamma_{i}=$ population mean, $S_{Y}^{2}=\frac{1}{N-1} \sum_{i=1}^{N}\left(y_{i}-\bar{Y}\right)^{2}=$ population variance, $C_{\gamma}=S_{Y} / \bar{Y}$ $=$ population coefficient of variation, and so forth.

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

统计代写|抽样调查作业代写sampling theory of survey代考| Balancing for Polynomial Models

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|抽样调查作业代写sampling theory of survey代考|Balancing for Polynomial Models

We return to the model $\mathcal{M}{10}^{\prime}$ of $4.1 .2$ and consider an extension $\mathcal{M}{k}$ defined as follows:
\begin{aligned} Y_{i} &=\sum_{j=0}^{k} \beta_{j} X_{i}^{j}+\varepsilon_{i} \ E_{m}\left(\varepsilon_{i}\right) &=0, V_{m}\left(\varepsilon_{i}\right)=\sigma^{2}, C_{m}\left(\varepsilon_{i}, \varepsilon_{j}\right)=0, \text { for } i \neq j \end{aligned}
where $i, j=1,2, \ldots, N$. By generalizing the developments of section 4.1.2, we derive.

RESULT 4.2 Let $\mathcal{M}{k}$ be given. Then, the MSE of the $B L U$ predictor $t{o}$ for $Y$ is minimum for a sample $s$ of size $n$ if
$$\frac{1}{n} \sum_{s} X_{i}^{j}=\frac{1}{N} \sum_{1}^{N} X_{i}^{j} \text { for } j=0,1, \ldots, k \text {. }$$
If these equalities hold we have
$$t_{o}(s, Y)=N \bar{y} \text {. }$$
A sample satisfying the equalities in Result $4.2$ is said to be balanced up to order $k$.

Now, assume the true model $\mathcal{M}{k^{\prime}}$ agrees with a statistician’s working model $\mathcal{M}{k}$ in all respects except that
$$E_{m}\left(Y_{i}\right)=\sum_{0}^{k^{\prime}} \beta_{j} X_{i}^{j}$$
with $k^{\prime}>k$. The statistician will use $t_{o}$ instead of $t_{o}^{\prime}$, the BLU predictor for $Y$ on the base of $\mathcal{M}_{k^{\prime}}$. However, if he selects a

sample that is balanced up to order $k^{\prime}$
$$t_{o}^{\prime}(s, Y)=t_{o}(s, Y)=N \bar{y}$$
and his error does not cause losses.
It is, of course, too ambitious to realize exactly the balancing conditions even if $k^{\prime}$ is of moderate size, for example, $k^{\prime}=4$ or 5 . But if $n$ is large the considerations outlined in Result 4.1 apply again for SRSWOR or SRSWOR independently from within strata after internally homogeneous strata are priorly constructed.

## 统计代写|抽样调查作业代写sampling theory of survey代考|Linear Models in Matrix Notation

Suppose $x_{1}, x_{2}, \ldots, x_{k}$ are real variables, called auxiliary or explanatory variables, each closely related to the variable of interest $y$. Let
$$x{i}=\left(X{i 1}, X_{i 2}, \ldots, X_{i k}\right)^{\prime}$$
be the vector of explanatory variables for unit $i$ and assume the linear model
$$\begin{gathered} Y_{i}=x{i}^{\prime} \beta+\varepsilon{i} \ \text { for } i=1,2, \ldots, N . \text { Here } \ \beta=\left(\beta_{1}, \beta_{2}, \ldots, \beta_{k}\right)^{\prime} \end{gathered}$$
is the vector of (unknown) regression parameters; $\varepsilon_{1}, \varepsilon_{2}, \ldots$, $\varepsilon_{N}$ are random variables satisfying
\begin{aligned} E_{m} \varepsilon_{i} &=0 \ V_{m} \varepsilon_{i} &=v_{i i} \ C_{m}\left(\varepsilon_{i}, \varepsilon_{j}\right) &=v_{i j}, i \neq j \end{aligned}
where $E_{m}, V_{m}, C_{m}$ are operators for expectation, variance, and covariance with respect to the model distribution; and the matrix $V=\left(v_{i j}\right)$ is assumed to be known up to a constant $\sigma^{2}$.
To have a more compact notation define
\begin{aligned} Y &=\left(Y_{1}, Y_{2}, \ldots, Y_{N}\right)^{\prime} \ X &=\left(x{1}, x{2}, \ldots, x{N}\right)^{\prime}=\left(X{i j}\right) \ \varepsilon &=\left(\varepsilon_{1}, \varepsilon_{2}, \ldots, \varepsilon_{N}\right)^{\prime} \end{aligned}

and write the linear model as
$$Y=X \beta+\varepsilon$$
where
\begin{aligned} E_{m \varepsilon} &=0 \ V_{m}(\varepsilon) &=V \end{aligned}

## 统计代写|抽样调查作业代写sampling theory of survey代考|Robustness Against Model Failures

Consider the general linear model described in section 4.1.4. TAM (1986) has shown that a necessary and sufficient condition for
$$T^{\prime} Y{s}=\sum{s} T_{i} Y_{i}$$
to be $\mathrm{BLU}$ for $Y=1^{\prime} Y$ is that
(a) $\frac{T}{V} \frac{X_{s}}{T}=1 K T$
(b) $\frac{T}{V_{s s}} \underline{T}-\bar{K} 1 \in M\left(X{s}\right)$ where $$K=\left(V{s s}, V_{s r}\right),$$
and $M\left(X{s}\right)$ is the column space of $X{s}$.

In case $V_{r s}=0$ these conditions reduce to $(q)$ and
$(b)^{\prime} \quad V_{s s}\left(T-1{s}\right) \in M\left(X{s}\right)$
as given earlier by PEREIRA and RODRIGUES (1983).
By TAM’s (1986) results one may deduce the following.
If the true model is as above, $\mathcal{M}$, but one employs the best predictor postulating a wrong model, say $\mathcal{M}^{}$, using $X^{}$ instead of $X$ throughout where
$$X=\left(X^{}, X\right),$$ then the best predictor under $\mathcal{M}^{}$ is still best under $\mathcal{M}$ if and only if
$$T^{\prime} X_{s}=1^{\prime} \tilde{X}$$
using obvious notations. This evidently is a condition that the predictor should remain model-unbiased under the correct model $\mathcal{M}$. Thus, choosing a right sample meeting this stipulation, one may achieve robustness. But, in practice, $X$ will be unknown and one cannot realize this robustness condition at will, although for large samples this condition may hold approximately. In this situation, it is advisable to adopt suitable unequal probability sampling designs that assign higher selection probabilities to samples for which this condition should hold approximately, provided one may guess effectively the nature for variables omitted but influential in explaining variabilities in $y$ values. If a sample is thus rightly chosen one may preserve optimality even under modeling deficient as above. On the other hand, if one employs the best predictor using $W^{}$ instead of $X$ when $W^{}=(X, W)$, then this predictor continues to remain best if and only if the condition (b) above still holds. But this condition is too restrictive, demanding correct specification of the nature of $V$, which should be too elusive in practice. ROYALL and HERSON (1973), TALLIS (1978), SCOTT, BREWER and Ho (1978), PEREIRA and RODRIGUES (1983), RODRIGUES (1984), ROYALL and PFEFFERMANN (1982), and PFEFFERMANN (1984) have derived results relevant to this context of robust prediction.

## 统计代写|抽样调查作业代写sampling theory of survey代考|Balancing for Polynomial Models

t_{o}(s, Y )=N \bar{y} \text {. }
$$一个满足 Result 等式的样本4.2据说是按订单平衡的ķ. 现在，假设真实模型米ķ′同意统计学家的工作模式米ķ在所有方面，除了 和米(是一世)=∑0ķ′bjX一世j 和ķ′>ķ. 统计学家将使用吨这代替吨这′, BLU 预测器是基于米ķ′. 但是，如果他选择了 按订单平衡的样品ķ′$$
t_{o}^{\prime}(s, Y )=t_{o}(s, Y )=N \bar{y}
$$并且他的错误不会造成损失。 当然，要准确地实现平衡条件过于雄心勃勃，即使ķ′大小适中，例如，ķ′=4或 5。但如果n结果 4.1 中概述的考虑因素再次适用于独立于地层内部的 SRSWOR 或 SRSWOR 在内部均质地层预先构建后。 ## 统计代写|抽样调查作业代写sampling theory of survey代考|Linear Models in Matrix Notation 认为X1,X2,…,Xķ是实变量，称为辅助变量或解释变量，每个变量都与感兴趣的变量密切相关是. 令$$
x {i}=\left(X{i 1}, X_{i 2}, \ldots, X_{ik}\right)^{\prime}
b和吨H和在和C吨这r这F和Xpl一种n一种吨这r是在一种r一世一种bl和sF这r在n一世吨$一世$一种nd一种ss在米和吨H和l一世n和一种r米这d和l
\begin{gathered}
Y_{i}= x {i}^{\prime} \beta +\varepsilon{i} \
\text { for } i=1,2, \ldots, N 。\text { 这里 } \
\beta =\left(\beta_{1}, \beta_{2}, \ldots, \beta_{k}\right)^{\prime}
\end{gathered}

\begin{aligned}
Y &=\left(Y_{1}, Y_{2}, \ldots, Y_{N}\right)^{\prime} \
X &=\left( x {1}, x { 2}, \ldots, x {N}\right)^{\prime}=\left(X{ij}\right) \
\varepsilon &=\left(\varepsilon_{1}, \varepsilon_{2}, \ ldots, \varepsilon_{N}\right)^{\prime}
\end{aligned}
$$并将线性模型写为$$
Y = X \beta + \varepsilon

\begin{对齐}
E_{m \varepsilon } &= 0 \
V_{m}( \varepsilon ) &=V
\end{aligned}
$$## 统计代写|抽样调查作业代写sampling theory of survey代考|Robustness Against Model Failures 考虑第 4.1.4 节中描述的一般线性模型。TAM (1986) 证明了$$
T ^{\prime} Y {s}=\sum{s} T_{i} Y_{i}
$$的充要条件 是乙大号在对于 Y= 1 \prime} Y一世s吨H一种吨(一种)\frac{T}{V} \frac{X_{s}}{T}= 1 K T(b)\frac{T}{V_{ss}} T -\bar{K} 1 \in M\left( X {s}\right)在H和r和ķ=(在ss,在sr),一种ndM\left( X {s}\right)一世s吨H和C这l在米nsp一种C和这FX {s}。 如果在rs=0这些条件减少到(q)和 (b)^{\prime} \quad V_{ss}\left(T- 1 {s}\right) \in M\left( X {s}\right)一种sG一世在和n和一种rl一世和rb是磷和R和一世R一种一种ndR这DR一世G在和小号(1983).乙是吨一种米′s(1986)r和s在l吨s这n和米一种是d和d在C和吨H和F这ll这在一世nG.一世F吨H和吨r在和米这d和l一世s一种s一种b这在和,\数学{M},b在吨这n和和米pl这是s吨H和b和s吨pr和d一世C吨这rp这s吨在l一种吨一世nG一种在r这nG米这d和l,s一种是\数学{M}^{},在s一世nGX ^{}一世ns吨和一种d这FX吨Hr这在GH这在吨在H和r和 X =\left( X ^{}, X \right),吨H和n吨H和b和s吨pr和d一世C吨这r在nd和r米一世ss吨一世llb和s吨在nd和r米一世F一种nd这nl是一世F T ^{\prime} X _{s}= 1 ^{\prime} \tilde{X}$$

