计算机代写|python代考|PYTHON101

Python是一种高级的、解释性的、通用的编程语言。它的设计理念强调代码的可读性，使用大量的缩进。

Python是动态类型的，并且是垃圾收集的。它支持多种编程范式，包括结构化（特别是程序化）、面向对象和函数式编程。由于其全面的标准库，它经常被描述为一种 “包含电池 “的语言。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

计算机代写|python代考|HTTP objects filter

As we can see, the filters provide us with a great traceability of communications and also serves as an ideal complement to analyze a multitude of attacks. An example of this is the http. content_type filter, thanks to which we can extract different data flows that take place in an HTTP connection (text/htm1, application/zip, audio/mpeg, image/gif). This will be very useful for locating malware, exploits, or other types of attacks that are embedded in such a protocol:

Wireshark contemplates two types of filters, that is, capture filters and display filters:

• Capture filters are those that are set to show only packets that meet the requirements indicated in the filter
• Display filters establish a filter criterion on the captured packages, which we are visualizing in the main screen of Wireshark

The visualization filters establish a criterion of filter on the packages that we are capturing and that we are visualizing in the main screen of Wireshark. When you apply a filter on the Wireshark main screen, only the filtered traffic will appear through the display filter. We can also use it to filter the content of a capture through a pcap file:

We can use the pyshark library to analyze the network traffic in Python, since everything Wireshark decodes in each packet is made available as a variable. We can find the source code of the tool in GitHub’s repository: https://github. com/Kimin ewt/pyshark.

In the PyPI repository, we can find the last version of the library, that is, https://p ypi.org/project/pyshark, and we can install it with the pip install pyshark command.

计算机代写|python代考|FileCapture and LiveCapture in pyshark

As we saw previously, you can use the filecapture method to open a previously saved trace file. You can also use pyshark to sniff from an interface in real time with the Livecapture method, like so:

Once a capture object is created, either from a Livecapture or Filecapture method, several methods and attributes are available at both the capture and packet level. The power of pyshark is that it has access to all of the packet decoders that are built into TShark.
Now, let’s see what methods provide the returned capture object.
To check this, we can use the dir method with the capture object:

Both methods offer similar parameters that affect packets that are returned in the capture object. For example, we can iterate through the packets and apply a function to each. The most useful method here is the apply_on_packets() method. apply_on_packets() is the main way to iterate through the packets, passing in a function to apply to each packet:

This option makes capture file reading much faster, and with the dir method, we can check the attributes that are available in the object to obtain information about a specific packet.

In this chapter, we have completed an introduction to TCP/IP and how machines communicate in a network. We learned about the main protocols of the network stack and the different types of address for communicating in a network. We started with Python libraries for network programming and looked at socket and the ur111ib and requests modules, and provided an example of how we can interact and obtain information from RFC documents. We also acquired some basic knowledge so that we are able to perform a network traffic analysis with Wireshark.

计算机代写|python代考|HTTP objects filter

Wireshark 考虑了两种类型的过滤器，即捕获过滤器和显示过滤器：

• 捕获过滤器是那些设置为仅显示满足过滤器中指示要求的数据包的过滤器
• 显示过滤器在捕获的包上建立过滤标准，我们在 Wireshark 的主屏幕中可视化

Python是一种高级的、解释性的、通用的编程语言。它的设计理念强调代码的可读性，使用大量的缩进。

Python是动态类型的，并且是垃圾收集的。它支持多种编程范式，包括结构化（特别是程序化）、面向对象和函数式编程。由于其全面的标准库，它经常被描述为一种 "包含电池 "的语言。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

计算机代写|python代考|Capturing packets with Wireshark

To start capturing packets, you can click on the name of an interface from the list of interfaces. For example, if you want to capture traffic on your Ethernet network, double-click on the Ethernet connection interface:

As soon as you click on the name of the interface, you will see that the packages start to appear in real time. Wireshark captures every packet that’s sent to or from your network traffic. You will see random flooding of data in the Wireshark dashboard. There are many ways to filter traffic:

• To filter traffic from any specific IP address, type $i p$.addr $==’ x x x . x x . x x . x x^{\prime}$ in the Apply a display filter field
• To filter traffic for a specific protocol, say, TCP, UDP, SMTP, ARP, and DNS requests, just type the protocol name into the Apply a display filter field

We can use the Apply a display filter box to filter traffic from any IP address or protocol:

The graphical interface of Wireshark is mainly divided into the following sections:

• The toolbar, where you have all the options that you can perform on the pre and post capture
• The main toolbar, where you have the most frequently used options in Wireshark
• The filter bar, where you can apply filters to the current capture quickly
• The list of packages, which shows a summary of each package that is captured by Wireshark
• The panel of details of packages that, once you have selected a package in the list of packages, shows detailed information of the same
• The packet byte panel, which shows the bytes of the selected packet, and highlights the bytes corresponding to the field that’s selected in the packet details panel
• The status bar, which shows some information about the current state of Wireshark and the capture

## 计算机代写|python代考|Network traffic in Wireshark

Network traffic or network data is the amount of packets that are moving across a network at any given point of time. The following is a classical formula for obtaining the traffic volume of a network: Traffic volume = Traffic Intensity or rate * Time

In the following screenshot, we can see what the network traffic looks like in Wireshark:

In the previous screenshot, we can see all the information that is sent over, along with the data packets on a network. It includes several pieces of information, including the following:

• Time: The time at which packets are captured
• Source: The source from which the packet originated
• Destination: The sink where packets reach their final destination
• Protocol: Type of IP (or set of rules) the packet followed during its journey, such as TCP, UDP, SMTP, and ARP
• Info: The information that the packet contains
The Wireshark website contains samples for capture files that you can import into Wireshark. You can also inspect the packets that they contain: https://wiki.wi reshark.org/samplecaptures.

When you start capturing packets, Wireshark uses colors to identify the types of traffic that can occur, among which we can highlight green for TCP traffic, blue for DNS traffic, and black for traffic that has errors at the packet level.

To see exactly what the color codes mean, click View | Coloring rules. You can also customize and modify the coloring rules in this screen.

If you need to change the color of one of the options, just double-click it and choose the color you want.

计算机代写|python代考|Capturing packets with Wireshark

• 要过滤来自任何特定 IP 地址的流量，请键入一世p.地址==′XXX.XX.XX.XX′在应用显示过滤器字段中
• 要过滤特定协议（例如 TCP、UDP、SMTP、ARP 和 DNS 请求）的流量，只需在“应用显示过滤器”字段中键入协议名称

Wireshark的图形界面主要分为以下几个部分：

• 工具栏，您可以在其中执行捕获前和捕获后的所有选项
• 主工具栏，您可以在其中找到 Wireshark 中最常用的选项
• 过滤器栏，您可以在其中快速将过滤器应用于当前捕获
• 包列表，显示 Wireshark 捕获的每个包的摘要
• 包的详细信息面板，一旦您在包列表中选择了一个包，就会显示该包的详细信息
• 数据包字节面板，显示所选数据包的字节，并突出显示与数据包详细信息面板中所选字段对应的字节
• 状态栏，显示一些关于 Wireshark 当前状态和捕获的信息

## 计算机代写|python代考|Network traffic in Wireshark

• 时间：抓包的时间
• Source：数据包的来源
• 目的地：数据包到达最终目的地的接收器
• 协议：数据包在其传输过程中遵循的 IP 类型（或规则集），例如 TCP、UDP、SMTP 和 ARP
• 信息：数据包包含
的信息 Wireshark 网站包含您可以导入 Wireshark 的捕获文件示例。您还可以检查它们包含的数据包：https://wiki.wi reshark.org/samplecaptures。

Python是一种高级的、解释性的、通用的编程语言。它的设计理念强调代码的可读性，使用大量的缩进。

Python是动态类型的，并且是垃圾收集的。它支持多种编程范式，包括结构化（特别是程序化）、面向对象和函数式编程。由于其全面的标准库，它经常被描述为一种 "包含电池 "的语言。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

Now, we are going to create the same script but, instead of using ur11ib or requests, we are going to use the socket module for working at a low level. For this, create a text file called RFc_download_socket.py: The main difference here is that we are using a socket module instead of ur11ib or requests. Socket is Python’s interface for the operating system’s TCP and UDP implementation. We have to tell socket which transport layer protocol we want to use. We do this by using the socket.create_connection() convenience function. This function will always create a TCP connection. For establishing the connection, we are using port 80 , which is the standard port number for web services over HTTP.

Next, we deal with the network communication over the TCP connection. We send the entire request string to the server by using the sendal1() call. The data that’s sent through TCP must be in raw bytes, so we have to encode the request text as ASCII before sending it.

Then, we piece together the server’s response as it arrives in the while loop. Bytes that are sent to us through a TCP socket are presented to our application in a continuous stream. So, like any stream of unknown length, we have to read it iteratively. The recv() call will return the empty string after the server sends all of its data and closes the connection. Finally, we can use this as a condition for breaking out and printing the response.This section will help you update the basics of Wireshark to capture packets, filter them, and inspect them. You can use Wireshark to analyze the network traffic of a suspicious program, analyze the traffic flow in your network, or solve network problems. We will also review the pyshark module for capturing packets in Python.

## 计算机代写|python代考|Introduction to Wireshark

Wireshark is a network packet analysis tool that captures packets in real time and displays them in a graphic interface. Wireshark includes filters, color coding, and other features that allow you to analyze network traffic and inspect packets individually.

Wireshark implements a wide range of filters that facilitate the definition of search criteria for the more than 1,000 protocols it currently supports. All of this happens through a simple and intuitive interface that allows each of the captured packages to be broken down into layers.
Thanks to Wireshark understanding the structure of these protocols, we can visualize the fields of each of the headers and layers that make up the packages, providing a wide range of possibilities to the network administrator when it comes to performing tasks in the analysis of traffic.
One of the advantages that Wireshark has is that at any given moment, we can leave capturing data in a network for as long as we want and then store them so that we can perform the analysis later. It works on several platforms, such as Windows, OS X, Linux, and Unix.
Wireshark is also considered a protocol analyzer or packet sniffer, thus allowing us to observe the messages that are exchanged between applications. For example, if we capture an HTTP message, the packet analyzer must know that this message is encapsulated in a TCP segment, which, in turn, is encapsulated in an IP packet, and which, in turn, is encapsulated in an Ethernet frame.

Wireshark is composed mainly of two elements: a packet capture library, which receives a copy of each data link frame that is either sent or received, and a packet analyzer, which shows the fields corresponding to each of the captured packets. To do this, the packet analyzer must know about the protocols that it is analyzing so that the information that’s shown is consistent.

计算机代写|python代考|Introduction to Wireshark

Wireshark 是一种网络数据包分析工具，可以实时捕获数据包并将其显示在图形界面中。Wireshark 包括过滤器、颜色编码和其他功能，可让您分析网络流量并单独检查数据包。

Wireshark 实现了范围广泛的过滤器，这些过滤器有助于为它目前支持的 1,000 多种协议定义搜索条件。所有这一切都通过一个简单直观的界面发生，该界面允许将每个捕获的包分解为多个层。

Wireshark 的优势之一是，在任何给定时刻，我们都可以将捕获的数据留在网络中，想留多久就留多久，然后存储它们，以便稍后执行分析。它适用于多种平台，例如 Windows、OS X、Linux 和 Unix。
Wireshark 也被认为是协议分析器或数据包嗅探器，从而使我们能够观察应用程序之间交换的消息。例如，如果我们捕获一个 HTTP 消息，数据包分析器必须知道这个消息被封装在一个 TCP 段中，TCP 段又被封装在一个 IP 数据包中，然后又被封装在一个以太网帧中。

Wireshark 主要由两个元素组成：一个数据包捕获库，它接收发送或接收的每个数据链路帧的副本，以及一个数据包分析器，它显示与每个捕获的数据包对应的字段。为此，数据包分析器必须了解它正在分析的协议，以便显示的信息是一致的。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

统计代写|抽样调查作业代写sampling theory of survey代考|The Predictive Rank Distribution

Ranks lie at the heart of JPS, and indeed all of RSS. Focusing on a single set, we can describe the ranks of the $H$ units in terms of a matrix $P$. Each row of the matrix corresponds to a unit in the set, each column to the rank of the unit in the set. A perfectly ranked sample corresponds to a permutation matrix where the row for unit $h$, if having rank $j$, is the $H$-vector with a 1 in position $j$ and 0 in all other positions. We use the notation $p_h$ to represent row $h$ of the matrix $P$.

JPS and RSS rely on the rank matrix $P$ but do not rely on an assumption of perfect ranking. Whether the ranks come from subjective judgement or from measured covariates, they yield a permutation matrix $P$, provided there are no ties in the ranking. In the event that there are ties, perhaps due to a pair of rankers (or measured covariates) providing different ranking matrices, $P_1$ and $P_2$, MacEachern et al. (2004) suggested use of the average $\bar{P}=0.5 P_1+0.5 P_2$. This is appropriate when there is no reason to prefer one ranking over the other. Replacement of the permutation matrix $P$ with the average necessitates replacement of the estimator (3) with one that allows non-indicator vectors $p_h$. Relying on the extensive body of work on ratio estimation in survey sampling, MacEachern et al. (2004) suggested the estimator in (4). This estimator effectively prorates the response across the strata to which it may belong.

The replacement of an $H \times H$ permutation matrix $P$ with a convex combination over permutation matrices has been used productively in RSS by a number of authors, primarily when concerned with creating models for imperfect rankings (e.g. Bohn and Wolfe, 1994; Frey, 2007, while Dell and Clutter, 1972 and Fligner and MacEachern, 2006 developed models for imperfect ranking of differing form). The permutation matrices represent the extreme points of the set of doubly stochastic matrices-matrices with non-negative entries whose row sums and column sums total one. As a consequence, all other doubly stochastic matrices may be represented as an average of permutation matrices.

The use of measured covariates for JPS allows one to build a model for the response $Y$ as a function of the measured covariates, $\mathbf{X}$. The model may be constructed from the data at hand, or it may have been developed in previous studies. With more than one covariate, a regression model for $Y$ on $\mathbf{X}$ effectively transforms the vector of covariates into a single covariate while capturing much of the information connecting covariate to response. If the units in a set are ranked on the fitted value from the model when the covariate distribution is continuous, there will be no ties among the covariate values, ranking will be unambiguous, and the ranking matrix $P$ will be a permutation matrix. Chen et al. (2005) took this approach to form a logistic regression model for a binary response.

## 统计代写|抽样调查作业代写sampling theory of survey代考|Simulation Study

This section presents the results of simulation studies comparing the performance of the various estimators of the mean based on a JPS sample. The findings for existing estimators are in line with the results in Wang et al. (2006). They also highlight the value that the predictive rank probabilities bring to estimation, particularly for the new estimator in (15).

The first study investigates the performance of eight estimators when the model that generates the data is fully known and is exactly right. This allows us to look at the potential performance of the estimators, exclusive of uncertainty about the model. Large sample sizes let us compare the asymptotic performance of the estimators.

The eight estimators are JPS1 from (3), a plug-in estimator based on the rank of $E\left[Y \mid X_1, X_2\right]$ (LS), OLS and WLS from Wang et al. (2006), TRs from (4), JPS2 and JPS3 from (14) and (15) and REG from (10). JPS2 and JPS3 make use of the predictive rank distribution. The estimator TRs has the same form as JPS2 but, as in MacEachern et al. (2004), uses the two ranks from the concomitants instead of the model-based predictive rank distribution. The REG estimator makes direct use of the covariates.

The model is the following. There are $n$ sets, each consisting of $H$ units. There are two covariates and a single response of interest. The covariates are measured on all $n H$ units, while the response is measured for a single unit in each set. The vector $\left(X_1, X_2, Y\right)$ follows a multivariate normal distribution with standard normal marginal distributions and covariances (correlations) specified in Tables 1,2 , and 3. The varied correlations range from a strong relationship between the concomitants and $Y$ to a relatively weak relationship between them. Sample sizes $n=20,50$ and 100 are investigated for set size $H=2$. For larger set sizes, results are presented only $n=50$ and 100 . For these set sizes, some of the estimators did not exist for some replicates. For the simulation, 10,000 replicates were used.

The tables present the relative accuracy of the various estimators to the sample mean based on a SRS. The entries are the ratio of MSEs for the SRS relative to the estimator in question. A number greater than 1 indicates smaller MSE for the estimator than for SRS.

## 统计代写|抽样调查作业代写sampling theory of survey代考|The Predictive Rank Distribution

JPS和RSS依赖秩矩阵P但不要依赖完美排名的假设。无论等级来自主观判断还是来自测量的协变量，它们都会产生一个置换矩阵P，前提是排名没有关系。如果存在关系，可能是由于一对排序器（或测量的协变量）提供不同的排序矩阵，P1和P2, MacEachern 等人。(2004) 建议使用平均值P¯=0.5P1+0.5P2. 当没有理由偏爱一个排名而不是另一个排名时，这是合适的。置换矩阵的替换P平均需要用一个允许非指标向量的估计器替换估计器 (3)pH. MacEachern 等人依靠调查抽样中比率估计的广泛工作。(2004) 建议使用 (4) 中的估计量。该估算器有效地按比例分配了它可能所属的各个层的响应。

## 广义线性模型代考

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

统计代写|抽样调查作业代写sampling theory of survey代考|Consistency of JPS Estimators

The literature on RSS and JPS demonstrates the consistency of the estimators $\bar{Y}{r s s}$ and $\hat{\mu}{j p s 1}$ in (2) and (3), respectively, under minimal conditions. These traditional estimators borrow heavily from the design-based perspective of survey sampling, where (approximate) unbiasedness is prized. Small variance is the secondary consideration. Modern work with surveys adjusts the balance, relying more heavily on models, especially where missing data is a concern (Lohr, 2010). With this perspective, a bit more bias is allowed, provided it is accompanied by a substantial reduction in variance. Simulations are used to evaluate the estimators’ performance when the model does not hold. Wang et al. (2006) pursued this path.

We work in the infinite population setting where we collect IID sets, observing a single member of each set. As such, we envision that the data come from some distribution which we refer to as the “true model”. In addition, there is a model used to construct the estimator. We assume that $\mu$ exists under both models. Consistency concerns arise when the true model and that used for analysis differ.

To set the framework for our consideration of robustness, we split the models into two parts. The first is the conditional distribution of $Y \mid \mathbf{R}$. The second is the distribution of $\mathbf{R}$ for the unit that is to be fully measured. The true and analysis models may differ in one or both of these aspects. A given estimator may be robust to differences in one portion of the model but not to differences in the other portion of the model. We consider each of the estimators in turn, presenting a heuristic argument for or against consistency. Our statements are to be taken loosely; simulations appearing in a later section support our claims. Formal statements and proofs of these results await another venue.

We briefly note that the estimators $\hat{\mu}{j p s 1}$ and $\hat{\mu}{j p s 2}$ are consistent for $\mu$. These estimators do not rely on a model, and so we need not consider the gap between the true and analysis models. Consistency was established in MacEachern et al. (2004).
The estimators based on parametric models, $\hat{\mu}{o L S}$ and $\hat{\mu}{w L S}$, may or may not be consistent. We begin with $\hat{\mu}{o L S}$. For a given stratum $\mathbf{r}$, an offset observation, $Y-$ $\delta{[\mathbf{r}]}=Y-\mu_{[\mathbf{r}]}+\mu$, has mean $\mu$-provided the true and analysis models agree for the distribution of $Y \mid(\mathbf{R}=\mathbf{r})$ so that $\mu_{[\mathbf{r}]}$ has the same value under the two models and the offset has been correctly specified (or will be estimated consistently). Averaging across the strata, we see that the estimator targets the quantity $\mu-\sum_{\mathbf{r}} \pi_{\mathbf{r}} \delta[\mathbf{r}]$. The estimator will be consistent for $\mu$ if (7) holds so that the average offset is zero. It is clear that this will be the case when the distribution on $\mathbf{R}$ and the conditional mean of $Y \mid(\mathbf{R}=\mathbf{r})$ are correctly specified for each of the $H^2$ strata. The first ensures accuracy of the $\pi_{\mathbf{r}}$, while the second ensures accuracy of the $\delta_{[\mathbf{r}]}$. Together, these imply (7). While these conditions stop short of full agreement between the true and analysis models, they are nearly there.

## 统计代写|抽样调查作业代写sampling theory of survey代考|Covariates or Ranks?

The use of the vector of measured covariates, $\mathbf{X}_i$, to induce the ranks opens up many possibilities. One might ask whether ranking on $X_1$ and $X_2$ is optimal, or whether there is a mapping to another set of variates that leads to a better estimator. One possibility stands out, especially when relying on a multivariate normal model for $(Y, \mathbf{X})$. The vector $\mathbf{X}$ can be mapped to the regression of $Y$ on $\mathbf{X}$ and its orthogonal complement. Under the multivariate normal model, this corresponds to an affine transformation of the covariates, $\mathbf{X}$, to a new set of covariates, say $\mathbf{W}=A \mathbf{X}$. The first coordinate of $\mathbf{W}$ is $E[Y \mid \mathbf{X}]$. The second coordinate is independent of both the first coordinate and the response and can be dropped.

In practice, we do not expect to know the relationship between covariates and response. With this in mind, we might estimate the relationship by fitting a model for $Y \mid \mathbf{X}$ to our $n$ fully observed cases. Having done so, the fitted values become the first coordinate of $\mathbf{W}$. Often, the fitted values are estimates of $E[Y \mid \mathbf{W}]=$ $E[Y \mid \mathbf{X}]$. From here, a natural estimate of $\mu$ can be obtained by averaging the fitted values (estimated means) for all $n H$ observations. Following this path, the ranks have disappeared, and we are no longer in the setting of RSS or JPS.

The “covariate” approach leads to a natural estimator in the regression setting. The model for $Y \mid \mathbf{X}$ is a constant variance linear regression model. The chain of algebra below yields the estimator when the covariance matrix for $\mathbf{X}$ and $Y$ is known.

Define $\bar{Y}{s r s}$ and $\overline{\mathbf{X}}{s r s}$ to be the mean of the response and the covariates for the $n$ fully measured units, respectively. Take $\overline{\mathbf{X}}$ (a vector) to be the mean of the covariates for all $n H$ units. For the covariance matrix, with $Y$ in position 1 followed by the vector $\mathbf{X}$ in the trailing positions, the matrix can be written in partitioned form. This leads to $\Sigma_{12}$ and $\Sigma_{22}$ for the covariance of $Y$ and the vector $\mathbf{X}$ and the variance matrix for the vector $\mathbf{X}$, respectively. Then
\begin{aligned} \hat{\mu}{r e g} & =\frac{1}{n H} \sum{i=1}^n \sum_{h=1}^H \hat{E}\left[Y_{i h} \mid \mathbf{X}{i h}\right] \ & =\frac{1}{n H} \sum{i=1}^n \sum_{h=1}^H \hat{\mu}Y+\Sigma{12} \Sigma_{22}^{-1}\left(\mathbf{X}{i h}-\hat{\mu}_X\right) \ & =\frac{1}{n H} \sum{i=1}^n \sum_{h=1}^H \bar{Y}{s r s}+\Sigma{12} \Sigma_{22}^{-1}\left(\mathbf{X}{i h}-\overline{\mathbf{X}}{s r s}\right) \ & =\bar{Y}{s r s}+\Sigma{12} \Sigma_{22}^{-1}\left(\overline{\mathbf{X}}-\overline{\mathbf{X}}_{s r s}\right) . \end{aligned}
This estimator is constructed by replacing the unknown parameters with estimates from the $n$ fully measured units. In the event that the covariance matrix was not known, it would be replaced by the estimated covariance from the fully measured units. If the covariance matrix is unknown, estimates can be plugged in for the unknown quantities.

Why would one choose to pass from the covariate $\mathbf{X}$ to the coarser summary of its rank? The advantage of working with the rank-based estimators is their ability to handle deficiencies in the assumed model for $(\mathbf{X}, Y)$. A well-chosen estimator either will be consistent or will be Fisher consistent for a value very near the truth. (Parenthetically, estimators based directly on ( $\mathbf{X}, Y)$ may also be consistent.) The rank-based estimators also seem to be better able to handle poorer quality covariates, including those whose distribution is not fully stable from one set to another. They also lead to methods with enhanced robustness for data sets with missing covariate values and imperfect models for the missing covariates given the observed covariates.

## 统计代写|抽样调查作业代写sampling theory of survey代考|Consistency of JPS Estimators

\hat{\mu} r e g=\frac{1}{n H} \sum i=1^n \sum_{h=1}^H \hat{E}\left[Y_{i h} \mid \mathbf{X} i h\right] \quad=\frac{1}{n H} \sum i=1^n \sum_{h=1}^H \hat{\mu} Y+\Sigma 12 \Sigma_{22}^{-1}\left(\mathbf{X} i h-\hat{\mu}_X\right.
$$该估计器是通过用来自 n 完全测量的单位。在协方差矩阵末知的情况下，它将被完全测量单位的估计协方 差所取代。如果协方差矩阵末知，则可以代入末知量的估计值。 为什么会选择从协变量传递 \mathbf{X} 对其排名的粗略总结? 使用基于秩的估计器的优点是它们能够处理假设模型 中的缺陷 (\mathbf{X}, Y).一个精心选择的估计要么是一致的，要么是 Fisher 一致的，以获得非常接近真实的值。 (顺便说一下，估计量直接基于 ( \mathbf{X}, Y )也可能是一致的。) 基于等级的估计器似乎也能够更好地处理质 量较差的协变量，包括那些从一组到另一组的分布不完全稳定的协变量。它们还导致方法对具有缺失协变 量值的数据集具有增强的鲁棒性，并且在给定观察到的协变量的情况下，缺失协变量的模型不完善。 统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考 ## 随机过程代考 在概率论概念中，随机过程随机变量的集合。 若一随机系统的样本点是随机函数，则称此函数为样本函数，这一随机系统全部样本函数的集合是一个随机过程。 实际应用中，样本函数的一般定义在时间域或者空间域。 随机过程的实例如股票和汇率的波动、语音信号、视频信号、体温的变化，随机运动如布朗运动、随机徘徊等等。 ## 贝叶斯方法代考 贝叶斯统计概念及数据分析表示使用概率陈述回答有关未知参数的研究问题以及统计范式。后验分布包括关于参数的先验分布，和基于观测数据提供关于参数的信息似然模型。根据选择的先验分布和似然模型，后验分布可以解析或近似，例如，马尔科夫链蒙特卡罗 (MCMC) 方法之一。贝叶斯统计概念及数据分析使用后验分布来形成模型参数的各种摘要，包括点估计，如后验平均值、中位数、百分位数和称为可信区间的区间估计。此外，所有关于模型参数的统计检验都可以表示为基于估计后验分布的概率报表。 ## 广义线性模型代考 广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。 statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。 ## 机器学习代写 随着AI的大潮到来，Machine Learning逐渐成为一个新的学习热点。同时与传统CS相比，Machine Learning在其他领域也有着广泛的应用，因此这门学科成为不仅折磨CS专业同学的“小恶魔”，也是折磨生物、化学、统计等其他学科留学生的“大魔王”。学习Machine learning的一大绊脚石在于使用语言众多，跨学科范围广，所以学习起来尤其困难。但是不管你在学习Machine Learning时遇到任何难题，StudyGate专业导师团队都能为你轻松解决。 ## 多元统计分析代考 基础数据: N 个样本， P 个变量数的单样本，组成的横列的数据表 变量定性: 分类和顺序；变量定量：数值 数学公式的角度分为: 因变量与自变量 ## 时间序列分析代写 随机过程，是依赖于参数的一组随机变量的全体，参数通常是时间。 随机变量是随机现象的数量表现，其时间序列是一组按照时间发生先后顺序进行排列的数据点序列。通常一组时间序列的时间间隔为一恒定值（如1秒，5分钟，12小时，7天，1年），因此时间序列可以作为离散时间数据进行分析处理。研究时间序列数据的意义在于现实中，往往需要研究某个事物其随时间发展变化的规律。这就需要通过研究该事物过去发展的历史记录，以得到其自身发展的规律。 ## 回归分析代写 多元回归分析渐进（Multiple Regression Analysis Asymptotics）属于计量经济学领域，主要是一种数学上的统计分析方法，可以分析复杂情况下各影响因素的数学关系，在自然科学、社会和经济学等多个领域内应用广泛。 ## MATLAB代写 MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。 ## 统计代写|抽样调查作业代写sampling theory of survey代考|STAT506 如果你也在 怎样代写抽样调查sampling theory of survey这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。 抽样调查是一种非全面调查，根据随机的原则从总体中抽取部分实际数据进行调查，并运用概率估计方法，根据样本数据推算总体相应的数量指标的一种统计分析方法。 statistics-lab™ 为您的留学生涯保驾护航 在代写抽样调查sampling theory of survey方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写抽样调查sampling theory of survey方面经验极为丰富，各种代写抽样调查sampling theory of survey相关的作业也就用不着说。 我们提供的抽样调查sampling theory of survey及其相关学科的代写，服务范围广, 其中包括但不限于: • Statistical Inference 统计推断 • Statistical Computing 统计计算 • Advanced Probability Theory 高等楖率论 • Advanced Mathematical Statistics 高等数理统计学 • (Generalized) Linear Models 广义线性模型 • Statistical Machine Learning 统计机器学习 • Longitudinal Data Analysis 纵向数据分析 • Foundations of Data Science 数据科学基础 ## 统计代写|抽样调查作业代写sampling theory of survey代考|Ranked Set Sampling and Judgement Post-stratification Stokes’ pioneering work (Stokes, 1977) brought measured covariates to ranked set sampling (RSS). Briefly restating her work and establishing notation, consider a set of n H units that are partitioned at random into n sets, each of size H. The units are presumed to form a random sample from some distribution. Within a given set, we begin with \left(X_h, Y_h\right), h=1, \ldots, H. These units are ranked on the X_h, so that X_{(r: H)} is the r th order statistic in the set. The measured response, Y_{[r: H]}, associated with this unit is its concomitant. To draw a RSS of size n from such a population, sample sizes n_h, h=1, \ldots, H, are specified, with \sum_{h=1}^H n_h=n. One unit is drawn from each of the n sets; in n_h sets, the unit ranked h is selected. The resulting sample is a RSS. The earliest description of RSS appears in McIntyre (1952) (republished as McIntyre, 2005). In McIntyre’s description of the technique, ranking is based on the subjective judgement of an experimenter who examines each set of H units, specifying the ranks of the units in the set. Once the units in each set have been ranked, the sample is drawn as described above and the response of interest, Y, is measured on the n sampled units. Extending our notation to capture both set and rank within set, the mean of the n H units is$$
\bar{Y}=(n H)^{-1} \sum_{i=1}^n \sum_{h=1}^H Y_{i h},
$$where Y_{i h} is the response of the unit with rank h in set i. Suppressing the notation for the rank, define Y_i to the be i th of the n sampled units. Provided n_h>1 for all h,$$
\bar{Y}{r s s}=H^{-1} \sum{h=1}^H \bar{Y}h, $$where \bar{Y}_h is the sample mean of the n_h sampled units with rank h. The RSS estimator is unbiased: E\left[\bar{Y}{r s s} \mid \bar{Y}\right]=\bar{Y} for any collection of n H units. Furthermore, when the units are a random sample from a distribution with mean \mu=E[Y], E\left[\bar{Y}_{r s s}\right]=E[\bar{Y}]=\mu. The goal of RSS is to estimate \mu. Stokes and Sager (1988) cast estimation of a cumulative distribution function as estimation of a proportion (mean) for all cut points on the real line. RSS with estimation following (2) is robust to variation in the specifics of how the ranks are created. When created subjectively, better ranking leads to greater separation of the means of the rank classes (or strata), in turn leading to greater reduction in variance relative to estimators based on a random sample from the population. When ranks arise from a measured covariate, the same holds. Sound ## 统计代写|抽样调查作业代写sampling theory of survey代考|Multivariate Order Statistics and JPS In Wang et al. (2006), Stokes and coauthors posed the intriguing question of how to use multiple covariates to convey information about the ranks of units for use in JPS. Their solution is to rank on each of the distinct covariates. In the case of a continuous bivariate covariate, \left(X_1, X_2\right), each of the units in the set would be assigned a pair of ranks – one for X_1 and the other for X_2. This pair of ranks defines the post-stratum (or rank class) of the unit. For a set of size H, there are H^2 poststrata. We denote these post-strata with \mathbf{r}=\left(r_1, r_2\right), where r_1, r_2 \in{1, \ldots, H}. We focus on a bivariate covariate but note that the technique extends to covariates of greater dimension. Figure 1 illustrates the situation for a bivariate order statistic for set size H=5. The increase in the number of post-strata from H to H^2 necessitates reconsideration of the basic post-stratification estimator (3). Marginally, each covariate for the measured unit will have rank r_i=h with probability 1 / H for i=1,2 and h=1, \ldots, H. The joint distribution of \mathbf{R} leads to the stratum probability \pi_{\mathbf{r}}=P(\mathbf{R}=\mathbf{r}). In general, these probabilities can be found via numerical integration if the model for \left(X_1, X_2\right) is fully specified. Some of the \pi_{\mathbf{r}} may be much smaller than H^{-2}, leading to a large probability that the estimator is undefined. Wang et al. (2006) handled this issue by appealing to a parametric model as an aid to estimation. The authors defined \mu_{[\mathbf{r}]}=F[Y \mid \mathbf{R}=\mathbf{r}]. The value of \mu_{[\mathbf{r}]} can be found by numerical integration over the conditional distribution of Y \mid \mathbf{R}. Once the stratum means are in place, they are connected to the mean of Y via the expression \mu=\sum_{\mathbf{r}} \pi_{\mathbf{r}} \mu_{[\mathbf{r}]}. It is helpful to introduce the difference between the stratum mean and the overall mean, \delta_{[\mathbf{r}]}=\mu_{[\mathbf{r}]}-\mu. The authors suggested estimation by ordinary least squares applied to a model for \mu, with observations in stratum \mathbf{r} offset by \delta_{[\mathbf{r}]}. The data are \left(Y_i, \mathbf{r}i\right), i=1, \ldots, n, and the estimator is$$ \hat{\mu}{o L S}=n^{-1} \sum_{i=1}^n\left(Y_i-\delta_{\left[\mathbf{r}i\right]}\right) . $$The estimator \hat{\mu}{o L S} can be viewed in two stages: In the first, each observation is bias-corrected by subtracting its \delta_{[\mathbf{r}]}; in the second, the sample mean of the biascorrected observations is computed. Partitioning the sample into strata reduces the within-stratum variances. Removing bias and then using the sample mean ensures that each observation receives equal weight in the estimator. Together, these two stages lead to substantial variance reduction, especially for relatively large set sizes. ## 抽样调查代考 ## 统计代写|抽样调查作业代写sampling theory of survey代考|Ranked Set Sampling and Judgement Post-stratification Stokes 的开创性工作 (Stokes, 1977) 将测量的协变量引入排序集抽样 (RSS)。简要重申她的工作并建立符 号，考虑一组 n H 随机分成的单元 n 套装，每个尺寸 H. 假定这些单位从某种分布中形成随机样本。在给定 的集合中，我们从 \left(X_h, Y_h\right), h=1, \ldots, H. 这些单位排名在 X_h ，以便 X_{(r: H)} 是个 r 集合中的 th 阶统 计量。测得的响应， Y_{[r: H]}, 与这个单位相关的是它的伴随物。绘制大小为RSSn从这样的人群中，样本量 n_h, h=1, \ldots, H ，被指定为 \sum_{h=1}^H n_h=n. 每个单位抽取一个单位 n 套; 在 n_h 套，单位排名 h 被选中。 生成的样本是一个 RSS。 对 RSS 的最早描述出现在 McIntyre (1952)（重新出版为 McIntyre，2005) 中。在 McIntyre 对这项技术 的描述中，排名是基于实验者的主观判断，他检查了每组 H 单位，指定集合中单位的等级。一旦对每组中 的单元进行排序，就会按照上述方法抽取样本，并得出感兴趣的响应， Y ，是在 n 抽样单位。扩展我们的符 号以捕获集合和集合内的等级，即 n H 单位是$$
\bar{Y}=(n H)^{-1} \sum_{i=1}^n \sum_{h=1}^H Y_{i h},
$$在哪里 Y_{i h} 是具有等级的单元的响应 h 在集合中 i. 抑制等级的符号，定义 Y_i 成为 i 的第 n 抽样单位。假如 n_h>1 对所有人 h ，$$
\bar{Y} r s s=H^{-1} \sum h=1^H \bar{Y} h,
$$在哪里 \bar{Y}h 是样本均值 n_h 有排名的抽样单位 h. RSS 估计器是无偏的: E[\bar{Y} r s s \mid \bar{Y}]=\bar{Y} 对于任何集合 n H 单位。此外，当单位是来自均值分布的随机样本时 \mu=E[Y], E\left[\bar{Y}{r s s}\right]=E[\bar{Y}]=\mu. RSS的目标是 估计 \mu. Stokes 和 Sager (1988) 将累积分布函数的估计作为对实线上所有切割点的比例（均值）的估计。 估计遵循 (2) 的 RSS 对于排名创建方式的具体变化具有鲁棒性。当主观创建时，更好的排名会导致排名类 别 (或阶层) 的均值更大程度的分离，进而导致相对于基于总体随机样本的估计量的方差更大程度的减 少。当排名来自测量的协变量时，同样成立。 ## 统计代写|抽样调查作业代写sampling theory of survey代考|Multivariate Order Statistics and JPS 在王等人。(2006)，Stokes 和合著者提出了一个有趣的问题，即如何使用多个协变量来传达有关JPS 中使 用的单位等级的信息。他们的解决方案是对每个不同的协变量进行排名。在连续双变量协变量的情况下， \left(X_1, X_2\right) ，集合中的每个单元都将分配一对等级一一一个用于 X_1 另一个是 X_2. 这对职级定义了单位的职 级 (或职级) 。对于一组尺寸 H ，有 H^2 后层。我们用 \mathbf{r}=\left(r_1, r_2\right) ， 在哪里 r_1, r_2 \in 1, \ldots, H. 我们 关注双变量协变量，但注意到该技术扩展到更大维度的协变量。图 1 说明了集合大小的双变量顺序统计的 情况 H=5. 后阶层数量的增加来自 H 至 H^2 需要重新考虑基本的分层后估计量 (3)。边际上，测量单位的每个协变量将 具有排名 r_i=h 有概率 1 / H 为了 i=1,2 和 h=1, \ldots, H. 的联合分布 \mathbf{R} 导致层概率 \pi_{\mathrm{r}}=P(\mathbf{R}=\mathbf{r}). 一般来说，如果模型为 \left(X_1, X_2\right) 是完全指定的。某些 \pi_{\mathrm{r}} 可能比 H^{-2} ，导致估计量末定义的可能性很大。 王等。(2006) 通过求助于参数模型作为估计的辅助来处理这个问题。作者定义 \mu_{[\mathbf{r}]}=F[Y \mid \mathbf{R}=\mathbf{r}]. 的 价值 \mu_{[\mathbf{r}} 可以通过对条件分布的数值积分找到 Y \mid \mathbf{R}. 一旦层均值就位，它们将连接到 Y 通过表达式 \mu=\sum_{\mathbf{r}} \pi_{\mathbf{r}} \mu_{[\mathbf{r}]}. 引入层均值和总体均值之间的差异是有帮助的， \delta_{[\mathbf{r}]}=\mu_{[\mathbf{r}]}-\mu. 作者建议将普通最小 二乘法应用于模型 \mu ，在 stratum 中观察 \mathbf{r} 抵消 \delta_{[\mathbf{r}]}. 数据是 \left(Y_i, \mathbf{r} i\right), i=1, \ldots, n ，估计量是$$
\hat{\mu} o L S=n^{-1} \sum_{i=1}^n\left(Y_i-\delta_{[\mathbf{r} i]}\right) .
$$估算器 \hat{\mu} o L S 可以分两个阶段来查看: 在第一个阶段，每个观察值通过减去它的偏差校正 \delta_{[\mathbf{r}]}; 第二，计算 偏差校正观察的样本均值。将样本划分为层可减少层内方差。去除偏差然后使用样本均值可确保每个观察 值在估计器中获得相等的权重。这两个阶段一起导致显着的方差减少，特别是对于相对较大的集合大小。 统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考 ## 随机过程代考 在概率论概念中，随机过程随机变量的集合。 若一随机系统的样本点是随机函数，则称此函数为样本函数，这一随机系统全部样本函数的集合是一个随机过程。 实际应用中，样本函数的一般定义在时间域或者空间域。 随机过程的实例如股票和汇率的波动、语音信号、视频信号、体温的变化，随机运动如布朗运动、随机徘徊等等。 ## 贝叶斯方法代考 贝叶斯统计概念及数据分析表示使用概率陈述回答有关未知参数的研究问题以及统计范式。后验分布包括关于参数的先验分布，和基于观测数据提供关于参数的信息似然模型。根据选择的先验分布和似然模型，后验分布可以解析或近似，例如，马尔科夫链蒙特卡罗 (MCMC) 方法之一。贝叶斯统计概念及数据分析使用后验分布来形成模型参数的各种摘要，包括点估计，如后验平均值、中位数、百分位数和称为可信区间的区间估计。此外，所有关于模型参数的统计检验都可以表示为基于估计后验分布的概率报表。 ## 广义线性模型代考 广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。 statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。 ## 机器学习代写 随着AI的大潮到来，Machine Learning逐渐成为一个新的学习热点。同时与传统CS相比，Machine Learning在其他领域也有着广泛的应用，因此这门学科成为不仅折磨CS专业同学的“小恶魔”，也是折磨生物、化学、统计等其他学科留学生的“大魔王”。学习Machine learning的一大绊脚石在于使用语言众多，跨学科范围广，所以学习起来尤其困难。但是不管你在学习Machine Learning时遇到任何难题，StudyGate专业导师团队都能为你轻松解决。 ## 多元统计分析代考 基础数据: N 个样本， P 个变量数的单样本，组成的横列的数据表 变量定性: 分类和顺序；变量定量：数值 数学公式的角度分为: 因变量与自变量 ## 时间序列分析代写 随机过程，是依赖于参数的一组随机变量的全体，参数通常是时间。 随机变量是随机现象的数量表现，其时间序列是一组按照时间发生先后顺序进行排列的数据点序列。通常一组时间序列的时间间隔为一恒定值（如1秒，5分钟，12小时，7天，1年），因此时间序列可以作为离散时间数据进行分析处理。研究时间序列数据的意义在于现实中，往往需要研究某个事物其随时间发展变化的规律。这就需要通过研究该事物过去发展的历史记录，以得到其自身发展的规律。 ## 回归分析代写 多元回归分析渐进（Multiple Regression Analysis Asymptotics）属于计量经济学领域，主要是一种数学上的统计分析方法，可以分析复杂情况下各影响因素的数学关系，在自然科学、社会和经济学等多个领域内应用广泛。 ## MATLAB代写 MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。 ## 统计代写|抽样调查作业代写sampling theory of survey代考|MATH525 如果你也在 怎样代写抽样调查sampling theory of survey这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。 抽样调查是一种非全面调查，根据随机的原则从总体中抽取部分实际数据进行调查，并运用概率估计方法，根据样本数据推算总体相应的数量指标的一种统计分析方法。 statistics-lab™ 为您的留学生涯保驾护航 在代写抽样调查sampling theory of survey方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写抽样调查sampling theory of survey方面经验极为丰富，各种代写抽样调查sampling theory of survey相关的作业也就用不着说。 我们提供的抽样调查sampling theory of survey及其相关学科的代写，服务范围广, 其中包括但不限于: • Statistical Inference 统计推断 • Statistical Computing 统计计算 • Advanced Probability Theory 高等楖率论 • Advanced Mathematical Statistics 高等数理统计学 • (Generalized) Linear Models 广义线性模型 • Statistical Machine Learning 统计机器学习 • Longitudinal Data Analysis 纵向数据分析 • Foundations of Data Science 数据科学基础 ## 统计代写|抽样调查作业代写sampling theory of survey代考|Bayesian approach Another inferential approach in survey sampling is Bayesian as we now discuss. About \mathrm{Y}=\left(y_1, \ldots, y_i, \ldots, y_N\right), let \Omega=\left{\mathrm{Y} \mid\left(-\infty<a_i \leq y_i \leq b_i<+\infty\right)\right., with a_i, b_i known or unknown }, called the universal parametric space. For a sample s=\left(i_1, \ldots, i_n\right) and survey data d=(s, y)=\left(\left(i_1, y_{i 1}\right), \ldots,\left(i_n, y_{i n}\right)\right), with y=\left(y_{i 1}, \ldots, y_{i n}\right), let us write$$
P{\mathrm{Y}}(d)=\operatorname{Prob}(d)=p(s) I{\mathrm{Y}}(d),
$$where$$
I_{Y}(d)=\left{\begin{array}{ll}
1 & \text { if } Y \in \Omega_d \
0 & \text { if } Y \notin \Omega_d
\end{array},\right.
$$writing \Omega_d=\left{\mathrm{Y} \mid-\infty<a_j \leq y_j \leq b_j<\infty\right. for j \neq i_1, \ldots, i_n but y is as observed }, then we call P_{\mathrm{Y}}(d), the probability of observing the survey data d when Y is in the underlying parametric space. A survey design p is called ‘informative’ if p(s) involves any element of Y and it is called ‘non-informative’ in case p(s) involves no element of Y. An informative design may be contemplated if, for example, sampling proceeds by choosing an element i_1, observing the y_{i 1}-value and allowing the value of p\left(i_2 \mid\left(i_1, y_{i 1}\right)\right) to involve y_{i 1} and likewise choosing successive elements in s utilizing the y-values for the units already drawn in it. But, generally a design p is ‘non-informative’. In case p is non-informative, \operatorname{Prob}(d)=p(s), which is a constant free of Y so long as the underlying Y belongs to \Omega_d i.e. it is consistent with the observed survey data at hand. We take P_{Y}(d) also as the ‘likelihood’ of Y given the data d and write it as$$
L_d(\mathrm{Y})=P_{\mathrm{Y}}(d)=p(s) I_{\mathrm{Y}}(d) .
$$Thus, for a ‘non-informative’ design, the likelihood is a constant, free of Y so long as it involves Y in \Omega_d; i.e. Y is consistent with the data observed. This flat likelihood in survey sampling is ‘sterile’ in inference-making concerning the variate-values yet unobserved for the sample at hand and already surveyed. This discussion is based on the classical works of Godambe (1966, 1969) and Basu (1969). ## 统计代写|抽样调查作业代写sampling theory of survey代考|Minimal sufficiency Just as \zeta={s}, the totality of all possible samples s is the ‘sample space’, we shall call D={d}, the totality of all possible survey data points d as the “data space”. This D is the totality of all individual data points d. If a statistic t=t(d) is defined, it has the effect of inducing on D a ‘partitioning’. A partitioning creates a number of ‘partition sets’ of data points d which are mutually disjoint and which together coincide with D. Two different statistics t_1=t_1(d) and t_2=t_2(d) generally induce two different partitionings on D. If every ‘partition set’ induced by t_2 is contained in a ‘partition set’ induced by t_1, then t_1 is said to induce a ‘thicker’ partitioning than t_2, which naturally induces a thinner partitioning. If both t_1 and t_2 are “sufficient”, then neither sacrifices any information of relevance and t_1 achieves more ‘summarization’ than t_2. So, one should prefer and work for a statistic which is sufficient and ‘induces the thickest partitioning’ and such a statistic is called the ‘Minimal Sufficient’ statistic which is the most desirable among all sufficient statistics. Let d_1 and d_2 be two data points of the form d and d_1^, d_2^ be two data points corresponding to them as d^ corresponds to d. Let t=t(.) be a sufficient statistic such that t\left(d_1\right)=t\left(d_2\right). If we can show that this implies that d_1^=d_2^, then it will follow that d^ induces a thicker partitioning than t implying that d^ is the ‘Minimal Sufficient’ statistic. We prove this below. Since t is a sufficient statistic,$$ \begin{aligned} P_{Y}\left(d_1\right) & =P_{Y}\left(d_1 \cap t\left(d_1\right)\right) \ & -\Gamma_{Y}\left(t\left(d_1\right)\right) C_1 \text { with } C_1 \text { a constant; } \end{aligned} $$since t\left(d_1\right)=t\left(d_2\right) it follows that$$ \begin{aligned} P_{Y}\left(d_1\right) & =P{\mathbf{Y}}\left(t\left(d_1\right)\right) C_1=P{Y}\left(t\left(d_2\right)\right) C_2 \
& =P_{\mathbf{Y}}\left(d_2\right) \frac{C_1}{C_2}, \text { with } C_2 \text { a constant. }
\end{aligned}
$$Since d^ is a sufficient statistic, P_{Y}\left(d_1^*\right)=P_{Y}\left(d_1\right) C_3, C_3 is a constant. ## 抽样调查代考 ## 统计代写|抽样调查作业代写sampling theory of survey代考|Bayesian approach I_ {Y}(d)=| left { begin { array }{|} 1 \& Itext { if } Y \backslash in \backslash Omega_d \backslash 0 \& Itext { if } Y \backslash notin \backslash 0 mega_d lend{array}， \正确的。 \ \$$

, theprobabilityofobservingthesurveydata $\mathrm{d}$ when 是
isintheunderlyingparametricspace. Asurveydesign iscalled $^*$ in formative if $^{\prime} \mathrm{p}(\mathrm{s})$
involvesanyelemento $f$ 是anditiscalled’non $-$ in formative’ incasep(s)
involvesnoelemento $f$ 是
. Aninformativedesignmaybecontemplatedif, forexample, samplingproceedsbychoosin i_1, observingthey_{i 1}-valueandallowingthevalueof $\mathrm{p} \backslash \mathrm{eft}(\mathrm{i} 2$ \mid \eft(i_1, y_{i
1}\right)\right)toinvolvey_{i 1}andlikewisechoosingsuccessiveelementsin 秒utilizingthe 是 -values fortheunitsalreadydrawninit. But, generallyadesign $\mathrm{p}$
is’non – informative’. Incasepisnon – informative, loperatorname ${$ 概率 $}(\mathrm{d})=\mathrm{p}(\mathrm{s})$
, whichisaconstantfreeo $f$ 是solongastheunderlying 是belongstolOmega_d
i.e.itisconsistentwiththeobservedsurveydataathand. Wetake $\mathrm{P}{-}{\mathrm{Y}}(\mathrm{d})$ alsoasthe ‘likelihood’of 是giventhedatadandwriteitas $\$$\mathrm{L}{-} \mathrm{d}(I m a t h r m{Y})=P_{-}{\operatorname{Imathrm}{\mathrm{Y}}}(\mathrm{d})=\mathrm{p}(\mathrm{s}) I_{-}{I m a t h r m{Y}}(\mathrm{d}) \ \$$ 因此，对于“非信息”设计，可能性是一个常数，不受$\$$Ysolongasitinvolves 是inlOmega_d; i.e.Y \$$ 与 观察到的数据一致。

## 统计代写|抽样调查作业代写sampling theory of survey代考|Minimal sufficiency

\text { since } \$t\left(d_1\right)=t\left(d_2\right) \text { \$itfollowsthat }
Veft(d_2\right)\right) C_2\ lend{aligned } \ \

$\$ P_{-}{Y} \backslash$left(d_1^*$\backslash$right)=P_${Y} \backslash$left(d_1\right) C_3，C_3\$是常数。

## 广义线性模型代考

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

统计代写|抽样调查作业代写sampling theory of survey代考|Predictive approach

Rather than this design-based approach the following ‘Predictive Approach’ addresses this issue. Here the speculation is not about what would happen if rather than the sample presently at hand the average story might have been gathered if other possible samples that might have been observed rather than this actual one that has been actually encountered. Let us elaborate.

Suppose a sample $s$ has been taken and the $y$-values $y_i$ for $i \in s$ are observed.
Then, we may write
$$Y=\sum_1^N y_i=\sum_{i \in s} y_i+\sum_{i \notin s} y_i .$$
If a statistic $t=t(s, Y)$ is constructed then, observing
$$t=t(s, \mathrm{Y})=\sum_{i \in s} y_i+\left(t(s, \mathrm{Y})-\sum_{i \in s} y_i\right)$$
we would like to be satisfied with the chosen statistic $t$ if $\left(t(s, Y)-\sum_{i \in s} y_i\right)$ came close to $\sum_{i \notin s} y_i$. But a statistic $t$ involves no $y_i$ for $i \notin s$. So, unless the $y_i$ ‘s for $i \in s$ and those for $i \notin s$ may be inter-related, then one cannot argue closeness of $\left(t(s, Y)-\sum_{i \in s} y_i\right)$ with $\sum_{i \notin s} y_i$. So, at the very outset it is plausible to regard the vector $Y=\left(y_1, \ldots, y_i, \ldots, y_N\right)$ of unknown real numbers as one with its co-ordinates suitably inter-related. A plausible way to achieve this is to regard $Y$ as a random vector. Being supposed to be a random vector $Y$ must have a probability distribution. Let us not insist on this probability distribution to be of a very specific form. Let us be satisfied on postulating on its form to be a member of a wide class of probability distributions with the simple restriction that the low order moments like the mean, variance, covariances, measures of skewness and Kutosis exist and are all finite in magnitude. We refer to such a class of probability distributions as a ‘model’. Actually postulating such a class of probability distributions is called ‘modeling’. With this approach, the total $Y=\sum_1^N y_i$ ceases to be a constant but turns out to be a random variable. But a random variable cannot be estimated. It may however be predicted. An estimator for the model-based expectation of the random variable $Y$ may be treated as a ‘Predictor’ for $Y$. We shall narrate about how to deal with this situation of finding appropriate predictors for $Y$ and the resulting consequences. This plan is known as the ‘Prediction Approach’ in the context of survey-sampling. In this approach we do not need a probability-based selection of a sample. Yet a slightly different approach results in case we regard $Y$ as a random vector and $Y$ as a random variable which we discuss next in brief.

## 统计代写|抽样调查作业代写sampling theory of survey代考|Super-population modeling approach

Postulating $Y$ as a random vector we recognize $Y$ has a probability distribution. This probability distribution defines of course a ‘population’, now called a super-population contrasted with the population $U=(1, \ldots, i, \ldots, N)$ we have already treated. Now the following approach is a third alternative. Choosing from $U$ a random i.e. a probability-based or probabilistic sample $s$ let $t=t(s, Y)$, a statistic be chosen as an estimator for $Y$ and let it be unbiased as well, i.e. $E_p(t)=\sum_s p(s)(t(s, Y))=Y$ for every vector $Y$. Now without succeeding to desirably control the magnitude of the variance $V_p(t)$ uniformly for every possible $Y$, let us try to choose one $t$ such that the model-based expectation of the design variance $V_p(t)$ may be suitably controlled.

We shall find it convenient to define $E_m, V_m$ and $C_m$ as the operators respectively for the expectation, variance and the covariance with respect to the probability distribution of $Y$, which distribution is just widely modeled. For a given design $p$, we may proceed to minimize the value of $E_m V_p(t)$ so as to dẻrive an appropriate $t_0$ such that $E_p\left(t_0\right)=Y=E_p(t)$ and $E_m V_p\left(t_0\right) \leq$ $E_m V_p(t) \forall t \neq t_0$. Such a $t_0$ is the optimal “super-population modeling based estimator”, rather a predictor for $Y$; it is an estimator because we start with taking $E_p(t)=Y=E_p\left(t_0\right) \forall Y$ and it is a predictor because we refer to the probability distribution of $Y$, treating $Y$ as a random variable. This is called ‘Super-population modeling’ approach.

Following Brewer (1979) another inferential approach in survey sampling is called “The Model-Assisted Approach”. As we shall show later (1) The opti-mal predictor for $Y$ and also (2) The optimal design-model based predictor for $Y$ cannot generally be used as they generally involve model-parameters which are unknowable. The prediction theory does not need any probability-based sampling. As the optimal predictor generally fails application Brewer replaces the ‘model-parameter’ in it by a simple multiplicative weight. He introduces an ‘Asymptotic’ argument and the concept of an ‘Asymptotic Design Unbiasedness’ rather than the exact design-unbiasedness. He recommends the choice of his ‘weight function’ to be a function of ‘design parameters’ ensuring his ‘revised optimal predictor’ to be asymptotically design-unbiased for $Y$ and this requirement yields a unique choice of the weight which gives us the Brewer’s predictor which is ‘model-induced’ as well as ‘design unbiased’ but only in an asymptotic sense. An important by-product of this model-assisted approach is that the (2) impracticable design-model-based optimal predictor for $Y$ may be amended so as to have its ‘asymptotic design expectation’ to match the population total. For a particular choice of a weight function involved that matches the Brewer’s predictor as an additional merit to note.

## 统计代写|抽样调查作业代写sampling theory of survey代考|Predictive approach

$$Y=\sum_1^N y_i=\sum_{i \in s} y_i+\sum_{i \notin s} y_i$$

\begin{aligned}
M_p(t) & =\sum_s p(s)(t(s, Y)-Y)^2 \
& =\sum_1 p(s)(t(s, Y)-Y)^2+\sum_2 p(s)(t(s, Y)-Y)^2
\end{aligned}
$$writing \sum_1 as the sum over samples for which |t(s, Y)-Y| \geq K for a certain K>0 and \sum_2 as the sum over samples for which |t(s, Y)-Y|<K. Then, \quad M_p(t) \geq K^2 \sum_1 p(s)=K^2 \operatorname{Prob}[|t(s, Y)-Y| \geq K]. So, \quad \operatorname{Prob}[|t(s, Y)-Y| \geq K] \leq \frac{V_p(t)+B_p^2(t)}{K^2}. Choosing \quad K=\lambda \sigma_p(t), with \lambda>0 one gets or$$
\begin{aligned}
& \operatorname{Prob}\left[|t-Y| \geq \lambda \sigma_p(t)\right] \leq \frac{1}{\lambda^2}+\frac{1}{\lambda^2}\left(\frac{\left|B_p(t)\right|}{\sigma_p(t)}\right)^2 \
& \operatorname{Prob}\left[|t-Y| \leq \lambda \sigma_p(t)\right] \geq\left(1-\frac{1}{\lambda^2}\right)-\frac{1}{\lambda^2}\left(\frac{\left|B_p(t)\right|}{\sigma_p(t)}\right)^2 .
\end{aligned}
$$Thus, in order that the error in estimation of Y by t may be kept in control (i) \left|B_p(t)\right| may be small and (ii) \sigma_p(t) may be small. So, a good estimator for Y should have (i) small numerical bias and (ii) small standard error. This is rather a truism if we decide to rest content with an estimator for which these two design-based performance characteristics are our main concerns. This is the essence of the ‘Design-based’ approach in estimation in Survey Sampling. Most crucially, we cannot say how close is the calculated value of the statistic t to Y, the estimatand parameter for the given data at hand. Thus according to this approach our concern is how controlled are the performance characteristics for the strategy we employ without questioning the magnitude of the actual realized error |t(s, Y)-Y|. ## 抽样调查代考 ## 统计代写|抽样调查作业代写sampling theory of survey代考|Certain rudimentaries for sampling 为了选择或选择概率样本，一种方便的方法是利用所谓的随机数表。虽然现在使用计算机设备抽取随机样 本不是问题，但我们选择提供细节来传播背景。 随机数表是大量个位数的序列 0,1,2,3,4,5,6,7,89 个依次连续排列在一页上，页数很多，形成一本 书。数字如此连续出现，以至于（i）从书中的任何地方读取每一个数字 i 以相对频率发生 \frac{1}{10} 如果涵盖了足 够多的人，并且 (ii) 此外，如果有一组 K 在大量这样的组中读取连续的数字，或者将每个组的相对频率 设置为 \frac{1}{10^K} ，和 K=2,3, \ldots, 8 说。这些相对频率接近 \frac{1}{10}, \frac{1}{100}, \ldots, \frac{1}{10^8} 分别为 K=1,2, \ldots, 8. 这 些相对频率与分数的接近程度 \frac{1}{10}, \frac{1}{10^2}, \ldots, \frac{1}{10^8} 分别可用卡方统计方法或其他概率检验方法进行检验。如 此测试的一系列数字称为“随机数”，此类“随机数”书籍的页面称为“随机数表”。 抽样调查实际上对非抽样专家很有用，但作为聪明的业主，我们倾向于为他们的潜在问题提供答案。 让我们举例说明。假设有限人口由 67 名成员组成。然后我们将它们分别标记为 1,2, \ldots, 67. 如果我们可 以以概率选择它们中的每一个 \frac{1}{67} ，那么我们将说我们“随机”选择了人口中的一名成员。由于 67 是 2 位数 字，我们应该考虑 2 位数字 (i, j), i=0,1, \ldots, 9 和 j=0,1, \ldots, 9 来自随机数表。有100个这样的数 字 (0,0),(0,1), \ldots,(0,9),(1,0),(1,1), \ldots,(1,9), \ldots,(9,0), \ldots,(9,9). 将人口的 67 名成员标 记为 (01)，(02)，..(65),(66),(67) 会很方便。从随机数表中，我们计划只读取这 67 个 2 位数字，忽略 33 个 数字 (00),(68),(69), \ldots,(99). 最早的一个2位数字 (01),(02), \ldots,(67) 被读取给了我们所需的随机 样本。 ## 统计代写|抽样调查作业代写sampling theory of survey代考|Design-based approach 给定一个设计 p, 估计量 \ \mathrm{t}=\mathrm{t}(\mathrm{s}, Y) basedonasample秒 chosenaccordingtodesign \mathrm{p} hasitsexpectationas \mathrm{E}{-} \mathrm{p}(\mathrm{t})=\mid sum{ {\mathrm{s} \backslash in \backslash z tat } \mathrm{p}(\mathrm{s}) \mathrm{t}(\mathrm{s}, \mathrm{Y}) anditsMeanSquare_Error (M S E) as \mathrm{M}{-} \mathrm{p}(\mathrm{t})=\mathrm{E}{-} \mathrm{p}(\mathrm{t} \mathrm{Y})^{\wedge} 2=\backslash sum_ {\mathrm{s} \backslash in \zeta } \mathrm{p}(\mathrm{s})(\mathrm{t}(\mathrm{s}, \mathrm{Y})-\mathrm{Y})^{\wedge} 2, whichprovidesa’measure { }^{\prime} oferroro f 吨 asanestimatorof Y \$$ 。

$$\operatorname{Prob}\left[|t-Y| \geq \lambda \sigma_p(t)\right] \leq \frac{1}{\lambda^2}+\frac{1}{\lambda^2}\left(\frac{\left|B_p(t)\right|}{\sigma_p(t)}\right)^2 \quad \operatorname{Prob}\left[|t-Y| \leq \lambda \sigma_p(t)\right] \geq\left(1-\frac{1}{\lambda^2}\right)$$

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

Python是一种高级的、解释性的、通用的编程语言。它的设计理念强调代码的可读性，使用大量的缩进。

Python是动态类型的，并且是垃圾收集的。它支持多种编程范式，包括结构化（特别是程序化）、面向对象和函数式编程。由于其全面的标准库，它经常被描述为一种 "包含电池 "的语言。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

计算机代写|python代考|Checking Whether Python 3 Is Installed

If the output shows you have Python $3.9$ or a later version installed, you can skip the next section and go to “Running Python in a Terminal Session.” If you see any version earlier than Python 3.9, follow the instructions in the next section to install the latest version.

Note that on macOS, whenever you see the python command in this book, you need to use the python3 command instead to make sure you’re using Python 3. On most macOS systems, the python command either points to an outdated version of Python that should only be used by internal system tools, or it points to nothing and generates an error message.

After the installer runs, a Finder window should appear. Double-click the Install Certificates.command file. Running this file will allow you to more easily install additional libraries that you’ll need for real-world projects, including the projects in the second half of this book.

## 计算机代写|python代考|Checking Your Version of Python

Open a terminal window by running the Terminal application on your system (in Ubuntu, you can press CTRL-ALT-T). To find out which version of Python is installed, enter python3 with a lowercase $p$. When Python is installed, this command starts the Python interpreter. You should see output indicating which version of Python is installed. You should also see a Python prompt ( $>>>)$ where you can start entering Python commands: This output indicates that Python $3.10 .4$ is currently the default version of Python installed on this computer. When you’ve seen this output, press CTRL-D or enter exit() to leave the Python prompt and return to a terminal prompt. Whenever you see the python command in this book, enter python3 instead.
You’ll need Python $3.9$ or later to run the code in this book. If the Python version installed on your system is earlier than Python 3.9, or if you want to update to the latest version currently available, refer to the instructions in Appendix A.

VS Code works with many different programming languages; to get the most out of it as a Python programmer, you’ll need to install the Python extension. This extension adds support for writing, editing, and running Python programs.

To install the Python extension, click the Manage icon, which looks like a gear in the lower-left corner of the VS Code app. In the menu that appears, click Extensions. Enter python in the search box and click the Python extension. (If you see more than one extension named Python, choose the one supplied by Microsoft.) Click Install and install any additional tools that your system needs to complete the installation. If you see a message that you need to install Python, and you’ve already done so, you can ignore this message.

## 计算机代写|python代考|Checking Your Version of Python

VS Code 适用于许多不同的编程语言；作为 Python 程序员，要充分利用它，您需要安装 Python 扩展。此扩展添加了对编写、编辑和运行 Python 程序的支持。

