## 统计代写|抽样调查作业代写sampling theory of survey代考|EXAMPLES OF REPRESENTATIVE STRATEGIES

The ratio estimator
$$t_{1}=X \frac{\sum_{i \in s} Y_{i}}{\sum_{i \in s} X_{i}}$$
is of special importance because of its traditional use in practice. Here, $\left(p, t_{1}\right)$ is obviously representative with respect to a size measure $x$, more precisely to $\left(X_{1}, \ldots, X_{N}\right)$, whatever the sampling design $p$.

Note, however, that $t_{1}$ is usually combined with SRSWOR or SRSWR. The sampling scheme of LAHIRI-MIDZUNO-SEN (LAHIRI, 1951; MIDZUNO, 1952; SEN, 1953) (LMS) yields a design of interest to be employed in conjunction with $t_{1}$ by rendering it design unbiased.
The Hansen-Hurwitz (HH, 1943) estimator (HHE)
$$t_{2}=\frac{1}{n} \sum_{i=1}^{N} f_{s i} \frac{Y_{i}}{P_{i}}$$ with $f_{s i}$ as the frequency of $i$ in $s, i \in \mathcal{U}$, combined with any design $p$, gives rise to a strategy representative with respect to $\left(P_{1}, \ldots, P_{N}\right)^{\prime}$. For the sake of design unbiasedness, $t_{2}$ is usually based on probability proportional to size (PPS) with replacement (PPSWR) sampling, that is, a scheme that consists of $n$ independent draws, each draw selecting unit $i$ with probability $P_{i}$.

Another representative strategy is due to RAO, HARTLEY and COCHRAN (RHC, 1962). We first describe the sampling scheme as follows: On choosing a sample size $n$, the population $\mathcal{U}$ is split at random into $n$ mutually exclusive groups of sizes suitably chosen $N_{i}\left(i=1, \ldots, n ; \sum_{1}^{n} N_{i}=N\right)$ coextensive with $\mathcal{U}$, the units bearing values $P_{i}$, the normed sizes $\left(0<P_{i}<1, \sum P_{i}=1\right)$. From each of the $n$ groups so formed independently one unit is selected with a probability proportional to its size given the units falling in the respective groups.

## 统计代写|抽样调查作业代写sampling theory of survey代考|Raj’s Estimator t5

Another popular strategy is due to RAJ $(1956,1968)$. The sampling scheme is called probability proportional to size without replacement (PPSWOR) with $P_{i}$ ‘s $\left(02)$ draw a unit $i_{n}\left(\neq i_{1}, \ldots, i_{n-1}\right)$ is chosen with probability
$$\frac{P_{i_{n}}}{1-P_{i_{1}}-P_{i_{2}}-\ldots,-P_{i_{n-1}}}$$ out of the units of $U$ minus $i_{1}, i_{2}, \ldots, i_{n-1}$. Then,
\begin{aligned} e_{1} &=\frac{Y_{i_{1}}}{P_{i_{1}}} \ e_{2} &=Y_{i_{1}}+\frac{Y_{i_{2}}}{P_{i_{2}}}\left(1-P_{i_{1}}\right) \ e_{j} &=Y_{i_{1}}+\ldots+Y_{i_{j-1}}+\frac{Y_{i_{j}}}{P_{i_{j}}}\left(1-P_{i_{1}}-\ldots-P_{i_{j-1}}\right) \end{aligned}
$j=3, \ldots, n$ are all unbiased for $Y$ because the conditional expectation
\begin{aligned} E_{c} & {\left[e_{j} \mid\left(i_{1}, Y_{i_{1}}\right), \ldots,\left(i_{j-1}, Y_{i_{j-1}}\right)\right] } \ &=\left(Y_{i_{1}}+\ldots,+Y_{i_{j-1}}\right)+\sum_{\substack{k=1 \ \left(\neq i_{1}, \ldots, i_{j-1}\right)}}^{N} Y_{k}=Y . \end{aligned}
So, unconditionally, $E_{p}\left(e_{j}\right)=Y$ for every $j=1, \ldots, n$, and
$$t_{5}=\frac{1}{n} \sum_{j=1}^{n} e_{j},$$
called Raj’s (1956) estimator, is unbiased for $Y$.

$$t_{1}=X \frac{\sum_{i \in s} Y_{i}}{\sum_{i \in s} X_{i}}$$

$$t_{2}=\frac{1}{n} \sum_{i=1}^{N} f_{s i} \frac{Y_{i}}{P_{i}}$$

$$\frac{P_{i_{n}}}{1-P_{i_{1}}-P_{i_{2}}-\ldots,-P_{i_{n-1}}}$$

$$e_{1}=\frac{Y_{i_{1}}}{P_{i_{1}}} e_{2}=Y_{i_{1}}+\frac{Y_{i_{2}}}{P_{i_{2}}}\left(1-P_{i_{1}}\right) e_{j}=Y_{i_{1}}+\ldots+Y_{i_{j-1}}+\frac{Y_{i_{j}}}{P_{i_{j}}}\left(1-P_{i_{1}}-\ldots-P_{i_{j-1}}\right)$$
$j=3, \ldots, n$ 都是公正的 $Y$ 因为条件期望
$$E_{c}\left[e_{j} \mid\left(i_{1}, Y_{i_{1}}\right), \ldots,\left(i_{j-1}, Y_{i_{j-1}}\right)\right]=\left(Y_{i_{1}}+\ldots,+Y_{i_{j-1}}\right)+\sum_{k=1} \sum_{\left(\neq i_{1}, \ldots, i_{j-1}\right)}^{N} Y_{k}=Y .$$

$$t_{5}=\frac{1}{n} \sum_{j=1}^{n} e_{j}$$

## 统计代写|抽样调查作业代写sampling theory of survey代考|SAMPLING SCHEMES

A unified theory is developed by noting that it is enough to establish results concerning $(p, t)$ without heeding how one may actually succeed in choosing samples with preassigned probabilities. A method of choosing a sample draw by draw, assigning selection probabilities with each draw, is called a sampling scheme. Following HANURAV (1966), we show below that starting with an arbitrary design we may construct a sampling scheme.

Suppose for each possible sample $s$ from $U$ the selection probability $p(s)$ is fixed. Let
$$\begin{array}{lll} \beta_{i 1}=p\left(i_{1}\right), & \beta_{i_{1}, i_{2}}=p\left(i_{1}, i_{2}\right), \ldots, & \beta_{i_{1}, \ldots, i_{n}}=p\left(i_{1}, \ldots, i_{n}\right) \ \alpha_{i 1}=\Sigma_{1} p(s), & \alpha_{i_{1}, i_{2}}=\Sigma_{2} p(s), \ldots, & \alpha_{i_{1}, \ldots, i_{n}}=\Sigma_{n} p(s) \end{array}$$
where $\Sigma_{1}$ is the sum over all samples $s$ with $i_{1}$ as the first entry; $\Sigma_{2}$ is the sum over all samples with $i_{1}, i_{2}$, respectively, as the first and second entries in $s, \ldots$, and $\Sigma_{n}$ is the sum over all samples of which the first, second, $\ldots, n$th entries are, respectively, $i_{1}, i_{2}, \ldots, i_{n}$.

Then, let us consider the scheme of selection such that on the first draw from $U, i_{1}$ is chosen with probability $\alpha_{i 1}$, a second draw from $U$ is made with probability
$$\left(1-\frac{\beta_{i 1}}{\alpha_{i 1}}\right) \text {. }$$
On the second draw from $U$ the unit $i_{2}$ is chosen with probability
$$\begin{gathered} \alpha_{i_{1}, i_{2}} \ \alpha_{i 1}-\beta_{i 1} \end{gathered}$$
A third draw is made from $U$ with probability
$$\left(1-\frac{\beta_{i_{1}, i_{2}}}{\alpha_{i_{1}, i_{2}}}\right)$$

## 统计代写|抽样调查作业代写sampling theory of survey代考|CONTROLLED SAMPLING

Now, consider an arbitrary design $p$ of fixed size $n$ and a linear estimator $t$; suppose a subset $S_{0}$ of all samples is less desirable from practical considerations like geographical location, inaccessibility, or, more generally, costliness. Then, it is advantageous to replace design $p$ by a modified one, for example, $q$, which attaches minimal values $q(s)$ to the samples $s$ in $S_{0}$ keeping
$$\begin{gathered} E_{p}(t)=E_{q}(t) \ E_{p}(t-Y)^{2}-E_{q}(t-Y)^{2} \end{gathered}$$
and even maintaining other desirable properties of $p$, if any. A resulting $q$ is called a controlled design and a corresponding scheme of selection is called a controlled sampling scheme. Quite a sizeable literature has grown around this problem of finding appropriate controlled designs. The methods of implementing such a scheme utilize theories of incomplete block designs and predominantly involve ingeneous devices of reducing the size of support of possible samples demanding trials and errors. But RAO and NIGAM (1990) have recently presented a simple solution by posing it as a linear programming problem and applying the well-known simplex algorithm to demonstrate their ability to work out suitable controlled schemes.
Taking $t$ as the HOR VIT7-THOMPSON estimator $\bar{t}=\sum_{i \in S}$ $Y_{i} / \pi_{i}$, they minimize the objective function $F=\sum_{s \in S_{0}} q(s)$ subject to the linear constraints
\begin{aligned} \sum_{s \ni i, j} q(s) &=\sum_{s \ni i, j} p(s)=\pi_{i j} \ q(s) & \geq 0 \text { for all } s \end{aligned}
where $\pi_{i j}{ }^{\prime} s$ are known quantities in terms of the original uncontrolled design $p$.

## 统计代写|抽样调查作业代写sampling theory of survey代考|SAMPLING SCHEMES

$$\beta_{i 1}=p\left(i_{1}\right), \quad \beta_{i_{1}, i_{2}}=p\left(i_{1}, i_{2}\right), \ldots, \quad \beta_{i_{1}, \ldots, i_{n}}=p\left(i_{1}, \ldots, i_{n}\right) \alpha_{i 1}=\Sigma_{1} p(s), \quad \alpha_{i_{1}, i_{2}}=\Sigma_{2} p(s), \ldots,$$

$$\left(1-\frac{\beta_{i 1}}{\alpha_{i 1}}\right) .$$

$$\alpha_{i_{1}, i_{2}} \alpha_{i 1}-\beta_{i 1}$$

$$\left(1-\frac{\beta_{i_{1}, i_{2}}}{\alpha_{i_{1}, i_{2}}}\right)$$

## 统计代写|抽样调查作业代写sampling theory of survey代考|CONTROLLED SAMPLING

$$E_{p}(t)=E_{q}(t) E_{p}(t-Y)^{2}-E_{q}(t-Y)^{2}$$

$$\sum_{s \ni i, j} q(s)=\sum_{s \ni i, j} p(s)=\pi_{i j} q(s) \quad \geq 0 \text { for all } s$$

## 统计代写|抽样调查作业代写sampling theory of survey代考|ELEMENTARY DEFINITIONS

Let $N$ be a known number of units, e.g., godowns, hospitals, or income earners, each assignable identifying labels $1,2, \ldots, N$ and bearing values, respectively, $Y_{1}, Y_{2}, \ldots, Y_{N}$ of a realvalued variable $y$, which are initially unknown to an investigator who intends to estimate the total
$$Y=\sum_{1}^{N} Y_{i}$$
or the mean $\bar{Y}=Y / N$.
We call the sequence $U=(1, \ldots, N)$ of labels a population. Selecting units leads to a sequence $s=\left(i_{1}, \ldots, i_{n}\right)$, which is called a sample. Here $i_{1}, \ldots, i_{n}$ are elements of $U$, not necessarily distinct from one another but the order of its appearance is maintained. We refer to $n=n(s)$ as the size of $s$, while the effective sample size $v(s)=|s|$ is the cardinality of $s$, i.e., the number of distinct units in $s$. Once a specific sample $s$ is chosen we suppose it is possible to ascertain the values $Y_{i_{1}}, \ldots, Y_{i_{n}}$ of $y$ associated with the respective units of $s$. Then $d=\left[\left(i_{1}, Y_{i_{1}}\right), \ldots,\left(i_{n}, Y_{i_{n}}\right)\right] \quad$ or briefly $d=\left[\left(i, Y_{i}\right) \mid i \in s\right]$
constitutes the survey data.
An estimator $t$ is a real-valued function $t(d)$, which is free of $Y_{i}$ for $i \notin s$ but may involve $Y_{i}$ for $i \in s$. Sometimes we will express $t(d)$ alternatively by $t(s, Y)$, where $Y=\left(Y_{1}, \ldots\right.$, $\left.Y_{N}\right)^{\prime} .$

## 统计代写|抽样调查作业代写sampling theory of survey代考|DESIGN-BASED INFERENCE

Let $\Sigma_{1}$ be the sum over samples for which $|t(s, Y)-Y| \geq k>0$ and let $\Sigma_{2}$ be the sum over samples for which $|t(s, Y)-Y|<k$ for a fixed $Y$. Then from
\begin{aligned} M_{p}(t) &=\Sigma_{1} p(s)(t-Y)^{2}+\Sigma_{2} p(s)(t-Y)^{2} \ & \geq k^{2} \operatorname{Prob}[|t(s, Y)-Y| \geq k] \end{aligned}
one derives the Chebyshev inequality:
$$\operatorname{Prob}[|t(s, Y)-Y| \geq k] \leq \frac{M_{p}(t)}{k^{2}} .$$
Hence
$\operatorname{Prob}[t-k \leq Y \leq t+k] \geq 1-\frac{M_{p}(t)}{k^{2}}=1-\frac{1}{k^{2}}\left[V_{p}(t)+B_{p}^{2}(t)\right]$ where $B_{p}(t)=E_{p}(t)-Y$ is the bias of $t$. Writing $\sigma_{p}(t)=$ $\sqrt{V_{p}(t)}$ for the standard error of $t$ and taking $k=3 \sigma_{p}(t)$, it follows that, whatever $Y$ may be, the random interval $t \pm 3 \sigma_{p}(t)$ covers the unknown $Y$ with a probability not less than
$$\frac{8}{9}-\frac{1}{9} \frac{B_{p}^{2}(t)}{V_{p}(t)} .$$
So, to keep this probability high and the length of this covering interval small it is desirable that both $\left|B_{p}(t)\right|$ and $\sigma_{p}(t)$ be small, leading to a small $M_{p}(t)$ as well.

## 统计代写|抽样调查作业代写sampling theory of survey代考|ELEMENTARY DEFINITIONS

$$Y=\sum_{1}^{N} Y_{i}$$

Ibegin{aligned}
$\mathrm{M}{-}{\mathrm{p}}(\mathrm{t}) \&=\mid$ sigma ${1} \mathrm{p}(\mathrm{s})(\mathrm{t} \mathbf{\mathrm { Y }}) \wedge{2}+\backslash \operatorname{sigma}{2} \mathrm{p}(\mathrm{s})(\mathrm{tY}) \wedge{2} \backslash$
\& Igeq $k \wedge{2}$ loperatorname{概率 $}[|t(s, Y)-Y| \backslash g e q ~ k]$
lend{对齐 $}$
onederivestheChebyshevinequality:
loperatorname{概率 $}[|\mathrm{t}(\mathrm{s}, Y)-\mathrm{Y}| \operatorname{lgeq} \mathrm{k}] \backslash \operatorname{leq} \backslash f$ frac $\left{\mathrm{M}{-}{\mathrm{p}}(\mathrm{t})\right}{\mathrm{k} \wedge{2}}$ $\$ \$$因此 \operatorname{Prob}[t-k \leq Y \leq t+k] \geq 1-\frac{M{p}(t)}{k^{2}}=1-\frac{1}{k^{2}}\left[V_{p}(t)+B_{p}^{2}(t)\right] 在哪里 B_{p}(t)=E_{p}(t)-Y 是偏差 t. 写作 \sigma_{p}(t)=\sqrt{V_{p}(t)} 对于标准误 t 并采取 k=3 \sigma_{p}(t), 由此可知，无论 \ צmaybe, therandomintervalt Ipm 3 ปsigma_{p}(t)coverstheunknown 是withaprobabilitynotlessthan \frac{8}{9}-\frac{1}{9} \frac{B_{p}^{2}(t)}{V_{p}(t)}. So, tokeepthisprobabilityhighandthelengthofthiscoveringintervalsmallitisdesirablethatboth Veft \mid \mathrm{B}{-}{\mathrm{p}}(\mathrm{t}) \backslash right \mid and \backslash sigma{p}(t)besmall, leadingtoasmall \mathrm{M}_{-}{\mathrm{p}}(\mathrm{t}) \$$ 也是如此。

## 统计代写|R语言代写R language代考|Reproducible data analysis

Reproducible data analysis is much more than a fashionable buzzword. Under any situation where accountability is important, from scientific research to decision making in commercial enterprises, industrial quality control and safety and environmental impact assessments, being able to reproduce a data analysis reaching the same conclusions from the same data is crucial. Most approaches to reproducible data analysis are based on automating report generation and including, as part of the report, all the computer commands used to generate the results presented.

A fundamental requirement for reproducibility is a reliable record of what commands have been run on which data. Such a record is especially difficult to keep when issuing commands through menus and dialogue boxes in a graphical user interface or interactively at a console. Even working interactively at the $\mathrm{R}$ console using copy and paste to include commands and results in a report is error prone, and laborious.

A further requirement is to be able to match the output of the $R$ commands to the input. If the script saves the output to separate files, then the user will need to take care that the script saved or shared as a record of the data analysis was the one actually used for obtaining the reported results and conclusions. This is another error-prone stage in the reporting of data analysis. To solve this problem an approach was developed, inspired in what is called literate programming (Knuth 1984). The idea is that running the script will produce a document that includes the listing of the R code used, the results of running this code and any explanatory text needed to understand and interpret the analysis.

Although a system capable of producing such reports with R, called ‘Sweave’ (Leisch 2002), has been available for a couple decades, it was rather limited and not supported by an IDE, making its use rather tedious. A more recently developed system called ‘knitr’ (Xie 2013) together with its integration into RStudio has made the use of this type of reports very easy. The most recent development is what has been called R notebooks produced within RStudio. This new feature, can produce the readable report of running the script as an HTML file, displaying the code used interspersed with the results within the viewable file as in earlier approaches. However, this newer approach goes even further: the actual source script used to generate the report is embedded in the HTML file of the report and can be extracted and run very easily and consequently re-used. This means that anyone who gets access to the output of the analysis in human readable form also gets access to the code used to generate the report, in computer executable format.

When searching for answers, asking for advice or reading books, you will be confronted with different ways of approaching the same tasks. Do not allow this to overwhelm you; in most cases it will not matter as many computations can be done in $\mathrm{R}$, as in any language, in several different ways, still obtaining the same result. The different approaches may differ mainly in two aspects: 1) how readable to humans are the instructions given to the computer as part of a script or program, and 2) how fast the code runs. Unless computation time is an important bottleneck in your work, just concentrate on writing code that is easy to understand to you and to others, and consequently easy to check and reuse. Of course, do always check any code you write for mistakes, preferably using actual numerical test cases for any complex calculation or even relatively simple scripts. Testing and validation are extremely important steps in data analysis, so get into this habit while reading this book. Testing how every function works, as I will challenge you to do in this book, is at the core of any robust data analysis or computing programming.

## 广义线性模型代考

## 统计代写|R语言代写R language代考|Using R interactively

A physical terminal (keyboard plus text-only screen) decades ago was how users communicated with computers, and was frequently called a console. Nowadays, a text-only interface to a computer, in most cases a window or a pane within a graphical user interface, is still called a console. In our case, the R console (Figure 1.1). This is the native user interface of $R$.

Typing commands at the $\mathrm{R}$ console is useful when one is playing around, rather aimlessly exploring things, or trying to understand how an $\mathrm{R}$ function or operator we are not familiar with works. Once we want to keep track of what we are doing, there are better ways of using $\mathrm{R}$, which allow $\mathrm{us}$ to keep a record of how an analysis has been carried out. The different ways of using R are not exclusive of each other, so most users will use the $\mathrm{R}$ console to test individual commands and plot data during the first stages of exploration. As soon as we decide how we want to plot or analyze the data, it is best to start using scripts. This is not enforced in any way by $\mathrm{R}$, but scripts are what really brings to light the most important advantages of using a programming language for data analysis. In Figure $1.1$ we can see how the $\mathrm{R}$ console looks. The text in red has been typed in by the user, except for the prompt $>$, and the text in blue is what $\mathrm{R}$ has displayed in response. It is essentially a dialogue between user and R. The console can look different when displayed within an IDE like RStudio, but the only difference is in the appearance of the text rather than in the text itself (cf. Figures $1.1$ and 1.2).

The two previous figures showed the result of entering a single command. Figure $1.3$ shows how the console looks after the user has entered several commands, each as a separate line of text.

The examples in this book require only the console window for user input. Menu-driven programs are not necessarily bad, they are just unsuitable when there is a need to set very many options and choose from many different actions. They are also difficult to maintain when extensibility is desired, and when independently developed modules of very different characteristics need to be integrated. Textual languages also have the advantage, to be addressed in later chapters, that command sequences can be stored in human- and computer-readable text files. Such files constitute a record of all the steps used, and in most cases, makes it trivial to reproduce the same steps at a later time. Scripts are a very simple and handy way of communicating to other users how to do a given data analysis.

## 统计代写|R语言代写R language代考|Editors and IDEs

Integrated Development Environments (IDEs) are used when developing computer programs. IDEs provide a centralized user interface from within which the different tools used to create and test a computer program can be accessed and used in coordination. Most IDEs include a dedicated editor capable of syntax highlighting, and even report some mistakes, related to the programming language in use. One could describe such an editor as the equivalent of a word processor with spelling and grammar checking, that can alert about spelling and syntax errors for a computer language like $\mathrm{R}$ instead of for a natural language like English. In the case of RStudio, the main, but not only language supported is R.

The main window of IDEs usually displays more than one pane simultaneously. From within the RStudio IDE, one has access to the R console, a text editor, a file-system browser, a pane for graphical output, and access to several additional tools such as for installing and updating extension packages. Although RStudio supports very well the development of large scripts and packages, it is currently, in my opinion, also the best possible way of using $R$ at the console as it has the $R$ help system very well integrated both in the editor and $\mathrm{R}$ console. Figure $1.6$ shows the main window displayed by RStudio after running the same script as shown above at the $\mathrm{R}$ console (Figure 1.4) and at the operating system command prompt (Figure 1.5). We can see by comparing these three figures how RStudio is really a layer between the user and an unmodified R executable. The script was sourced by pressing the “Source”button at the top of the editor pane. RStudio, in response to this, generated the code needed to source the file and “entered” it at the console, the same console, where we would type any $\mathrm{R}$ commands.

## 统计代写|R语言代写R language代考|Editors and IDEs

IDE 的主窗口通常同时显示多个窗格。从 RStudio IDE 中，您可以访问 R 控制台、文本编辑器、文件系统浏览器、图形输出窗格，以及访问其他一些工具，例如安装和更新扩展包。尽管 RStudio 很好地支持大型脚本和包的开发，但在我看来，它目前也是最好的使用方式R在控制台上，因为它有R帮助系统很好地集成在编辑器和R安慰。数字1.6显示 RStudio 在运行如上所示的相同脚本后显示的主窗口R控制台（图 1.4）和操作系统命令提示符（图 1.5）。通过比较这三个图，我们可以看出 RStudio 是如何真正成为用户和未修改的 R 可执行文件之间的一层。该脚本是通过按编辑器窗格顶部的“源”按钮获取的。RStudio 对此作出响应，生成了获取文件所需的代码并在控制台“输入”它，在同一个控制台，我们可以在其中键入任何R命令。

## 统计代写|R语言代写R language代考|STA518

R是一种用于统计计算和图形的编程语言，由R核心团队和R统计计算基金会支持。R由统计学家Ross Ihaka和Robert Gentleman创建，在数据挖掘者和统计学家中被用于数据分析和开发统计软件。用户已经创建了软件包来增强R语言的功能。

statistics-lab™ 为您的留学生涯保驾护航 在代写R语言方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写R语言代写方面经验极为丰富，各种代写R语言相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|R语言代写R language代考|R as a language

$\mathrm{R}$ is a computer language designed for data analysis and data visualization, however, in contrast to some other scripting languages, it is, from the point of view of computer programming, a complete language-it is not missing any important feature. In other words, no fundamental operations or data types are lacking (Chambers 2016). I attribute much of its success to the fact that its design achieves a very good balance between simplicity, clarity and generality. R excels at generality thanks to its extensibility at the cost of only a moderate loss of simplicity, while clarity is ensured by enforced documentation of extensions and support for both object-oriented and functional approaches to programming. The same three principles can be also easily respected by user code written in $\mathrm{R}$.

As mentioned above, $\mathrm{R}$ started as a free and open-source implementation of the S language (Becker and Chambers 1984; Becker et al. 1988). We will describe the features of the R language in later chapters. Here I mention, for those with programming experience, that it does have some features that make it different from other frequently used programming languages. For example, $\mathrm{R}$ does not have the strict type checks of Pascal or $\mathrm{C}++$. It has operators that can take vectors and matrices as operands allowing more concise program statements for such operations than other languages. Writing programs, specially reliable and fast code, requires familiarity with some of these idiosyncracies of the R language. For those using R interactively, or writing short scripts, these idiosyncratic features make life a lot easier by saving typing.

## 统计代写|R语言代写R language代考|R as a computer program

The R program itself is open-source, and the source code is available for anybody to inspect, modify and use. A small fraction of users will directly contribute improvements to the R program itself, but it is possible, and those contributions are important in making R reliable. The executable, the R program we actually use, can be built for different operating systems and computer hardware. The members of the $\mathrm{R}$ developing team make an important effort to keep the results obtained from calculations done on all the different builds and computer architectures as consistent as possible. The aim is to ensure that computations return consistent results not only across updates to $\mathrm{R}$ but also across different operating systems like Linux, Unix (including OS X), and MS-Windows, and computer hardware.

The R program does not have a graphical user interface (GUI), or menus from which to start different types of analyses. Instead, the user types the commands at the $\mathrm{R}$ console (Figure 1.1). The same textual commands can also be saved into a text file, line by line, and such a file, called a “script” can substitute repeated typing of the same sequence of commands. When we work at the console typing in commands one by one, we say that we use R interactively. When we run script, we may say that we run a “batch job.”

The two approaches described above are part of the R program by itself. However, it is common to use a second program as a front-end or middleman between the user and the R program. Such a program allows more flexibility and has multiple features that make entering commands or writing scripts easier. Computations are still done by exactly the same R program. The simplest option is to use a text editor like Emacs to edit the scripts and then run the scripts in $\mathrm{R}$ from within the editor. With some editors like Emacs, rather good integration is possible. However, nowadays there are also Integrated Development Environments (IDEs) available for R. An IDE both gives access to the R console in one window and provides a text editor for writing scripts in another window. Of the available IDEs for R, RStudio is currently the most popular by a wide margin.

## 统计代写|R语言代写R language代考|R as a language

R是一门专为数据分析和数据可视化而设计的计算机语言，然而，与其他一些脚本语言相比，从计算机编程的角度来看，它是一门完整的语言——它不缺任何重要的特性。换句话说，没有缺少基本操作或数据类型（Chambers 2016）。我将它的成功归功于它的设计在简单性、清晰性和通用性之间取得了很好的平衡。R 在通用性方面表现出色，这要归功于它的可扩展性，但其代价是适度损失了简单性，而通过强制扩展文档和对面向对象和函数式编程方法的支持来确保清晰度。同样的三个原则也可以很容易地被编写的用户代码遵守R.

## 统计代写|R语言代写R language代考|R as a computer program

R 程序本身是开源的，任何人都可以查看、修改和使用源代码。一小部分用户将直接为 R 程序本身的改进做出贡献，但这是可能的，而且这些贡献对于使 R 变得可靠很重要。可执行文件，即我们实际使用的 R 程序，可以针对不同的操作系统和计算机硬件构建。的成员R开发团队做出了重要的努力，以使从所有不同构建和计算机体系结构上进行的计算获得的结果尽可能一致。目的是确保计算不仅在更新到R还可以跨越不同的操作系统，如 Linux、Unix（包括 OS X）和 MS-Windows，以及计算机硬件。

R 程序没有图形用户界面 (GUI) 或用于启动不同类型分析的菜单。相反，用户在R控制台（图 1.1）。相同的文本命令也可以逐行保存到文本文件中，这种称为“脚本”的文件可以替代重复键入相同的命令序列。当我们在控制台上一一输入命令时，我们说我们以交互方式使用 R。当我们运行脚本时，我们可能会说我们运行的是“批处理作业”。

## 统计代写|抽样调查作业代写sampling theory of survey代考|Probability Proportional to Size Without Replacement Sampling

In probability proportional to size WOR (PPSWOR) sampling scheme, probability of selection of $i_{1}$ at the first draw is $p_{i_{1}}(1)=p_{i_{1}}$. Probability of selecting $i_{2}$ at the second draw is $p_{i_{2}}(2)=\frac{p_{i_{2}}}{1-p_{i_{1}}}$ if the unit $i_{1}\left(i_{2} \neq i_{1}\right)$ is selected at the first draw and $p_{i_{2}}(2)=0$ when the unit $i_{2}$ is selected at the first draw, i.e., $i_{2}=i_{1}$. In general, the probability of selection of $i_{k}$ at the $k$ th draw is $p_{i_{k}}(k)=p_{1-p_{i_{1}}-p_{i_{2}}-\cdots-p_{i_{k-1}}}$, if the units $i_{1}, i_{2}, \ldots, i_{k-1}$ are selected in any of the first $k-1$ draws and $p_{i_{k}}(k)=0$ if the unit $i_{k}$ is selected in any of the first $k-1$ draws for $k=2, \ldots, n ; i=1, \ldots, N$. So, for a PPSWOR sampling scheme, the probability of selecting $i_{1}$ at the first draw, $i_{2}$ at the second draw, and $i_{n}$ at the $n$th draw is
\begin{aligned} p\left(i_{1}, \ldots, i_{n}\right)=& p_{i_{1}} \frac{p_{i_{2}}}{1-p_{i_{1}}} \cdots \frac{p_{i_{k}}}{1-p_{i_{1}}-\cdots-p_{i_{k-1}}} \cdots \frac{p_{i_{n}}}{1-p_{i_{1}}-\cdots-p_{i_{n-1}}} \text { for } \ 1 \leq i_{1} \neq i_{2} \neq \cdots \neq i_{n} \leq N \end{aligned}
It should be noted that PPSWOR reduces to SRSWOR sampling scheme if $p_{i}=1 / N$ for $i=1, \ldots, N$.

## 统计代写|抽样调查作业代写sampling theory of survey代考|HANURAV’S ALGORITHM

Hanurav (1966) established a correspondence between a sampling design and a sampling scheme. He proved that any sampling scheme results in a sampling design. Similarly, for a given sampling design, one can construct at least one sampling scheme, which can implement the sampling design. In fact, Hanurav proposed the most general sampling scheme, known as Hanurav’s algorithm, using which one can derive various types of sampling schemes or sampling designs. Henceforth, we will not differentiate between the terms “sampling design” and “sampling scheme”.

Let $n_{0}$ denote the maximum sample size that might be required from a sampling scheme. Then, Hanurav’s (1966) algorithm is defined as follows:
$$\mathscr{A}=\mathscr{A}\left{q_{1}(i) ; q_{2}(s) ; q_{3}(s, i)\right}$$
where
(i) $0 \leq q_{1}(i) \leq 1, \quad \sum_{i=1}^{N} q_{1}(i)=1$ for $i=1, \ldots, N$
(ii) $0 \leq q_{2}(s) \leq 1$ for any sample $s \in \mathscr{S}$, where $\mathscr{\mathcal { S }}$ be the set of all possible samples.
(iii) $q_{3}(s, i)$ is defined when $q_{2}(s)>0$ and subject to $0 \leq q_{3}(s, i) \leq 1$,
$$\sum_{i=1}^{N} q_{3}(s, i)=1 \text { for } i=1, \ldots, N$$
Samples are selected using the following steps:
Step 1: At the first draw a unit $i_{1}$ is selected with probability $q_{1}\left(i_{1}\right)$; $i_{1}=1, \ldots, N$

Step 2: In this step, we decide whether the sampling procedure will be terminated or continued. Let $s_{(1)}=i_{1}$ be the unit selected in the first draw. A Bernoulli trial is performed with success probability $q_{2}\left(s_{(1)}\right)$. If the trial results in a failure, the sampling procedure is terminated and the selected sample is $s_{(1)}=i_{1}$. On the other hand, if the trial results in a success, we go to step 3 .

## 统计代写|抽样调查作业代写sampling theory of survey代考|Probability Proportional to Size Without Replacement Sampling

$$p\left(i_{1}, \ldots, i_{n}\right)=p_{i_{1}} \frac{p_{i_{2}}}{1-p_{i_{1}}} \cdots \frac{p_{i_{k}}}{1-p_{i_{1}}-\cdots-p_{i_{k-1}}} \cdots \frac{p_{i_{n}}}{1-p_{i_{1}}-\cdots-p_{i_{n-1}}} \text { for } 1 \leq i_{1} \neq i_{2} \neq \cdots$$

## 统计代写|抽样调查作业代写sampling theory of survey代考|HANURAV’S ALGORITHM

Hanurav (1966) 建立了抽样设计和抽样方案之间的对应关系。他证明了任何抽样方案都会导致抽样设计。类似 地，对于给定的抽样设计，可以构建至少一种抽样方案，该方案可以实现抽样设计。事实上，Hanurav 提出了最 通用的抽样方案，称为 Hanurav 算法，利用该算法可以推导出各种类型的抽样方案或抽样设计。此后，我们将 不再区分”抽样设计”和”抽样方案”这两个术语。

$\backslash$ mathscr ${\mathrm{A}}=\backslash$ mathscr ${\mathrm{A}} \backslash \operatorname{left}\left{\mathrm{q}{-}{1}(\mathrm{i}) ; \mathrm{q}{-}{2}(\mathrm{s}) ; \mathrm{q}{-}{3}(\mathrm{s}, \mathrm{i}) \backslash\right.$ right $}$ 其中 (i) $0 \leq q{1}(i) \leq 1, \quad \sum_{i=1}^{N} q_{1}(i)=1$ 为了 $i=1, \ldots, N$
(二) $0 \leq q_{2}(s) \leq 1$ 对于任何样品 $s \in \mathscr{S}$ ，在哪里 $\mathcal{S}$ 是所有可能样本的集合。
$\Leftrightarrow q_{3}(s, i)$ 定义为 $q_{2}(s)>0$ 并受 $0 \leq q_{3}(s, i) \leq 1$ ，
$$\sum_{i=1}^{N} q_{3}(s, i)=1 \text { for } i=1, \ldots, N$$

## 英国补考|抽样调查作业代写sampling theory of survey代考|Sampling and Nonsampling Errors

Obviously, using the complete enumeration method, we get the correct value of the parameter, provided all the $\gamma$-values of the population obtained are correct. This would mean that there is no nonresponse, i.e., a response from each unit is obtained, and there is no measurement error in measuring $\gamma$-values. However, in practice, at least for a large-scale survey, nonresponse is unavoidable, and $\gamma$-values are also subject to error because the respondents report untrue values, especially when $\gamma$-values relate to confidential characteristics such as income and age. The error in a survey, which is originated from nonresponse or incorrect measurement of $y$-values, is termed as the nonsampling error. The nonsampling errors increase with the sample size.
From a sample survey, we cannot get the true value of the parameter because we surveyed only a sample, which is just a part of the population. The error committed by making inference by surveying a part of the population is known as the sampling error. In complete enumeration,sampling error is absent, but it is subjected more to nonsampling error than sample surveys. When the population is large, complete enumeration is not possible as it is very expensive, time-consuming, and requires many trained investigators. The advantages of sample surveys over complete enumeration were advocated by Mahalanobis (1946), Cochran (1977), and Murthy (1977), to name a few.

## 英国补考|抽样调查作业代写sampling theory of survey代考|Cumulative Total Method

Here we label all possible samples of $\mathcal{S}$ as $s_{1}, \ldots, s_{i}, \ldots, s_{M}$, where $M=$ total number of samples in $\mathscr{e}$. Then we calculate the cumulative total $T_{i}=p\left(s_{1}\right)+\cdots+p\left(s_{i}\right)$ for $i=1, \ldots, M$ and select a random sample $R$ (say) from a uniform population with range $(0,1)$. This can be done by choosing a five-digit random number and placing a decimal preceding it. The sample $s_{k}$ is selected if $T_{k-1}<R \leq T_{k}$, for $k=1, \ldots, M$ with $T_{0}=0$.

Example $1.4 .1$
Let $U=(1,2,3,4) ; s_{1}=(1,1,2), s_{2}=(1,2,2), s_{3}=(3,2), s_{4}=(4)$; $p\left(s_{1}\right)=0.25, p\left(s_{2}\right)=0.30, p\left(s_{3}\right)=0.20$, and $p\left(s_{4}\right)=0.25$.
$\begin{array}{lllll}s & s_{1} & s_{2} & s_{3} & s_{4} \ p(s) & 0.25 & 0.30 & 0.20 & 0.25 \ T_{k} & 0.25 & 0.55 & 0.75 & 1\end{array}$
Let a random sample $R=0.34802$ be selected from a uniform population with range $(0,1)$. The sample $s_{2}$ is selected as $T_{1}=0.25<R=$ $0.34802 \leq T_{2}=0.55$.

The cumulative total method mentioned above, however, cannot be used in practice because here we have to list all the possible samples having positive probabilities. For example, suppose we need to select a sample of size 15 from a population size $R=30$ following a sampling design, where all possible samples of size $n=15$ have positive probabilities, we need to list $M=\left(\begin{array}{l}30 \ 15\end{array}\right)$ possible samples, which is obviously a huge number.

## 英国补考|抽样调查作业代写sampling theory of survey代考|Cumulative Total Method

$p\left(s_{1}\right)=0.25, p\left(s_{2}\right)=0.30, p\left(s_{3}\right)=0.20$ ，和 $p\left(s_{4}\right)=0.25$.

## 统计代写|抽样调查作业代写sampling theory of survey代考|Preliminaries and Basics of Probability Sampling

Various government organizations, researchers, sociologist, and businesses often conduct surveys to get answers to certain specific questions, which cannot be obtained merely through laboratory experiments or simply using economic, mathematical, or statistical formulation. For example, the knowledge of the proportion of unemployed people, those below poverty line, and the extent of child labor in a certain locality is very important for the formulation of a proper economic planning. To get the answers to such questions, we conduct surveys on sections of people of the locality very often. Surveys should be conducted in such a way that the results of the surveys can be interpreted objectively in terms of probability. Drawing inference about aggregate (population) on the basis of a sample, a part of the populations, is a natural instinct of human beings. Surveys should be conducted in such a way that the inference relating to the population should have some valid statistical background. To achieve valid statistical inferences, one needs to select samples using some suitable sampling procedure. The collected data should be analyzed appropriately. In this book, we have discussed various methods of sample selection procedures, data collection, and methods of data analysis and their applications under various circumstances. The statistical theories behind such procedures have also been studied in great detail.

In this chapter we introduce some of the basic definitions and terminologies in survey sampling such as population, unit, sample, sampling designs, and sampling schemes. Various methods of sample selection as well as Hanurav’s algorithm which gives the correspondence between a sampling design and a sampling scheme have also been discussed.

## 统计代写|抽样调查作业代写sampling theory of survey代考|Parameter and Parameter Space

For a given population $U$, we may be interested in studying certain characteristics of it. Such characteristics are known as study variables. When considering a population of students in a certain class, we may be interested to know the age, height, racial group, economic condition, marks on different subjects, and so forth. Each of the variables under study is called a study variable, and it will be denoted by $\gamma$. Let $\gamma_{i}$ be the value of a study variable $y$ for the $i$ th unit of the population $U$, which is generally not known before the survey. The $N$-dimension vector $\mathbf{y}=\left(\gamma_{1}, \ldots, \gamma_{i}, \ldots, \gamma_{\mathrm{N}}\right)$ is known as a parameter of the population $U$ with respect to the characteristic $\gamma$. The set of all-possible values of the vector $\mathbf{y}$ is the $N$-dimensional Euclidean space $R^{N}=\left(-\infty<y_{1}<\infty, \ldots,-\infty<\gamma_{i}<\infty, \ldots,-\infty<\gamma_{N}<\infty\right)$ and it is known as a parameter space. In most of the cases we are not interested in knowing the parameter $\mathbf{y}$ but in a certain parametric function of $\mathbf{y}$ such as, $Y=\sum_{i=1}^{N} \gamma_{i}=$ population total, $\bar{Y}=\frac{1}{N} \sum_{i=1}^{N} \gamma_{i}=$ population mean, $S_{Y}^{2}=\frac{1}{N-1} \sum_{i=1}^{N}\left(y_{i}-\bar{Y}\right)^{2}=$ population variance, $C_{\gamma}=S_{Y} / \bar{Y}$ $=$ population coefficient of variation, and so forth.

## 统计代写|随机过程代写stochastic process代考|The distribution of X

EXERCISE 7.1. (a) We plug $\lambda_{n}=n \lambda$ and $\mu_{n}=0$ into (7.1), and note that $n$ starts with 1 and not 0 . The Kolmogorov forward equations become $P_{1}^{\prime}(t)=-\lambda P_{1}(t)$ and $P_{n}^{\prime}(t)=(n-1) \lambda P_{n-1}(t)-$ $n \lambda P_{n}(t), n=2,3, \ldots$, with the initial condition $P_{1}(0)=1$.
(b) To show that $P_{n}(t)=e^{-\lambda t}\left(1-e^{-\lambda t}\right)^{n-1}, n=1,2, \ldots$, solve the Kolmogorov equations, we write $P_{1}(t)=e^{-\lambda t}$, so $P_{1}^{\prime}(t)=-\lambda e^{-\lambda t}=-\lambda P_{1}(t)$. Also,
$P_{n}^{\prime}(t)=-\lambda e^{-\lambda t}\left(1-e^{-\lambda t}\right)^{n-1}+e^{-\lambda t}(n-1) \lambda e^{-\lambda t}\left(1-e^{-\lambda t}\right)^{n-2}=-\lambda e^{-\lambda t}\left(1-e^{-\lambda t}\right)^{n-1}+$ $e^{-\lambda t}(n-1) \lambda\left(e^{-\lambda t}-1+1\right)\left(1-e^{-\lambda t}\right)^{n-2}=-\lambda e^{-\lambda t}\left(1-e^{-\lambda t}\right)^{n-1}-(n-1) \lambda e^{-\lambda t}(1-$ $\left.e^{-\lambda t}\right)^{n-1}+(n-1) \lambda e^{-\lambda t}\left(1-e^{-\lambda t}\right)^{n-2}=(n-1) \lambda e^{-\lambda t}\left(1-e^{-\lambda t}\right)^{n-2}-n \lambda e^{-\lambda t}(1-$ $\left.e^{-\lambda t}\right)^{n-1}=(n-1) \lambda P_{n-1}(t)-n \lambda P_{n}(t) .$
(c) The distribution of $X(t)$ is geometric that models the number of trials until the first success where the probability of success is $p=e^{-\lambda t}$. Therefore, $E(X(t))=\frac{1}{p}=e^{\lambda t}$, and $\operatorname{Var}(X(t))=\frac{1-p}{p^{2}}=$ $\frac{1-e^{-\lambda t}}{e^{-2 \lambda t}}=e^{\lambda t}\left(e^{\lambda t}-1\right)$.
(d) If $\lambda=4$, the probability that there will be between 3 and 5 particles at week 1 is $P_{3}(1)+P_{4}(1)+$ $P_{5}(1)=e^{-4}\left(1-e^{-4}\right)^{3-1}+e^{-4}\left(1-e^{-4}\right)^{4-1}+e^{-4}\left(1-e^{-4}\right)^{5-1}=0.051989$. The mean at week 1 is $E(X(1))=e^{4}=54.59815$, and the standard deviation is $\sqrt{\operatorname{Var}(X(1))}=\sqrt{e^{4}\left(e^{4}-1\right)}=$ $54.09584$.
EXERCISE 7.2. (a) We plug $\lambda_{n}=n \lambda$ and $\mu_{n}=0$ into (7.1) and note that $n$ starts with $m$ and not 0 . The Kolmogorov forward equations become $P_{m}^{\prime}(t)=-m \lambda P_{m}(t)$ and $P_{n}^{\prime}(t)=(n-1) \lambda P_{n-1}(t)-$ $n \lambda P_{n}(t), n=2,3, \ldots$, with the initial condition $P_{m}(0)=1$.
(b) To verify that $P_{n}(t)=\left(\begin{array}{c}n-1 \ n-m\end{array}\right) e^{-m \lambda t}\left(1-e^{-\lambda t}\right)^{n-m}, n=m, m+1_{, \ldots}$, solve the Kolmogorov equations, we write $P_{m}(t)=e^{-m \lambda t}$, so $P_{m}^{\prime}(t)=-m \lambda e^{-m \lambda t}=-m \lambda P_{m}(t)$. Further, $P_{n}^{\prime}(t)=$ $-m \lambda\left(\begin{array}{c}n-1 \ n-m\end{array}\right) e^{-m \lambda t}\left(1-e^{-\lambda t}\right)^{n-m}+\left(\begin{array}{c}n-1 \ n-m\end{array}\right) e^{-m \lambda t}(n-m) \lambda e^{-\lambda t}\left(1-e^{-\lambda t}\right)^{n-m-1}–m \lambda P_{n}(t)+$ $\left(\begin{array}{l}n-1 \ n-m\end{array}\right) e^{-m \lambda t}(n-m) \lambda\left(e^{-\lambda t}-1+1\right)\left(1-e^{-\lambda t}\right)^{n-m-1}=-m \lambda P_{n}(t)-(n-m) \lambda P_{n}(t)+$ $(n-m)\left(\begin{array}{c}n-1 \ n=m\end{array}\right) \lambda e^{-m \lambda t}\left(1-e^{-\lambda t}\right)^{n-m-1}=-n \lambda P_{n}(t)+(n-1) \lambda\left(\begin{array}{c}n-2 \ n=m=1\end{array}\right) e^{-m \lambda t}(1-$ $\left.e^{-\lambda t}\right)^{n-m-1}=(n-1) \lambda P_{n-1}(t)-n \lambda P_{n}(t)$.
(c) The distribution of $X(t)$ is a negative binomial that models the number of trials until the $m$ th success, where the probability of success is $p=e^{-\lambda t}$. Therefore, the mean and the variance are $E(X(t))=\frac{m}{p}=m e^{\lambda t}$, and $\operatorname{Var}(X(t))=\frac{m(1-p)}{p^{2}}=m e^{\lambda t}\left(e^{-\lambda t}-1\right) .$
(d) $P_{12}(2)=\left(\begin{array}{c}12-1 \ 12-5\end{array}\right) e^{-(5)(0.2)(2)}\left(1-e^{-(0.2)(2)}\right)^{12-5}=0.0189$. The mean and standard deviations are $E(X(2))=(5) e^{(0.2)(2)}=7.459123$, and $\sqrt{\operatorname{Var}(X(2))}=\sqrt{(5) e^{0.4}\left(e^{0.4}-1\right)}=1.915354 .$

## 统计代写|随机过程代写stochastic process代考|In the Kolmogorov forward equations

EXERCISE 7.3. (a) In the Kolmogorov forward equations (7.1), we use $\lambda_{n}=0$, and $\mu_{n}=n \mu$, and the fact that the initial population size is $N$. We write $P_{N}^{\prime}(t)=-N \mu P_{N}(t)$ and $P_{n}^{\prime}(t)=$ $(n+1) \mu P_{n+1}(t)-n \mu P_{n}(t), n=0,1, \ldots, N-1$, with the initial condition $P_{N}(0)=1$.
(b) The probabilities $P_{n}(t)=\left(\begin{array}{l}N \ n\end{array}\right) e^{-n \mu t}\left(1-e^{-\mu t}\right)^{N-n}, n=0, \ldots, N$, solve the Kolmogorov forward equations since $P_{N}(t)=e^{-N \mu t}$ and so, $P_{N}^{\prime}(t)=-N \mu e^{-N \mu t}=-N \mu P_{N}(t)$. Also,
$$\begin{gathered} P_{n}^{\prime}(t)=-n \mu\left(\begin{array}{l} N \ n \end{array}\right) e^{-n \mu t}\left(1-e^{-\mu t}\right)^{N-n}+\left(\begin{array}{l} N \ n \end{array}\right) e^{-n \mu t}(N-n) \mu e^{-\mu t}\left(1-e^{-\mu t}\right)^{N-n-1} \ =-n \mu P_{n}(t)+(n+1) \mu\left(\begin{array}{c} N \ n+1 \end{array}\right) e^{-(n+1) \mu t}\left(1-e^{-\mu t}\right)^{N-(n+1)}=(n+1) \mu P_{n+1}(t)-n \mu P_{n}(t) . \end{gathered}$$
(c) The distribution of $X(t)$ is binomial with parameters $N$ and $p=e^{-\mu t}$. Therefore, $E(X(t))=N p=$ $N e^{-\mu t}$, and $\operatorname{Var}(X(t))=N p(1-p)=N e^{-\mu t}\left(1-e^{-\mu t}\right)$.
(d) $P_{12}(3)=\left(\begin{array}{c}15 \ 12\end{array}\right) e^{-(12)(0.02)(3)}\left(1-e^{-(0.02)(3)}\right)^{15-12}=0.0437$. The mean and standard deviation are $E(X(3))=15 e^{-(0.02)(3)}=14.12647$, and $\sqrt{\operatorname{Var}(X(3))}=\sqrt{15 e^{-(0.02)(3)}\left(1-e^{-(0.02)(3)}\right)}=$ $0.907007 .$
EXERCISE 7.4. (a) We are given that $\lambda=1.3$ and $\mu=0.2$. We need to compute
$$P_{4}(2)=\left(1-P_{0}\right)\left(1-\frac{\lambda}{\mu} P_{0}\right)\left(\frac{\lambda}{\mu} P_{0}\right)^{n-1}=\left(1-P_{0}\right)\left(1-\frac{1.3}{0.2} P_{0}\right)\left(\frac{1.3}{0.2} P_{0}\right)^{4-1}$$
where
$$P_{0}=\frac{\mu e^{(\lambda-\mu) t}-\mu}{\lambda e^{(\lambda-\mu) t}-\mu}=\frac{0.2 e^{(1.3-0.2)(2)}-0.2}{1.3 e^{(1.3-0.2)(2)}-0.2}=0.139172 .$$
Thus, $P_{4}(2)=0.060783$. The mean and variance are $E(X(2))=e^{(\lambda-\mu) t}=e^{(1.3-0.2)(2)}=9.025013$ and $\operatorname{Var}(X(2))=\frac{\lambda+\mu}{\lambda-\mu} e^{(\lambda-\mu) t}\left(e^{(\lambda-\mu) t}-1\right)=\frac{1.3+0.2}{1.3-0.2} e^{(1.3-0.2)(2)}\left(e^{(1.3-0.2)(2)}-1\right)=98.76253$.
(b) Below we simulate a 50 -step trajectory of the process that starts in state 1 and has parameters $\lambda=$ $1.3$ and $\mu=0.2$.

###### specifying parameters

lambda<- $1.3$
$m u<-0.2$
njumps<- 50

###### defining state and time as vectors

$\mathbb{N}<-\mathrm{c}()$

###### specifying parameters

lambda<- $1.3$
mu<- $0.2$
njumps<- 50

N<- c()
time<-c()

N[1]<-
time[1]<- 0

set.seed(353332)
time<- col)

###### setting initial values

$\mathrm{N}[1]<-1$
time $[1]<-0$

###### specifying seed

set.seed (353332)

## 统计代写|随机过程代写stochastic process代考|the queue will accumulate faster

EXERCISE 7.5. (a) If $\lambda>\mu$, the queue will accumulate faster than customers go through the server, and so we expect an infinite number of customers in the system in the long run.
(b) For $\lambda=3$ and $\mu=5$, the long-run probability that there will be more than 2 customers in the system is $P(#$ of customers $>2)=1-P_{0}-P_{1}-P_{2}=1-\left(1-\frac{\lambda}{\mu}\right) \mu\left[1+\frac{\lambda}{\mu}+\left(\frac{\lambda}{\mu}\right)^{2}\right]=1-$ $\left(1-\frac{\lambda}{\mu}\right) \frac{1-\left(\frac{\lambda}{\mu}\right)^{3}}{1-\frac{\lambda}{\mu}}=\left(\frac{\lambda}{\mu}\right)^{3}=\left(\frac{3}{5}\right)^{3}=0.216 .$
(c) In the long run, the average number of customers in the system is
$$\lim _{t \rightarrow \infty} E(X(t))=\frac{\lambda}{\mu-\lambda}=\frac{3}{5-3}=1.5 .$$
(d) In the long run, the proportion of customers in the system who have to wait more than 1 minute is $P(T>1)=e^{-(\mu-\lambda) t}=e^{-(5-3)(1)}=0.135335$, or roughly $13.5 \%$.

EXERCISE 7.6. (a) Below we simulate a trajectory of a birth-and-death process with immigration and emigration, with parameters $\lambda=1, \mu=0.2, \alpha=0.3$, and $\beta=0.1$. The trajectory starts in state 10 and ends in state $25 .$

###### specifying parameters

lambda<- 1
$m<-0.2$
alpha<-0.3
beta<- $0.1$

###### defining state and time as vectors

$\mathrm{N}<-\mathrm{c}$ ()
time<- c()

###### setting initial values

$\mathrm{N}[1]<-10$
time $[1]<-0$

###### specifying seed

set.seed ( 93743765$)$

###### simulating trajectory

$1<-2$
repeat \&
time.birth<- $(-1 /(\mathrm{N}[i-1] * \operatorname{lambda}+\mathrm{l}$ pha) $) * \log (1-\operatorname{runif}(1))$
time deathc- $\left(-1 /\left(\mathbb{N}[1-1]^{\star}\right.\right.$ mutbeta $\left.)\right) \wedge \log (1-\operatorname{run} 1 f(1))$
if (time.birth < time. death | $N[i-1]==0$ ) (
time $[i]<-$ time $[i-1]+$ time.birth – $0.001$
$\mathbb{N}[i]<-\mathbb{N}[i-1]$
if $(N[i]==25)$ break
else।

## 统计代写|随机过程代写stochastic process代考|The distribution of X

(b) 表明磷n(吨)=和−λ吨(1−和−λ吨)n−1,n=1,2,…，求解 Kolmogorov 方程，我们写磷1(吨)=和−λ吨， 所以磷1′(吨)=−λ和−λ吨=−λ磷1(吨). 还，

(c) 分布X(吨)是几何的，它模拟试验次数，直到第一次成功，其中成功的概率是p=和−λ吨. 所以，和(X(吨))=1p=和λ吨， 和曾是⁡(X(吨))=1−pp2= 1−和−λ吨和−2λ吨=和λ吨(和λ吨−1).
(d) 如果λ=4，第 1 周有 3 到 5 个粒子的概率为磷3(1)+磷4(1)+ 磷5(1)=和−4(1−和−4)3−1+和−4(1−和−4)4−1+和−4(1−和−4)5−1=0.051989. 第 1 周的平均值是和(X(1))=和4=54.59815，标准差为曾是⁡(X(1))=和4(和4−1)= 54.09584.

(b) 核实磷n(吨)=(n−1 n−米)和−米λ吨(1−和−λ吨)n−米,n=米,米+1,…，求解 Kolmogorov 方程，我们写磷米(吨)=和−米λ吨， 所以磷米′(吨)=−米λ和−米λ吨=−米λ磷米(吨). 更远，磷n′(吨)= −米λ(n−1 n−米)和−米λ吨(1−和−λ吨)n−米+(n−1 n−米)和−米λ吨(n−米)λ和−λ吨(1−和−λ吨)n−米−1–米λ磷n(吨)+ (n−1 n−米)和−米λ吨(n−米)λ(和−λ吨−1+1)(1−和−λ吨)n−米−1=−米λ磷n(吨)−(n−米)λ磷n(吨)+ (n−米)(n−1 n=米)λ和−米λ吨(1−和−λ吨)n−米−1=−nλ磷n(吨)+(n−1)λ(n−2 n=米=1)和−米λ吨(1− 和−λ吨)n−米−1=(n−1)λ磷n−1(吨)−nλ磷n(吨).
(c) 分布X(吨)是一个负二项式，它对试验次数进行建模，直到米th 成功，其中成功的概率为p=和−λ吨. 因此，均值和方差为和(X(吨))=米p=米和λ吨， 和曾是⁡(X(吨))=米(1−p)p2=米和λ吨(和−λ吨−1).
(d)磷12(2)=(12−1 12−5)和−(5)(0.2)(2)(1−和−(0.2)(2))12−5=0.0189. 均值和标准差为和(X(2))=(5)和(0.2)(2)=7.459123， 和曾是⁡(X(2))=(5)和0.4(和0.4−1)=1.915354.

## 统计代写|随机过程代写stochastic process代考|In the Kolmogorov forward equations

(b) 概率磷n(吨)=(ñ n)和−nμ吨(1−和−μ吨)ñ−n,n=0,…,ñ，求解 Kolmogorov 正向方程，因为磷ñ(吨)=和−ñμ吨所以，磷ñ′(吨)=−ñμ和−ñμ吨=−ñμ磷ñ(吨). 还，

(c) 分布X(吨)是带参数的二项式ñ和p=和−μ吨. 所以，和(X(吨))=ñp= ñ和−μ吨， 和曾是⁡(X(吨))=ñp(1−p)=ñ和−μ吨(1−和−μ吨).
(d)磷12(3)=(15 12)和−(12)(0.02)(3)(1−和−(0.02)(3))15−12=0.0437. 均值和标准差是和(X(3))=15和−(0.02)(3)=14.12647， 和曾是⁡(X(3))=15和−(0.02)(3)(1−和−(0.02)(3))= 0.907007.

(b) 下面我们模拟从状态 1 开始并有参数的过程的 50 步轨迹λ= 1.3和μ=0.2.

λ<-1.3

ñ<−C()

λ<-1.3

N<- c()

N[1]<-

set.seed(353332)

ñ[1]<−1

###### 指定种子

set.seed (353332)

## 统计代写|随机过程代写stochastic process代考|the queue will accumulate faster

(b) 为λ=3和μ=5，系统中存在超过 2 个客户的长期概率为P(#P(#客户的>2)=1−磷0−磷1−磷2=1−(1−λμ)μ[1+λμ+(λμ)2]=1− (1−λμ)1−(λμ)31−λμ=(λμ)3=(35)3=0.216.
(c) 从长远来看，系统中的平均客户数为

(d) 从长远来看，系统中等待超过 1 分钟的客户比例为磷(吨>1)=和−(μ−λ)吨=和−(5−3)(1)=0.135335，或大致13.5%.

λ<- 1

ñ<−C()

ñ[1]<−10

###### 指定种子

set.seed ( 93743765)

###### 模拟轨迹

1<−2

time.birth<-(−1/(ñ[一世−1]∗拉姆达+l阶段）)∗日志⁡(1−鲁尼夫⁡(1))

ñ[一世]<−ñ[一世−1]

## 有限元方法代写

