商业分析代写 - 统计代写答疑辅导

分类：商业分析代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Describing Central Tendency

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

statistics-lab™ 为您的留学生涯保驾护航在代写商业分析Statistical Modelling for Business方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写商业分析Statistical Modelling for Business方面经验极为丰富，各种代写商业分析Statistical Modelling for Business相关的作业也就用不着说。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|Describing Central Tendency

统计代写|商业分析作业代写Statistical Modelling for Business代考|The mean, median, and mode

In addition to describing the shape of the distribution of a sample or population of measurements, we also describe the data set’s central tendency. A measure of central tendency represents the center or middle of the data. Sometimes we think of a measure of central tendency as a typical value. However, as we will see, not all measures of central tendency are necessarily typical values.

One important measure of central tendency for a population of measurements is the population mean. We define it as follows:More precisely, the population mean is calculated by adding all the population measurements and then dividing the resulting sum by the number of population measurements. For instance, suppose that Chris is a college junior majoring in business. This semester Chris is taking five classes and the numbers of students enrolled in the classes (that is, the class sizes) are as follows:

The mean $\mu$ of this population of class sizes is
$$
\mu=\frac{60+41+15+30+34}{5}=\frac{180}{5}=36
$$
Because this population of five class sizes is small, it is possible to compute the population mean. Often, however, a population is very large and we cannot obtain a measurement for each population element. Therefore, we cannot compute the population mean. In such a case, we must estimate the population mean by using a sample of measurements.

In order to understand how to estimate a population mean, we must realize that the population mean is a population parameter.

统计代写|商业分析作业代写Statistical Modelling for Business代考|The Car Mileage Case: Estimating Mileage

In order to offer its tax credit, the federal government has decided to define the “typical” EPA combined city and highway mileage for a car model as the mean $\mu$ of the population of EPA combined mileages that would be obtained by all cars of this type. Here, using the mean to represent a typical value is probably reasonable. We know that some individual cars will get mileages that are lower than the mean and some will get mileages that are above it. However, because there will be many thousands of these cars on the road, the mean mileage obtained by these cars is probably a reasonable way to represent the model’s overall fuel economy. Therefore, the government will offer its tax credit to any automaker selling a midsize model equipped with an automatic transmission that achieves a mean EPA combined mileage of at least $31 \mathrm{mpg}$.

To demonstrate that its new midsize model qualifies for the tax credit, the automaker in this case study wishes to use the sample of 50 mileages in Table $3.1$ to estimate $\mu$, the model’s mean mileage. Before calculating the mean of the entire sample of 50 mileages, we will illustrate the formulas involved by calculating the mean of the first five of these mileages.

Table $3.1$ tells us that $x_{1}=30.8, x_{2}=31.7, x_{3}=30.1, x_{4}=31.6$, and $x_{5}=32.1$, so the sum of the first five mileages is
$$
\begin{aligned}
\sum_{i=1}^{5} x_{i} &=x_{1}+x_{2}+x_{3}+x_{4}+x_{5} \
&=30.8+31 . \overline{3}+30.1+31.6+3 \overline{2} .1=156.3
\end{aligned}
$$
Therefore, the mean of the first five mileages is
$$
\bar{x}=\frac{\sum_{i=1}^{5} x_{i}}{5}=\frac{156.3}{5}=31.26
$$
Of course, intuitively, we are likely to obtain a more accurate point estimate of the population mean by using all of the available sample information. The sum of all 50 mileages can be verified to be
$$
\sum_{i=1}^{50} x_{i}=x_{1}+x_{2}+\cdots+x_{50}=30.8+31.7+\cdots+31.4=1578
$$
Therefore, the mean of the sample of 50 mileages is
$$
\bar{x}=\frac{\sum_{i=1}^{50} x_{i}}{50}=\frac{1578}{50}=31.56
$$
This point estimate says we estimate that the mean mileage that would be obtained by all of the new midsize cars that will or could potentially be produced this year is $31.56 \mathrm{mpg}$. Unless we are extremely lucky, however, there will be sampling error. That is, the point estimate $\bar{x}=31.56$ mpg, which is the average of the sample of fifty randomly selected mileages, will probably not exactly equal the population mean $\mu$, which is the average mileage that would be obtained by all cars. Therefore, although $\bar{x}=31.56$ provides some evidence that $\mu$ is at least 31 and thus that the automaker should get the tax credit, it does not provide definitive evidence. In later chapters, we discuss how to assess the reliability of the sample mean and how to use a measure of reliability to decide whether sample information provides definitive evidence.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Comparing the mean, median, and mode

Often we construct a histogram for a sample to make inferences about the shape of the sampled population. When we do this, it can be useful to “smooth out” the histogram and use the resulting relative frequency curve to describe the shape of the population. Relative frequency curves can have many shapes. Three common shapes are illustrated in Figure $3.3$. Part (a) of this figure depicts a population described by a symmetrical relative frequency curve. For such a population, the mean $(\mu)$, median $\left(M_{s}\right)$, and mode $\left(M_{s}\right)$ are all equal. Note that in this case all three of these quantities are located under the highest point of the curve. It follows that when the frequency distribution of a sample of measurements is approximately symmetrical, then the sample mean, median, and mode will be nearly the same. For instance,

consider the sample of 50 mileages in Table 3.1. Because the histogram of these mileages in Figure $3.2$ is approximately symmetrical, the mean $-31.56$ – and the median-31.55- of the mileages are approximately equal to each other.

Figure $3.3$ (b) depicts a population that is skewed to the right. Here the population mean is larger than the population median, and the population median is larger than the population mode (the mode is located under the highest point of the relative frequency curve). In this case the population mean averages in the large values in the upper tail of the distribution. Thus the population mean is more affected by these large values than is the population median. To understand this, we consider the following example.

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|The mean, median, and mode

除了描述样本或测量群体的分布形状外，我们还描述了数据集的集中趋势。集中趋势的度量代表数据的中心或中间。有时我们将集中趋势的度量视为一个典型值。然而，正如我们将看到的，并非所有集中趋势的度量都必然是典型值。

测量群体集中趋势的一个重要度量是群体均值。我们将其定义如下：更准确地说，总体均值是通过将所有总体测量值相加，然后将所得总和除以总体测量值的数量来计算的。例如，假设 Chris 是一名商业专业的大三学生。本学期 Chris 上 5 节课，在校学生人数（即班级人数）如下：

均值μ这个班级规模的人口是
μ=60+41+15+30+345=1805=36
因为这个五个班级规模的人口很小，所以可以计算人口平均值。但是，通常人口非常大，我们无法获得每个人口元素的测量值。因此，我们无法计算总体均值。在这种情况下，我们必须使用测量样本来估计总体均值。

为了理解如何估计总体均值，我们必须认识到总体均值是总体参数。

统计代写|商业分析作业代写Statistical Modelling for Business代考|The Car Mileage Case: Estimating Mileage

为了提供税收抵免，联邦政府已决定将汽车模型的“典型”EPA 城市和高速公路组合里程定义为平均值μ这种类型的所有汽车将获得的 EPA 总里程数。在这里，用平均值来表示一个典型值可能是合理的。我们知道，有些个别汽车的行驶里程会低于平均值，而有些汽车的行驶里程会高于平均值。但是，由于道路上会有成千上万辆这样的汽车，这些汽车所获得的平均里程可能是代表该车型整体燃油经济性的合理方式。因此，政府将向任何销售配备自动变速器的中型车型的汽车制造商提供税收抵免，该车型的平均 EPA 综合里程至少为31米pG.

为了证明其新的中型车型有资格获得税收抵免，本案例研究中的汽车制造商希望使用表中 50 英里的样本3.1估计μ，模型的平均里程。在计算 50 英里的整个样本的平均值之前，我们将通过计算前 5 英里的平均值来说明所涉及的公式。

桌子3.1告诉我们X1=30.8,X2=31.7,X3=30.1,X4=31.6，和X5=32.1，所以前五英里的总和是
∑一世=15X一世=X1+X2+X3+X4+X5 =30.8+31.3¯+30.1+31.6+32¯.1=156.3
因此，前五英里的平均值为
X¯=∑一世=15X一世5=156.35=31.26
当然，直观地说，我们很可能通过使用所有可用的样本信息来获得对总体均值的更准确的点估计。可以验证所有 50 里程的总和为
∑一世=150X一世=X1+X2+⋯+X50=30.8+31.7+⋯+31.4=1578
因此，50 英里的样本的平均值为
X¯=∑一世=150X一世50=157850=31.56
该点估计表明，我们估计今年将或可能生产的所有新中型汽车的平均行驶里程为31.56米pG. 然而，除非我们非常幸运，否则会有抽样误差。即点估计X¯=31.56mpg 是随机选择的 50 英里的样本的平均值，可能不会完全等于总体平均值μ，这是所有汽车将获得的平均里程数。因此，虽然X¯=31.56提供了一些证据表明μ至少是 31 岁，因此汽车制造商应该获得税收抵免，它没有提供确切的证据。在后面的章节中，我们将讨论如何评估样本均值的可靠性以及如何使用可靠性度量来确定样本信息是否提供了确凿的证据。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Comparing the mean, median, and mode

我们经常为样本构建直方图，以推断样本总体的形状。当我们这样做时，“平滑”直方图并使用生成的相对频率曲线来描述总体形状会很有用。相对频率曲线可以有多种形状。三种常见的形状如图所示3.3. 该图的 (a) 部分描绘了由对称相对频率曲线描述的总体。对于这样的人群，均值(μ), 中位数(米s), 和模式(米s)都是平等的。请注意，在这种情况下，所有这三个量都位于曲线的最高点下方。因此，当测量样本的频率分布近似对称时，样本均值、中值和众数将几乎相同。例如，

考虑表 3.1 中 50 英里的样本。因为图中这些里程的直方图3.2近似对称，均值−31.56– 里程数的中位数 31.55- 大致相等。

数字3.3(b) 描绘了向右倾斜的人口。这里总体均值大于总体中位数，总体中位数大于总体众数（众数位于相对频率曲线最高点下方）。在这种情况下，总体平均值是分布上尾中较大值的平均值。因此，总体平均值比总体中位数更受这些大值的影响。为了理解这一点，我们考虑下面的例子。

统计代写|商业分析作业代写Statistical Modelling for Business代考请认准statistics-lab™

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

在概率论概念中，随机过程是随机变量的集合。若一随机系统的样本点是随机函数，则称此函数为样本函数，这一随机系统全部样本函数的集合是一个随机过程。实际应用中，样本函数的一般定义在时间域或者空间域。 随机过程的实例如股票和汇率的波动、语音信号、视频信号、体温的变化，随机运动如布朗运动、随机徘徊等等。

贝叶斯方法代考

贝叶斯统计概念及数据分析表示使用概率陈述回答有关未知参数的研究问题以及统计范式。后验分布包括关于参数的先验分布，和基于观测数据提供关于参数的信息似然模型。根据选择的先验分布和似然模型，后验分布可以解析或近似，例如，马尔科夫链蒙特卡罗 (MCMC) 方法之一。贝叶斯统计概念及数据分析使用后验分布来形成模型参数的各种摘要，包括点估计，如后验平均值、中位数、百分位数和称为可信区间的区间估计。此外，所有关于模型参数的统计检验都可以表示为基于估计后验分布的概率报表。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

机器学习代写

随着AI的大潮到来，Machine Learning逐渐成为一个新的学习热点。同时与传统CS相比，Machine Learning在其他领域也有着广泛的应用，因此这门学科成为不仅折磨CS专业同学的“小恶魔”，也是折磨生物、化学、统计等其他学科留学生的“大魔王”。学习Machine learning的一大绊脚石在于使用语言众多，跨学科范围广，所以学习起来尤其困难。但是不管你在学习Machine Learning时遇到任何难题，StudyGate专业导师团队都能为你轻松解决。

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

随机过程，是依赖于参数的一组随机变量的全体，参数通常是时间。随机变量是随机现象的数量表现，其时间序列是一组按照时间发生先后顺序进行排列的数据点序列。通常一组时间序列的时间间隔为一恒定值（如1秒，5分钟，12小时，7天，1年），因此时间序列可以作为离散时间数据进行分析处理。研究时间序列数据的意义在于现实中，往往需要研究某个事物其随时间发展变化的规律。这就需要通过研究该事物过去发展的历史记录，以得到其自身发展的规律。

回归分析代写

多元回归分析渐进（Multiple Regression Analysis Asymptotics）属于计量经济学领域，主要是一种数学上的统计分析方法，可以分析复杂情况下各影响因素的数学关系，在自然科学、社会和经济学等多个领域内应用广泛。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习和应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Misleading Graphs and Charts

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|Misleading Graphs and Charts

统计代写|商业分析作业代写Statistical Modelling for Business代考|Optional

The statistical analyst’s goal should be to present the most accurate and truthful portrayal of a data set that is possible. Such a presentation allows managers using the analysis to make informed decisions. However, it is possible to construct statistical summaries that are misleading. Although we do not advocate using misleading statistics, you should be aware of some of the ways statistical graphs and charts can be manipulated in order to distort the truth. By knowing what to look for, you can avoid being misled by a (we hope) small number of unscrupulous practitioners.

As an example, suppose that the nurses at a large hospital will soon vote on a proposal to join a union. Both the union organizers and the hospital administration plan to distribute recent salary statistics to the entire nursing staff. Suppose that the mean nurses’ salary at the hospital and the mean nurses’ salary increase at the hospital (expressed as a percentage) for each of the last four years are as follows:

The hospital administration does not want the nurses to unionize and, therefore, hopes to convince the nurses that substantial progress has been made to increase salaries without a union. On the other hand, the union organizers wish to portray the salary increases as minimal so that the nurses will feel the need to unionize.

Figure $2.29$ gives two bar charts of the mean nurses’ salaries at the hospital for each of the last four years. Notice that in Figure $2.29$ (a) the administration has started the vertical scale of the bar chart at a salary of $\$ 58,000$ by using a scale break ( ). Alternatively, the chart

could be set up without the scale break by simply starting the vertical scale at $\$ 58,000$. Starting the vertical scale at a value far above zero makes the salary increases look more dramatic. Notice that when the union organizers present the bar chart in Figure $2.29$ (b), which has a vertical scale starting at zero, the salary increases look far less impressive.

Figure $2.30$ presents two bar charts of the mean nurses’ salary increases (in percentages) at the hospital for each of the last four years. In Figure $2.30$ (a), the administration has made the widths of the bars representing the percentage increases proportional to their heights. This makes the upward movement in the mean salary increases look more dramatic because the observer’s eye tends to compare the areas of the bars, while the improvements in the mean salary increases are really only proportional to the heights of the bars. When the union organizers present the bar chart of Figure $2.30$ (b), the improvements in the mean salary increases look less impressive because each bar has the same width.

Figure $2.31$ gives two time series plots of the mean nurses’ salary increases at the hospital for the last four years. In Figure $2.31$ (a) the administration has stretched the vertical axis of the graph. That is, the vertical axis is set up so that the distances between the percentages are large. This makes the upward trend of the mean salary increases appear to be steep. In Figure $2.31$ (b) the union organizers have compressed the vertical axis (that is, the distances between the percentages are small). This makes the upward trend of the

mean salary increases appear to be gradual. As we will see in the exercises, stretching and compressing the horizontal axis in a time series plot can also greatly affect the impression given by the plot.

It is also possible to create totally different interpretations of the same statistical summary by simply using different labeling or captions. For example, consider the bar chart of mean nurses’ salary increases in Figure 2.30(b). To create a favorable interpretation, the hospital administration might use the caption “Salary Increase Is Higher for the Fourth Year in a Row.” On the other hand, the union organizers might create a negative impression by using the caption “Salary Increase Fails to Reach $10 \%$ for Fourth Straight Year.”

In summary, it is important to carefully study any statistical summary so that you will not be misled. Look for manipulations such as stretched or compressed axes on graphs, axes that do not begin at zero, bar charts with bars of varying widths, and biased captions. Doing these things will help you to see the truth and to make well-informed decisions.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Descriptive Analytics

In Section $1.5$ we said that descriptive analytics uses traditional and more recently developed graphics to present to executives (and sometimes customers) easy-to-understand visual summaries of up-to-the-minute information conceming the operational status of a business. In this section we will discuss some of the more recently developed graphics used by descriptive analytics, which include gauges, bullet graphs, treemaps, and sparklines. In addition, we will see how they are used with each other and more traditional graphics to form analytic dashboards, which are part of executive information systems. We will also briefly discuss data discovery, which involves, in it simplest form, data drill down.

Dashboards and gauges An analytic dashboard provides a graphical presentation of the current status and historical trends of a business’s key performance indicators. The term dashboard originates from the automotive dashboard, which helps a driver monitor a car’s key functions. Figure $2.35$ shows a dashboard that graphically portrays some key performance indicators for a (fictitious) airline for last year. In the lower left-hand portion of the dashboard we see three gauges depicting the percentage utilizations of the regional, short-haul, and international fleets of the airline. In general, a gauge (chart) allows us to visualize data in a way that is similar to a real-life speedometer needle on an automobile. The outer scale of the gauge is often color coded to provide additional performance information. For example, note that the colors on the outer scale of the gauges in Figure $2.35$ range from red to dark green to light green. These colors signify percentage fleet utilizations that range from poor (less than 75 percent) to satisfactory (less than 90 percent but at least 75 percent) to good (at least 90 percent).
Bullet graphs While gauge charts are nice looking, they take up considerable space and to some extent are cluttered. For example, recalling the Disney Parks Case, using seven $\mathrm{~ ส ม ่ ม ด ร ~ c h a}$ seven Epcot rides would take up considerable space and would be less efficient than using the bullet graph that we originally showed in Figure 1.8. That bullet graph (which has been obtained using Excel) is shown again in Figure 2.36.

In general, a bullet graph features a single measure (for example, predicted waiting time) and displays it as a horizontal bar (or a vertical bar) that extends into ranges representing qualitative measures of performance, such as poor, satisfactory, and good. The ranges are displayed as either different colors or as varying intensities of a single hue. Using a single hue makes them discernible by those who are color blind and restricts the use of color if the bullet graph is part of a dashboard that we don’t want to look too “busy.” Many bullet graphs compare the single primary measure to a target, or objective, which is represented by a symbol on the bullet graph. The bullet graph of Disney’s predicted waiting times uses five colors ranging from dark green to red and signifying short ( 0 to 20 minutes) to very long ( 80 to 100 minutes) predicted waiting times. This bullet graph does not compare the predicted waiting times to an objective. However, the bullel graphs located in the upper left of the dashboard in Figure $2.35$ (representing the percentages of on-time arrivals and departures for the airline) do display obbjectives represented by shoort vertical black lines. For examplé, consider the bullet graphs representing the percentages of on-time arrivals and departures in the Midwest, which are shown below.

统计代写|商业分析作业代写Statistical Modelling for Business代考|sparkline

Sparklines Figure $2.38$ shows sparklines depicting the monthly closing prices of five stocks over six months. In general, a sparkline, another of the new descriptive analytics graphics, is a line chart that presents the general shape of the variation (usually over time) in some measurement, such as temperature or a stock price. A sparkline is typically drawn without axes or coordinates and made small enough to be embedded in text. Sparklines are often grouped together so that comparative variations can be seen. Whereas most charts are designed to show considerable information and are set off from the main text, sparklines are intended to be succinct and located where they are discussed. Therefore, if the sparklines in Figure $2.38$ were located in the text of a report comparing stocks, they would probably be made smaller than shown in Figure $2.38$, and the actual monthly closing prices of the stocks used to obtain the sparklines would probably not be shown next to the sparklines.
Data discovery To conclude this section, we briefly discuss data discovery methods, which allow decision makers to interactively view data and make preliminary analyses. One simple version of data discovery is data drill down, which reveals more detailed data that underlie a higher-level summary. For example, an online presentation of the airline dashboard in Figure $2.35$ might allow airline executives to drill down by clicking the gauge which shows that approximately 89 percent of the airline’s international fleet was utilized over the year to see the bar chart in Figure 2.39. This bar chart depicts the quarterly percentage utilizations of the airline’s international fleet over the year. The bar chart suggests that the airline might offer international travel specials in quarter 1 (Winter) and quarter 4 (Fall) to increase international travel and thus increase the percentage of the international fleet utilized in those quarters. Drill down can be done at multiple levels. For example, Disney executives might wish to have weekly reports of total merchandise sales at its four Orlando parks, which can be drilled down to reveal total merchandise sales at each park, which can be further drilled down to reveal total merchandise sales at each “land” in each park, which can be yet further drilled down to reveal total merchandise sales in each store in each land in each park.

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Optional

统计分析师的目标应该是尽可能准确和真实地描述数据集。这样的演示允许管理人员使用分析来做出明智的决定。但是，有可能构建具有误导性的统计摘要。虽然我们不提倡使用误导性统计数据，但您应该了解一些统计图表可能被操纵以歪曲事实的方式。通过知道要寻找什么，您可以避免被（我们希望）少数不道德的从业者误导。

例如，假设一家大型医院的护士很快将对加入工会的提案进行投票。工会组织者和医院管理部门都计划向全体护理人员分发最近的工资统计数据。假设过去四年每年在医院的平均护士工资和在医院的平均护士工资增长（以百分比表示）如下：

医院管理部门不希望护士加入工会，因此希望让护士相信，在没有工会的情况下提高工资已经取得了实质性进展。另一方面，工会组织者希望将工资增长描述为最低限度，以便护士感到有必要加入工会。

数字2.29给出过去四年中每年医院护士平均工资的两个条形图。请注意，在图2.29(a) 行政部门已开始以条形图的垂直刻度为$58,000通过使用比例分隔符 ( )。或者，图表

可以通过简单地在$58,000. 从远高于零的值开始垂直刻度会使工资增长看起来更加显着。请注意，当工会组织者呈现图2.29(b)，垂直尺度从零开始，工资增长看起来远没有那么令人印象深刻。

数字2.30显示了过去四年每年医院护士平均工资增长（百分比）的两个条形图。如图2.30(a)，政府已使代表百分比增加的条形的宽度与其高度成正比。这使得平均工资增长的向上运动看起来更加显着，因为观察者的眼睛倾向于比较条形的区域，而平均工资增长的改善实际上只与条形的高度成正比。当工会组织者呈现图的条形图2.30(b)，平均工资增长的改善看起来不那么令人印象深刻，因为每个条具有相同的宽度。

数字2.31给出了过去四年医院平均护士工资增长的两个时间序列图. 如图2.31(a) 行政当局拉长了图表的纵轴。即，纵轴被设置为使得百分比之间的距离大。这使得平均工资增长的上升趋势显得陡峭。如图2.31(b) 工会组织者压缩了纵轴（即百分比之间的距离很小）。这使得上涨趋势

平均工资增长似乎是渐进的。正如我们将在练习中看到的那样，在时间序列图中拉伸和压缩水平轴也会极大地影响该图给人的印象。

也可以通过简单地使用不同的标签或标题来创建对相同统计摘要的完全不同的解释。例如，考虑图 2.30(b) 中护士平均工资增长的条形图。为了创造一个有利的解释，医院管理部门可能会使用“工资增长连续第四年更高”的标题。另一方面，工会组织者可能会通过使用标题“加薪未能达到”来制造负面印象10%连续第四年。”

总之，重要的是仔细研究任何统计摘要，以免被误导。寻找操作，例如图形上的拉伸或压缩轴、不从零开始的轴、具有不同宽度条形的条形图以及有偏差的标题。做这些事情将帮助你看清真相并做出明智的决定。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Descriptive Analytics

在部分1.5我们说过，描述性分析使用传统的和最近开发的图形向高管（有时是客户）呈现易于理解的关于企业运营状态的最新信息的视觉摘要。在本节中，我们将讨论描述性分析使用的一些最近开发的图形，包括仪表、子弹图、树状图和迷你图。此外，我们将看到它们如何相互使用，以及如何使用更传统的图形来形成分析仪表板，这些仪表板是执行信息系统的一部分。我们还将简要讨论数据发现，它以最简单的形式涉及数据钻取。

仪表板和仪表分析仪表板提供企业关键绩效指标的当前状态和历史趋势的图形表示。仪表板一词源于汽车仪表板，它可以帮助驾驶员监控汽车的关键功能。数字2.35显示了一个仪表板，该仪表板以图形方式描绘了一家（虚构的）航空公司去年的一些关键绩效指标。在仪表板的左下角，我们看到三个仪表，描述了航空公司的区域、短途和国际机队的利用率百分比。一般来说，仪表（图表）允许我们以类似于现实生活中汽车上的速度表指针的方式可视化数据。仪表的外部刻度通常采用颜色编码，以提供额外的性能信息。例如，请注意图中仪表外部刻度上的颜色2.35范围从红色到深绿色到浅绿色。这些颜色表示从差（低于 75%）到满意（低于 90% 但至少 75%）到好（至少 90%）的车队利用率百分比。
子弹图虽然仪表图看起来很漂亮，但它们占用了相当大的空间，并且在某种程度上显得杂乱无章。例如，回顾迪斯尼公园案例，使用七个สม่มดร 小号米的米dR CH一种七次 Epcot 游乐设施将占用相当大的空间，并且效率低于我们最初在图 1.8 中显示的子弹图。图 2.36 再次显示了该项目符号图（已使用 Excel 获得）。

通常，项目符号图具有单个度量（例如，预测的等待时间）并将其显示为水平条（或垂直条），该条延伸到表示性能定性度量的范围，例如差、满意和良好。范围显示为不同的颜色或单一色调的不同强度。如果项目符号图是我们不想看起来太“忙碌”的仪表板的一部分，则使用单一色调可以让色盲者识别它们并限制颜色的使用。许多子弹图将单个主要度量与目标或目标进行比较，目标或目标由子弹图上的符号表示。迪士尼预计等待时间的子弹图使用从深绿色到红色的五种颜色，表示预计等待时间短（0 到 20 分钟）到非常长（80 到 100 分钟）。此项目符号图未将预测的等待时间与目标进行比较。但是，位于图2.35（代表航空公司准时到达和离开的百分比）确实显示了由短的垂直黑线表示的目标。例如，考虑代表中西部准时到达和离开百分比的子弹图，如下所示。

统计代写|商业分析作业代写Statistical Modelling for Business代考|sparkline

迷你图2.38显示了描述 5 只股票在 6 个月内的月收盘价的迷你图。一般来说，迷你图是另一种新的描述性分析图形，是一种折线图，它显示了某些测量（例如温度或股票价格）中变化的一般形状（通常随时间变化）。迷你图通常在没有轴或坐标的情况下绘制，并且足够小以嵌入文本中。迷你图通常组合在一起，以便可以看到比较变化。尽管大多数图表旨在显示大量信息并与正文分开，但迷你图旨在简洁并位于讨论它们的位置。因此，如果图中的迷你图2.38位于比较股票的报告的文本中，它们可能会比图所示的更小2.38，用于获取迷你图的股票的实际每月收盘价可能不会显示在迷你图旁边。
数据发现为了结束本节，我们将简要讨论数据发现方法，这些方法允许决策者以交互方式查看数据并进行初步分析。数据发现的一个简单版本是数据钻取，它揭示了更详细的数据，这些数据是更高级别摘要的基础。例如，图 1 中航空公司仪表板的在线演示2.35可能允许航空公司管理人员通过单击显示该航空公司国际机队的大约 89% 在一年中被使用的仪表进行深入研究，以查看图 2.39 中的条形图。此条形图描绘了该航空公司国际机队在一年中的季度利用率百分比。条形图表明航空公司可能会在第 1 季度（冬季）和第 4 季度（秋季）提供国际旅行特价，以增加国际旅行，从而增加这些季度使用的国际机队的百分比。下钻可以在多个级别进行。例如，迪斯尼高管可能希望每周报告其奥兰多四个公园的商品总销售额，可以深入了解每个公园的商品总销售额，

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考| Contingency Tables (Optional)

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考| Contingency Tables (Optional)

统计代写|商业分析作业代写Statistical Modelling for Business代考|The Brokerage Firm Case: Studying Client Satisfaction

Previous sections in this chapter have presented methods for summarizing data for a single variable. Often, however, we wish to use statistics to study possible relationships between several variables. In this section we present a simple way to study the relationship between two variables. Crosstabulation is a process that classifies data on two dimensions. This process results in a table that is called a contingency table. Such a table consists of rows and columns – the rows classify the data according to one dimension and the columns classify the data according to a second dimension. Together, the rows and columns represent all possibilities (or contingencies).An investment broker sells several kinds of investment products-a stock fund, a bond fund, and a tax-deferred annuity. The broker wishes to study whether client satisfaction with its products and services depends on the type of investment product purchased. To do this, 100 of the broker’s clients are randomly selected from the population of clients who have

purchased shares in exactly one of the funds. The broker records the fund type purchased by each client and has one of its investment counselors personally contact the client. When contacted, the client is asked to rate his or her level of satisfaction with the purchased fund as high, medium, or low. The resulting data are given in Table 2.16.

Looking at the raw data in Table $2.16$, it is difficult to see whether the level of client satisfaction varies depending on the fund type. We can look at the data in an organized way by constructing a contingency table. A crosstabulation of fund type versus level of client satisfaction is shown in Table $2.17$. The classification categories for the two variables are defined along the left and top margins of the table. The three row labels-bond fund, stock fund, and tax deferred annuity-define the three fund categories and are given in the left table margin. The three column labels-high, medium, and low-define the three levels of client satisfaction and are given along the top table margin. Each row and column combination, that is, each fund type and level of satisfaction combination, defines what we call a “cell” in the table. Because each of the randomly selected clients has invested in exactly one fund type and has reported exactly one level of satisfaction, each client can be placed in a particular cell in the contingency table. For example, because client number 1 in Table $2.16$ has invested in the bond fund and reports a high level of client satisfaction, client number 1 can be placed in the upper left cell of the table (the cell defined by the Bond Fund row and High Satisfaction column).

统计代写|商业分析作业代写Statistical Modelling for Business代考|column percentages

One good way to investigate relationships such as these is to compute row percentages and column percentages. We compute row percentages by dividing each cell’s freguency by its corresponding row total and by expressing the resulting fraction as a percentage. For instance, the row percentage for the upper left-hand cell (bond fund and high level of satisfaction) in Table $2.17$ is $(15 / 30) \times 100 \%=50 \%$. Similarly, column percentages are computed by dividing each cell’s frequency by its corresponding column total and by expressing the resulting fraction as a percentage. For example, the column percentage for the upper left-hand cell in Table $2.17$ is $(15 / 40) \times 100 \%=37.5 \%$. Table $2.18$ summarizes all of the row percentages for the different fund types in Table 2.17. We see that each row in Table $2.18$ gives a percentage frequency distribution of level of client satisfaction given a particular fund type.
For example, the first row in Table $2.18$ gives a percent frequency distribution of client satisfaction for investors who have purchased shares in the bond fund. We see that 50 percent of bond fund investors report high satisfaction, while 40 percent of these investors report medium satisfaction, and only 10 percent report low satisfaction. The other mows in Tahle $2.18$ provide. Гepcent freynency distrikutions of client satisfaction for stock fund and annuity parchasers
All three percent frequency distributions of client satisfaction-for the bond fund, the stock fund, and the tax deferred annuity-are illustrated using bar charts in Figure $2.23$. In this figure, the bar heights for each chart are the respective row percentages in Table $2.18$. For example, these distributions tell us that 80 percent of stock fund investors report high satisfaction, while $97.5$ percent of tax deferred annuity purchasers report medium or low satisfaction. Looking at the entire table of row percentages (or the bar charts in Figure 2.23), we might conclude that stock fund investors are highly satisfied, that bond fund investors are quite satisfied (but, somewhat less so than stock fund investors), and that tax-deferred-annuity purchasers are less satisfied than either stock fund or bond fund investors. In general, row percentages and column percentages help us to quantify relationships such as these.

In the investment example, we have cross-tabulated two qualitative variables. We can also cross-tabulate a quantitative variable versus a qualitative variable or two quantitative variables against each other. If we are cross-tabulating a quantitative variable, we often define categories by using appropriate ranges. For example, if we wished to cross-tabulate level of education (grade school, high school, college, graduate school) versus income, we might define income classes $\$ 0-\$ 50,000, \$ 50,001-\$ 100,000, \$ 100,001-\$ 150,000$, and above $\$ 150,000$.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Scatter Plots

We often study relationships between variables by using graphical methods. A simple graph that can be used to study the relationship between two variables is called a scatter plot. As an example, suppose that a marketing manager wishes to investigate the relationship between the sales volume (in thousands of units) of a product and the amount spent (in units of $\$ 10,000$ ) on advertising the product. To do this, the marketing manager randomly selects 10 sales regions having equal sales potential. The manager assigns a different level of advertising expenditure for January 2016 to each sales region as shown in Table 2.20. At the end of the month, the sales volume for each region is recorded as also shown in Table $2.20$.

A scatter plot of these data is given in Figure 2.24. To construct this plot, we place the variable advertising expenditure (denoted $x$ ) on the horizontal axis and we place the variable sales volume (denoted $y$ ) on the vertical axis. For the first sales region, advertising expenditure equals 5 and sales volume equals 89 . We plot the point with coordinates $x=5$ and $y=89$ on the scatter plot to represent this sales region. Points for the other sales regions are plotted similarly. The scatter plot shows that there is a positive relationship between advertising expenditure and sales volumethat is, higher values of sales volume are associated with higher levels of advertising expenditure.
We have drawn a straight line through the plotted points of the scatter plot to represent the relationship between advertising expenditure and sales volume. We often do this when the relationship between two variables appears to be a straight line, or linear, relationship. Of course, the relationship between $x$ and $y$ in Figure $2.24$ is not perfectly linear-not all of the points in the scatter plot are exactly on the line. Nevertheless, because the relationship between $x$ and $y$ appears to be approximately linear, it seems reasonable to represent the general relationship between these variables using a straight line. In future chapters we will explain ways to quantify such a relationship that is, describe such a relationship numerically. Moreover, not all linear relationships between two variables $x$ and $y$ are positive linear relationships (that is, have a positive slope). For example, Table $2.21$ on the next page gives the average hourly outdoor temperature $(x)$ in a city during a week and the city’s natural gas consumption (y) during the week for each of the previous eight weeks. The temperature readings are expressed in degrees Fahrenheit and the natural gas consumptions are expressed in millions of cubic feet of natural gas. The scatter plot in Figure $2.25$ shows that there is a

negative linear relationship between $x$ and $y$-that is, as average hourly temperature in the city increases, the city’s natural gas consumption decreases in a linear fashion. Finally, not all relationships are linear. In Chapter 14 we will consider how to represent and quantify curved relationships, and, as illustrated in Figure $2.26$, there are situations in which two variables $x$ and $y$ do not appear to have any relationship.

To conclude this section, recall from Chapter 1 that a time series plot (also called a runs plot) is a plot of individual process measurements versus time. This implies that a time series plot is a scatter plot, where values of a process variable are plotted on the vertical axis versus corresponding values of time on the horizontal axis.

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|The Brokerage Firm Case: Studying Client Satisfaction

本章的前几节介绍了汇总单个变量数据的方法。然而，我们通常希望使用统计数据来研究几个变量之间可能存在的关系。在本节中，我们提出了一种研究两个变量之间关系的简单方法。交叉制表是在两个维度上对数据进行分类的过程。此过程产生一个称为列联表的表。这样的表由行和列组成——行根据一维对数据进行分类，列根据第二维对数据进行分类。行和列一起代表所有可能性（或意外情况）。投资经纪人销售几种投资产品——股票基金、债券基金和延税年金。经纪商希望研究客户对其产品和服务的满意度是否取决于所购买的投资产品类型。为此，经纪人的 100 名客户是从拥有

购买了其中一只基金的股票。经纪人记录每个客户购买的基金类型，并让其一名投资顾问亲自与客户联系。联系时，客户被要求将他或她对所购买基金的满意度评定为高、中或低。结果数据在表 2.16 中给出。

查看表中的原始数据2.16，很难看出客户满意度是否因基金类型而异。我们可以通过构建列联表以有组织的方式查看数据。表中显示了基金类型与客户满意度的交叉表2.17. 两个变量的分类类别沿表格的左边缘和上边缘定义。三行标签——债券基金、股票基金和延税年金——定义了三个基金类别，并在左表边距中给出。三列标签（高、中和低）定义了客户满意度的三个级别，并沿顶部表格边缘给出。每个行和列的组合，即每个基金类型和满意度组合，定义了我们所说的表格中的“单元格”。因为每个随机选择的客户都只投资了一种基金类型并报告了一种满意程度，所以每个客户都可以放在列联表的特定单元格中。例如，因为表中的客户编号 12.16已投资债券基金并报告客户满意度高，客户编号 1 可放置在表格的左上角单元格（由债券基金行和高满意度列定义的单元格）。

统计代写|商业分析作业代写Statistical Modelling for Business代考|column percentages

调查此类关系的一种好方法是计算行百分比和列百分比。我们通过将每个单元格的频率除以其相应的行总数并将结果分数表示为百分比来计算行百分比。例如，表中左上角单元格（债券基金和高满意度）的行百分比2.17是(15/30)×100%=50%. 类似地，通过将每个单元格的频率除以其相应的列总数并将所得分数表示为百分比来计算列百分比。例如，表中左上角单元格的列百分比2.17是(15/40)×100%=37.5%. 桌子2.18总结了表 2.17 中不同基金类型的所有行百分比。我们看到 Table 中的每一行2.18给出给定特定基金类型的客户满意度水平的百分比频率分布。
例如，表中的第一行2.18给出已购买债券基金股份的投资者的客户满意度百分比分布。我们看到 50% 的债券基金投资者表示高满意度，而这些投资者中有 40% 表示中等满意度，只有 10% 表示低满意度。塔勒的其他刈草2.18提供。股票基金和年金购买者的客户满意度的 epcent
频率分布债券基金、股票基金和税收递延年金的客户满意度的所有 3% 频率分布都用图 1 中的条形图说明2.23. 在此图中，每个图表的条形高度是表中相应的行百分比2.18. 例如，这些分布告诉我们，80% 的股票基金投资者表示非常满意，而97.5百分之几的延税年金购买者表示满意度中等或低。查看整个行百分比表（或图 2.23 中的条形图），我们可能会得出结论，股票基金投资者非常满意，债券基金投资者非常满意（但略低于股票基金投资者），并且与股票基金或债券基金投资者相比，延税年金购买者的满意度较低。一般来说，行百分比和列百分比可以帮助我们量化诸如此类的关系。

在投资示例中，我们对两个定性变量进行了交叉制表。我们还可以将一个定量变量与一个定性变量或两个定量变量相互交叉制表。如果我们对定量变量进行交叉制表，我们通常会使用适当的范围来定义类别。例如，如果我们希望将教育水平（小学、高中、大学、研究生院）与收入进行交叉制表，我们可以定义收入等级$0−$50,000,$50,001−$100,000,$100,001−$150,000，以上$150,000.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Scatter Plots

我们经常使用图解法来研究变量之间的关系。可用于研究两个变量之间关系的简单图形称为散点图。例如，假设营销经理希望调查产品的销售量（以千为单位）与花费的金额（以$10,000) 宣传产品。为此，营销经理随机选择 10 个具有相同销售潜力的销售区域。经理为每个销售区域分配了 2016 年 1 月不同水平的广告支出，如表 2.20 所示。月末各地区的销量记录如表所示2.20.

图 2.24 给出了这些数据的散点图。为了构建这个图，我们将变量广告支出（表示为X）在横轴上，我们将变量销售量（表示为是) 在垂直轴上。对于第一个销售区域，广告支出为 5，销量为 89 。我们用坐标绘制点X=5和是=89在散点图上表示该销售区域。其他销售区域的点也以类似方式绘制。散点图表明，广告支出与销售额之间存在正相关关系，即销售额越高，广告支出水平越高。
我们通过散点图的绘制点画了一条直线来表示广告支出和销量之间的关系。当两个变量之间的关系看起来是直线或线性关系时，我们经常这样做。当然，两者之间的关系X和是如图2.24不是完全线性的——并不是散点图中的所有点都完全在线上。然而，由于两者之间的关系X和是似乎是近似线性的，用直线表示这些变量之间的一般关系似乎是合理的。在以后的章节中，我们将解释量化这种关系的方法，即用数字描述这种关系。此外，并非两个变量之间的所有线性关系X和是是正线性关系（即，具有正斜率）。例如，表2.21下一页给出了平均每小时室外温度(X)一周内城市的天然气消耗量（y），前八周的每一周。温度读数以华氏度表示，天然气消耗量以百万立方英尺天然气表示。图中的散点图2.25表明有一个

之间的负线性关系X和是-也就是说，随着城市平均每小时温度的升高，城市的天然气消耗量呈线性下降。最后，并非所有关系都是线性的。在第 14 章中，我们将考虑如何表示和量化曲线关系，如图所示2.26, 有两个变量的情况X和是似乎没有任何关系。

总结本节，回顾第 1 章，时间序列图（也称为运行图）是单个过程测量值与时间的关系图。这意味着时间序列图是散点图，其中过程变量的值绘制在垂直轴上，而相应的时间值绘制在水平轴上。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Cumulative distributions and ogives

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|Cumulative distributions and ogives

Another way to summarize a distribution is to construct a cumulative distribution. To do this, we use the same number of classes, the same class lengths, and the same class boundaries that we have used for the frequency distribution of a data set. However, in order to construct a cumulative frequency distribution, we record for each class the number of

measurements that are less than the upper boundary of the class. To illustrate this idea, Table $2.10$ gives the cumulative frequency distribution of the payment time distribution summarized in Table $2.7$ (page 64). Columns (1) and (2) in this table give the frequency distribution of the payment times. Column (3) gives the cumulative frequency for each class. To see how these values are obtained, the cumulative frequency for the class $10<13$ is the number of payment times less than 13 . This is obviously the frequency for the class $10<13$, which is 3 . The cumulative frequency for the class $13<16$ is the number of payment times less than 16 , which is obtained by adding the frequencies for the first two classes-that is, $3+14=17$. The cumulative frequency for the class $16<19$ is the number of payment times less than 19-that is, $3+14+23=40$. We see that, in general, a cumulative frequency is obtained by summing the frequencies of all classes representing values less than the upper buumdary of the class.

Column (4) gives the cumulative relative frequency for each class, which is obtained by summing the relative frequencies of all classes representing values less than the upper boundary of the class. Or, more simply, this value can be found by dividing the cumulative frequency for the class by the total number of measurements in the data set. For instance, the cunulative relative frequency for the class $19<22$ is $52 / 65-.8$. Culum (5) gives the cumulative percent frequency for each class, which is obtained by summing the percent frequencies of all classes representing values less than the upper boundary of the class. More simply, this value can be found by multiplying the cumulative relative frequency of a class by 100. For instance, the cumulative percent frequency for the class $19<22$ is $.8 \times(100)=80$ percent.

As an example of interpreting Table $2.10 .60$ of the 65 payment times are 24 days or less. or, equivalently, $92.31$ percent of the payment times (or a fraction of $.9231$ of the payment times) are 24 days or less. Also, notice that the last entry in the cumulative frequency distribution is the total number of measurements (here, 65 payment times). In addition, the last entry in the cumulative relative frequency distribution is $1.0$ and the last entry in the cumulalive percent frequency disiribution is $100 \%$. In general, for any dala sel, these lasi eniries will be, respectively, the total number of measurements, $1.0$, and $100 \%$.

An ogive (pronounced “oh-jive”) is a graph of a cumulative distribution. To construct a frequency ogive, we plot a point above each upper class boundary at a height equal to the cumulative frequency of the class. We then connect the plotted points with line segments. A similar graph can be drawn using the cumulative relative frequencies or the cumulative percent frequencies. As an example, Figure $2.14$ gives a percent frequency ogive of the payment times. Looking at this figure, we see that, for instance, a little more than 25 percent (actually, $26.15$ percent according to Table $2.10$ ) of the payment times are less than 16 days, while 80 percent of the payment times are less than 22 days. Also notice that we have completed the ogive by plotting an additional point at the lower boundary of the first (leftmost) class at a height equal to zero. This depicts the fact that none of the payment times is less than 10 days. Finally, the ogive graphically shows that all ( 100 percent) of the payment times are less than 31 days.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Dot Plots

A very simple graph that can be used to summarize a data set is called a dot plot. To make a dot plot we draw a horizontal axis that spans the range of the measurements in the data set. We then place dots above the horizontal axis to represent the measurements. As an example, Figure $2.18$ (a) shows a dot plot of the exam scores in Table $2.8$ (page 68). Remember, these are the scores for the first exam given before implementing a strict attendance policy. The horizontal axis spans exam scores from 30 to 100 . Each dot above the axis represents an exam score. For instance, the two dots above the score of 90 tell us that two students received a 90 on the exam. The dot plot shows us that there are two concentrations of scores-those in the $80 \mathrm{~s}$ and $90 \mathrm{~s}$ and those in the $60 \mathrm{~s}$. Figure $2.18(\mathrm{~b})$ gives a dot plot of the scores on the second exam (which was given after imposing the attendance policy). As did the percent frequency polygon for Exam 2 in Figure $2.13$ (page 69), this second dot plot shows that the attendance policy eliminated the concentration of scores in the $60 \mathrm{~s}$.

Dot plots are useful for detecting outliers, which are unusually large or small observations that are well separated from the remaining observations. For example, the dot plot for exam 1 indicates that the score 32 seems unusually low. How we handle an outlier depends on its cause. If the outlier results from a measurement error or an error in recording or processing the data, it should be corrected. If such an outlier cannot be corrected, it should be discarded. If an outlier is not the result of an error in measuring or recording the data, its cause may reveal important information. For example, the outlying exam score of 32 convinced the author that the student needed a tutor. After working with a tutor, the student showed considerable improvement on Exam 2. A more precise way to detect outliers is presented in Section 3.3.

统计代写|商业分析作业代写Statistical Modelling for Business代考|back-to-back stem-and-leaf display

If we wish to compare two distributions, it is convenient to construct a back-to-back stemand-leaf display. Figure $2.20$ presents a back-to-back stem-and-leaf display for the previously discussed exam scores. The left side of the display summarizes the scores for the first exam. Remember, this exam was given before implementing a strict attendance policy. The right side of the display summarizes the scores for the second exam (which was given after imposing the attendance policy). Looking at the left side of the display, we see that for the first exam there are two concentrations of scores-those in the $80 \mathrm{~s}$ and $90 \mathrm{~s}$ and those in the $60 \mathrm{~s}$. The right side of the display shows that the attendance policy eliminated the concentration of scores in the $60 \mathrm{~s}$ and illustrates that the scores on exam 2 are almost single peaked and somewhat skewed to the left.

Stem-and-leaf displays are useful for detecting outliers, which are unusually large or small observations that are well separated from the remaining observations. For example, the stem-and-leaf display for exam 1 indicates that the score 32 seems unusually low. How we handle an outlier depends on its cause. If the outlier results from a measurement error or an error in recording or processing the data, it should be corrected. If such an outlier cannot be corrected, it should be discarded. If an outlier is not the result of an error in measuring or recording the data, its cause may reveal important information. For example, the outlying exam score of 32 convinced the author that the student needed a tutor. After working with a tutor, the student showed considerable improvement on exam 2. A more precise way to detect outliers is presented in Section 3.3.

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Cumulative distributions and ogives

总结分布的另一种方法是构建累积分布。为此，我们使用与用于数据集频率分布的相同数量的类、相同的类长度和相同的类边界。然而，为了构建累积频率分布，我们记录每个类的数量

小于类的上限的测量值。为了说明这个想法，表2.10给出了表中汇总的支付时间分布的累积频率分布2.7（第 64 页）。此表中的第 (1) 和 (2) 列给出了支付时间的频率分布。第 (3) 列给出了每个类别的累积频率。要查看这些值是如何获得的，类的累积频率10<13是付款次数少于 13 次。这显然是上课的频率10<13，即 3 。类的累积频率13<16是小于 16 的支付次数，是前两个类的频率相加得到的——即，3+14=17. 类的累积频率16<19是支付次数小于19-即，3+14+23=40. 我们看到，一般来说，累积频率是通过将所有表示值小于该类的上界的类的频率相加来获得的。

第 (4) 列给出了每个类别的累积相对频率，它是通过将所有类别的相对频率相加获得的，这些类别代表的值小于该类别的上边界。或者，更简单地说，可以通过将类的累积频率除以数据集中的测量总数来找到该值。例如，该类的累积相对频率19<22是52/65−.8. Culum (5) 给出了每个类的累积百分比频率，它是通过将所有表示值小于类的上限的类的百分比频率相加而获得的。更简单地说，这个值可以通过将类的累积相对频率乘以 100 来找到。例如，类的累积百分比频率19<22是.8×(100)=80百分。

作为解释表的例子2.10.6065 次付款时间为 24 天或更短。或者，等效地，92.31支付时间的百分比（或一小部分.9231付款时间）为 24 天或更短。此外，请注意累积频率分布中的最后一项是测量的总数（此处为 65 次付款时间）。此外，累积相对频率分布中的最后一项是1.0累积百分比频率分布中的最后一个条目是100%. 一般来说，对于任何 dala sel，这些 lasi eniries 将分别是测量的总数，1.0，和100%.

ogive（发音为“oh-jive”）是累积分布图。为了构建频率曲线，我们在每个上层类边界上方绘制一个点，高度等于类的累积频率。然后我们将绘制的点与线段连接起来。可以使用累积相对频率或累积百分比频率绘制类似的图表。例如，图2.14给出支付时间的百分比频率。看看这个数字，我们看到，例如，略高于 25%（实际上，26.15根据表百分比2.10) 的付款时间少于 16 天，而 80% 的付款时间少于 22 天。另请注意，我们通过在第一个（最左边）类的下边界处绘制一个附加点，高度为零，从而完成了 ogive。这描述了没有一个付款时间少于 10 天的事实。最后，ogive 以图形方式显示所有（100%）的付款时间都少于 31 天。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Dot Plots

可用于汇总数据集的非常简单的图形称为点图。为了制作点图，我们绘制了一个横轴，该轴跨越了数据集中的测量范围。然后我们在水平轴上方放置点来表示测量值。例如，图2.18(a) 显示了表中考试成绩的点图2.8（第 68 页）。请记住，这些是在实施严格的考勤政策之前给出的第一次考试的分数。横轴跨越从 30 到 100 的考试分数。轴上方的每个点代表一个考试分数。例如，90 分以上的两个点告诉我们有两个学生在考试中获得了 90 分。点图向我们展示了分数的两个浓度——那些在80 s和90 s和那些在60 s. 数字2.18( b)给出第二次考试分数的点图（在实施出勤政策后给出）。与图 2 中考试 2 的百分比频率多边形一样2.13（第 69 页），第二个点图显示出勤政策消除了分数集中在60 s.

点图对于检测异常值非常有用，这些异常值是与其余观测值完全分离的异常大或小观测值。例如，考试 1 的点图表明分数 32 似乎异常低。我们如何处理异常值取决于其原因。如果异常值是由测量错误或记录或处理数据的错误引起的，则应予以纠正。如果无法纠正这样的异常值，则应将其丢弃。如果异常值不是测量或记录数据错误的结果，则其原因可能会揭示重要信息。例如，32 分的偏远考试分数使作者确信该学生需要一位导师。在与导师合作后，学生在考试 2 中表现出相当大的进步。第 3.3 节介绍了一种更精确的检测异常值的方法。

统计代写|商业分析作业代写Statistical Modelling for Business代考|back-to-back stem-and-leaf display

如果我们想比较两个分布，构建一个背靠背的茎叶显示是很方便的。数字2.20为前面讨论的考试分数提供了一个背靠背的茎叶显示。显示屏的左侧总结了第一次考试的分数。请记住，此考试是在实施严格的出勤政策之前进行的。显示屏的右侧总结了第二次考试的分数（这是在实施出勤政策后给出的）。查看显示屏的左侧，我们看到第一次考试有两个分数集中度——那些在80 s和90 s和那些在60 s. 显示右侧显示考勤政策消除了分数集中在60 s并说明考试 2 的分数几乎是单峰的，并且有些偏左。

茎叶显示可用于检测异常值，这些异常值是与其余观测值完全分开的异常大或小观测值。例如，考试 1 的茎叶显示表明分数 32 似乎异常低。我们如何处理异常值取决于其原因。如果异常值是由测量错误或记录或处理数据的错误引起的，则应予以纠正。如果无法纠正这样的异常值，则应将其丢弃。如果异常值不是测量或记录数据错误的结果，则其原因可能会揭示重要信息。例如，32 分的偏远考试分数使作者确信该学生需要一位导师。在与导师合作后，学生在考试 2 中表现出相当大的进步。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Constructing Frequency Distributions and Histograms

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|Constructing Frequency Distributions and Histograms

The procedure in the preceding box is not the only way to construct a histogram. Often, histograms are constructed more informally. For instance, it is not necessary to set the lower boundary of the first (leftmost) class equal to the smallest measurement in the data. As an example, suppose that we wish to form a histogram of the 50 gas mileages given in Table $1.7$ (page 15). Examining the mileages, we see that the smallest mileage is $29.8 \mathrm{mpg}$ and that the largest mileage is $33.3 \mathrm{mpg}$. Therefore, it would be convenient to begin the first (leftmost) class at $29.5 \mathrm{mpg}$ and end the last (rightmost) class at $33.5 \mathrm{mpg}$. Further, it would be reasonable to use classes that are $.5 \mathrm{mpg}$ in length. We would then use 8 classes: $29.5<30$, $30<30.5,30.5<31,31<31.5,31.5<32,32<32.5,32.5<33$, and $33<33.5$. A histogram of the gas mileages employing these classes is shown in Figure $2.9$.

Sometimes it is desirable to let the nature of the problem determine the histogram classes. For example, to construct a histogram describing the ages of the residents in a city, it might be reasonable to use classes having 10-year lengths (that is, under 10 years, $10-19$ years, 20-29 years, $30-39$ years, and so on).

Notice that in our examples we have used classes having equal class lengths. In general, it is best to use equal class lengths whenever the raw data (that is, all the actual measurements) are available. However, sometimes histograms are formed with unequal class lengths-particularly when we are using published data as a source. Economic data and data in the social sciences are often published in the form of frequency distributions having unequal class lengths. Dealing with this kind of data is discussed in Exercises $2.26$ and $2.27$. Also discussed in these exercises is how to deal with open-ended classes. For example, if we are constructing a histogram describing the yearly incomes of U.S. households, an open-ended class could be households earning over $\$ 500,000$ per year.

As an alternative to constructing a frequency distribution and histogram by hand, we can use software packages such as Excel and Minitab. Each of these packages will automatically define histogram classes for the user. However, these automatically defined classes will not necessarily be the same as those that would be obtained using the manual method we have previously described. Furthermore, the packages define classes by using different methods. (Descriptions of how the classes are defined can often be found in help menus.) For example, Figure $2.10$

gives a Minitab frequency histogram of the payment times in Table $2.4$. Here, Minitab has defined 11 classes and has labeled 5 of the classes on the horizontal axis using midpoints $(12,16,20,24,28)$. It is easy to see that the midpoints of the unlabeled classes are 10,14 , $18,22,26$, and 30 . Moreover, the boundaries of the first class are 9 and 11 , the boundaries of the second class are 11 and 13 , and so forth. Minitab counts frequencies as we have previously described. For instance, one payment time is at least 9 and less than 11 , two payment times are at least 11 and less than 13 , seven payment times are at least 13 and less than 15 , and so forth.

Figure $2.11$ gives an Excel frequency distribution and histogram of the bottle design ratings in Table $1.6$ on page 14. Excel labels histogram classes using their upper class boundaries. For example, the first class has an upper class boundary equal to the smallest rating of 20 and contains only this smallest rating. The boundaries of the second class are 20 and 22 , the boundaries of the third class are 22 and 24 , and so forth. The last class corresponds to ratings more than 36. Excel’s method for counting frequencies differs from that of Minitab (and, therefore, also differs from the way we counted frequencies by hand in Example 2.2). Excel assigns a frequency to a particular class by counting the number of measurements that are greater than the lower boundary of the class and less than or equal to the upper boundary of the class. For example, one bottle design rating is greater than 20 and less than or equal to (that is, at most) 22 . Similarly, 15 bottle design ratings are greater than 32 and at most 34 .

In Figure $2.10$ we have used Minitab to automatically form histogram classes. It is also possible to force software packages to form histogram classes that are defined by the user. We explain how to do this in the appendices at the end of this chapter. Because Excel does not always automatically define acceptable classes, the classes in Figure $2.11$ are a modification of Excel’s automatic classes. We also explain this modification in the appendices at the end of this chapter.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Some common distribution shapes

We often graph a frequency distribution in the form of a histogram in order to visualize the shape of the distribution. If we look at the histogram of payment times in Figure 2.10, we see that the right tail of the histogram is longer than the left tail. When a histogram has this general shape, we say that the distribution is skewed to the right. Here the long right tail tells us that a few of the payment times are somewhat longer than the rest. If we look at the histogram of bottle design ratings in Figure 2.11, we see that the left tail of the histogram is much longer than the right tail. When a histogram has this general shape, we say that the distribution is skewed to the left. Here the long tail to the left tells

us that, while most of the bottle design ratings are concentrated above 25 or so, a few of the ratings are lower than the rest. Finally, looking at the histogram of gas mileages in Figure $2.9$, we see that the right and left tails of the histogram appear to be mirror images of each other. When a histogram has this general shape, we say that the distribution is symmetrical. Moreover, the distribution of gas mileages appears to be piled up in the middle or mound shaped.

Mound-shaped, symmetrical distributions as well as distributions that are skewed to the right or left are commonly found in practice. For example, distributions of scores on standardized tests such as the SAT and ACT tend to be mound shaped and symmetrical, whereas distributions of scores on tests in college statistics courses might be skewed to the left-a few students don’t study and get scores much lower than the rest. On the other hand, economic data such as income data are often skewed to the right-a few people have incomes much higher than most others. Many other distribution shapes are possible. For example, some distributions have two or more peaks-we will give an example of this distribution shape later in this section. It is often very useful to know the shape of a distribution. For example, knowing that the distribution of butle design ratings is skewed to the left suggests that a few consumers may have noticed a problem with design that others didn’t see. Further investigation into why these consumers gave the design low ratings might allow the company to improve the design.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Frequency polygons

Another graphical display that can be used to depict a frequency distribution is a frequency polygon. To construct this graphic, we plot a point above each class midpoint at a height equal to the frequency of the class the height can also be the class relative frequency or class percent frequency if su desired. Then we cunnect the points with line segments. As we will demonstrate in the following example, this kind of graphic can be particularly useful when we wish to compare two or more distributions.Table $2.8$ lists (in increasing order) the scores earned on the first exam by the 40 students in a business statistics course taught by one of the authors several semesters ago. Figure $2.12$ gives a percent frequency polygon for these exam scores. Because exam scores are often reported by using 10-point grade ranges (for instance, 80 to 90 percent), we have defined the following classes: $30<40,40<50,50<60,60<70,70<80,80<90$, and

$90<100$. This is an example of letting the situation determine the classes of a frequency distribution, which is common practice when the situation naturally defines classes. The points that form the polygon have been plotted corresponding to the midpoints of the classes $(35,45,55,65,75,85,95)$. Each point is plotted at a height that equals the percentage of exam scores in its class. For instance, because 10 of the 40 scores are at least 90 and less than 100 , the plot point corresponding to the class midpoint 95 is plotted at a height of 25 percent.

Looking at Figure 2.12, we see that there is a concentration of scores in the 85 to 95 range and another concentration of scores around 65 . In addition, the distribution of scores is somewhat skewed to the left-a few students had scores (in the 30 s and 40 s) that were quite a bit lower than the rest.

This is an example of a distribution having two peaks. When a distribution has multiple peaks, finding the reason for the different peaks often provides useful information. The reason for the two-peaked distribution of exam scores was that some students were not attending class regularly. Students who received scores in the 60 s and below admitted that they were cutting class, whereas students who received higher scores were attending class on a regular basis.

After identifying the reason for the concentrution of lower scores, the instructor established an attendance policy that forced students to attend every class-any student who missed a class was to be dropped from the course. Table $2.9$ presents the scores on the second exam-after the new attendance policy. Figure $2.13$ presents (and allows us to compare) the percent frequency polygons for both exams. We see that the polygon for the second exam is single peaked – the attendance policy ${ }^{4}$ eliminated the concentration of scores in the $60 \mathrm{~s}$, although the scores are still somewhat skewed to the left.

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Constructing Frequency Distributions and Histograms

前面框中的过程并不是构造直方图的唯一方法。通常，直方图的构造更为非正式。例如，没有必要将第一个（最左边）类的下边界设置为数据中的最小测量值。举个例子，假设我们希望形成表中给出的 50 次油耗的直方图1.7（第 15 页）。检查里程，我们看到最小的里程是29.8米pG并且最大的里程是33.3米pG. 因此，在开始第一节（最左边的）课时会很方便29.5米pG并在最后一个（最右边的）类结束33.5米pG. 此外，使用以下类是合理的.5米pG长度。然后我们将使用 8 个类：29.5<30,30<30.5,30.5<31,31<31.5,31.5<32,32<32.5,32.5<33，和33<33.5. 使用这些类别的汽油里程直方图如图所示2.9.

有时希望让问题的性质决定直方图的类别。例如，要构建一个描述城市居民年龄的直方图，使用具有 10 年长度的类（即 10 年以下，10−19年，20-29 岁，30−39年等等）。

请注意，在我们的示例中，我们使用了具有相等类长度的类。一般来说，只要有原始数据（即所有实际测量值）可用，最好使用相等的类长度。但是，有时直方图的类长度不相等，尤其是当我们使用已发布的数据作为源时。经济数据和社会科学数据通常以具有不等类长度的频率分布的形式发布。练习中讨论了处理此类数据2.26和2.27. 在这些练习中还讨论了如何处理开放式课程。例如，如果我们正在构建一个描述美国家庭年收入的直方图，那么开放式类别可能是家庭收入超过$500,000每年。

作为手动构建频率分布和直方图的替代方法，我们可以使用 Excel 和 Minitab 等软件包。这些包中的每一个都将自动为用户定义直方图类。但是，这些自动定义的类不一定与使用我们之前描述的手动方法获得的类相同。此外，包使用不同的方法定义类。（如何定义类的描述通常可以在帮助菜单中找到。）例如，图2.10

给出表中付款时间的 Minitab 频率直方图2.4. 在这里，Minitab 定义了 11 个类，并使用中点在水平轴上标记了 5 个类(12,16,20,24,28). 很容易看出，未标记类的中点是 10,14 ，18,22,26, 和 30 . 此外，第一类的边界是 9 和 11 ，第二类的边界是 11 和 13 ，以此类推。正如我们之前描述的，Minitab 会计算频率。例如，一个支付时间至少为 9 且小于 11 ，两次支付时间为至少 11 且小于 13 ，七次支付时间为至少 13 且小于 15 ，等等。

数字2.11给出了表格中瓶子设计评级的 Excel 频率分布和直方图1.6在第 14 页。 Excel 使用其上类边界标记直方图类。例如，第一类的上类边界等于最小评分 20，并且仅包含这个最小评分。第二类的边界是 20 和 22 ，第三类的边界是 22 和 24 ，以此类推。最后一类对应的评分超过 36。Excel 计算频率的方法与 Minitab 的不同（因此，也不同于我们在示例 2.2 中手动计算频率的方法）。Excel 通过计算大于类下界且小于或等于类上界的测量次数来为特定类分配频率。例如，一个瓶子设计等级大于 20 且小于或等于（即，最多） 22 。相似地，

如图2.10我们已经使用 Minitab 自动形成直方图类。也可以强制软件包形成用户定义的直方图类。我们在本章末尾的附录中解释了如何做到这一点。因为 Excel 并不总是自动定义可接受的类，所以图2.11是对 Excel 自动类的修改。我们还在本章末尾的附录中解释了这种修改。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Some common distribution shapes

我们经常以直方图的形式绘制频率分布图，以便可视化分布的形状。如果我们查看图 2.10 中的支付时间直方图，我们会看到直方图的右尾比左尾长。当直方图具有这种一般形状时，我们说分布向右倾斜。在这里，长长的右尾告诉我们，一些支付时间比其他的要长一些。如果我们看一下图 2.11 中瓶子设计评级的直方图，我们会看到直方图的左尾比右尾长得多。当直方图具有这种一般形状时，我们说分布向左倾斜。这里左边的长尾巴告诉

我们认为，虽然大多数瓶子设计评分都集中在 25 左右，但也有少数评分低于其余部分。最后，看图中的油耗直方图2.9，我们看到直方图的左右尾看起来是彼此的镜像。当直方图具有这种一般形状时，我们说分布是对称的。此外，油耗分布呈中间堆积或土丘状。

在实践中通常会发现丘状、对称分布以及向右或向左倾斜的分布。例如，SAT 和 ACT 等标准化考试的分数分布往往呈土丘状且对称，而大学统计学课程的考试分数分布可能偏左——少数学生不学习并获得分数比其他人低得多。另一方面，收入数据等经济数据往往偏右——少数人的收入远高于其他大多数人。许多其他分布形状是可能的。例如，一些分布有两个或多个峰——我们将在本节后面给出一个这种分布形状的例子。了解分布的形状通常非常有用。例如，知道巴特尔设计评级的分布向左倾斜表明一些消费者可能已经注意到其他人没有看到的设计问题。进一步调查为什么这些消费者对设计的评价较低，可能会让公司改进设计。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Frequency polygons

另一种可用于描述频率分布的图形显示是频率多边形。为了构建这个图形，我们在每个类中点上方绘制一个点，高度等于类的频率，如果需要，高度也可以是类相对频率或类百分比频率。然后我们用线段连接点。正如我们将在下面的示例中演示的那样，当我们希望比较两个或多个分布时，这种图形可能特别有用。表2.8列出（按递增顺序）40 名学生在几个学期前作者之一教授的商业统计课程中的第一次考试中获得的分数。数字2.12给出这些考试分数的百分比频率多边形. 由于考试成绩通常使用 10 分等级范围（例如，80% 到 90%）来报告，因此我们定义了以下类别：30<40,40<50,50<60,60<70,70<80,80<90，和

90<100. 这是让情况确定频率分布的类别的示例，当情况自然定义类别时，这是常见的做法。形成多边形的点对应于类的中点绘制(35,45,55,65,75,85,95). 每个点都绘制在一个高度，该高度等于其班级考试成绩的百分比。例如，因为 40 个分数中有 10 个至少为 90 且小于 100，所以对应于类中点 95 的绘图点绘制在 25% 的高度处。

查看图 2.12，我们看到分数集中在 85 到 95 范围内，另一个分数集中在 65 左右。此外，分数的分布有些偏左——一些学生的分数（在 30 年代和 40 年代）比其他学生低很多。

这是具有两个峰值的分布的示例。当一个分布有多个峰值时，找出不同峰值的原因通常会提供有用的信息。考试成绩出现双峰分布的原因是部分学生没有按时上课。得分在60岁及以下的学生承认他们在逃课，而得分较高的学生则在正常上课。

在确定了分数偏低的原因后，教师制定了一项出勤政策，强制学生每节课都上课——任何缺课的学生都将被拒之门外。桌子2.9在新的考勤政策之后显示第二次考试的分数。数字2.13呈现（并允许我们比较）两种考试的百分比频率多边形。我们看到第二次考试的多边形是单峰的——出勤政策4消除了分数的集中60 s，尽管分数仍然有些偏左。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Graphically Summarizing Quantitative Data

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|Graphically Summarizing Quantitative Data

统计代写|商业分析作业代写Statistical Modelling for Business代考|Frequency distributions and histograms

Major consulting firms such as Accenture, Ernst \& Young Consulting, and Deloitte \& Touche Consulting employ statistical analysis to assess the effectiveness of the systems they design for their customers. In this case a consulting firm has developed an electronic billing system for a Hamilton, Ohio, trucking company. The system sends invoices electronically to each customer’s computer and allows customers to easily check and correct errors. It is hoped that the new hilling system will substantially reduce the amount of time it takes customers to make payments. Typical payment times-measured from the date on an invoice to the date payment is received-using the trucking company’s old billing system had been 39 days or more. This exceeded the industry standard payment time of 30 days.
The new billing system does not automatically compute the payment time for each invoice because there is no continuing need for this information. Therefore, in order to assess the system’s effectiveness, the consulting firm selects a random sample of 65 invoices from the 7,823 invoices processed during the first three months of the new system’s operation. The payment times for the 65 sample invoices are manually determined and are given in Table $2.4$. If this sample can be used to establish that the new billing system substantially reduces payment times, the consulting firm plans to market the system to other trucking companies.

Looking at the payment times in Table $2.4$, we can see that the shortest payment time is 10 days and that the longest payment timee is 29 days. Beyond that, it is pretty difficult to interpret the data in any meaningful way. To better understand the sample of 65 payment times, the consulting firm will form a frequency distribution of the data and will graph the distribution by constructing a histogram. Similar to the frequency distributions for qualitative data we studied in Section 2.1, the frequency distribution will divide the payment times into classes and will tell us how many of the payment times are in each class.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Form Nonoverlapping Classes of Equal Width

Step 3: Form Nonoverlapping Classes of Equal Width We can form the classes of the frequency distribution by defining the boundaries of the classes. To find the first class boundary, we find the smallest payment time in Table $2.4$, which is 10 days. This value is the lower boundary of the first class. Adding the class length of 3 to this lower boundary, we obtain $10+3=13$, which is the upper boundary of the first class and the lower boundary of the second class. Similarly, the upper boundary of the second class and the lower boundary of the third class equals $13+3=16$. Continuing in this fashion, the lower boundaries of the remaining classes are $19,22,25$, and 28 . Adding the class length 3 to the lower boundary of the last class gives us the upper boundary of the last class, 31 . These boundaries define seven nonoverlapping classes for the frequency distribution. We summarize these classes in Table 2.6. For instance, the first class $-10$ days and less than 13 days-includes the payment times 10,11 , and 12 days; the second class $-13$ days and less than 16 days-includes the payment times 13 , 14 , and 15 days; and so forth. Notice that the largest observed payment time- 29 days-is contained in the last class. In cases where the largest measurement is not contained in the last class, we simply add another class. Generally speaking, the guidelines we have given for forming classes are not inflexible rules. Rather, they are inended to help us find reasunable classes. Finally, the method we have used for forming classes results in classes of equal length. Generally, forming classes of equal length will make it easier to appropriately interpret the frequency distribution.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Graph the Histogram

Step 5: Graph the Histogram We can graphically portray the distribution of payment times by drawing a histogram. The histogram can be constructed using the frequency, relative frequency, or percent frequency distribution. To set up the histogram, we draw rectangles that correspond to the classes. The base of the rectangle corresponding to a class represents the payment times in the class. The height of the rectangle can represent the class frequency, relative frequency, or percent frequency.

We have drawn a frequency histogram of the 65 payment times in Figure 2.7. The first (leftmost) rectangle, or “bar,” of the histogram represents the payment times 10,11 , and 12 . Looking at Figure 2.7, we see that the base of this rectangle is drawn from the lower boundary (10) of the first class in the frequency distribution of payment times to the lower boundary (13) of the second class. The height of this rectangle tells us that the frequency of the first class is 3 . The second histogram rectangle represents payment times 13,14 , and 15 . Its base is drawn from the lower boundary (13) of the second class to the lower boundary (16) of the third class, and its height tells us that the frequency of the second class is 14 . The other histogram bars are constructed similarly. Notice that there are no gaps between the adjacent rectangles in the histogram. Here, although the payment times have been recorded to the nearest whole day, the fact that the histogram bars touch each other emphasizes that a payment time could (in theory) be any number on the horizontal axis. In general, histograms are drawn so that adjacent bars touch each other.

Looking at the frequency distribution in Table $2.7$ and the frequency histogram in Figure 2.7, we can describe the payment times:
1 None of the payment times exceeds the industry standard of 30 days. (Actually, all of the payment times are less than 30 -remember the largest payment time is 29 days.)
2 The payment times are concentrated between 13 and 24 days ( 57 of the 65 , or $(57 / 65) \times 100=87.69 \%$, of the payment times are in this range).

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Frequency distributions and histograms

埃森哲、Ernst \& Young Consulting 和 Deloitte \& Touche Consulting 等主要咨询公司采用统计分析来评估他们为客户设计的系统的有效性。在这种情况下，一家咨询公司为俄亥俄州汉密尔顿的一家货运公司开发了一个电子计费系统。该系统以电子方式将发票发送到每个客户的计算机，并允许客户轻松检查和纠正错误。希望新的hilling系统将大大减少客户付款所需的时间。典型的付款时间——从发票上的日期到收到付款的日期——使用货运公司的旧计费系统是 39 天或更长时间。这超过了行业标准的 30 天付款时间。
新的计费系统不会自动计算每张发票的付款时间，因为不再需要此信息。因此，为了评估系统的有效性，咨询公司从新系统运行的前三个月处理的 7,823 张发票中随机抽取 65 张发票作为样本。65 张样本发票的付款时间是人工确定的，见表2.4. 如果这个样本可以用来确定新的计费系统大大减少了支付时间，那么咨询公司计划将该系统推销给其他货运公司。

查看表中的付款时间2.4，我们可以看到最短的支付时间是10天，最长的支付时间是29天。除此之外，很难以任何有意义的方式解释数据。为了更好地理解 65 次付款的样本，咨询公司将形成数据的频率分布，并通过构建直方图来绘制分布图。与我们在 2.1 节中研究的定性数据的频率分布类似，频率分布将支付时间划分为多个类别，并告诉我们每个类别中有多少支付时间。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Form Nonoverlapping Classes of Equal Width

第 3 步：形成等宽的非重叠类我们可以通过定义类的边界来形成频率分布的类。为了找到第一类边界，我们在表中找到最小的支付时间2.4，即 10 天。该值是第一类的下界。将类长度 3 添加到这个下边界，我们得到10+3=13，即第一类的上边界和第二类的下边界。类似地，第二类的上边界和第三类的下边界等于13+3=16. 以这种方式继续，剩余类的下边界是19,22,25, 和 28 . 将类长度 3 添加到最后一个类的下边界为我们提供了最后一个类的上边界 31 。这些边界为频率分布定义了七个不重叠的类别。我们在表 2.6 中总结了这些类。比如第一课−10天且少于 13 天 – 包括付款时间 10,11 和 12 天；第二课−13天且少于 16 天——包括付款时间 13 天、14 天和 15 天；等等。请注意，观察到的最长付款时间 – 29 天 – 包含在最后一节课中。如果最后一个类中不包含最大测量值，我们只需添加另一个类。一般而言，我们为形成班级而给出的指导方针并不是一成不变的规则。相反，它们旨在帮助我们找到合理的类。最后，我们用来形成类的方法导致类的长度相等。通常，形成等长的类将更容易适当地解释频率分布。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Graph the Histogram

第 5 步：绘制直方图我们可以通过绘制直方图以图形方式描绘支付时间的分布。可以使用频率、相对频率或百分比频率分布来构建直方图。为了设置直方图，我们绘制对应于类的矩形。一个类对应的矩形底表示该类中的支付时间。矩形的高度可以表示类频率、相对频率或百分比频率。

我们在图 2.7 中绘制了 65 次支付时间的频率直方图。直方图的第一个（最左边）矩形或“条形”表示支付时间 10,11 和 12 。看图 2.7，我们看到这个矩形的底边是从支付次数频率分布中第一类的下边界 (10) 到第二类的下边界 (13) 绘制的。这个矩形的高度告诉我们第一类的频率是 3 。第二个直方图矩形代表付款时间 13,14 和 15 。它的底是从第二类的下边界（13）到第三类的下边界（16）绘制的，它的高度告诉我们第二类的频率是14。其他直方图条的构造类似。请注意，直方图中的相邻矩形之间没有间隙。在这里，虽然支付时间已记录到最近的一整天，但直方图条相互接触的事实强调了支付时间（理论上）可以是水平轴上的任何数字。通常，绘制直方图以使相邻的条相互接触。

查看表中的频率分布2.7和图 2.7 中的频率直方图，我们可以描述支付时间：
1 支付时间没有超过 30 天的行业标准。（实际上，所有的付款时间都少于 30 天——记住最大的付款时间是 29 天。）
2 付款时间集中在 13 到 24 天之间（65 天中的 57 天，或(57/65)×100=87.69%, 的付款时间在此范围内）。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Graphically Summarizing

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|Graphically Summarizing

统计代写|商业分析作业代写Statistical Modelling for Business代考|Studying Pizza Preferences by Using

Part 1: Studying Pizza Preferences by Using a Frequency Distribution Unfortunately, the raw data in Table $2.1$ do not reveal much useful information about the pattern of pizza preferences. In order to summarize the data in a more useful way, we can construct a frequency distribution. To do this we simply count the number of times each of the six pizza restaurants appears in Table 2.1. We find that Bruno’s appears 8 times, Domino’s appears 2 times, Little Caesars appears 9 times, Papa John’s appears 19 times, Pizza Hut appears 4 times, and Will’s Uptown Pizza appears 8 times. The frequency distribution for the pizza preferences is given in Table $2.2$ on the next page-a list of each of the six restaurants along with their corresponding counts (or frequencies). The frequency distribution shows us how the preferences are distributed among the six restaurants. The purpose of the frequency

distribution is to make the data easier to understand. Certainly, looking at the frequency distribution in Table $2.2$ is more informative than looking at the raw data in Table 2.1. We see that Papa John’s is the most popular restaurant, and that Papa John’s is roughly twice as popular as each of the next three runners-up-Bruno’s, Little Caesars, and Will’s. Finally, Pizza Hut and Domino’s are the least preferred restaurants.

When we wish to summarize the proportion (or fraction) of items in each class, we employ the relative frequency for each class. If the data set consists of $n$ observations, we define the relative frequency of a class as follows:
Relative frequency of a class $=\frac{\text { frequency of the class }}{h}$
This quantity is simply the fraction of items in the class. Further, we can obtain the percent frequency of a class by multiplying the relative frequency by 100 .

Table $2.3$ gives a relative frequency distribution and a percent frequency distribution of the pizza preference data. A relative frequency distribution is a table that lists the relative frequency for each class, and a percent frequency distribution lists the percent frequency for each class. Looking at Table $2.3$, we see that the relative frequency for Bruno’s pizza is $8 / 50=.16$ and that (from the percent frequency distribution) $16 \%$ of the sampled students preferred Bruno’s pizza. Similarly, the relative frequency for Papa John’s pizza is $19 / 50=$ $.38$ and $38 \%$ of the sampled students preferred Papa John’s pizza. Finally, the sum of the relative frequencies in the relative frequency distribution equals $1.0$, and the sum of the percent frequencies in the percent frequency distribution equals $100 \%$. These facts are true for all relative frequency and percent frequency distributions.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Studying Pizza Preferences

Part 2: Studying Pizza Preferences by Using Bar Charts and Pie Charts A bar chart is a graphic that depicts a frequency, relative frequency, or percent frequency distribution. For example, Figure $2.1$ gives an Excel bar chart of the pizza preference data. On the horizontal axis we have placed a label for each class (restaurant), while the vertical axis measures frequencies. To construct the bar chart, Excel draws a bar (of fixed width) corresponding to each class label.

Each bar is drawn so that its height equals the frequency corresponding to its label. Because the height of each bar is a frequency, we refer to Figure $2.1$ as a frequency bar chart. Notice that there are gaps between the bars. When data are qualitative, the bars should always be separated by gaps in order to indicate that each class is separate from the others. The bar chart in Figure $2.1$ clearly illustrates that, for example, Papa Juhits pizca is preferred by more sampled students than any other restaurant and Domino’s pizza is least preferred by the sampled students.

If desired, the har heights can represent relative frequencies or percent frequencies For instance, Figure $2.2$ is a Minitab percent bar chart for the pizza preference data. Here the heights of the bars are the percentages given in the percent frequency distribution of Table 2.3. Lastly, the bars in Figures $2.1$ and $2.2$ have been positioned vertically. Because of this, these bar charts are called vertical bar charts. However, sometimes bar charts are constructed with horizontal bars and are called horizontal bar charts.

A pie chart is another graphic that can be used to depict a frequency distribution. When constructing a pie chart, we first draw a circle to represent the entire data set. We then divide the circle into sectors or “pie slices” based on the relative frequencies of the classes. For example, remembering that a circle consists of 360 degrees, Bruno’s Pizza (which has relative frequency .16) is assigned a pie slice that consists of $.16(360)=57.6$ degrees. Similarly, Papa John’s Pizza (with relative frequency .38) is assigned a pie slice having. $.38(360)=$ $136.8$ degrees. The resulting pie chart (constructed using Excel) is shown in Figure $2.3$ on the next page. Here we have labeled the pie slices using the percent frequencies. The pie slices can also be labeled using frequencies or relative frequencies.

统计代写|商业分析作业代写Statistical Modelling for Business代考|The Pareto chart (Optional)

Pareto charts are used to help identify important quality problems and opportunities for process improvement. By using these charts we can prioritize problem-solving activities. The Pareto chart is named for Vilfredo Pareto (1848-1923), an Italian economist. Pareto suggested that, in many economies, most of the wealth is held by a small minority of the population. It has been found that the “Pareto principle” often applies to defects. That is, only a few defect types account for most of a product’s quality problems.

To illustrate the use of Pareto charts, suppose that a jelly producer wishes to evaluate the labels being placed on 16-ounce jars of grape jelly. Every day for two weeks, all defective labels found on inspection are classified by type of defect. If a label has more than one defect, the type of defect that is most noticeable is recorded. The Excel output in Figure $2.4$ presents the frequencies and percentages of the types of defects observed over the two-week period.
In general, the first step in setting up a Pareto chart summarizing data concerning types of defects (or categories) is to construct a frequency table like the one in Figure 2.4. Defects or categories should be listed at the left of the table in decreasing order by frequenciesthe defect with the highest frequency will be at the top of the table, the defect with the second-highest frequency below the first, and so forth. If an “other” category is employed, it should be placed at the bottom of the table. The “other” category should not make up 50 percent or more of the total of the frequencies, and the frequency for the “other” category should not exceed the frequency for the defect at the top of the table. If the frequency for the “other” category is too high, data should be collected so that the “other” category can be broken down into new categories. Once the frequency and the percentage for each category are determined, a cumulative percentage for each category is computed. As illustrated in Figure $2.4$, the cumulative percentage for a particular category is the sum of the percentages corresponding to the particular eategory and the categories that are above that category in the table.

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Studying Pizza Preferences by Using

第 1 部分：使用频率分布研究比萨偏好不幸的是，表中的原始数据2.1不要透露太多关于披萨偏好模式的有用信息。为了以更有用的方式汇总数据，我们可以构造频率分布。为此，我们只需计算表 2.1 中六家比萨餐厅中的每家出现的次数。我们发现 Bruno’s 出现了 8 次，Domino’s 出现了 2 次，Little Caesars 出现了 9 次，Papa John’s 出现了 19 次，Pizza Hut 出现了 4 次，Will’s Uptown Pizza 出现了 8 次。披萨偏好的频率分布在表中给出2.2在下一页上 – 六家餐厅中每家的列表及其相应的计数（或频率）。频率分布向我们展示了偏好在六家餐厅之间的分布情况。频率的目的

分布是为了让数据更容易理解。当然，看看表中的频率分布2.2比查看表 2.1 中的原始数据提供更多信息。我们看到 Papa John’s 是最受欢迎的餐厅，而 Papa John’s 的受欢迎程度大约是接下来三个亚军 Bruno’s、Little Caesars 和 Will’s 的两倍。最后，必胜客和达美乐是最不受欢迎的餐厅。

当我们希望总结每个类别中项目的比例（或分数）时，我们使用每个类别的相对频率。如果数据集由n观察，我们定义一个类的相对频率如下：
一个类的相对频率= 上课频率 H
这个数量只是类中项目的分数。此外，我们可以通过将相对频率乘以 100 来获得类的百分比频率。

桌子2.3给出比萨偏好数据的相对频率分布和百分比频率分布. 相对频率分布是列出每个类别的相对频率的表格，百分比频率分布列出每个类别的百分比频率。看表2.3，我们看到布鲁诺披萨的相对频率是8/50=.16和那个（从百分比频率分布）16%的抽样学生更喜欢布鲁诺的披萨。同样，棒约翰披萨的相对频率是19/50= .38和38%的抽样学生更喜欢 Papa John 的披萨。最后，相对频率分布中的相对频率之和等于1.0，并且百分比频率分布中的百分比频率之和等于100%. 这些事实适用于所有相对频率和百分比频率分布。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Studying Pizza Preferences

第 2 部分：使用条形图和饼图研究比萨偏好条形图是描述频率、相对频率或百分比频率分布的图形。例如，图2.1给出披萨偏好数据的 Excel 条形图. 在水平轴上，我们为每个类别（餐厅）放置了一个标签，而垂直轴测量频率。为了构建条形图，Excel 会绘制一个与每个类标签相对应的条形（固定宽度）。

绘制每个条形使其高度等于其标签对应的频率。因为每个条的高度是一个频率，我们参考图2.1作为频率条形图。请注意，条形之间存在间隙。当数据是定性的时，条形应始终由间隙隔开，以表明每个类与其他类是分开的。图中的条形图2.1清楚地表明，例如，Papa Juhits pizca 比任何其他餐厅更受抽样学生的青睐，而 Domino 的比萨饼是抽样学生最不喜欢的。

如果需要，har 高度可以表示相对频率或百分比频率。例如，图2.2是比萨偏好数据的 Minitab 百分比条形图。这里的条形高度是表 2.3 的百分比频率分布中给出的百分比。最后，图中的条形图2.1和2.2已垂直放置。因此，这些条形图被称为垂直条形图。但是，有时条形图由水平条构成，称为水平条形图。

饼图是另一种可用于描述频率分布的图形。在构建饼图时，我们首先画一个圆圈来表示整个数据集。然后，我们根据类的相对频率将圆圈分成扇区或“饼片”。例如，记住一个圆由 360 度组成，Bruno’s Pizza（相对频率为 0.16）被分配一个饼片，该饼片由.16(360)=57.6度。类似地，Papa John’s Pizza（相对频率为 0.38）被分配了一个饼片。.38(360)= 136.8度。生成的饼图（使用 Excel 构建）如图2.3在下一页。在这里，我们使用百分比频率标记了饼图。饼图切片也可以使用频率或相对频率来标记。

统计代写|商业分析作业代写Statistical Modelling for Business代考|The Pareto chart (Optional)

帕累托图用于帮助识别重要的质量问题和过程改进的机会。通过使用这些图表，我们可以优先考虑解决问题的活动。Pareto 图以意大利经济学家 Vilfredo Pareto (1848-1923) 命名。帕累托建议，在许多经济体中，大部分财富由少数人口持有。已经发现，“帕累托原理”通常适用于缺陷。也就是说，只有少数缺陷类型可以解决大部分产品的质量问题。

为了说明帕累托图的使用，假设果冻生产商希望评估放置在 16 盎司罐装葡萄果冻上的标签。两周内每天都会对检查中发现的所有缺陷标签按缺陷类型进行分类。如果标签有多个缺陷，则记录最明显的缺陷类型。图中的 Excel 输出2.4显示了在两周内观察到的缺陷类型的频率和百分比。
一般而言，建立帕累托图总结有关缺陷类型（或类别）的数据的第一步是构建一个如图 2.4 中所示的频率表。缺陷或类别应按频率降序排列在表的左侧，频率最高的缺陷将位于表的顶部，频率第二高的缺陷位于第一个之下，依此类推。如果使用“其他”类别，则应将其放在表格底部。“其他”类别不应占总频率的 50% 或更多，“其他”类别的频率不应超过表顶部缺陷的频率。如果“其他”类别的频率太高，应收集数据，以便将“其他”类别分解为新类别。一旦确定了每个类别的频率和百分比，就会计算每个类别的累积百分比。如图所示2.4，特定类别的累积百分比是对应于特定类别的百分比和表中高于该类别的类别的总和。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Excel, MegaStat, and Minitab for Statistics

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|Excel, MegaStat, and Minitab for Statistics

In this book we use three types of software to carry out statistical analysis – Excel 2013, MegaStat, and Minitab $17 .$ Excel is, of course, a general purpose electronic spreadsheet program and analytical tool. The analysis ToolPak in Excel includes many procedures for performing various kinds of basic statistical analyses. MegaStat is an add-in package that is specifically designed for performing statistical analysis in the Excel spreadsheet environment. Minitab is a computer package designed expressly for conducting statistical analysis. It is widely used at many colleges and universities and in a large number of business organizations. The principal advantage of Excel is that, because of its broad acceptance among students and professionals as a multipurpose analytical tool, it is both well-known and widely available. The advantages of a special-purpose statistical software package like Minitab are that it provides a far wider range of statistical procedures and it offers the experienced analyst a range of options to better control the analysis. The advantages of MegaStat include (1) its ability to perform a number of statistical calculations that are not automatically done by the procedures in the Excel ToolPak and (2) features that make it easier to use than Excel for a wide variety of statistical analyses. In addition, the output obtained by using MegaStat is automatically placed in a standard Excel spreadsheet and can be edited by using any of the features in Excel. MegaStat can be copied from the book’s website. Excel, MegaStat, and Minitab, through built-in functions, programming languages, and macros, offer almost limitless power. Here, we will limit our attention to procedures that are easily accessible via menus without resorting to any special programming or advanced features.

Commonly used features of Excel 2013, MegaStat, and Minitab 17 are presented in this chapter along with an initial application-the construction of a time series plot of the gas mileages in Table 1.7. You will find that thẻ limited instructions included hêré, alông with thê built-in hêlp féatures ớ all thieê sơftware packages, will serve as a starting point from which you can discover a variety of other procedures and options. Much more detailed descriptions of Minitab 17 can be found in other sources, in particular in the manual Getting Started with Minitab $17 .$ This manual is available as a pdf file, viewable using Adobe Acrobat Reader, on the Minitab Inc. website-go to http://www.minitab.com/en-us/support/documentation/ to download the manual. This manual is also available online at http://support.minitab.com/en-us/minitab/17/getting-started/. Similarly, there are a number of alternative reference materials for Microsoft Excel 2013. Of course, an understanding of the related statistical concepts is essential to the effective use of any statistical software package.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Getting Started with Excel

Because Excel 2013 may be new to some readers, and because the Excel 2013 window looks somewhat different from previous versions of Excel, we will begin by describing some characteristics of the Excel 2013 window. Versions of Excel prior to 2007 employed many drop-down menus. This meant that many features were “hidden” from the user, which resulted in a steep learning curve for beginners. Beginning with Excel 2010, Microsoft tried to reduce the number of features that are hidden in drop-down menus. Therefore, Excel 2013 displays all of the applicable commands needed for a particular type of task at the top of the Excel window. These commands are represented by a tab-and-group arrangement called the ribbon-see the right side of the illustration of an Excel 2013 window on the next page. The commands displayed in the ribbon are regulated by a series of tabs located near the top of the ribbon. For example, in the illustration, the Home tab is selected. If we selected a different tab, say, for example, the Page Layout tab, the commands displayed by the ribbon would be different.
We now briefly describe some basic features of the Excel 2013 window:
1 File button: By clicking on this button, the user obtains a menu of often used commands-for example, Open, Save, Print, and so forth. This menu also provides access to a large number of Excel options settings.
$2 \mathrm{~ T a ̉ b s : ~ C l i c k i n g ̄ ~ o ̄ n ~ a ~ t a b b ~ r e ̂ s u l t s ~ i n ~ a ~ r i b b o ́ n ~ đ i s p l a y ~ o ̛ f ~ f e ̉ a t u}$ type of task. For example, when the Home tab is selected (as in the figure), the features, commands, and options displayed by the ribbon are all related to making entries into the Excel worksheet. As another example, if the Formulas tab is selected, all of the features, commands, and options displayed in the ribbon relate to using formulas in the Excel worksheet.
Appendix $1.1$
Getting Started with Excel
3 Quick access toolbar: This toolbar displays buttons that provide shortcuts to often-used commands. Initially, this toolbar displays Save, Undo, and Redo buttons. The user can customize this toolbar by adding shortcut buttons for other commands (such as New, Open, Quick Print, and so forth). This can be done by clicking on the arrow button directly to the right of the Quick Access toolbar and by making selections from the “Customize” drop-down menu that appears.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Select Home : Format : Row Height

This notation indicates that we first select the Home tab on the ribbon, then we select Format from the Cells Group on the ribbon, and finally we select Row Height from the Format drop-down menu.

For many of the statistical and graphical procedures in Excel, it is necessary to provide a range of cells to specify the location of data in the spreadsheet. Generally, the range may be specified either by typing the cell locations directly into a dialog box or by dragging the selected range with the mouse. Although for the experienced user, it is usually easier to use the mouse to select a range, the instructions that follow will, for precision and clarity, specify ranges by typing in cell locations. The selected range may include column or variable labels-labels at the tops columns that serve to identify variables. When the selected range includes such labels, it is important to select the “Labels check box” in the analysis dialog box.

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Excel, MegaStat, and Minitab for Statistics

在本书中，我们使用三种类型的软件进行统计分析——Excel 2013、MegaStat 和 Minitab17.Excel 当然是一种通用的电子表格程序和分析工具。Excel 中的分析工具包包括许多用于执行各种基本统计分析的程序。MegaStat 是一个插件包，专为在 Excel 电子表格环境中执行统计分析而设计。Minitab 是专为进行统计分析而设计的计算机软件包。它在许多高校和大量的商业组织中被广泛使用。Excel 的主要优势在于，由于它作为一种多功能分析工具在学生和专业人士中被广泛接受，因此它广为人知且应用广泛。像 Minitab 这样的专用统计软件包的优势在于它提供了范围更广的统计程序，并为经验丰富的分析师提供了一系列选项来更好地控制分析。MegaStat 的优点包括 (1) 能够执行 Excel ToolPak 中的程序无法自动完成的大量统计计算，以及 (2) 比 Excel 更易于使用的各种统计分析功能。此外，使用 MegaStat 获得的输出会自动放置在标准 Excel 电子表格中，并且可以使用 Excel 中的任何功能进行编辑。MegaStat 可以从本书的网站上复制。Excel、MegaStat 和 Minitab 通过内置函数、编程语言和宏，提供几乎无限的功能。这里，

本章介绍 Excel 2013、MegaStat 和 Minitab 17 的常用功能以及初始应用程序 – 表 1.7 中的油耗时间序列图的构建。您会发现包括 hêré 在内的有限说明，以及内置帮助功能 ớ 所有 thieê 软件包，将作为您可以发现各种其他程序和选项的起点。可以在其他来源中找到有关 Minitab 17 的更详细的描述，特别是在 Minitab 入门手册中17.本手册以 PDF 文件的形式提供，可使用 Adobe Acrobat Reader 在 Minitab Inc. 网站上查看 – 请访问 http://www.minitab.com/en-us/support/documentation/ 下载手册。本手册也可从 http://support.minitab.com/en-us/minitab/17/getting-started/ 在线获取。同样，Microsoft Excel 2013 也有许多替代参考资料。当然，了解相关统计概念对于有效使用任何统计软件包都是必不可少的。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Getting Started with Excel

由于 Excel 2013 对某些读者来说可能是新的，并且由于 Excel 2013 窗口看起来与以前版本的 Excel 有所不同，我们将首先描述 Excel 2013 窗口的一些特征。2007 年之前的 Excel 版本使用了许多下拉菜单。这意味着许多功能对用户“隐藏”，导致初学者学习曲线陡峭。从 Excel 2010 开始，Microsoft 尝试减少隐藏在下拉菜单中的功能数量。因此，Excel 2013 在 Excel 窗口顶部显示特定类型任务所需的所有适用命令。这些命令由称为功能区的选项卡和组排列表示 – 请参见下一页 Excel 2013 窗口插图的右侧。功能区中显示的命令由功能区顶部附近的一系列选项卡控制。例如，在插图中，选择了 Home 选项卡。如果我们选择了不同的选项卡，例如“页面布局”选项卡，则功能区显示的命令会有所不同。
我们现在简要介绍一下 Excel 2013 窗口的一些基本功能：
1 文件按钮：通过单击此按钮，用户可以获得一个常用命令菜单，例如打开、保存、打印等。此菜单还提供对大量 Excel 选项设置的访问。
̉đ̛̉2 吨一种̉bs: Cl一世Cķ一世nḠ 这̄n 一种吨一种bb r和̂s在l吨s 一世n 一种 r一世bb这́n D一世spl一种是这̛F F和̉一种吨在任务类型。例如，当 Home 选项卡被选中时（如图所示），功能区显示的功能、命令和选项都与在 Excel 工作表中输入条目有关。作为另一个示例，如果选择了“公式”选项卡，则功能区中显示的所有功能、命令和选项都与在 Excel 工作表中使用公式有关。
附录1.1
Excel
3 入门快速访问工具栏：此工具栏显示提供常用命令快捷方式的按钮。最初，此工具栏显示保存、撤消和重做按钮。用户可以通过为其他命令（例如新建、打开、快速打印等）添加快捷按钮来自定义此工具栏。这可以通过直接单击快速访问工具栏右侧的箭头按钮并从出现的“自定义”下拉菜单中进行选择来完成。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Select Home : Format : Row Height

该符号表示我们首先选择功能区上的主页选项卡，然后从功能区上的单元格组中选择格式，最后从格式下拉菜单中选择行高。

对于 Excel 中的许多统计和图形程序，需要提供一系列单元格来指定数据在电子表格中的位置。通常，可以通过直接在对话框中键入单元格位置或通过用鼠标拖动所选范围来指定范围。尽管对于有经验的用户来说，使用鼠标选择范围通常更容易，但为了精确和清晰，下面的说明将通过键入单元格位置来指定范围。选定的范围可以包括列或变量标签——位于顶部列的用于识别变量的标签。当所选范围包括此类标签时，请务必在分析对话框中选中“标签复选框”。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|More about Surveys and Errors in Survey

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|More about Surveys and Errors in Survey

统计代写|商业分析作业代写Statistical Modelling for Business代考|Types of survey questions

Survey instruments can use dichotomous (“yes or no”), multiple-choice, or open-ended questions. Each type of question has its benefits and drawbacks. Dichotomous questions are usually clearly stated, can be answered quickly, and yield data that are easily analyzed. However, the information gathered may be limited by this two-option format. If we limit voters to expressing support or disapproval for stem-cell research, we may not learn the nuanced reasoning that voters use in weighing the merits and moral issues involved. Similarly, in today’s heterogeneous world, it would be unusual to use a dichotomous question to categorize a person’s religious preferences. Asking whether respondents are Christian or non-Christian (or to use any other two categories like Jewish or non-Jewish; Muslim or nonMuslim) is certain to make some people feel their religion is being slighted. In addition, this is a crude and unenlightening way to learn about religious preferences.

Multiple-choice questions can assume several different forms. Sometimes respondents are asked to choose a response from a list (for example, possible answers to the religion question could be Jewish; Christian; Muslim; Hindu; Agnostic; or Other). Other times, respondents are asked to choose an answer from a numerical range. We could ask the question:
“In your opinion, how important are SAT scores to a college student’s success?”
Not important at all $1 \quad 2 \quad 3 \quad 4 \quad 5$ Extremely important
These numerical responses are usually summarized and reported in terms of the average response, whose size tells us something about the perceived importance. The Zagat restaurant survey (www.zagat.com) asks diners to rate restaurants’ food, décor, and service, each on a scale of 1 to 30 points, with a 30 representing an incredible level of satisfaction. Although the Zagat scale has an unusually wide range of possible ratings, the concept is the same as in the more common 5-point scale.

Open-ended questions typically provide the most honest and complete information because there are no suggested answers to divert or bias a person’s response. This kind of question is often found on instructor evaluation forms distributed at the end of a college course. College students at Georgetown University are asked the open-ended question, “What comments would you give to the instructor?’ The responses provide the instructor feedback that may be missing from the initial part of the teaching evaluation survey, which consists of numerical multiple-choice ratings of various aspects of the course. While these numerical ratings can be used to compare instructors and courses, there are no easy comparisons of the diverse responses instructors receive to the open-ended question. In fact, these responses are often seen only by the instructor and are useful, constructive tools for the teacher despite the fact they cannot be readily summarized.

Survey questionnaires must be carefully constructed so they do not inadvertently bias the results. Because survey design is such a difficult and sensitive process, it is not uncommon for a pilot survey to be taken before a lot of time, effort, and financing go into collecting a large amount of data. Pilot surveys are similar to the beta version of a new electronic product; they are tested out with a smaller group of people to work out the “kinks” before being used on a larger scale. Determination of the sample size for the final survey is an important process for many reasons. If the sample size is too large, resources may be wasted during the data collection. On the other hand, not collecting enough data for a meaningful analysis will obviously be detrimental to the study. Fortunately, there are several formulas that will help decide how large a sample should be, depending on the goal of the study and various other factors.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Types of surveys

There are several different survey types, and we will explore just a few of them. The phone survey is particularly well-known (and often despised). A phone survey is inexpensive and usually conducted by callers who have very little training. Because of this and the impersonal nature of the medium, the respondent may misunderstand some of the questions. A further drawback is that some people cannot be reached and that others may refuse to answer some or all of the questions. Phone surveys are thus particularly prone to have a low response rate.
The response rate is the proportion of all people whom we attempt to contact that actually respond to a survey. A low response rate can destroy the validity of a survey’s results.
It can be difficult to collect good data from unsolicited phone calls because many of us resent the interruption. The calls often come at inopportune times, intruding on a meal or arriving just when we have climbed a ladder with a full can of paint. No wonder we may fantasize about turning the tables on the callers and calling them when it is least convenient.

Numerous complaints have been filed with the Federal Trade Commission (FTC) about the glut of marketing and survey telephone calls to private residences. The National Do Not Call Registry was created as the culmination of a comprehensive, three-year review of the Telemarketing Sales Rule (TSR) (www.ftc.gov/donotcall/). This legislation allows people to enroll their phone numbers on a website so as to prevent most marketers from calling them.
Self-administered surveys, or mail surveys, are also very inexpensive to conduct. However, these also have their drawbacks. Often, recipients will choose not to reply unless they receive some kind of financial incentive or other reward. Generally, after an initial mailing, the response rate will fall between 20 and 30 percent. Response rates can be raised with successive follow-up reminders, and after three contacts, they might reach between 65 and 75 percent. Unfortunately, the entire process can take significantly longer than a phone survey would.

Web-based surveys have become increasingly popular, but they suffer from the same problems as mail surveys. In addition, as with phone surveys, respondents may record their true reactions incorrectly because they have misunderstood some of the questions posed.
A personal interview provides more control over the survey process. People selected for interviews are more likely to respond because the questions are being asked by someone face-to-face. Questions are less likely to be misunderstood because the people conducting the interviews are typically trained employees who can clear up any confusion arising during the process. On the other hand, interviewers can potentially “lead” a respondent by body language which signals approval or disapproval of certain sorts of answers. They can also prompt certain replies by providing too much information. Mall surveys are examples of personal interviews. Interviewers approach shoppers as they pass by and ask them to answer the survey questions. Response rates around 50 percent are typical. Personal interviews are more costly than mail or phone surveys. Obviously, the objective of the study will be important in deciding upon the survey type employed.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Errors of observation

As discussed in Section 1.4, the opinions of those who bother to complete a voluntary response survey may be dramatically different from those who do not. (Recall the Ann Landers question about having children.) The viewer voting on the popular television show American Idol is another illustration of selection bias, because only those who are interested in the outcome of the show will bother to phone in or text message their votes. The results of the voting are not representative of the performance ratings the country would give as a whole.
Errors of observation occur when data values are recorded incorrectly. Such errors can be caused by the data collector (the interviewer), the survey instrument, the respondent, or the data collection process. For instance, the manner in which a question is asked can influence the response. Or, the order in which questions appear on a questionnaire can influence the survey results. Or, the data collection method (telephone interview, questionnaire, personal interview, or direct observation) can influence the results. A recording error occurs when either the respondent or interviewer incorrectly marks an answer. Once data are collected from a survey, the results are often entered into a computer for statistical analysis. When transferring data from a survey form to a spreadsheet program like Excel, Minitab, or MegaStat, there is potential for entering them incorrectly. Before the survey is administered, the questions need to be very carefully worded so that there is little chance of misinterpretation. A poorly framed question might yield results that lead to unwarranted decisions. Scaled questions are particularly susceptible to this type of error. Consider the question “How would you rate this course?” Without a proper explanation, the respondent may not know whether “1” or ” 5 ” is the best.

If the survey instrument contains highly sensitive questions and respondents feel compelled to answer, they may not tell the truth. This is especially true in personal interviews. We then have what is called response bias. A surprising number of people are reluctant to be candid about what they like to read or watch on television. People tend to overreport “good” activities like reading respected newspapers and underreport their “bad” activities like delighting in the National Fnquirer’s stories of alien ahductions and celehrity meltdewns. Iniggine, then, the difficully in getting henest inswers abeut pevple’s ganbling hab its, drug use, or sexual histories. Response bias can also occur when respondents are asked slanted questions whose wording influences the answer received. For example, consider the following question:
Which of the following best describes your views on gun control?
1 The government should take away our guns, leaving us defenseless against heavily armed criminals.
2 We have the right to keep and bear arms.
This question is biased toward eliciting a response against gun control.

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Types of survey questions

调查工具可以使用二分法（“是或否”）、多项选择或开放式问题。每种类型的问题都有其优点和缺点。二分法问题通常表述清楚，可以快速回答，并产生易于分析的数据。但是，收集的信息可能会受到这种二选一格式的限制。如果我们限制选民表达对干细胞研究的支持或反对，我们可能无法了解选民在权衡所涉及的优点和道德问题时使用的微妙推理。同样，在当今异质的世界中，使用二分法问题对一个人的宗教偏好进行分类是不寻常的。询问受访者是基督徒还是非基督徒（或使用任何其他两个类别，如犹太人或非犹太人；穆斯林或非穆斯林）肯定会让一些人觉得他们的宗教受到轻视。此外，这是了解宗教偏好的一种粗鲁且无启发性的方式。

多项选择题可以采用几种不同的形式。有时会要求受访者从列表中选择一个答案（例如，宗教问题的可能答案可能是犹太人；基督教；穆斯林；印度教；不可知论者；或其他）。其他时候，受访者被要求从一个数字范围中选择一个答案。我们可以问这样一个问题：
“在您看来，SAT 成绩对大学生的成功有多重要？”
一点都不重要12345极其重要
这些数值响应通常根据平均响应进行总结和报告，其大小告诉我们一些关于感知重要性的信息。Zagat 餐厅调查 (www.zagat.com) 要求食客对餐厅的食物、装饰和服务进行评分，每项评分从 1 到 30 分，其中 30 分代表令人难以置信的满意度。尽管 Zagat 量表具有异常广泛的可能评级范围，但其概念与更常见的 5 点量表相同。

开放式问题通常提供最诚实和最完整的信息，因为没有建议的答案来转移或偏向一个人的反应。这种问题经常出现在大学课程结束时分发的教师评估表上。乔治城大学的大学生被问到一个开放式问题，“你会给导师什么意见？” 回答提供了教师反馈，教学评估调查的初始部分可能缺少该反馈，该调查由课程各个方面的数字多项选择评分组成。虽然这些数字评分可用于比较教师和课程，但很难比较教师对开放式问题的不同回答。实际上，

调查问卷必须仔细构建，以免无意中使结果产生偏差。由于调查设计是一个如此困难和敏感的过程，在大量时间、精力和资金用于收集大量数据之前进行试点调查的情况并不少见。试点调查类似于新电子产品的测试版；在大规模使用之前，它们会与一小群人一起进行测试，以解决“问题”。出于多种原因，确定最终调查的样本量是一个重要过程。如果样本量太大，可能会在数据收集过程中浪费资源。另一方面，没有为有意义的分析收集足够的数据显然会损害研究。幸运的是，

统计代写|商业分析作业代写Statistical Modelling for Business代考|Types of surveys

有几种不同的调查类型，我们将只探讨其中的几个。电话调查特别有名（并且经常被鄙视）。电话调查费用不高，通常由受过很少培训的呼叫者进行。由于这一点以及媒体的非个人性质，受访者可能会误解一些问题。另一个缺点是无法联系到某些人，而其他人可能拒绝回答部分或全部问题。因此，电话调查特别容易得到低响应率。
回复率是我们尝试联系的所有实际回复调查的人的比例。低响应率会破坏调查结果的有效性。
可能很难从不请自来的电话中收集良好的数据，因为我们中的许多人都讨厌这种中断。这些电话通常在不合时宜的时候打来，比如打扰吃饭，或者在我们爬上装满油漆的梯子时才打来。难怪我们会幻想扭转局面并在最不方便的时候打电话给他们。

已经向联邦贸易委员会 (FTC) 提出了大量关于私人住宅的营销和调查电话过剩的投诉。全国请勿来电登记处的创建是对电话营销销售规则 (TSR) (www.ftc.gov/donotcall/) 的三年全面审查的高潮。这项立法允许人们在网站上注册他们的电话号码，以防止大多数营销人员给他们打电话。
自我管理的调查或邮寄调查的成本也很低。然而，这些也有它们的缺点。通常，收件人会选择不回复，除非他们收到某种经济激励或其他奖励。通常，在初次邮寄后，回复率将在 20% 到 30% 之间。通过连续的后续提醒可以提高响应率，并且在三个联系之后，它们可能会达到 65% 到 75% 之间。不幸的是，整个过程可能比电话调查花费的时间要长得多。

基于网络的调查变得越来越流行，但它们也面临与邮件调查相同的问题。此外，与电话调查一样，受访者可能会错误地记录他们的真实反应，因为他们误解了提出的一些问题。
个人访谈可以更好地控制调查过程。被选中参加面试的人更有可能做出回应，因为这些问题是由面对面的人提出的。问题不太可能被误解，因为进行采访的人通常是经过培训的员工，他们可以解决过程中出现的任何困惑。另一方面，采访者可能会通过肢体语言“引导”受访者，这表明对某些类型的回答表示赞同或不赞同。他们还可以通过提供太多信息来提示某些回复。商场调查是个人访谈的例子。采访者在购物者经过时接近他们并要求他们回答调查问题。50% 左右的响应率是典型的。个人访谈比邮件或电话调查更昂贵。明显地，

统计代写|商业分析作业代写Statistical Modelling for Business代考|Errors of observation

正如第 1.4 节所讨论的，那些费心完成自愿回答调查的人的意见可能与那些不参加的人有很大不同。（回想一下 Ann Landers 关于生孩子的问题。）热门电视节目《美国偶像》上的观众投票是选择偏见的另一个例证，因为只有那些对节目结果感兴趣的人才会费心打电话或发短信给他们的选票. 投票结果并不代表该国整体的绩效评级。
当数据值记录不正确时，就会发生观察错误。此类错误可能由数据收集者（访调员）、调查工具、受访者或数据收集过程引起。例如，提问的方式会影响回答。或者，问题出现在问卷上的顺序会影响调查结果。或者，数据收集方法（电话访谈、问卷调查、个人访谈或直接观察）会影响结果。当受访者或访问者错误地标记答案时，就会发生记录错误。从调查中收集数据后，通常会将结果输入计算机进行统计分析。将数据从调查表传输到 Excel、Minitab 或 MegaStat 等电子表格程序时，有可能输入错误。在进行调查之前，需要对问题进行非常仔细的措辞，以免产生误解。一个结构不佳的问题可能会产生导致无根据的决定的结果。比例问题特别容易受到此类错误的影响。考虑“你如何评价这门课程？”这个问题。如果没有适当的解释，被访者可能不知道“1”还是“5”是最好的。考虑“你如何评价这门课程？”这个问题。如果没有适当的解释，被访者可能不知道“1”还是“5”是最好的。考虑“你如何评价这门课程？”这个问题。如果没有适当的解释，被访者可能不知道“1”还是“5”是最好的。

如果调查工具包含高度敏感的问题，而受访者觉得有必要回答，他们可能不会说实话。在个人采访中尤其如此。然后我们就有了所谓的反应偏差。数量惊人的人不愿意坦诚他们喜欢在电视上阅读或观看的内容。人们倾向于高估“好”活动，例如阅读受人尊敬的报纸，而低估他们的“坏”活动，例如欣赏《国家调查报》关于外星人绑架和名人融化的故事。因此，Iniggine 很难得到关于 pevple 的嗜好、吸毒或性史的最新答案。当回答者被问及措辞会影响收到的答案的倾斜问题时，也会出现反应偏差。例如，考虑以下问题：
以下哪项最能描述您对枪支管制的看法？
1 政府应该拿走我们的枪支，让我们对全副武装的犯罪分子束手无策。
2 我们有权持有和携带武器。
这个问题偏向于引发对枪支管制的回应。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Predictive analytics, data mining

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|Predictive analytics, data mining

统计代写|商业分析作业代写Statistical Modelling for Business代考|prescriptive analytics

Predictive analytics are methods used to find anomalies, patterns, and associations in data sets, with the purpose of predicting future outcomes. Predictive analytics and data mining are terms that are sometimes used together, but data mining might more specifically be defined to be the use of predictive analytics, computer science algorithms, and information systems techniques to extract useful knowledge from huge amounts of data. It is estimated that for any data mining project, approximately 65 percent to 90 percent of the time is spent in data preparation – checking, correcting, reconciling inconsistencies in, and otherwise “cleaning” the data. Also, whereas predictive analytics methods might be most useful to decision makers when used with data mining, these methods can also be important, as we will see, when analyzing smaller data sets. Prescriptive analytics looks at internal and extemal variables and constraints, along with the predictions obtained from predictive analytics, to recommend one or more courses of action. In this book, other than intuitively using predictions from predictive analytics to suggest business improvement courses of action, we will not discuss prescriptive analytics. Therefore, returning to predictive analytics, we can roughly classify the applications of predictive analytics into six categories:
Anomaly (outlier) detection In a data set, predictive analytics can be used to get a picture of what the data tends to look like in a typical case and to determine if an observation is notably different (or outlying) from this pattern. For example, a sales manager could model the sales results of typical salespeople and use anomaly detection to identify specific salespeople who have unusually high or low sales results. Or the IRS could model typical tax returns and use anomaly detection to identify specific returns that are extremely atypical for review and possible audit.
Association learning This involves identifying items that tend to co-occur and finding the rules that describe their co-occurrence. For example, a supermarket chain once found that men who buy baby diapers on Thursdays also tend to buy beer on Thursdays (possibly in anticipation of watching sports on television over the weekend). This led the chain to display beer near the baby aisle in its stores. As another example, Netflix might find that customers whō rent fictiōnal dramas alsō tênd tō rent historical documentaries ō thăt some customers will rent almost any type of movie that stars a particular actor or actress. Disney might find that visitors who spend more time at the Magic Kingdom also tend to buy Disney cartoon character clothing. Disney might also find that visitors who stay in more luxurious Disney hotels also tend to play golf on Disney courses and take cruises on the Disney Cruise Line. These types of findings are used for targeting coupons, deals, or advertising to the right potential customers.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Ratio, Interval, Ordinal

In Section $1.1$ we said that a variable is quantitative if its possible values are numbers that represent quantities (that is, “how much” or “how many”). In general, a quantitative variable is measured on a scale having a fixed unit of measurement between its possible values. For example, if we measure employees’ salaries to the nearest dollar, then one dollar is the fixed unit of measurement between different employees’ salaries. There are two types of quantitative variables: ratio and interval. A ratio variable is a quantitative variable measured on a scale such that ratios of its values are meaningful and there is an inherently defined zero value. Variables such as salary, height, weight, time, and distance are ratio variables. For example, a distance of zero miles is “no distance at all,” and a town that is 30 miles away is “twice as far” as a town that is 15 miles away.

An interval variable is a quantitative variable where ratios of its values are not meaningful and there is not an inherently defined zero value. Temperature (on the Fahrenheit scale) is an interval variable. For example, zero degrees Fahrenheit does not represent “no heat at all,” just that it is very cold. Thus, there is no inherently defined zero value. Furthermore, ratios of temperatures are not meaningful. For example, it makes no sense to say that $60^{\circ}$ is twice as

warm as $30^{\circ}$. In practice, there are very few interval variables other than temperature. Almost all quantitative variables are ratio variables.

In Section $1.1$ we also said that if we simply record into which of several categories a population (or sample) unit falls, then the variable is qualitative (or eategorical). There are two types of qualitative variables: ordinal and nominative. An ordinal variable is a qualitative variable for which there is a meaningful ordering, or ranking, of the categories. The measurements of an ordinal variable may be nonnumerical or numerical. For example, a student may be asked to rate the teaching effectiveness of a college professor as excellent, good, average, poor, or unsatisfactory. Here, one category is higher than the next one; that is, “excellent” is a higher rating than “good,” “good” is a higher rating than “average,” and so on. Therefore, teaching effectiveness is an ordinal variable having nonnumerical measurements. On the other hand, if (as is often done) we substitute the numbers $4,3,2,1$, and 0 for the ratings excellent through unsatisfactory, then teaching effectiveness is an ordinal variable having numerical measurements.

In practice, hoth numhers and associated words are often presented to respondents asked to rate a person or item. When numbers are used, statisticians debate whether the ordinal variable is “somewhat quantitative.” For example, statisticians who claim that teaching effectiveness rated as $4,3,2,1$, or 0 is not somewhat quantitative argue that the difference between 4 (excellent) and 3 (good) may not be the same as the difference between 3 (good) and 2 (average). Other statisticians argue that as soon as respondents (students) see equally spaced numbers (even though the numbers are described by words), their responses are affected enough to make the variable (teaching effectiveness) somewhat quantitative. Generally speaking, the specific words associated with the numbers probably substantially affect whether an ordinal variable may be considered somewhat quantitative. It is important to note, however, that in practice numerical ordinal ratings are often analyzed as though they are quantitative. Specifically, various arithmetic operations (as discussed in Chapters 2 through 18) are often performed on numerical ordinal ratings. For example, a professor’s teaching effectiveness average and a student’s grade point average are calculated.

To conclude this section, we consider the second type of qualitative variable. A nominative variable is a qualitative variable for which there is no meaningful ordering, or ranking, of the categories. A person’s gender, the color of a car, and an employee’s state of residence are nominative variables.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Stratified Random

It is wise to stratify when the population consists of two or more groups that differ with respect to the variable of interest. For instance, consumers could be divided into strata based on gender, age, ethnic group, or income.

As an example, suppose that a department store chain proposes to open a new store in a location that would serve customers who live in a geographical region that consists of (1) an industrial city, (2) a suburban community, and (3) a rural area. In order to assess the potential profitability of the proposed store, the chain wishes to study the incomes of all households in the region. In addition, the chain wishes to estimate the proportion and the total number of households whose members would be likely to shop at the store. The department store chain feels that the industrial city, the suburban community, and the rural area differ with respect to income and the store’s potential desirability. Therefore, it uses these subpopulations as strata and takes a stratified random sample.

Taking a stratified sample can be advantageous because such a sample takes advantage of the fact that elements in the same stratum are similar to each other. It follows that a stratified sample can provide more accurate information than a random sample of the same size. As a simple example, if all of the elements in each stratum were exactly the same, then examining only one element in each stratum would allow us to describe the entire population. Furthermore, stratification can make a sample easier (or possible) to select. Recall that, in order to take a random sample, we must have a list, or frame of all of the population elements. Although a frame might not exist for the overall population, a frame might exist for each stratum. For example, suppose nearly all the households in the department store’s geographical region have telephones. Although there might not be a telephone directory for the overall geographical region, there might be separate telephone directories for the industrial city, the suburb, and the rural area. For more discussion of stratified random sampling, see Mendenhall, Schaeffer, and Ott (1986).
Sometimes it is advantageous to select a sample in stages. This is a common practice when selecting a sample from a very large geographical region. In such a case, a frame often does not exist. For instance, there is no single list of all registered voters in the United States. There is also no single list of all households in the United States. In this kind of situation, we can use multistage cluster sampling. To illustrate this procedure, suppose we wish to take a sample of registered voters from all registered voters in the United States. We might proceed as follows:
Stage 1: Randomly select a sample of counties from all of the counties in the United States.
Stage 2: Randomly select a sample of townships from each county selected in Stage $1 .$
Stage 3: Randomly select a sample of voting precincts from each township selected in Stage 2.
Stage 4: Randomly select a sample of registered voters from each voting precinct selected in Stage 3 .

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|prescriptive analytics

预测分析是用于在数据集中发现异常、模式和关联的方法，目的是预测未来的结果。预测分析和数据挖掘是有时一起使用的术语，但数据挖掘可能更具体地定义为使用预测分析、计算机科学算法和信息系统技术从大量数据中提取有用的知识。据估计，对于任何数据挖掘项目，大约 65% 到 90% 的时间都花在数据准备上——检查、更正、协调不一致以及以其他方式“清理”数据。此外，虽然预测分析方法在与数据挖掘一起使用时可能对决策者最有用，但正如我们将看到的，在分析较小的数据集时，这些方法也很重要。规范性分析着眼于内部和外部变量和约束，以及从预测分析中获得的预测，以推荐一种或多种行动方案。在本书中，除了直观地使用预测分析的预测来建议业务改进行动方案外，我们不会讨论规范性分析。因此，回到预测分析，我们可以将预测分析的应用大致分为六类：
异常（离群值）检测在数据集中，预测分析可用于了解数据在典型情况下的外观，并确定观察结果是否与该模式显着不同（或异常）。例如，销售经理可以对典型销售人员的销售业绩进行建模，并使用异常检测来识别销售业绩异常高或异常低的特定销售人员。或者，美国国税局可以对典型的纳税申报表进行建模，并使用异常检测来识别非常不典型的特定申报表，以供审查和可能的审计。
关联学习这涉及识别倾向于同时出现的项目并找到描述它们同时出现的规则。例如，一家连锁超市曾经发现，在星期四购买婴儿尿布的男性也倾向于在星期四购买啤酒（可能是为了在周末看电视体育节目）。这导致该连锁店在其商店的婴儿过道附近展示啤酒。再举一个例子，Netflix 可能会发现租用虚构剧集的客户也租用历史纪录片，这样一些客户会租用几乎任何类型的由特定演员或女演员主演的电影。迪士尼可能会发现，在魔法王国逗留时间较长的游客也倾向于购买迪士尼卡通人物服装。迪士尼可能还会发现，入住更豪华的迪士尼酒店的游客也倾向于在迪士尼球场打高尔夫球，并乘坐迪士尼游轮航线。这些类型的调查结果用于将优惠券、交易或广告定位到合适的潜在客户。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Ratio, Interval, Ordinal

在部分1.1我们说，如果一个变量的可能值是代表数量的数字（即“多少”或“多少”），则该变量是定量的。通常，定量变量是在其可能值之间具有固定测量单位的尺度上测量的。例如，如果我们以最接近的美元来衡量员工的工资，那么一美元就是不同员工工资之间的固定计量单位。有两种类型的定量变量：比率和区间。比率变量是按比例测量的定量变量，其值的比率是有意义的，并且存在固有定义的零值。工资、身高、体重、时间和距离等变量是比率变量。例如，零英里的距离是“根本没有距离，

区间变量是一个定量变量，其值的比率没有意义，并且没有固有定义的零值。温度（华氏度）是一个区间变量。例如，零华氏度并不代表“根本没有热量”，只是它非常冷。因此，没有固有定义的零值。此外，温度比没有意义。例如，这样说是没有意义的60∘是两倍

温暖如30∘. 实际上，除了温度之外，几乎没有区间变量。几乎所有的定量变量都是比率变量。

在部分1.1我们还说过，如果我们简单地记录一个总体（或样本）单位属于几个类别中的哪一个，那么变量是定性的（或食的）。有两种类型的定性变量：序数和主格。序数变量是一个定性变量，其中有一个有意义的类别排序或排名。序数变量的测量可以是非数值的或数值的。例如，可能会要求学生将大学教授的教学效果评价为优秀、良好、一般、差或不满意。在这里，一个类别高于下一个类别；也就是说，“优秀”的评分高于“好”，“好”的评分高于“一般”，以此类推。因此，教学效果是一个具有非数值测量的有序变量。另一方面，4,3,2,1, 0 表示优秀到不满意的评分，那么教学效果是一个具有数值测量的序数变量。

在实践中，通常会向被要求对个人或项目进行评分的受访者提供热门数字和相关词。当使用数字时，统计学家会争论序数变量是否“有点定量”。例如，声称教学效果被评为4,3,2,1, 或 0 不是定量的争论 4（优秀）和 3（好）之间的差异可能与 3（好）和 2（平均）之间的差异不同。其他统计学家认为，一旦受访者（学生）看到等距的数字（即使这些数字是用文字描述的），他们的反应就会受到影响，足以使变量（教学效率）在某种程度上量化。一般来说，与数字相关的特定词可能会极大地影响序数变量是否可以被认为是定量的。然而，重要的是要注意，在实践中，数字序数评级通常被分析为好像它们是定量的。具体来说，各种算术运算（如第 2 章到第 18 章所讨论的）通常在数字序数评级上执行。例如，

为了结束本节，我们考虑第二种类型的定性变量。主变量是没有有意义的类别排序或排名的定性变量。一个人的性别、汽车的颜色和员工的居住状态是主变量。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Stratified Random

当总体由两个或多个在感兴趣变量方面不同的组组成时，分层是明智的。例如，可以根据性别、年龄、种族或收入将消费者划分为不同的阶层。

举个例子，假设一家百货连锁店提议在一个地点开设一家新店，该地点将为居住在由 (1) 工业城市、(2) 郊区社区和 (3)一个农村地区。为了评估拟建商店的潜在盈利能力，该连锁店希望研究该地区所有家庭的收入。此外，该连锁店希望估计其成员可能在该商店购物的家庭的比例和总数。百货连锁店认为工业城市、郊区社区和农村地区在收入和商店的潜在吸引力方面存在差异。因此，它使用这些亚群作为分层，并采用分层随机样本。

采用分层样本可能是有利的，因为这样的样本利用了同一层中的元素彼此相似的事实。因此，分层样本可以提供比相同大小的随机样本更准确的信息。举个简单的例子，如果每个层中的所有元素都完全相同，那么只检查每个层中的一个元素就可以让我们描述整个人口。此外，分层可以使样本更容易（或可能）选择。回想一下，为了随机抽取样本，我们必须有一个列表或所有总体元素的框架。尽管可能不存在针对总体人口的框架，但可能存在针对每个阶层的框架。例如，假设百货公司所在地理区域内的几乎所有家庭都有电话。尽管可能没有整个地理区域的电话簿，但工业城市、郊区和农村地区可能有单独的电话簿。有关分层随机抽样的更多讨论，请参见 Mendenhall、Schaeffer 和 Ott (1986)。
有时分阶段选择样本是有利的。这是从非常大的地理区域中选择样本时的常见做法。在这种情况下，框架通常不存在。例如，美国没有所有登记选民的单一名单。美国也没有所有家庭的单一清单。在这种情况下，我们可以使用多阶段整群抽样。为了说明这个过程，假设我们希望从美国所有登记选民中抽取登记选民样本。我们可以如下进行：
阶段 1：从美国所有县中随机选择一个县样本。
第 2 阶段：从阶段选择的每个县随机抽取一个乡镇样本1.
第 3 阶段：从第 2 阶段选择的每个乡镇随机选择一个投票区样本。
第 4 阶段：从第 3 阶段选择的每个投票区随机选择一个登记选民样本。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写