标签： BUS 241

商科代写|商业分析作业代写Statistical Modelling for Business代考|MNGT5232

Posted on 2022年12月14日2022年12月14日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

statistics-lab™ 为您的留学生涯保驾护航在代写商业分析Statistical Modelling for Business方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写商业分析Statistical Modelling for Business方面经验极为丰富，各种代写商业分析Statistical Modelling for Business相关的作业也就用不着说。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

商科代写|商业分析作业代写Statistical Modelling for Business代考|MNGT5232

商科代写|商业分析作业代写Statistical Modelling for Business代考|The Data Component

A common starting point for a discussion about data is that they are facts. There is a huge philosophical literature on what is a fact. As noted by Mulligan and Correia (2020), a fact is the opposite of theories and values and “are to be distinguished from things, in particular from complex objects, complexes and wholes, and from relations.” Without getting into this philosophy of facts, I will hold that a fact is a checkable or provable entity and, therefore, true. For example, it is true that Washington D.C. is the capital of the United States: it is easily checkable and can be shown to be true. It is also a fact that $1+1=2$. This is checkable by simply counting one finger on your left hand and one finger on your right hand. ${ }^3$

You could have a lot of facts on a topic but they are of little value if they are not

organized,
subsetted,
manipulated, and
interpreted
in a meaningfully way to provide insight for a recommendation for an action, the action being the problem solution. Otherwise, the facts are just a collection of valueless things. Their value stems from what you can do with them.

Organizing data, or facts, is a first step in any analytical process and the drive for information. This could involve arranging them in chronological order (e.g., by date and time of a transaction), spatial order (e.g., countries in the Northern and Southern Hemispheres), alphanumeric order, size order, and so on in an infinite number of ways. Transactions data, for example, are facts about units sold of a series of products including what products, who bought them, when they were sold, the amount sold, and prices. They are typically maintained in a file without a discernible order: just product, date, and units. There is no insight or intelligence from this data. In fact, it is somewhat randomly organized based on when orders were placed. ${ }^4$ If sorted by product and date, however, then they are organized and useful, but not much. The best organization is the one most applicable for a practical problem.
In Data Science, statistics, econometrics, machine learning, and other quantitative areas, a common organizational form is a rectangular array consisting of rows and columns: The rows are typically objects and the columns variables or features. An object can be a person (e.g., a customer) or an event (e.g., a transaction). The words object, case, individual, event, observation, and instance are often used interchangeably and I will certainly do this. For the methods considered in this book, each row is an individual case, one case per row and each case is in its own row.

商科代写|商业分析作业代写Statistical Modelling for Business代考|The Extractor Component

Finally, you have to apply some methods or procedures to your DataFrame to extract information. Refer back to Fig. $1.2$ for the role and position of an Extractor function in the information chain. This whole book is concerned with these methods. The interpretation of the results to give meaning to the information will be illustrated as I develop and discuss the methods, but the final interpretation is up to you based on your problem, your domain knowledge, and your expertise.

Due to the size and complexity of modern business data sets, the amount and type of information hidden inside them is large, to say the least. There is no one piece of information-no one size fits all-for all business problems. The same data set can be used for multiple problems and can yield multiple types of information. The possibilities are endless. The information content, however, is a function of the size and complexity of the DataFrame you eventually work with. The size is the number of data elements. Since a DataFrame is a rectangular array, the size is #rows $\times$ #columns elements and is given by its shape attribute. Shape is expressed as a tuple written as (rows, columns). For example, it could be $(5,2)$ for a DataFrame with 5 rows and 2 columns and 10 elements. The complexity is the types of data in the DataFrame and is difficult to quantify except to count types. They could be floating point numbers (or, simply, floats), integers (or ints), Booleans, text strings (referred to as objects), datetime values, and more. The larger and more complex the DataFrame, the more information you can extract. Let $I=I$ fformation, $S=$ size and $C=$ complexity. Then Information $=f(S, C)$ with $\partial I / \partial S>0$ and $\partial I / \partial C>0$. For a very large, complex DataFrame, there is a very large amount of information.

The cost of extracting information directly increases with the DataFrame’s size and complexity of the data. If I have 10 sales values, then my data set is small and simple. Minimal information, such as the mean, standard deviation, and range, can be extracted. The cost of extraction is low; just a hand-held calculator is needed. If I have $10 \mathrm{~GB}$ of data, then more can be done but at a greater cost. For data sizes approaching exabytes, the costs are monumental.

There could be an infinite amount of information, even contradictory information, in a large and complex DataFrame. So, when something is extracted, you have to check for its accuracy. For example, suppose you extract information that classifies customers by risk of default on extended credit. This particular classification may not be a good or correct one; that is, the predictive classifier $(P C)$ may not be the best one for determining whether someone is a credit risk or not. Predictive Error Analysis $(P E A)$ is needed to determine how well the $P C$ worked. I discuss this in Chap. 11. In that discussion, I will use a distinction between a training data set and a testing data set for building the classifier and testing it. This means the entire DataFrame will not, and should not, be used for a particular problem. It should be divided into the two parts although that division is not always clear, or even feasible. I will discuss the split into training and testing data sets in Chap.9.
The complexity of the DataFrame is, as I mentioned above, dependent on the types of data. Generally speaking, there are two forms: text and numeric. Other forms such as images and audio are possible but I will restrict myself to these two forms. I have to discuss these data types so that you know the possibilities and their importance and implications. How the two are handled within a DataFrame and with what statistical, econometric, and machine learning tools for the extraction of information is my focus in this book and so I will deal with them in depth in succeeding chapters. I will first discuss text data and then numeric data in the next two subsections.

商业分析代写

商科代写|商业分析作业代写Statistical Modelling for Business代考|The Data Component

讨论数据的一个共同出发点是它们是事实。关于什么是事实，有大量的哲学文献。正如 Mulligan 和 Correia（2020 年）所指出的，事实是理论和价值的对立面，并且“要与事物区分开来，特别是与复杂的对象、复合物和整体以及关系区分开来。” 在不进入这种事实哲学的情况下，我会认为事实是可检查或可证明的实体，因此是真实的。例如，华盛顿特区是美国的首都是真实的：它很容易验证并且可以被证明是真实的。这也是事实1+1=2. 这可以通过简单地计算左手的一根手指和右手的一根手指来检查。3

你可能有很多关于某个主题的事实，但如果不是，它们就没有什么价值

有组织的，
子集，
操纵，和
以有意义的方式进行解释
，以提供对行动建议的洞察力，行动就是问题的解决方案。否则，事实只是一堆毫无价值的东西。它们的价值源于您可以用它们做什么。

组织数据或事实是任何分析过程和信息驱动的第一步。这可能涉及按时间顺序（例如，按交易的日期和时间）、空间顺序（例如，北半球和南半球的国家/地区）、字母数字顺序、大小顺序等以无数种方式排列它们。例如，交易数据是关于一系列产品的销售单位的事实，包括产品种类、购买者、销售时间、销售量和价格。它们通常保存在一个没有明显顺序的文件中：只有产品、日期和单位。这些数据没有任何洞察力或情报。事实上，它是根据下订单的时间随机组织的。4然而，如果按产品和日期排序，那么它们是有条理和有用的，但并不多。最好的组织是最适用于实际问题的组织。
在数据科学、统计学、计量经济学、机器学习和其他定量领域，一种常见的组织形式是由行和列组成的矩形数组：行通常是对象，列通常是变量或特征。对象可以是人（例如，客户）或事件（例如，交易）。对象、案例、个体、事件、观察和实例这些词经常互换使用，我当然会这样做。对于本书中考虑的方法，每一行都是一个个案，每行一个案例，每个案例都在它自己的行中。

商科代写|商业分析作业代写Statistical Modelling for Business代考|The Extractor Component

最后，您必须对 DataFrame 应用一些方法或过程来提取信息。回头再看图1.2Extractor函数在信息链中的作用和位置。整本书都与这些方法有关。在我开发和讨论这些方法时，将说明对结果赋予信息意义的解释，但最终解释取决于你根据你的问题、你的领域知识和你的专业知识。

由于现代商业数据集的规模和复杂性，至少可以说，隐藏在其中的信息量和类型都很大。没有一种信息可以解决所有业务问题，没有一种方法可以解决所有问题。同一数据集可用于多个问题，并可产生多种类型的信息。可能性是无止境。然而，信息内容是您最终使用的 DataFrame 的大小和复杂性的函数。大小是数据元素的数量。由于 DataFrame 是一个矩形数组，因此大小为 #rows×#columns 元素并由其 shape 属性给出。形状表示为一个元组，写为（行，列）。例如，它可能是(5,2)对于具有 5 行和 2 列以及 10 个元素的 DataFrame。复杂性在于 DataFrame 中的数据类型，除了统计类型外很难量化。它们可以是浮点数（或简称为浮点数）、整数（或整数）、布尔值、文本字符串（称为对象）、日期时间值等。DataFrame 越大越复杂，您可以提取的信息就越多。让我=我形成，小号=大小和C=复杂。然后信息=F(小号,C)和∂我/∂小号>0和∂我/∂C>0. 对于一个非常庞大、复杂的DataFrame，信息量非常大。

提取信息的成本直接随着 DataFrame 的大小和数据的复杂性而增加。如果我有 10 个销售值，那么我的数据集很小而且很简单。可以提取最少的信息，例如均值、标准差和范围。开采成本低；只需要一个手持式计算器。如果我有10 G乙的数据，那么可以做更多的事情，但成本更高。对于接近 EB 的数据大小，成本是巨大的。

在一个庞大而复杂的 DataFrame 中，可能存在无限量的信息，甚至是相互矛盾的信息。因此，当提取某些内容时，您必须检查其准确性。例如，假设您提取按延期信贷违约风险对客户进行分类的信息。这个特定的分类可能不是一个好的或正确的分类；也就是说，预测分类器(PC)可能不是确定某人是否存在信用风险的最佳方法。预测误差分析(P和一个)需要确定如何以及PC工作了。我在第 1 章讨论这个问题。11. 在那次讨论中，我将使用训练数据集和测试数据集之间的区别来构建分类器和测试它。这意味着整个 DataFrame 不会也不应该用于解决特定问题。它应该分为两部分，尽管这种划分并不总是很清楚，甚至不可行。我将在第 9 章讨论训练和测试数据集的拆分。
正如我上面提到的，DataFrame 的复杂性取决于数据的类型。一般来说，有两种形式：文本和数字。图像和音频等其他形式也是可能的，但我将限制自己使用这两种形式。我必须讨论这些数据类型，以便您了解可能性及其重要性和含义。如何在 DataFrame 中处理这两者以及使用哪些统计、计量经济学和机器学习工具来提取信息是本书的重点，因此我将在后续章节中深入讨论它们。在接下来的两小节中，我将首先讨论文本数据，然后再讨论数字数据。

商科代写|商业分析作业代写Statistical Modelling for Business代考请认准statistics-lab™

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

在概率论概念中，随机过程是随机变量的集合。若一随机系统的样本点是随机函数，则称此函数为样本函数，这一随机系统全部样本函数的集合是一个随机过程。实际应用中，样本函数的一般定义在时间域或者空间域。 随机过程的实例如股票和汇率的波动、语音信号、视频信号、体温的变化，随机运动如布朗运动、随机徘徊等等。

贝叶斯方法代考

贝叶斯统计概念及数据分析表示使用概率陈述回答有关未知参数的研究问题以及统计范式。后验分布包括关于参数的先验分布，和基于观测数据提供关于参数的信息似然模型。根据选择的先验分布和似然模型，后验分布可以解析或近似，例如，马尔科夫链蒙特卡罗 (MCMC) 方法之一。贝叶斯统计概念及数据分析使用后验分布来形成模型参数的各种摘要，包括点估计，如后验平均值、中位数、百分位数和称为可信区间的区间估计。此外，所有关于模型参数的统计检验都可以表示为基于估计后验分布的概率报表。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

机器学习代写

随着AI的大潮到来，Machine Learning逐渐成为一个新的学习热点。同时与传统CS相比，Machine Learning在其他领域也有着广泛的应用，因此这门学科成为不仅折磨CS专业同学的“小恶魔”，也是折磨生物、化学、统计等其他学科留学生的“大魔王”。学习Machine learning的一大绊脚石在于使用语言众多，跨学科范围广，所以学习起来尤其困难。但是不管你在学习Machine Learning时遇到任何难题，StudyGate专业导师团队都能为你轻松解决。

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

随机过程，是依赖于参数的一组随机变量的全体，参数通常是时间。随机变量是随机现象的数量表现，其时间序列是一组按照时间发生先后顺序进行排列的数据点序列。通常一组时间序列的时间间隔为一恒定值（如1秒，5分钟，12小时，7天，1年），因此时间序列可以作为离散时间数据进行分析处理。研究时间序列数据的意义在于现实中，往往需要研究某个事物其随时间发展变化的规律。这就需要通过研究该事物过去发展的历史记录，以得到其自身发展的规律。

回归分析代写

多元回归分析渐进（Multiple Regression Analysis Asymptotics）属于计量经济学领域，主要是一种数学上的统计分析方法，可以分析复杂情况下各影响因素的数学关系，在自然科学、社会和经济学等多个领域内应用广泛。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习和应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

商科代写|商业分析作业代写Statistical Modelling for Business代考|STAT4600

Posted on 2022年12月14日2022年12月14日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

商科代写|商业分析作业代写Statistical Modelling for Business代考|STAT4600

商科代写|商业分析作业代写Statistical Modelling for Business代考|Uncertainty vs. Risk

Uncertainty is a fact of life reflecting our lack of knowledge. It is either spatial (“I don’t know what is happening in Congress today.”) or temporal (“I don’t know what will happen to sales next year.”). In either case, the lack of knowledge is about the state of the world (SOW): what is happening in Congress and what will happen next year. Business textbooks such as Freund and Williams (1969), Spurr and Bonini (1968), and Hildebrand et al. (2005) typically discuss assigning a probability to different $S O W$ s that you could list. The purpose of these probabilities is to enable you to say something about the world before that something materializes. Somehow, and it is never explained how, you assign numeric values representing outcomes, or payoffs, to the $S O W \mathrm{~s}$. The probabilities and associated payoffs are used to calculate an expected or average payoff over all the possible $S O W$ s. Consider, for example, the rate of return on an investment (ROI) in a capital expansion project. The ROI might depend on the average annual growth of real GDP for the next 5 years. Suppose the real GDP growth is simply expressed as declining (i.e., a recession), flat ( $0 \%$ ), slow $(1 \%-2 \%)$, and robust $(>2 \%)$ with assigned probabilities of $0.05,0.20,0.50$, and $0.25$, respectively. These form a probability distribution. Let $p_i$ be the probability state $i$ is realized. Then, $\sum_{i=1}^n p_i=1.0$ for these $n=4$ possible states. I show the $S O W \mathrm{~s}$, probabilities, and $R O I$ values in Table 1.1. The expected $R O I$ is $\sum_{i=1}^4 p_i \times$ $R O I_i=2.15 \%$. This is the amount expected to be earned on average over the next 5 years.

Savage (1972, p. 9) notes that the “world” in the statement “state of the world” is defined for the problem at hand and that you should not take it literally. It is a fluid concept. He states that it is “the object about which the person is concerned.” At the same time, the “state” of the world is a full description of its conditions. Savage (1972) notes that it is “a description of the world, leaving no relevant aspects undescribed.” But he also notes that there is a true state, a “state that does in fact obtain, i.e., the true description of the world.” Unfortunately, it is unknown, and so the best we can do until it is realized or revealed to us is assign probabilities to the occurrence of each state for decision making. These are the probabilities in Table 1.1. More importantly, it is the fact that the true state is unknown, and never will be known until revealed that is the problem. No amount of information will ever completely and perfectly reveal this true state before it occurs.

商科代写|商业分析作业代写Statistical Modelling for Business代考|The Data-Information Nexus

To an extent, discussing definitions and terminology is useful for the advancement of scientific and practical solutions for any problem. If you cannot agree on basic terms, then you are doomed at worst and hindered at best from making any progress toward a solution, a decision. You can, however, become so involved in defining terms and so overly concerned about terminology that nothing else maters. Popper too strongly, that
One should never quarrel about words, and never get involved in questions of terminology … What we are really interested in, our real problem,… are problems of theories and their truth.
Popper, a philosopher of science, was concerned about scientific problems. The same sentiment, however, holds for practical problems like the ones you face daily in your business. Despite Popper’s preeminence, you still need some perspective on the foundational units that drive the raison d’etre of BDA: data and information. ${ }^1$ If information is so important for reducing uncertainty, then a logical question to ask is: “What is information?” A subordinate, but equally important, question is:

The words information and data are used as synonyms in everyday conversations. It is not uncommon, for example, to hear a business manager claim in one instance that she has a lot of data and then say in the next instance that she has a lot of information, thus linking the two words to have the same meaning. In fact, the computer systems that manage data are referred to as Information Systems (IS) and the associated technology used in those systems is referred to as Information Technology (IT). ${ }^2$ The C-Level executive in charge of this data and $I T$ infrastructure is the Chief Information Officer $(\mathrm{CIO})$. Notice the repeated use of the word “information.”
Even though people use these two words interchangeably it does not mean they have the same meaning. It is my contention, along with others, that data and information are distinct terms that, yet, have a connection. I will simply state that data are facts, objects that are true on their face, that have to be organized and manipulated to yield insight into something previously unknown. When managed and manipulated, they become information. The organization cannot be without the manipulation and the manipulation cannot be without the organization. The IT group of your business organizes your company’s data but it does not manipulate it to be information. The information is latent, hidden inside the data and must be extracted so it can be used in a decision. I illustrate this connection Fig. 1.2. I will comment on each component in the next few sections.

商业分析代写

商科代写|商业分析作业代写Statistical Modelling for Business代考|Uncertainty vs. Risk

不确定性是反映我们缺乏知识的生活事实。它要么是空间的（“我不知道今天国会发生了什么。”），要么是时间的（“我不知道明年的销售会发生什么。”）。无论哪种情况，知识的缺乏都与世界状况 (SOW) 有关：国会正在发生的事情以及明年将发生的事情。Freund 和 Williams (1969)、Spurr 和 Bonini (1968) 以及 Hildebrand 等人的商业教科书。(2005) 通常讨论将概率分配给不同的小号欧在这就是你可以列出的。这些概率的目的是使您能够在某事具体化之前对世界说些什么。不知何故，从未解释过如何将代表结果或收益的数值分配给小号欧在秒. 概率和相关的收益用于计算所有可能的预期收益或平均收益小号欧在秒。例如，考虑资本扩张项目的投资回报率 (ROI)。投资回报率可能取决于未来 5 年实际 GDP 的年均增长率。假设实际 GDP 增长简单地表示为下降（即衰退）、持平（0%），减缓(1%−2%), 并且健壮(>2%)具有指定的概率0.05,0.20,0.50，和0.25，分别。这些形成了概率分布。让p一世是概率状态一世实现了。然后，∑一世=1np一世=1.0对于这些n=4可能的状态。我展示了小号欧在秒、概率和R欧我表 1.1 中的值。预期的R欧我是∑一世=14p一世× R欧我一世=2.15%. 这是未来 5 年平均预期赚取的金额。

Savage (1972, p. 9) 指出，“世界状况”陈述中的“世界”是为手头的问题定义的，你不应该从字面上理解它。这是一个流动的概念。他说这是“这个人所关心的对象”。同时，世界的“状态”是对其状况的完整描述。Savage (1972) 指出它是“对世界的描述，没有留下任何未描述的相关方面”。但他也指出存在一种真实的状态，一种“确实获得的状态，即对世界的真实描述”。不幸的是，它是未知的，因此在它被实现或揭示给我们之前我们能做的最好的事情就是为每个状态的发生分配概率以进行决策。这些是表 1.1 中的概率。更重要的是，真实状态不明，在发现问题所在之前永远不会为人所知。在这种真实状态发生之前，再多的信息也无法完全、完美地揭示它。

商科代写|商业分析作业代写Statistical Modelling for Business代考|The Data-Information Nexus

在某种程度上，讨论定义和术语有助于为任何问题提出科学和实用的解决方案。如果你们不能就基本条款达成一致，那么最坏的情况下你注定失败，最好的情况下你也无法在解决方案和决策方面取得任何进展。但是，您可能会过多地参与定义术语并过分关注术语，而忽略了其他任何事情。波普尔太强烈了，
一个人永远不应该为词语争吵，永远不要卷入术语问题……我们真正感兴趣的，我们真正的问题，……是理论及其真理的问题。
科学哲学家波普尔关注科学问题。然而，同样的观点也适用于实际问题，例如您在业务中每天面临的问题。尽管 Popper 非常出色，但您仍然需要对驱动 BDA 存在理由的基本单元有一些看法：数据和信息。1如果信息对于减少不确定性如此重要，那么一个合乎逻辑的问题是：“什么是信息？” 一个从属但同样重要的问题是：

信息和数据这两个词在日常对话中用作同义词。例如，经常听到一位业务经理在一个实例中声称她有很多数据，然后在下一个实例中又说她有很多信息，从而将两个词联系起来具有相同的含义。事实上，管理数据的计算机系统被称为信息系统 (IS)，这些系统中使用的相关技术被称为信息技术 (IT)。2负责此数据的 C 级主管和我吨基础设施是首席信息官(C我欧). 请注意“信息”一词的重复使用。
尽管人们可以互换使用这两个词，但这并不意味着它们具有相同的含义。我和其他人的观点是，数据和信息是不同的术语，但它们之间存在联系。我将简单地说明数据是事实，是表面上真实的对象，必须对其进行组织和操作以深入了解以前未知的事物。当被管理和操纵时，它们就变成了信息。组织不能没有操纵，操纵也不能没有组织。您企业的 IT 团队组织您公司的数据，但不会将其处理为信息。信息是潜在的，隐藏在数据中，必须提取出来才能用于决策。我在图 1.2 中说明了这种连接。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

商科代写|商业分析作业代写Statistical Modelling for Business代考|DATA603

Posted on 2022年12月14日2022年12月14日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

商科代写|商业分析作业代写Statistical Modelling for Business代考|DATA603

商科代写|商业分析作业代写Statistical Modelling for Business代考|Types of Business Problems

What types of business problems warrant $B D A$ ? The types are too numerous to mention, but to give a sense of them consider a few examples:

Anomaly Detection: production surveillance, predictive maintenance, manufacturing yield optimization;
Fraud detection;
Identity theft;
Account and transaction anomalies;
Customer analytics:
Customer Relationship Management (CRM);
Churn analysis and prevention;
Customer Satisfaction;
Marketing cross-sell and up-sell;
Pricing: leakage monitoring, promotional effects tracking, competitive price responses;
Fulfillment: management and pipeline tracking;
Competitive monitoring;
Competitive Environment Analysis (CEA); and
New Product Development.
And the list goes on, and on.
A decision of some type is required for all these problems. New product development best exemplifies a complex decision process. Decisions are made throughout a product development pipeline. This is a series of stages from ideation or conceptualization to product launch and post-launch tracking. Paczkowski (2020) identifies five stages for a pipeline: ideation, design, testing, launch, and post-launch tracking. Decisions are made between each stage whether to proceed to the next one or abort development or even production. Each decision point is marked by a business case analysis that examines the expected revenue and market share for the product. Expected sales, anticipated price points (which are refined as the product moves through the pipeline), production and marketing cost estimates, and competitive analyses that include current products, sales, pricing, and promotions plus competitive responses to the proposed new product, are all needed for each business case assessment. If any of these has a negative implication for the concept, then it will be canceled and removed from the pipeline. Information is needed for each business case check point.

The expected revenue and market share are refined for each business case analysis as new and better information -not data-become available for the items I listed above. More data do become available, of course, as the product is developed, but it is the analysis of that data based on methods described in this book, that provide the information needed to approve or not approve the advancement of the concept to the next stage in the pipeline. The first decision, for example, is simply to begin developing a new product. Someone has to say “Yes” to the question “Should we develop a new product?” The business case analysis provides that decision maker with the information for this initial “Go/No Go” decision. Similar decisions are made at other stages.

Another example is product pricing. This is actually a two-fold decision involving a structure (e.g., uniform pricing or price discrimination to mention two possibilities) and a level within the structure. These decisions are made throughout the product life cycle beginning at the development stage (the launch stage of the pipeline I discussed above) and then throughout the post-launch period until the product is ultimately removed from the market. The wrong price structure and/or level could cost your business lost profit, lost market share, or a lost business. See Paczkowski (2018) for a discussion of the role of pricing and the types of analysis for identifying the best price structure and level. Also see Paczkowski (2020) for new product development pricing at each stage of the pipeline.

商科代写|商业分析作业代写Statistical Modelling for Business代考|The Role of Information in Business Decision Making

Decisions are effective if they solve a problem, such as those I discussed above, and aid rather than hinder your business in succeeding in the market. I will assume your business succeeds if it earns a profit and has a positive return for its owners (shareholders, partners, employees in an employee-owned company) or a sole owner. Information could be about

current sales;
future sales;
the state of the market;
consumer, social, and technology trends and developments;
customer needs and wants;
customer willingness-to-pay;
key customer segments;
financial developments;
supply chain developments; and
the size of customer churn.
This information is input into decisions and like any input, if it is bad, then the decisions will be bad. Basically, the GIGO Principle (Garbage In-Garbage Out) holds. This should be obvious and almost trite. Unfortunately, you do not know when you make your decision if your information is good or bad, or even sufficient. You face uncertainty due to the amount and quality of the information you have available.

Without any information you would just be guessing, and guessing is costly. In Fig. 1.1, I illustrate what happens to the cost of decisions based on the amount of information you have. Without any information, all your decisions are based on pure guesses, hunches, so you are forced to approximate their effect. The approximation could be very naive, based on gut instinct (i.e., an unfounded belief that you know everything) or what happened yesterday or in another business similar to yours (i.e., an analog business).

The cost of these approximations in terms of financial losses, lost market share, or outright bankruptcy can be very high. As the amount of information increases, however, you will have more insight so your approximations (i.e., guesses) improve and the cost of approximations declines. This is exactly what happens during the business case process I described above. More and better information helps the decision makers at each business case stage. The approximations could now be based on trends, statistically significant estimates of impact, or model-based what-if analyses. These are not “data”; they are information.

商业分析代写

商科代写|商业分析作业代写Statistical Modelling for Business代考|Types of Business Problems

什么类型的业务问题保证乙丁一个？类型太多无法一一列举，但为了让您了解它们，请考虑几个示例：

异常检测：生产监控、预测性维护、制造良率优化；
欺诈识别;
身份盗用；
账户及交易异常；
客户分析：
客户关系管理（CRM）；
客户流失分析与预防；
顾客满意度;
营销交叉销售和追加销售；
定价：泄漏监控、促销效果跟踪、有竞争力的价格响应；
履行：管理和管道跟踪；
竞争监控；
竞争环境分析（CEA）；和
新产品开发。
这样的例子不胜枚举。
所有这些问题都需要某种类型的决定。新产品开发最能说明复杂的决策过程。决策是在整个产品开发流程中做出的。这是从构思或概念化到产品发布和发布后跟踪的一系列阶段。Paczkowski (2020) 确定了管道的五个阶段：构思、设计、测试、发布和发布后跟踪。在每个阶段之间做出决定是继续下一阶段还是中止开发甚至生产。每个决策点都由业务案例分析标记，该分析检查产品的预期收入和市场份额。预期销售额、预期价格点（随着产品在管道中移动而细化）、生产和营销成本估算以及包括当前产品的竞争分析，每个业务案例评估都需要销售、定价和促销以及对拟议新产品的竞争性反应。如果其中任何一个对该概念有负面影响，那么它将被取消并从管道中删除。每个业务案例检查点都需要信息。

随着新的和更好的信息（不是数据）可用于我上面列出的项目，每个业务案例分析的预期收入和市场份额都得到了改进。当然，随着产品的开发，确实会有更多数据可用，但正是基于本书描述的方法对这些数据进行的分析，提供了批准或不批准将概念推进到下一阶段所需的信息在管线中。例如，第一个决定就是开始开发新产品。有人必须对“我们应该开发新产品吗？”这个问题说“是”。商业案例分析为决策者提供了这个最初的“通过/不通过”决策的信息。在其他阶段也会做出类似的决定。

另一个例子是产品定价。这实际上是一个双重决策，涉及结构（例如，统一定价或价格歧视两种可能性）和结构内的级别。这些决策是在整个产品生命周期中做出的，从开发阶段（我上面讨论的管道的启动阶段）开始，然后在整个启动后阶段，直到产品最终退出市场。错误的价格结构和/或水平可能会使您的企业损失利润、失去市场份额或失去业务。请参阅 Paczkowski (2018) 讨论定价的作用以及用于确定最佳价格结构和水平的分析类型。另请参阅 Paczkowski (2020) 了解管道每个阶段的新产品开发定价。

商科代写|商业分析作业代写Statistical Modelling for Business代考|The Role of Information in Business Decision Making

如果决策能够解决问题（例如我上面讨论的问题）并且有助于而不是阻碍您的企业在市场上取得成功，那么决策就是有效的。如果您的企业盈利并为其所有者（股东、合伙人、员工持股公司的员工）或唯一所有者带来正回报，我会假设您的企业成功。信息可能是关于

当前销售额；
未来的销售；
市场状况；
消费者、社会和技术趋势和发展；
客户的需求和愿望；
客户支付意愿；
关键客户群；
金融发展；
供应链发展；和
客户流失的规模。
这些信息被输入到决策中，就像任何输入一样，如果它是错误的，那么决策就会是错误的。基本上，GIGO 原则（垃圾进垃圾出）成立。这应该是显而易见的，几乎是陈腐的。不幸的是，您不知道您何时做出决定，您的信息是好是坏，甚至是充分的。由于可用信息的数量和质量，您面临着不确定性。

如果没有任何信息，您将只能猜测，而猜测的代价是昂贵的。在图 1.1 中，我说明了基于您拥有的信息量的决策成本会发生什么变化。在没有任何信息的情况下，你所有的决定都是基于纯粹的猜测和直觉，所以你不得不估计它们的效果。近似值可能非常天真，基于直觉（即毫无根据地相信你知道一切）或昨天发生的事情或在与你的业务类似的另一家公司（即模拟公司）中发生的事情。

这些近似值在财务损失、市场份额损失或彻底破产方面的成本可能非常高。但是，随着信息量的增加，您将获得更多洞察力，因此您的近似值（即猜测）会得到改善，而近似值的成本会下降。这正是我在上面描述的业务案例过程中发生的事情。更多更好的信息有助于每个业务案例阶段的决策者。这些近似值现在可以基于趋势、具有统计意义的影响估计或基于模型的假设分析。这些不是“数据”；他们是信息。

统计代写|商业分析作业代写Statistical Modelling for Business代考请认准statistics-lab™

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Describing Central Tendency

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|Describing Central Tendency

统计代写|商业分析作业代写Statistical Modelling for Business代考|The mean, median, and mode

In addition to describing the shape of the distribution of a sample or population of measurements, we also describe the data set’s central tendency. A measure of central tendency represents the center or middle of the data. Sometimes we think of a measure of central tendency as a typical value. However, as we will see, not all measures of central tendency are necessarily typical values.

One important measure of central tendency for a population of measurements is the population mean. We define it as follows:More precisely, the population mean is calculated by adding all the population measurements and then dividing the resulting sum by the number of population measurements. For instance, suppose that Chris is a college junior majoring in business. This semester Chris is taking five classes and the numbers of students enrolled in the classes (that is, the class sizes) are as follows:

The mean $\mu$ of this population of class sizes is
$$
\mu=\frac{60+41+15+30+34}{5}=\frac{180}{5}=36
$$
Because this population of five class sizes is small, it is possible to compute the population mean. Often, however, a population is very large and we cannot obtain a measurement for each population element. Therefore, we cannot compute the population mean. In such a case, we must estimate the population mean by using a sample of measurements.

In order to understand how to estimate a population mean, we must realize that the population mean is a population parameter.

统计代写|商业分析作业代写Statistical Modelling for Business代考|The Car Mileage Case: Estimating Mileage

In order to offer its tax credit, the federal government has decided to define the “typical” EPA combined city and highway mileage for a car model as the mean $\mu$ of the population of EPA combined mileages that would be obtained by all cars of this type. Here, using the mean to represent a typical value is probably reasonable. We know that some individual cars will get mileages that are lower than the mean and some will get mileages that are above it. However, because there will be many thousands of these cars on the road, the mean mileage obtained by these cars is probably a reasonable way to represent the model’s overall fuel economy. Therefore, the government will offer its tax credit to any automaker selling a midsize model equipped with an automatic transmission that achieves a mean EPA combined mileage of at least $31 \mathrm{mpg}$.

To demonstrate that its new midsize model qualifies for the tax credit, the automaker in this case study wishes to use the sample of 50 mileages in Table $3.1$ to estimate $\mu$, the model’s mean mileage. Before calculating the mean of the entire sample of 50 mileages, we will illustrate the formulas involved by calculating the mean of the first five of these mileages.

Table $3.1$ tells us that $x_{1}=30.8, x_{2}=31.7, x_{3}=30.1, x_{4}=31.6$, and $x_{5}=32.1$, so the sum of the first five mileages is
$$
\begin{aligned}
\sum_{i=1}^{5} x_{i} &=x_{1}+x_{2}+x_{3}+x_{4}+x_{5} \
&=30.8+31 . \overline{3}+30.1+31.6+3 \overline{2} .1=156.3
\end{aligned}
$$
Therefore, the mean of the first five mileages is
$$
\bar{x}=\frac{\sum_{i=1}^{5} x_{i}}{5}=\frac{156.3}{5}=31.26
$$
Of course, intuitively, we are likely to obtain a more accurate point estimate of the population mean by using all of the available sample information. The sum of all 50 mileages can be verified to be
$$
\sum_{i=1}^{50} x_{i}=x_{1}+x_{2}+\cdots+x_{50}=30.8+31.7+\cdots+31.4=1578
$$
Therefore, the mean of the sample of 50 mileages is
$$
\bar{x}=\frac{\sum_{i=1}^{50} x_{i}}{50}=\frac{1578}{50}=31.56
$$
This point estimate says we estimate that the mean mileage that would be obtained by all of the new midsize cars that will or could potentially be produced this year is $31.56 \mathrm{mpg}$. Unless we are extremely lucky, however, there will be sampling error. That is, the point estimate $\bar{x}=31.56$ mpg, which is the average of the sample of fifty randomly selected mileages, will probably not exactly equal the population mean $\mu$, which is the average mileage that would be obtained by all cars. Therefore, although $\bar{x}=31.56$ provides some evidence that $\mu$ is at least 31 and thus that the automaker should get the tax credit, it does not provide definitive evidence. In later chapters, we discuss how to assess the reliability of the sample mean and how to use a measure of reliability to decide whether sample information provides definitive evidence.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Comparing the mean, median, and mode

Often we construct a histogram for a sample to make inferences about the shape of the sampled population. When we do this, it can be useful to “smooth out” the histogram and use the resulting relative frequency curve to describe the shape of the population. Relative frequency curves can have many shapes. Three common shapes are illustrated in Figure $3.3$. Part (a) of this figure depicts a population described by a symmetrical relative frequency curve. For such a population, the mean $(\mu)$, median $\left(M_{s}\right)$, and mode $\left(M_{s}\right)$ are all equal. Note that in this case all three of these quantities are located under the highest point of the curve. It follows that when the frequency distribution of a sample of measurements is approximately symmetrical, then the sample mean, median, and mode will be nearly the same. For instance,

consider the sample of 50 mileages in Table 3.1. Because the histogram of these mileages in Figure $3.2$ is approximately symmetrical, the mean $-31.56$ – and the median-31.55- of the mileages are approximately equal to each other.

Figure $3.3$ (b) depicts a population that is skewed to the right. Here the population mean is larger than the population median, and the population median is larger than the population mode (the mode is located under the highest point of the relative frequency curve). In this case the population mean averages in the large values in the upper tail of the distribution. Thus the population mean is more affected by these large values than is the population median. To understand this, we consider the following example.

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|The mean, median, and mode

除了描述样本或测量群体的分布形状外，我们还描述了数据集的集中趋势。集中趋势的度量代表数据的中心或中间。有时我们将集中趋势的度量视为一个典型值。然而，正如我们将看到的，并非所有集中趋势的度量都必然是典型值。

测量群体集中趋势的一个重要度量是群体均值。我们将其定义如下：更准确地说，总体均值是通过将所有总体测量值相加，然后将所得总和除以总体测量值的数量来计算的。例如，假设 Chris 是一名商业专业的大三学生。本学期 Chris 上 5 节课，在校学生人数（即班级人数）如下：

均值μ这个班级规模的人口是
μ=60+41+15+30+345=1805=36
因为这个五个班级规模的人口很小，所以可以计算人口平均值。但是，通常人口非常大，我们无法获得每个人口元素的测量值。因此，我们无法计算总体均值。在这种情况下，我们必须使用测量样本来估计总体均值。

为了理解如何估计总体均值，我们必须认识到总体均值是总体参数。

统计代写|商业分析作业代写Statistical Modelling for Business代考|The Car Mileage Case: Estimating Mileage

为了提供税收抵免，联邦政府已决定将汽车模型的“典型”EPA 城市和高速公路组合里程定义为平均值μ这种类型的所有汽车将获得的 EPA 总里程数。在这里，用平均值来表示一个典型值可能是合理的。我们知道，有些个别汽车的行驶里程会低于平均值，而有些汽车的行驶里程会高于平均值。但是，由于道路上会有成千上万辆这样的汽车，这些汽车所获得的平均里程可能是代表该车型整体燃油经济性的合理方式。因此，政府将向任何销售配备自动变速器的中型车型的汽车制造商提供税收抵免，该车型的平均 EPA 综合里程至少为31米pG.

为了证明其新的中型车型有资格获得税收抵免，本案例研究中的汽车制造商希望使用表中 50 英里的样本3.1估计μ，模型的平均里程。在计算 50 英里的整个样本的平均值之前，我们将通过计算前 5 英里的平均值来说明所涉及的公式。

桌子3.1告诉我们X1=30.8,X2=31.7,X3=30.1,X4=31.6，和X5=32.1，所以前五英里的总和是
∑一世=15X一世=X1+X2+X3+X4+X5 =30.8+31.3¯+30.1+31.6+32¯.1=156.3
因此，前五英里的平均值为
X¯=∑一世=15X一世5=156.35=31.26
当然，直观地说，我们很可能通过使用所有可用的样本信息来获得对总体均值的更准确的点估计。可以验证所有 50 里程的总和为
∑一世=150X一世=X1+X2+⋯+X50=30.8+31.7+⋯+31.4=1578
因此，50 英里的样本的平均值为
X¯=∑一世=150X一世50=157850=31.56
该点估计表明，我们估计今年将或可能生产的所有新中型汽车的平均行驶里程为31.56米pG. 然而，除非我们非常幸运，否则会有抽样误差。即点估计X¯=31.56mpg 是随机选择的 50 英里的样本的平均值，可能不会完全等于总体平均值μ，这是所有汽车将获得的平均里程数。因此，虽然X¯=31.56提供了一些证据表明μ至少是 31 岁，因此汽车制造商应该获得税收抵免，它没有提供确切的证据。在后面的章节中，我们将讨论如何评估样本均值的可靠性以及如何使用可靠性度量来确定样本信息是否提供了确凿的证据。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Comparing the mean, median, and mode

我们经常为样本构建直方图，以推断样本总体的形状。当我们这样做时，“平滑”直方图并使用生成的相对频率曲线来描述总体形状会很有用。相对频率曲线可以有多种形状。三种常见的形状如图所示3.3. 该图的 (a) 部分描绘了由对称相对频率曲线描述的总体。对于这样的人群，均值(μ), 中位数(米s), 和模式(米s)都是平等的。请注意，在这种情况下，所有这三个量都位于曲线的最高点下方。因此，当测量样本的频率分布近似对称时，样本均值、中值和众数将几乎相同。例如，

考虑表 3.1 中 50 英里的样本。因为图中这些里程的直方图3.2近似对称，均值−31.56– 里程数的中位数 31.55- 大致相等。

数字3.3(b) 描绘了向右倾斜的人口。这里总体均值大于总体中位数，总体中位数大于总体众数（众数位于相对频率曲线最高点下方）。在这种情况下，总体平均值是分布上尾中较大值的平均值。因此，总体平均值比总体中位数更受这些大值的影响。为了理解这一点，我们考虑下面的例子。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Misleading Graphs and Charts

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|Misleading Graphs and Charts

统计代写|商业分析作业代写Statistical Modelling for Business代考|Optional

The statistical analyst’s goal should be to present the most accurate and truthful portrayal of a data set that is possible. Such a presentation allows managers using the analysis to make informed decisions. However, it is possible to construct statistical summaries that are misleading. Although we do not advocate using misleading statistics, you should be aware of some of the ways statistical graphs and charts can be manipulated in order to distort the truth. By knowing what to look for, you can avoid being misled by a (we hope) small number of unscrupulous practitioners.

As an example, suppose that the nurses at a large hospital will soon vote on a proposal to join a union. Both the union organizers and the hospital administration plan to distribute recent salary statistics to the entire nursing staff. Suppose that the mean nurses’ salary at the hospital and the mean nurses’ salary increase at the hospital (expressed as a percentage) for each of the last four years are as follows:

The hospital administration does not want the nurses to unionize and, therefore, hopes to convince the nurses that substantial progress has been made to increase salaries without a union. On the other hand, the union organizers wish to portray the salary increases as minimal so that the nurses will feel the need to unionize.

Figure $2.29$ gives two bar charts of the mean nurses’ salaries at the hospital for each of the last four years. Notice that in Figure $2.29$ (a) the administration has started the vertical scale of the bar chart at a salary of $\$ 58,000$ by using a scale break ( ). Alternatively, the chart

could be set up without the scale break by simply starting the vertical scale at $\$ 58,000$. Starting the vertical scale at a value far above zero makes the salary increases look more dramatic. Notice that when the union organizers present the bar chart in Figure $2.29$ (b), which has a vertical scale starting at zero, the salary increases look far less impressive.

Figure $2.30$ presents two bar charts of the mean nurses’ salary increases (in percentages) at the hospital for each of the last four years. In Figure $2.30$ (a), the administration has made the widths of the bars representing the percentage increases proportional to their heights. This makes the upward movement in the mean salary increases look more dramatic because the observer’s eye tends to compare the areas of the bars, while the improvements in the mean salary increases are really only proportional to the heights of the bars. When the union organizers present the bar chart of Figure $2.30$ (b), the improvements in the mean salary increases look less impressive because each bar has the same width.

Figure $2.31$ gives two time series plots of the mean nurses’ salary increases at the hospital for the last four years. In Figure $2.31$ (a) the administration has stretched the vertical axis of the graph. That is, the vertical axis is set up so that the distances between the percentages are large. This makes the upward trend of the mean salary increases appear to be steep. In Figure $2.31$ (b) the union organizers have compressed the vertical axis (that is, the distances between the percentages are small). This makes the upward trend of the

mean salary increases appear to be gradual. As we will see in the exercises, stretching and compressing the horizontal axis in a time series plot can also greatly affect the impression given by the plot.

It is also possible to create totally different interpretations of the same statistical summary by simply using different labeling or captions. For example, consider the bar chart of mean nurses’ salary increases in Figure 2.30(b). To create a favorable interpretation, the hospital administration might use the caption “Salary Increase Is Higher for the Fourth Year in a Row.” On the other hand, the union organizers might create a negative impression by using the caption “Salary Increase Fails to Reach $10 \%$ for Fourth Straight Year.”

In summary, it is important to carefully study any statistical summary so that you will not be misled. Look for manipulations such as stretched or compressed axes on graphs, axes that do not begin at zero, bar charts with bars of varying widths, and biased captions. Doing these things will help you to see the truth and to make well-informed decisions.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Descriptive Analytics

In Section $1.5$ we said that descriptive analytics uses traditional and more recently developed graphics to present to executives (and sometimes customers) easy-to-understand visual summaries of up-to-the-minute information conceming the operational status of a business. In this section we will discuss some of the more recently developed graphics used by descriptive analytics, which include gauges, bullet graphs, treemaps, and sparklines. In addition, we will see how they are used with each other and more traditional graphics to form analytic dashboards, which are part of executive information systems. We will also briefly discuss data discovery, which involves, in it simplest form, data drill down.

Dashboards and gauges An analytic dashboard provides a graphical presentation of the current status and historical trends of a business’s key performance indicators. The term dashboard originates from the automotive dashboard, which helps a driver monitor a car’s key functions. Figure $2.35$ shows a dashboard that graphically portrays some key performance indicators for a (fictitious) airline for last year. In the lower left-hand portion of the dashboard we see three gauges depicting the percentage utilizations of the regional, short-haul, and international fleets of the airline. In general, a gauge (chart) allows us to visualize data in a way that is similar to a real-life speedometer needle on an automobile. The outer scale of the gauge is often color coded to provide additional performance information. For example, note that the colors on the outer scale of the gauges in Figure $2.35$ range from red to dark green to light green. These colors signify percentage fleet utilizations that range from poor (less than 75 percent) to satisfactory (less than 90 percent but at least 75 percent) to good (at least 90 percent).
Bullet graphs While gauge charts are nice looking, they take up considerable space and to some extent are cluttered. For example, recalling the Disney Parks Case, using seven $\mathrm{~ ส ม ่ ม ด ร ~ c h a}$ seven Epcot rides would take up considerable space and would be less efficient than using the bullet graph that we originally showed in Figure 1.8. That bullet graph (which has been obtained using Excel) is shown again in Figure 2.36.

In general, a bullet graph features a single measure (for example, predicted waiting time) and displays it as a horizontal bar (or a vertical bar) that extends into ranges representing qualitative measures of performance, such as poor, satisfactory, and good. The ranges are displayed as either different colors or as varying intensities of a single hue. Using a single hue makes them discernible by those who are color blind and restricts the use of color if the bullet graph is part of a dashboard that we don’t want to look too “busy.” Many bullet graphs compare the single primary measure to a target, or objective, which is represented by a symbol on the bullet graph. The bullet graph of Disney’s predicted waiting times uses five colors ranging from dark green to red and signifying short ( 0 to 20 minutes) to very long ( 80 to 100 minutes) predicted waiting times. This bullet graph does not compare the predicted waiting times to an objective. However, the bullel graphs located in the upper left of the dashboard in Figure $2.35$ (representing the percentages of on-time arrivals and departures for the airline) do display obbjectives represented by shoort vertical black lines. For examplé, consider the bullet graphs representing the percentages of on-time arrivals and departures in the Midwest, which are shown below.

统计代写|商业分析作业代写Statistical Modelling for Business代考|sparkline

Sparklines Figure $2.38$ shows sparklines depicting the monthly closing prices of five stocks over six months. In general, a sparkline, another of the new descriptive analytics graphics, is a line chart that presents the general shape of the variation (usually over time) in some measurement, such as temperature or a stock price. A sparkline is typically drawn without axes or coordinates and made small enough to be embedded in text. Sparklines are often grouped together so that comparative variations can be seen. Whereas most charts are designed to show considerable information and are set off from the main text, sparklines are intended to be succinct and located where they are discussed. Therefore, if the sparklines in Figure $2.38$ were located in the text of a report comparing stocks, they would probably be made smaller than shown in Figure $2.38$, and the actual monthly closing prices of the stocks used to obtain the sparklines would probably not be shown next to the sparklines.
Data discovery To conclude this section, we briefly discuss data discovery methods, which allow decision makers to interactively view data and make preliminary analyses. One simple version of data discovery is data drill down, which reveals more detailed data that underlie a higher-level summary. For example, an online presentation of the airline dashboard in Figure $2.35$ might allow airline executives to drill down by clicking the gauge which shows that approximately 89 percent of the airline’s international fleet was utilized over the year to see the bar chart in Figure 2.39. This bar chart depicts the quarterly percentage utilizations of the airline’s international fleet over the year. The bar chart suggests that the airline might offer international travel specials in quarter 1 (Winter) and quarter 4 (Fall) to increase international travel and thus increase the percentage of the international fleet utilized in those quarters. Drill down can be done at multiple levels. For example, Disney executives might wish to have weekly reports of total merchandise sales at its four Orlando parks, which can be drilled down to reveal total merchandise sales at each park, which can be further drilled down to reveal total merchandise sales at each “land” in each park, which can be yet further drilled down to reveal total merchandise sales in each store in each land in each park.

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Optional

统计分析师的目标应该是尽可能准确和真实地描述数据集。这样的演示允许管理人员使用分析来做出明智的决定。但是，有可能构建具有误导性的统计摘要。虽然我们不提倡使用误导性统计数据，但您应该了解一些统计图表可能被操纵以歪曲事实的方式。通过知道要寻找什么，您可以避免被（我们希望）少数不道德的从业者误导。

例如，假设一家大型医院的护士很快将对加入工会的提案进行投票。工会组织者和医院管理部门都计划向全体护理人员分发最近的工资统计数据。假设过去四年每年在医院的平均护士工资和在医院的平均护士工资增长（以百分比表示）如下：

医院管理部门不希望护士加入工会，因此希望让护士相信，在没有工会的情况下提高工资已经取得了实质性进展。另一方面，工会组织者希望将工资增长描述为最低限度，以便护士感到有必要加入工会。

数字2.29给出过去四年中每年医院护士平均工资的两个条形图。请注意，在图2.29(a) 行政部门已开始以条形图的垂直刻度为$58,000通过使用比例分隔符 ( )。或者，图表

可以通过简单地在$58,000. 从远高于零的值开始垂直刻度会使工资增长看起来更加显着。请注意，当工会组织者呈现图2.29(b)，垂直尺度从零开始，工资增长看起来远没有那么令人印象深刻。

数字2.30显示了过去四年每年医院护士平均工资增长（百分比）的两个条形图。如图2.30(a)，政府已使代表百分比增加的条形的宽度与其高度成正比。这使得平均工资增长的向上运动看起来更加显着，因为观察者的眼睛倾向于比较条形的区域，而平均工资增长的改善实际上只与条形的高度成正比。当工会组织者呈现图的条形图2.30(b)，平均工资增长的改善看起来不那么令人印象深刻，因为每个条具有相同的宽度。

数字2.31给出了过去四年医院平均护士工资增长的两个时间序列图. 如图2.31(a) 行政当局拉长了图表的纵轴。即，纵轴被设置为使得百分比之间的距离大。这使得平均工资增长的上升趋势显得陡峭。如图2.31(b) 工会组织者压缩了纵轴（即百分比之间的距离很小）。这使得上涨趋势

平均工资增长似乎是渐进的。正如我们将在练习中看到的那样，在时间序列图中拉伸和压缩水平轴也会极大地影响该图给人的印象。

也可以通过简单地使用不同的标签或标题来创建对相同统计摘要的完全不同的解释。例如，考虑图 2.30(b) 中护士平均工资增长的条形图。为了创造一个有利的解释，医院管理部门可能会使用“工资增长连续第四年更高”的标题。另一方面，工会组织者可能会通过使用标题“加薪未能达到”来制造负面印象10%连续第四年。”

总之，重要的是仔细研究任何统计摘要，以免被误导。寻找操作，例如图形上的拉伸或压缩轴、不从零开始的轴、具有不同宽度条形的条形图以及有偏差的标题。做这些事情将帮助你看清真相并做出明智的决定。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Descriptive Analytics

在部分1.5我们说过，描述性分析使用传统的和最近开发的图形向高管（有时是客户）呈现易于理解的关于企业运营状态的最新信息的视觉摘要。在本节中，我们将讨论描述性分析使用的一些最近开发的图形，包括仪表、子弹图、树状图和迷你图。此外，我们将看到它们如何相互使用，以及如何使用更传统的图形来形成分析仪表板，这些仪表板是执行信息系统的一部分。我们还将简要讨论数据发现，它以最简单的形式涉及数据钻取。

仪表板和仪表分析仪表板提供企业关键绩效指标的当前状态和历史趋势的图形表示。仪表板一词源于汽车仪表板，它可以帮助驾驶员监控汽车的关键功能。数字2.35显示了一个仪表板，该仪表板以图形方式描绘了一家（虚构的）航空公司去年的一些关键绩效指标。在仪表板的左下角，我们看到三个仪表，描述了航空公司的区域、短途和国际机队的利用率百分比。一般来说，仪表（图表）允许我们以类似于现实生活中汽车上的速度表指针的方式可视化数据。仪表的外部刻度通常采用颜色编码，以提供额外的性能信息。例如，请注意图中仪表外部刻度上的颜色2.35范围从红色到深绿色到浅绿色。这些颜色表示从差（低于 75%）到满意（低于 90% 但至少 75%）到好（至少 90%）的车队利用率百分比。
子弹图虽然仪表图看起来很漂亮，但它们占用了相当大的空间，并且在某种程度上显得杂乱无章。例如，回顾迪斯尼公园案例，使用七个สม่มดร 小号米的米dR CH一种七次 Epcot 游乐设施将占用相当大的空间，并且效率低于我们最初在图 1.8 中显示的子弹图。图 2.36 再次显示了该项目符号图（已使用 Excel 获得）。

通常，项目符号图具有单个度量（例如，预测的等待时间）并将其显示为水平条（或垂直条），该条延伸到表示性能定性度量的范围，例如差、满意和良好。范围显示为不同的颜色或单一色调的不同强度。如果项目符号图是我们不想看起来太“忙碌”的仪表板的一部分，则使用单一色调可以让色盲者识别它们并限制颜色的使用。许多子弹图将单个主要度量与目标或目标进行比较，目标或目标由子弹图上的符号表示。迪士尼预计等待时间的子弹图使用从深绿色到红色的五种颜色，表示预计等待时间短（0 到 20 分钟）到非常长（80 到 100 分钟）。此项目符号图未将预测的等待时间与目标进行比较。但是，位于图2.35（代表航空公司准时到达和离开的百分比）确实显示了由短的垂直黑线表示的目标。例如，考虑代表中西部准时到达和离开百分比的子弹图，如下所示。

统计代写|商业分析作业代写Statistical Modelling for Business代考|sparkline

迷你图2.38显示了描述 5 只股票在 6 个月内的月收盘价的迷你图。一般来说，迷你图是另一种新的描述性分析图形，是一种折线图，它显示了某些测量（例如温度或股票价格）中变化的一般形状（通常随时间变化）。迷你图通常在没有轴或坐标的情况下绘制，并且足够小以嵌入文本中。迷你图通常组合在一起，以便可以看到比较变化。尽管大多数图表旨在显示大量信息并与正文分开，但迷你图旨在简洁并位于讨论它们的位置。因此，如果图中的迷你图2.38位于比较股票的报告的文本中，它们可能会比图所示的更小2.38，用于获取迷你图的股票的实际每月收盘价可能不会显示在迷你图旁边。
数据发现为了结束本节，我们将简要讨论数据发现方法，这些方法允许决策者以交互方式查看数据并进行初步分析。数据发现的一个简单版本是数据钻取，它揭示了更详细的数据，这些数据是更高级别摘要的基础。例如，图 1 中航空公司仪表板的在线演示2.35可能允许航空公司管理人员通过单击显示该航空公司国际机队的大约 89% 在一年中被使用的仪表进行深入研究，以查看图 2.39 中的条形图。此条形图描绘了该航空公司国际机队在一年中的季度利用率百分比。条形图表明航空公司可能会在第 1 季度（冬季）和第 4 季度（秋季）提供国际旅行特价，以增加国际旅行，从而增加这些季度使用的国际机队的百分比。下钻可以在多个级别进行。例如，迪斯尼高管可能希望每周报告其奥兰多四个公园的商品总销售额，可以深入了解每个公园的商品总销售额，

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考| Contingency Tables (Optional)

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考| Contingency Tables (Optional)

统计代写|商业分析作业代写Statistical Modelling for Business代考|The Brokerage Firm Case: Studying Client Satisfaction

Previous sections in this chapter have presented methods for summarizing data for a single variable. Often, however, we wish to use statistics to study possible relationships between several variables. In this section we present a simple way to study the relationship between two variables. Crosstabulation is a process that classifies data on two dimensions. This process results in a table that is called a contingency table. Such a table consists of rows and columns – the rows classify the data according to one dimension and the columns classify the data according to a second dimension. Together, the rows and columns represent all possibilities (or contingencies).An investment broker sells several kinds of investment products-a stock fund, a bond fund, and a tax-deferred annuity. The broker wishes to study whether client satisfaction with its products and services depends on the type of investment product purchased. To do this, 100 of the broker’s clients are randomly selected from the population of clients who have

purchased shares in exactly one of the funds. The broker records the fund type purchased by each client and has one of its investment counselors personally contact the client. When contacted, the client is asked to rate his or her level of satisfaction with the purchased fund as high, medium, or low. The resulting data are given in Table 2.16.

Looking at the raw data in Table $2.16$, it is difficult to see whether the level of client satisfaction varies depending on the fund type. We can look at the data in an organized way by constructing a contingency table. A crosstabulation of fund type versus level of client satisfaction is shown in Table $2.17$. The classification categories for the two variables are defined along the left and top margins of the table. The three row labels-bond fund, stock fund, and tax deferred annuity-define the three fund categories and are given in the left table margin. The three column labels-high, medium, and low-define the three levels of client satisfaction and are given along the top table margin. Each row and column combination, that is, each fund type and level of satisfaction combination, defines what we call a “cell” in the table. Because each of the randomly selected clients has invested in exactly one fund type and has reported exactly one level of satisfaction, each client can be placed in a particular cell in the contingency table. For example, because client number 1 in Table $2.16$ has invested in the bond fund and reports a high level of client satisfaction, client number 1 can be placed in the upper left cell of the table (the cell defined by the Bond Fund row and High Satisfaction column).

统计代写|商业分析作业代写Statistical Modelling for Business代考|column percentages

One good way to investigate relationships such as these is to compute row percentages and column percentages. We compute row percentages by dividing each cell’s freguency by its corresponding row total and by expressing the resulting fraction as a percentage. For instance, the row percentage for the upper left-hand cell (bond fund and high level of satisfaction) in Table $2.17$ is $(15 / 30) \times 100 \%=50 \%$. Similarly, column percentages are computed by dividing each cell’s frequency by its corresponding column total and by expressing the resulting fraction as a percentage. For example, the column percentage for the upper left-hand cell in Table $2.17$ is $(15 / 40) \times 100 \%=37.5 \%$. Table $2.18$ summarizes all of the row percentages for the different fund types in Table 2.17. We see that each row in Table $2.18$ gives a percentage frequency distribution of level of client satisfaction given a particular fund type.
For example, the first row in Table $2.18$ gives a percent frequency distribution of client satisfaction for investors who have purchased shares in the bond fund. We see that 50 percent of bond fund investors report high satisfaction, while 40 percent of these investors report medium satisfaction, and only 10 percent report low satisfaction. The other mows in Tahle $2.18$ provide. Гepcent freynency distrikutions of client satisfaction for stock fund and annuity parchasers
All three percent frequency distributions of client satisfaction-for the bond fund, the stock fund, and the tax deferred annuity-are illustrated using bar charts in Figure $2.23$. In this figure, the bar heights for each chart are the respective row percentages in Table $2.18$. For example, these distributions tell us that 80 percent of stock fund investors report high satisfaction, while $97.5$ percent of tax deferred annuity purchasers report medium or low satisfaction. Looking at the entire table of row percentages (or the bar charts in Figure 2.23), we might conclude that stock fund investors are highly satisfied, that bond fund investors are quite satisfied (but, somewhat less so than stock fund investors), and that tax-deferred-annuity purchasers are less satisfied than either stock fund or bond fund investors. In general, row percentages and column percentages help us to quantify relationships such as these.

In the investment example, we have cross-tabulated two qualitative variables. We can also cross-tabulate a quantitative variable versus a qualitative variable or two quantitative variables against each other. If we are cross-tabulating a quantitative variable, we often define categories by using appropriate ranges. For example, if we wished to cross-tabulate level of education (grade school, high school, college, graduate school) versus income, we might define income classes $\$ 0-\$ 50,000, \$ 50,001-\$ 100,000, \$ 100,001-\$ 150,000$, and above $\$ 150,000$.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Scatter Plots

We often study relationships between variables by using graphical methods. A simple graph that can be used to study the relationship between two variables is called a scatter plot. As an example, suppose that a marketing manager wishes to investigate the relationship between the sales volume (in thousands of units) of a product and the amount spent (in units of $\$ 10,000$ ) on advertising the product. To do this, the marketing manager randomly selects 10 sales regions having equal sales potential. The manager assigns a different level of advertising expenditure for January 2016 to each sales region as shown in Table 2.20. At the end of the month, the sales volume for each region is recorded as also shown in Table $2.20$.

A scatter plot of these data is given in Figure 2.24. To construct this plot, we place the variable advertising expenditure (denoted $x$ ) on the horizontal axis and we place the variable sales volume (denoted $y$ ) on the vertical axis. For the first sales region, advertising expenditure equals 5 and sales volume equals 89 . We plot the point with coordinates $x=5$ and $y=89$ on the scatter plot to represent this sales region. Points for the other sales regions are plotted similarly. The scatter plot shows that there is a positive relationship between advertising expenditure and sales volumethat is, higher values of sales volume are associated with higher levels of advertising expenditure.
We have drawn a straight line through the plotted points of the scatter plot to represent the relationship between advertising expenditure and sales volume. We often do this when the relationship between two variables appears to be a straight line, or linear, relationship. Of course, the relationship between $x$ and $y$ in Figure $2.24$ is not perfectly linear-not all of the points in the scatter plot are exactly on the line. Nevertheless, because the relationship between $x$ and $y$ appears to be approximately linear, it seems reasonable to represent the general relationship between these variables using a straight line. In future chapters we will explain ways to quantify such a relationship that is, describe such a relationship numerically. Moreover, not all linear relationships between two variables $x$ and $y$ are positive linear relationships (that is, have a positive slope). For example, Table $2.21$ on the next page gives the average hourly outdoor temperature $(x)$ in a city during a week and the city’s natural gas consumption (y) during the week for each of the previous eight weeks. The temperature readings are expressed in degrees Fahrenheit and the natural gas consumptions are expressed in millions of cubic feet of natural gas. The scatter plot in Figure $2.25$ shows that there is a

negative linear relationship between $x$ and $y$-that is, as average hourly temperature in the city increases, the city’s natural gas consumption decreases in a linear fashion. Finally, not all relationships are linear. In Chapter 14 we will consider how to represent and quantify curved relationships, and, as illustrated in Figure $2.26$, there are situations in which two variables $x$ and $y$ do not appear to have any relationship.

To conclude this section, recall from Chapter 1 that a time series plot (also called a runs plot) is a plot of individual process measurements versus time. This implies that a time series plot is a scatter plot, where values of a process variable are plotted on the vertical axis versus corresponding values of time on the horizontal axis.

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|The Brokerage Firm Case: Studying Client Satisfaction

本章的前几节介绍了汇总单个变量数据的方法。然而，我们通常希望使用统计数据来研究几个变量之间可能存在的关系。在本节中，我们提出了一种研究两个变量之间关系的简单方法。交叉制表是在两个维度上对数据进行分类的过程。此过程产生一个称为列联表的表。这样的表由行和列组成——行根据一维对数据进行分类，列根据第二维对数据进行分类。行和列一起代表所有可能性（或意外情况）。投资经纪人销售几种投资产品——股票基金、债券基金和延税年金。经纪商希望研究客户对其产品和服务的满意度是否取决于所购买的投资产品类型。为此，经纪人的 100 名客户是从拥有

购买了其中一只基金的股票。经纪人记录每个客户购买的基金类型，并让其一名投资顾问亲自与客户联系。联系时，客户被要求将他或她对所购买基金的满意度评定为高、中或低。结果数据在表 2.16 中给出。

查看表中的原始数据2.16，很难看出客户满意度是否因基金类型而异。我们可以通过构建列联表以有组织的方式查看数据。表中显示了基金类型与客户满意度的交叉表2.17. 两个变量的分类类别沿表格的左边缘和上边缘定义。三行标签——债券基金、股票基金和延税年金——定义了三个基金类别，并在左表边距中给出。三列标签（高、中和低）定义了客户满意度的三个级别，并沿顶部表格边缘给出。每个行和列的组合，即每个基金类型和满意度组合，定义了我们所说的表格中的“单元格”。因为每个随机选择的客户都只投资了一种基金类型并报告了一种满意程度，所以每个客户都可以放在列联表的特定单元格中。例如，因为表中的客户编号 12.16已投资债券基金并报告客户满意度高，客户编号 1 可放置在表格的左上角单元格（由债券基金行和高满意度列定义的单元格）。

统计代写|商业分析作业代写Statistical Modelling for Business代考|column percentages

调查此类关系的一种好方法是计算行百分比和列百分比。我们通过将每个单元格的频率除以其相应的行总数并将结果分数表示为百分比来计算行百分比。例如，表中左上角单元格（债券基金和高满意度）的行百分比2.17是(15/30)×100%=50%. 类似地，通过将每个单元格的频率除以其相应的列总数并将所得分数表示为百分比来计算列百分比。例如，表中左上角单元格的列百分比2.17是(15/40)×100%=37.5%. 桌子2.18总结了表 2.17 中不同基金类型的所有行百分比。我们看到 Table 中的每一行2.18给出给定特定基金类型的客户满意度水平的百分比频率分布。
例如，表中的第一行2.18给出已购买债券基金股份的投资者的客户满意度百分比分布。我们看到 50% 的债券基金投资者表示高满意度，而这些投资者中有 40% 表示中等满意度，只有 10% 表示低满意度。塔勒的其他刈草2.18提供。股票基金和年金购买者的客户满意度的 epcent
频率分布债券基金、股票基金和税收递延年金的客户满意度的所有 3% 频率分布都用图 1 中的条形图说明2.23. 在此图中，每个图表的条形高度是表中相应的行百分比2.18. 例如，这些分布告诉我们，80% 的股票基金投资者表示非常满意，而97.5百分之几的延税年金购买者表示满意度中等或低。查看整个行百分比表（或图 2.23 中的条形图），我们可能会得出结论，股票基金投资者非常满意，债券基金投资者非常满意（但略低于股票基金投资者），并且与股票基金或债券基金投资者相比，延税年金购买者的满意度较低。一般来说，行百分比和列百分比可以帮助我们量化诸如此类的关系。

在投资示例中，我们对两个定性变量进行了交叉制表。我们还可以将一个定量变量与一个定性变量或两个定量变量相互交叉制表。如果我们对定量变量进行交叉制表，我们通常会使用适当的范围来定义类别。例如，如果我们希望将教育水平（小学、高中、大学、研究生院）与收入进行交叉制表，我们可以定义收入等级$0−$50,000,$50,001−$100,000,$100,001−$150,000，以上$150,000.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Scatter Plots

我们经常使用图解法来研究变量之间的关系。可用于研究两个变量之间关系的简单图形称为散点图。例如，假设营销经理希望调查产品的销售量（以千为单位）与花费的金额（以$10,000) 宣传产品。为此，营销经理随机选择 10 个具有相同销售潜力的销售区域。经理为每个销售区域分配了 2016 年 1 月不同水平的广告支出，如表 2.20 所示。月末各地区的销量记录如表所示2.20.

图 2.24 给出了这些数据的散点图。为了构建这个图，我们将变量广告支出（表示为X）在横轴上，我们将变量销售量（表示为是) 在垂直轴上。对于第一个销售区域，广告支出为 5，销量为 89 。我们用坐标绘制点X=5和是=89在散点图上表示该销售区域。其他销售区域的点也以类似方式绘制。散点图表明，广告支出与销售额之间存在正相关关系，即销售额越高，广告支出水平越高。
我们通过散点图的绘制点画了一条直线来表示广告支出和销量之间的关系。当两个变量之间的关系看起来是直线或线性关系时，我们经常这样做。当然，两者之间的关系X和是如图2.24不是完全线性的——并不是散点图中的所有点都完全在线上。然而，由于两者之间的关系X和是似乎是近似线性的，用直线表示这些变量之间的一般关系似乎是合理的。在以后的章节中，我们将解释量化这种关系的方法，即用数字描述这种关系。此外，并非两个变量之间的所有线性关系X和是是正线性关系（即，具有正斜率）。例如，表2.21下一页给出了平均每小时室外温度(X)一周内城市的天然气消耗量（y），前八周的每一周。温度读数以华氏度表示，天然气消耗量以百万立方英尺天然气表示。图中的散点图2.25表明有一个

之间的负线性关系X和是-也就是说，随着城市平均每小时温度的升高，城市的天然气消耗量呈线性下降。最后，并非所有关系都是线性的。在第 14 章中，我们将考虑如何表示和量化曲线关系，如图所示2.26, 有两个变量的情况X和是似乎没有任何关系。

总结本节，回顾第 1 章，时间序列图（也称为运行图）是单个过程测量值与时间的关系图。这意味着时间序列图是散点图，其中过程变量的值绘制在垂直轴上，而相应的时间值绘制在水平轴上。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Cumulative distributions and ogives

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|Cumulative distributions and ogives

Another way to summarize a distribution is to construct a cumulative distribution. To do this, we use the same number of classes, the same class lengths, and the same class boundaries that we have used for the frequency distribution of a data set. However, in order to construct a cumulative frequency distribution, we record for each class the number of

measurements that are less than the upper boundary of the class. To illustrate this idea, Table $2.10$ gives the cumulative frequency distribution of the payment time distribution summarized in Table $2.7$ (page 64). Columns (1) and (2) in this table give the frequency distribution of the payment times. Column (3) gives the cumulative frequency for each class. To see how these values are obtained, the cumulative frequency for the class $10<13$ is the number of payment times less than 13 . This is obviously the frequency for the class $10<13$, which is 3 . The cumulative frequency for the class $13<16$ is the number of payment times less than 16 , which is obtained by adding the frequencies for the first two classes-that is, $3+14=17$. The cumulative frequency for the class $16<19$ is the number of payment times less than 19-that is, $3+14+23=40$. We see that, in general, a cumulative frequency is obtained by summing the frequencies of all classes representing values less than the upper buumdary of the class.

Column (4) gives the cumulative relative frequency for each class, which is obtained by summing the relative frequencies of all classes representing values less than the upper boundary of the class. Or, more simply, this value can be found by dividing the cumulative frequency for the class by the total number of measurements in the data set. For instance, the cunulative relative frequency for the class $19<22$ is $52 / 65-.8$. Culum (5) gives the cumulative percent frequency for each class, which is obtained by summing the percent frequencies of all classes representing values less than the upper boundary of the class. More simply, this value can be found by multiplying the cumulative relative frequency of a class by 100. For instance, the cumulative percent frequency for the class $19<22$ is $.8 \times(100)=80$ percent.

As an example of interpreting Table $2.10 .60$ of the 65 payment times are 24 days or less. or, equivalently, $92.31$ percent of the payment times (or a fraction of $.9231$ of the payment times) are 24 days or less. Also, notice that the last entry in the cumulative frequency distribution is the total number of measurements (here, 65 payment times). In addition, the last entry in the cumulative relative frequency distribution is $1.0$ and the last entry in the cumulalive percent frequency disiribution is $100 \%$. In general, for any dala sel, these lasi eniries will be, respectively, the total number of measurements, $1.0$, and $100 \%$.

An ogive (pronounced “oh-jive”) is a graph of a cumulative distribution. To construct a frequency ogive, we plot a point above each upper class boundary at a height equal to the cumulative frequency of the class. We then connect the plotted points with line segments. A similar graph can be drawn using the cumulative relative frequencies or the cumulative percent frequencies. As an example, Figure $2.14$ gives a percent frequency ogive of the payment times. Looking at this figure, we see that, for instance, a little more than 25 percent (actually, $26.15$ percent according to Table $2.10$ ) of the payment times are less than 16 days, while 80 percent of the payment times are less than 22 days. Also notice that we have completed the ogive by plotting an additional point at the lower boundary of the first (leftmost) class at a height equal to zero. This depicts the fact that none of the payment times is less than 10 days. Finally, the ogive graphically shows that all ( 100 percent) of the payment times are less than 31 days.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Dot Plots

A very simple graph that can be used to summarize a data set is called a dot plot. To make a dot plot we draw a horizontal axis that spans the range of the measurements in the data set. We then place dots above the horizontal axis to represent the measurements. As an example, Figure $2.18$ (a) shows a dot plot of the exam scores in Table $2.8$ (page 68). Remember, these are the scores for the first exam given before implementing a strict attendance policy. The horizontal axis spans exam scores from 30 to 100 . Each dot above the axis represents an exam score. For instance, the two dots above the score of 90 tell us that two students received a 90 on the exam. The dot plot shows us that there are two concentrations of scores-those in the $80 \mathrm{~s}$ and $90 \mathrm{~s}$ and those in the $60 \mathrm{~s}$. Figure $2.18(\mathrm{~b})$ gives a dot plot of the scores on the second exam (which was given after imposing the attendance policy). As did the percent frequency polygon for Exam 2 in Figure $2.13$ (page 69), this second dot plot shows that the attendance policy eliminated the concentration of scores in the $60 \mathrm{~s}$.

Dot plots are useful for detecting outliers, which are unusually large or small observations that are well separated from the remaining observations. For example, the dot plot for exam 1 indicates that the score 32 seems unusually low. How we handle an outlier depends on its cause. If the outlier results from a measurement error or an error in recording or processing the data, it should be corrected. If such an outlier cannot be corrected, it should be discarded. If an outlier is not the result of an error in measuring or recording the data, its cause may reveal important information. For example, the outlying exam score of 32 convinced the author that the student needed a tutor. After working with a tutor, the student showed considerable improvement on Exam 2. A more precise way to detect outliers is presented in Section 3.3.

统计代写|商业分析作业代写Statistical Modelling for Business代考|back-to-back stem-and-leaf display

If we wish to compare two distributions, it is convenient to construct a back-to-back stemand-leaf display. Figure $2.20$ presents a back-to-back stem-and-leaf display for the previously discussed exam scores. The left side of the display summarizes the scores for the first exam. Remember, this exam was given before implementing a strict attendance policy. The right side of the display summarizes the scores for the second exam (which was given after imposing the attendance policy). Looking at the left side of the display, we see that for the first exam there are two concentrations of scores-those in the $80 \mathrm{~s}$ and $90 \mathrm{~s}$ and those in the $60 \mathrm{~s}$. The right side of the display shows that the attendance policy eliminated the concentration of scores in the $60 \mathrm{~s}$ and illustrates that the scores on exam 2 are almost single peaked and somewhat skewed to the left.

Stem-and-leaf displays are useful for detecting outliers, which are unusually large or small observations that are well separated from the remaining observations. For example, the stem-and-leaf display for exam 1 indicates that the score 32 seems unusually low. How we handle an outlier depends on its cause. If the outlier results from a measurement error or an error in recording or processing the data, it should be corrected. If such an outlier cannot be corrected, it should be discarded. If an outlier is not the result of an error in measuring or recording the data, its cause may reveal important information. For example, the outlying exam score of 32 convinced the author that the student needed a tutor. After working with a tutor, the student showed considerable improvement on exam 2. A more precise way to detect outliers is presented in Section 3.3.

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Cumulative distributions and ogives

总结分布的另一种方法是构建累积分布。为此，我们使用与用于数据集频率分布的相同数量的类、相同的类长度和相同的类边界。然而，为了构建累积频率分布，我们记录每个类的数量

小于类的上限的测量值。为了说明这个想法，表2.10给出了表中汇总的支付时间分布的累积频率分布2.7（第 64 页）。此表中的第 (1) 和 (2) 列给出了支付时间的频率分布。第 (3) 列给出了每个类别的累积频率。要查看这些值是如何获得的，类的累积频率10<13是付款次数少于 13 次。这显然是上课的频率10<13，即 3 。类的累积频率13<16是小于 16 的支付次数，是前两个类的频率相加得到的——即，3+14=17. 类的累积频率16<19是支付次数小于19-即，3+14+23=40. 我们看到，一般来说，累积频率是通过将所有表示值小于该类的上界的类的频率相加来获得的。

第 (4) 列给出了每个类别的累积相对频率，它是通过将所有类别的相对频率相加获得的，这些类别代表的值小于该类别的上边界。或者，更简单地说，可以通过将类的累积频率除以数据集中的测量总数来找到该值。例如，该类的累积相对频率19<22是52/65−.8. Culum (5) 给出了每个类的累积百分比频率，它是通过将所有表示值小于类的上限的类的百分比频率相加而获得的。更简单地说，这个值可以通过将类的累积相对频率乘以 100 来找到。例如，类的累积百分比频率19<22是.8×(100)=80百分。

作为解释表的例子2.10.6065 次付款时间为 24 天或更短。或者，等效地，92.31支付时间的百分比（或一小部分.9231付款时间）为 24 天或更短。此外，请注意累积频率分布中的最后一项是测量的总数（此处为 65 次付款时间）。此外，累积相对频率分布中的最后一项是1.0累积百分比频率分布中的最后一个条目是100%. 一般来说，对于任何 dala sel，这些 lasi eniries 将分别是测量的总数，1.0，和100%.

ogive（发音为“oh-jive”）是累积分布图。为了构建频率曲线，我们在每个上层类边界上方绘制一个点，高度等于类的累积频率。然后我们将绘制的点与线段连接起来。可以使用累积相对频率或累积百分比频率绘制类似的图表。例如，图2.14给出支付时间的百分比频率。看看这个数字，我们看到，例如，略高于 25%（实际上，26.15根据表百分比2.10) 的付款时间少于 16 天，而 80% 的付款时间少于 22 天。另请注意，我们通过在第一个（最左边）类的下边界处绘制一个附加点，高度为零，从而完成了 ogive。这描述了没有一个付款时间少于 10 天的事实。最后，ogive 以图形方式显示所有（100%）的付款时间都少于 31 天。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Dot Plots

可用于汇总数据集的非常简单的图形称为点图。为了制作点图，我们绘制了一个横轴，该轴跨越了数据集中的测量范围。然后我们在水平轴上方放置点来表示测量值。例如，图2.18(a) 显示了表中考试成绩的点图2.8（第 68 页）。请记住，这些是在实施严格的考勤政策之前给出的第一次考试的分数。横轴跨越从 30 到 100 的考试分数。轴上方的每个点代表一个考试分数。例如，90 分以上的两个点告诉我们有两个学生在考试中获得了 90 分。点图向我们展示了分数的两个浓度——那些在80 s和90 s和那些在60 s. 数字2.18( b)给出第二次考试分数的点图（在实施出勤政策后给出）。与图 2 中考试 2 的百分比频率多边形一样2.13（第 69 页），第二个点图显示出勤政策消除了分数集中在60 s.

点图对于检测异常值非常有用，这些异常值是与其余观测值完全分离的异常大或小观测值。例如，考试 1 的点图表明分数 32 似乎异常低。我们如何处理异常值取决于其原因。如果异常值是由测量错误或记录或处理数据的错误引起的，则应予以纠正。如果无法纠正这样的异常值，则应将其丢弃。如果异常值不是测量或记录数据错误的结果，则其原因可能会揭示重要信息。例如，32 分的偏远考试分数使作者确信该学生需要一位导师。在与导师合作后，学生在考试 2 中表现出相当大的进步。第 3.3 节介绍了一种更精确的检测异常值的方法。

统计代写|商业分析作业代写Statistical Modelling for Business代考|back-to-back stem-and-leaf display

如果我们想比较两个分布，构建一个背靠背的茎叶显示是很方便的。数字2.20为前面讨论的考试分数提供了一个背靠背的茎叶显示。显示屏的左侧总结了第一次考试的分数。请记住，此考试是在实施严格的出勤政策之前进行的。显示屏的右侧总结了第二次考试的分数（这是在实施出勤政策后给出的）。查看显示屏的左侧，我们看到第一次考试有两个分数集中度——那些在80 s和90 s和那些在60 s. 显示右侧显示考勤政策消除了分数集中在60 s并说明考试 2 的分数几乎是单峰的，并且有些偏左。

茎叶显示可用于检测异常值，这些异常值是与其余观测值完全分开的异常大或小观测值。例如，考试 1 的茎叶显示表明分数 32 似乎异常低。我们如何处理异常值取决于其原因。如果异常值是由测量错误或记录或处理数据的错误引起的，则应予以纠正。如果无法纠正这样的异常值，则应将其丢弃。如果异常值不是测量或记录数据错误的结果，则其原因可能会揭示重要信息。例如，32 分的偏远考试分数使作者确信该学生需要一位导师。在与导师合作后，学生在考试 2 中表现出相当大的进步。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Constructing Frequency Distributions and Histograms

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|Constructing Frequency Distributions and Histograms

The procedure in the preceding box is not the only way to construct a histogram. Often, histograms are constructed more informally. For instance, it is not necessary to set the lower boundary of the first (leftmost) class equal to the smallest measurement in the data. As an example, suppose that we wish to form a histogram of the 50 gas mileages given in Table $1.7$ (page 15). Examining the mileages, we see that the smallest mileage is $29.8 \mathrm{mpg}$ and that the largest mileage is $33.3 \mathrm{mpg}$. Therefore, it would be convenient to begin the first (leftmost) class at $29.5 \mathrm{mpg}$ and end the last (rightmost) class at $33.5 \mathrm{mpg}$. Further, it would be reasonable to use classes that are $.5 \mathrm{mpg}$ in length. We would then use 8 classes: $29.5<30$, $30<30.5,30.5<31,31<31.5,31.5<32,32<32.5,32.5<33$, and $33<33.5$. A histogram of the gas mileages employing these classes is shown in Figure $2.9$.

Sometimes it is desirable to let the nature of the problem determine the histogram classes. For example, to construct a histogram describing the ages of the residents in a city, it might be reasonable to use classes having 10-year lengths (that is, under 10 years, $10-19$ years, 20-29 years, $30-39$ years, and so on).

Notice that in our examples we have used classes having equal class lengths. In general, it is best to use equal class lengths whenever the raw data (that is, all the actual measurements) are available. However, sometimes histograms are formed with unequal class lengths-particularly when we are using published data as a source. Economic data and data in the social sciences are often published in the form of frequency distributions having unequal class lengths. Dealing with this kind of data is discussed in Exercises $2.26$ and $2.27$. Also discussed in these exercises is how to deal with open-ended classes. For example, if we are constructing a histogram describing the yearly incomes of U.S. households, an open-ended class could be households earning over $\$ 500,000$ per year.

As an alternative to constructing a frequency distribution and histogram by hand, we can use software packages such as Excel and Minitab. Each of these packages will automatically define histogram classes for the user. However, these automatically defined classes will not necessarily be the same as those that would be obtained using the manual method we have previously described. Furthermore, the packages define classes by using different methods. (Descriptions of how the classes are defined can often be found in help menus.) For example, Figure $2.10$

gives a Minitab frequency histogram of the payment times in Table $2.4$. Here, Minitab has defined 11 classes and has labeled 5 of the classes on the horizontal axis using midpoints $(12,16,20,24,28)$. It is easy to see that the midpoints of the unlabeled classes are 10,14 , $18,22,26$, and 30 . Moreover, the boundaries of the first class are 9 and 11 , the boundaries of the second class are 11 and 13 , and so forth. Minitab counts frequencies as we have previously described. For instance, one payment time is at least 9 and less than 11 , two payment times are at least 11 and less than 13 , seven payment times are at least 13 and less than 15 , and so forth.

Figure $2.11$ gives an Excel frequency distribution and histogram of the bottle design ratings in Table $1.6$ on page 14. Excel labels histogram classes using their upper class boundaries. For example, the first class has an upper class boundary equal to the smallest rating of 20 and contains only this smallest rating. The boundaries of the second class are 20 and 22 , the boundaries of the third class are 22 and 24 , and so forth. The last class corresponds to ratings more than 36. Excel’s method for counting frequencies differs from that of Minitab (and, therefore, also differs from the way we counted frequencies by hand in Example 2.2). Excel assigns a frequency to a particular class by counting the number of measurements that are greater than the lower boundary of the class and less than or equal to the upper boundary of the class. For example, one bottle design rating is greater than 20 and less than or equal to (that is, at most) 22 . Similarly, 15 bottle design ratings are greater than 32 and at most 34 .

In Figure $2.10$ we have used Minitab to automatically form histogram classes. It is also possible to force software packages to form histogram classes that are defined by the user. We explain how to do this in the appendices at the end of this chapter. Because Excel does not always automatically define acceptable classes, the classes in Figure $2.11$ are a modification of Excel’s automatic classes. We also explain this modification in the appendices at the end of this chapter.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Some common distribution shapes

We often graph a frequency distribution in the form of a histogram in order to visualize the shape of the distribution. If we look at the histogram of payment times in Figure 2.10, we see that the right tail of the histogram is longer than the left tail. When a histogram has this general shape, we say that the distribution is skewed to the right. Here the long right tail tells us that a few of the payment times are somewhat longer than the rest. If we look at the histogram of bottle design ratings in Figure 2.11, we see that the left tail of the histogram is much longer than the right tail. When a histogram has this general shape, we say that the distribution is skewed to the left. Here the long tail to the left tells

us that, while most of the bottle design ratings are concentrated above 25 or so, a few of the ratings are lower than the rest. Finally, looking at the histogram of gas mileages in Figure $2.9$, we see that the right and left tails of the histogram appear to be mirror images of each other. When a histogram has this general shape, we say that the distribution is symmetrical. Moreover, the distribution of gas mileages appears to be piled up in the middle or mound shaped.

Mound-shaped, symmetrical distributions as well as distributions that are skewed to the right or left are commonly found in practice. For example, distributions of scores on standardized tests such as the SAT and ACT tend to be mound shaped and symmetrical, whereas distributions of scores on tests in college statistics courses might be skewed to the left-a few students don’t study and get scores much lower than the rest. On the other hand, economic data such as income data are often skewed to the right-a few people have incomes much higher than most others. Many other distribution shapes are possible. For example, some distributions have two or more peaks-we will give an example of this distribution shape later in this section. It is often very useful to know the shape of a distribution. For example, knowing that the distribution of butle design ratings is skewed to the left suggests that a few consumers may have noticed a problem with design that others didn’t see. Further investigation into why these consumers gave the design low ratings might allow the company to improve the design.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Frequency polygons

Another graphical display that can be used to depict a frequency distribution is a frequency polygon. To construct this graphic, we plot a point above each class midpoint at a height equal to the frequency of the class the height can also be the class relative frequency or class percent frequency if su desired. Then we cunnect the points with line segments. As we will demonstrate in the following example, this kind of graphic can be particularly useful when we wish to compare two or more distributions.Table $2.8$ lists (in increasing order) the scores earned on the first exam by the 40 students in a business statistics course taught by one of the authors several semesters ago. Figure $2.12$ gives a percent frequency polygon for these exam scores. Because exam scores are often reported by using 10-point grade ranges (for instance, 80 to 90 percent), we have defined the following classes: $30<40,40<50,50<60,60<70,70<80,80<90$, and

$90<100$. This is an example of letting the situation determine the classes of a frequency distribution, which is common practice when the situation naturally defines classes. The points that form the polygon have been plotted corresponding to the midpoints of the classes $(35,45,55,65,75,85,95)$. Each point is plotted at a height that equals the percentage of exam scores in its class. For instance, because 10 of the 40 scores are at least 90 and less than 100 , the plot point corresponding to the class midpoint 95 is plotted at a height of 25 percent.

Looking at Figure 2.12, we see that there is a concentration of scores in the 85 to 95 range and another concentration of scores around 65 . In addition, the distribution of scores is somewhat skewed to the left-a few students had scores (in the 30 s and 40 s) that were quite a bit lower than the rest.

This is an example of a distribution having two peaks. When a distribution has multiple peaks, finding the reason for the different peaks often provides useful information. The reason for the two-peaked distribution of exam scores was that some students were not attending class regularly. Students who received scores in the 60 s and below admitted that they were cutting class, whereas students who received higher scores were attending class on a regular basis.

After identifying the reason for the concentrution of lower scores, the instructor established an attendance policy that forced students to attend every class-any student who missed a class was to be dropped from the course. Table $2.9$ presents the scores on the second exam-after the new attendance policy. Figure $2.13$ presents (and allows us to compare) the percent frequency polygons for both exams. We see that the polygon for the second exam is single peaked – the attendance policy ${ }^{4}$ eliminated the concentration of scores in the $60 \mathrm{~s}$, although the scores are still somewhat skewed to the left.

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Constructing Frequency Distributions and Histograms

前面框中的过程并不是构造直方图的唯一方法。通常，直方图的构造更为非正式。例如，没有必要将第一个（最左边）类的下边界设置为数据中的最小测量值。举个例子，假设我们希望形成表中给出的 50 次油耗的直方图1.7（第 15 页）。检查里程，我们看到最小的里程是29.8米pG并且最大的里程是33.3米pG. 因此，在开始第一节（最左边的）课时会很方便29.5米pG并在最后一个（最右边的）类结束33.5米pG. 此外，使用以下类是合理的.5米pG长度。然后我们将使用 8 个类：29.5<30,30<30.5,30.5<31,31<31.5,31.5<32,32<32.5,32.5<33，和33<33.5. 使用这些类别的汽油里程直方图如图所示2.9.

有时希望让问题的性质决定直方图的类别。例如，要构建一个描述城市居民年龄的直方图，使用具有 10 年长度的类（即 10 年以下，10−19年，20-29 岁，30−39年等等）。

请注意，在我们的示例中，我们使用了具有相等类长度的类。一般来说，只要有原始数据（即所有实际测量值）可用，最好使用相等的类长度。但是，有时直方图的类长度不相等，尤其是当我们使用已发布的数据作为源时。经济数据和社会科学数据通常以具有不等类长度的频率分布的形式发布。练习中讨论了处理此类数据2.26和2.27. 在这些练习中还讨论了如何处理开放式课程。例如，如果我们正在构建一个描述美国家庭年收入的直方图，那么开放式类别可能是家庭收入超过$500,000每年。

作为手动构建频率分布和直方图的替代方法，我们可以使用 Excel 和 Minitab 等软件包。这些包中的每一个都将自动为用户定义直方图类。但是，这些自动定义的类不一定与使用我们之前描述的手动方法获得的类相同。此外，包使用不同的方法定义类。（如何定义类的描述通常可以在帮助菜单中找到。）例如，图2.10

给出表中付款时间的 Minitab 频率直方图2.4. 在这里，Minitab 定义了 11 个类，并使用中点在水平轴上标记了 5 个类(12,16,20,24,28). 很容易看出，未标记类的中点是 10,14 ，18,22,26, 和 30 . 此外，第一类的边界是 9 和 11 ，第二类的边界是 11 和 13 ，以此类推。正如我们之前描述的，Minitab 会计算频率。例如，一个支付时间至少为 9 且小于 11 ，两次支付时间为至少 11 且小于 13 ，七次支付时间为至少 13 且小于 15 ，等等。

数字2.11给出了表格中瓶子设计评级的 Excel 频率分布和直方图1.6在第 14 页。 Excel 使用其上类边界标记直方图类。例如，第一类的上类边界等于最小评分 20，并且仅包含这个最小评分。第二类的边界是 20 和 22 ，第三类的边界是 22 和 24 ，以此类推。最后一类对应的评分超过 36。Excel 计算频率的方法与 Minitab 的不同（因此，也不同于我们在示例 2.2 中手动计算频率的方法）。Excel 通过计算大于类下界且小于或等于类上界的测量次数来为特定类分配频率。例如，一个瓶子设计等级大于 20 且小于或等于（即，最多） 22 。相似地，

如图2.10我们已经使用 Minitab 自动形成直方图类。也可以强制软件包形成用户定义的直方图类。我们在本章末尾的附录中解释了如何做到这一点。因为 Excel 并不总是自动定义可接受的类，所以图2.11是对 Excel 自动类的修改。我们还在本章末尾的附录中解释了这种修改。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Some common distribution shapes

我们经常以直方图的形式绘制频率分布图，以便可视化分布的形状。如果我们查看图 2.10 中的支付时间直方图，我们会看到直方图的右尾比左尾长。当直方图具有这种一般形状时，我们说分布向右倾斜。在这里，长长的右尾告诉我们，一些支付时间比其他的要长一些。如果我们看一下图 2.11 中瓶子设计评级的直方图，我们会看到直方图的左尾比右尾长得多。当直方图具有这种一般形状时，我们说分布向左倾斜。这里左边的长尾巴告诉

我们认为，虽然大多数瓶子设计评分都集中在 25 左右，但也有少数评分低于其余部分。最后，看图中的油耗直方图2.9，我们看到直方图的左右尾看起来是彼此的镜像。当直方图具有这种一般形状时，我们说分布是对称的。此外，油耗分布呈中间堆积或土丘状。

在实践中通常会发现丘状、对称分布以及向右或向左倾斜的分布。例如，SAT 和 ACT 等标准化考试的分数分布往往呈土丘状且对称，而大学统计学课程的考试分数分布可能偏左——少数学生不学习并获得分数比其他人低得多。另一方面，收入数据等经济数据往往偏右——少数人的收入远高于其他大多数人。许多其他分布形状是可能的。例如，一些分布有两个或多个峰——我们将在本节后面给出一个这种分布形状的例子。了解分布的形状通常非常有用。例如，知道巴特尔设计评级的分布向左倾斜表明一些消费者可能已经注意到其他人没有看到的设计问题。进一步调查为什么这些消费者对设计的评价较低，可能会让公司改进设计。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Frequency polygons

另一种可用于描述频率分布的图形显示是频率多边形。为了构建这个图形，我们在每个类中点上方绘制一个点，高度等于类的频率，如果需要，高度也可以是类相对频率或类百分比频率。然后我们用线段连接点。正如我们将在下面的示例中演示的那样，当我们希望比较两个或多个分布时，这种图形可能特别有用。表2.8列出（按递增顺序）40 名学生在几个学期前作者之一教授的商业统计课程中的第一次考试中获得的分数。数字2.12给出这些考试分数的百分比频率多边形. 由于考试成绩通常使用 10 分等级范围（例如，80% 到 90%）来报告，因此我们定义了以下类别：30<40,40<50,50<60,60<70,70<80,80<90，和

90<100. 这是让情况确定频率分布的类别的示例，当情况自然定义类别时，这是常见的做法。形成多边形的点对应于类的中点绘制(35,45,55,65,75,85,95). 每个点都绘制在一个高度，该高度等于其班级考试成绩的百分比。例如，因为 40 个分数中有 10 个至少为 90 且小于 100，所以对应于类中点 95 的绘图点绘制在 25% 的高度处。

查看图 2.12，我们看到分数集中在 85 到 95 范围内，另一个分数集中在 65 左右。此外，分数的分布有些偏左——一些学生的分数（在 30 年代和 40 年代）比其他学生低很多。

这是具有两个峰值的分布的示例。当一个分布有多个峰值时，找出不同峰值的原因通常会提供有用的信息。考试成绩出现双峰分布的原因是部分学生没有按时上课。得分在60岁及以下的学生承认他们在逃课，而得分较高的学生则在正常上课。

在确定了分数偏低的原因后，教师制定了一项出勤政策，强制学生每节课都上课——任何缺课的学生都将被拒之门外。桌子2.9在新的考勤政策之后显示第二次考试的分数。数字2.13呈现（并允许我们比较）两种考试的百分比频率多边形。我们看到第二次考试的多边形是单峰的——出勤政策4消除了分数的集中60 s，尽管分数仍然有些偏左。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Graphically Summarizing Quantitative Data

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|Graphically Summarizing Quantitative Data

统计代写|商业分析作业代写Statistical Modelling for Business代考|Frequency distributions and histograms

Major consulting firms such as Accenture, Ernst \& Young Consulting, and Deloitte \& Touche Consulting employ statistical analysis to assess the effectiveness of the systems they design for their customers. In this case a consulting firm has developed an electronic billing system for a Hamilton, Ohio, trucking company. The system sends invoices electronically to each customer’s computer and allows customers to easily check and correct errors. It is hoped that the new hilling system will substantially reduce the amount of time it takes customers to make payments. Typical payment times-measured from the date on an invoice to the date payment is received-using the trucking company’s old billing system had been 39 days or more. This exceeded the industry standard payment time of 30 days.
The new billing system does not automatically compute the payment time for each invoice because there is no continuing need for this information. Therefore, in order to assess the system’s effectiveness, the consulting firm selects a random sample of 65 invoices from the 7,823 invoices processed during the first three months of the new system’s operation. The payment times for the 65 sample invoices are manually determined and are given in Table $2.4$. If this sample can be used to establish that the new billing system substantially reduces payment times, the consulting firm plans to market the system to other trucking companies.

Looking at the payment times in Table $2.4$, we can see that the shortest payment time is 10 days and that the longest payment timee is 29 days. Beyond that, it is pretty difficult to interpret the data in any meaningful way. To better understand the sample of 65 payment times, the consulting firm will form a frequency distribution of the data and will graph the distribution by constructing a histogram. Similar to the frequency distributions for qualitative data we studied in Section 2.1, the frequency distribution will divide the payment times into classes and will tell us how many of the payment times are in each class.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Form Nonoverlapping Classes of Equal Width

Step 3: Form Nonoverlapping Classes of Equal Width We can form the classes of the frequency distribution by defining the boundaries of the classes. To find the first class boundary, we find the smallest payment time in Table $2.4$, which is 10 days. This value is the lower boundary of the first class. Adding the class length of 3 to this lower boundary, we obtain $10+3=13$, which is the upper boundary of the first class and the lower boundary of the second class. Similarly, the upper boundary of the second class and the lower boundary of the third class equals $13+3=16$. Continuing in this fashion, the lower boundaries of the remaining classes are $19,22,25$, and 28 . Adding the class length 3 to the lower boundary of the last class gives us the upper boundary of the last class, 31 . These boundaries define seven nonoverlapping classes for the frequency distribution. We summarize these classes in Table 2.6. For instance, the first class $-10$ days and less than 13 days-includes the payment times 10,11 , and 12 days; the second class $-13$ days and less than 16 days-includes the payment times 13 , 14 , and 15 days; and so forth. Notice that the largest observed payment time- 29 days-is contained in the last class. In cases where the largest measurement is not contained in the last class, we simply add another class. Generally speaking, the guidelines we have given for forming classes are not inflexible rules. Rather, they are inended to help us find reasunable classes. Finally, the method we have used for forming classes results in classes of equal length. Generally, forming classes of equal length will make it easier to appropriately interpret the frequency distribution.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Graph the Histogram

Step 5: Graph the Histogram We can graphically portray the distribution of payment times by drawing a histogram. The histogram can be constructed using the frequency, relative frequency, or percent frequency distribution. To set up the histogram, we draw rectangles that correspond to the classes. The base of the rectangle corresponding to a class represents the payment times in the class. The height of the rectangle can represent the class frequency, relative frequency, or percent frequency.

We have drawn a frequency histogram of the 65 payment times in Figure 2.7. The first (leftmost) rectangle, or “bar,” of the histogram represents the payment times 10,11 , and 12 . Looking at Figure 2.7, we see that the base of this rectangle is drawn from the lower boundary (10) of the first class in the frequency distribution of payment times to the lower boundary (13) of the second class. The height of this rectangle tells us that the frequency of the first class is 3 . The second histogram rectangle represents payment times 13,14 , and 15 . Its base is drawn from the lower boundary (13) of the second class to the lower boundary (16) of the third class, and its height tells us that the frequency of the second class is 14 . The other histogram bars are constructed similarly. Notice that there are no gaps between the adjacent rectangles in the histogram. Here, although the payment times have been recorded to the nearest whole day, the fact that the histogram bars touch each other emphasizes that a payment time could (in theory) be any number on the horizontal axis. In general, histograms are drawn so that adjacent bars touch each other.

Looking at the frequency distribution in Table $2.7$ and the frequency histogram in Figure 2.7, we can describe the payment times:
1 None of the payment times exceeds the industry standard of 30 days. (Actually, all of the payment times are less than 30 -remember the largest payment time is 29 days.)
2 The payment times are concentrated between 13 and 24 days ( 57 of the 65 , or $(57 / 65) \times 100=87.69 \%$, of the payment times are in this range).

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Frequency distributions and histograms

埃森哲、Ernst \& Young Consulting 和 Deloitte \& Touche Consulting 等主要咨询公司采用统计分析来评估他们为客户设计的系统的有效性。在这种情况下，一家咨询公司为俄亥俄州汉密尔顿的一家货运公司开发了一个电子计费系统。该系统以电子方式将发票发送到每个客户的计算机，并允许客户轻松检查和纠正错误。希望新的hilling系统将大大减少客户付款所需的时间。典型的付款时间——从发票上的日期到收到付款的日期——使用货运公司的旧计费系统是 39 天或更长时间。这超过了行业标准的 30 天付款时间。
新的计费系统不会自动计算每张发票的付款时间，因为不再需要此信息。因此，为了评估系统的有效性，咨询公司从新系统运行的前三个月处理的 7,823 张发票中随机抽取 65 张发票作为样本。65 张样本发票的付款时间是人工确定的，见表2.4. 如果这个样本可以用来确定新的计费系统大大减少了支付时间，那么咨询公司计划将该系统推销给其他货运公司。

查看表中的付款时间2.4，我们可以看到最短的支付时间是10天，最长的支付时间是29天。除此之外，很难以任何有意义的方式解释数据。为了更好地理解 65 次付款的样本，咨询公司将形成数据的频率分布，并通过构建直方图来绘制分布图。与我们在 2.1 节中研究的定性数据的频率分布类似，频率分布将支付时间划分为多个类别，并告诉我们每个类别中有多少支付时间。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Form Nonoverlapping Classes of Equal Width

第 3 步：形成等宽的非重叠类我们可以通过定义类的边界来形成频率分布的类。为了找到第一类边界，我们在表中找到最小的支付时间2.4，即 10 天。该值是第一类的下界。将类长度 3 添加到这个下边界，我们得到10+3=13，即第一类的上边界和第二类的下边界。类似地，第二类的上边界和第三类的下边界等于13+3=16. 以这种方式继续，剩余类的下边界是19,22,25, 和 28 . 将类长度 3 添加到最后一个类的下边界为我们提供了最后一个类的上边界 31 。这些边界为频率分布定义了七个不重叠的类别。我们在表 2.6 中总结了这些类。比如第一课−10天且少于 13 天 – 包括付款时间 10,11 和 12 天；第二课−13天且少于 16 天——包括付款时间 13 天、14 天和 15 天；等等。请注意，观察到的最长付款时间 – 29 天 – 包含在最后一节课中。如果最后一个类中不包含最大测量值，我们只需添加另一个类。一般而言，我们为形成班级而给出的指导方针并不是一成不变的规则。相反，它们旨在帮助我们找到合理的类。最后，我们用来形成类的方法导致类的长度相等。通常，形成等长的类将更容易适当地解释频率分布。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Graph the Histogram

第 5 步：绘制直方图我们可以通过绘制直方图以图形方式描绘支付时间的分布。可以使用频率、相对频率或百分比频率分布来构建直方图。为了设置直方图，我们绘制对应于类的矩形。一个类对应的矩形底表示该类中的支付时间。矩形的高度可以表示类频率、相对频率或百分比频率。

我们在图 2.7 中绘制了 65 次支付时间的频率直方图。直方图的第一个（最左边）矩形或“条形”表示支付时间 10,11 和 12 。看图 2.7，我们看到这个矩形的底边是从支付次数频率分布中第一类的下边界 (10) 到第二类的下边界 (13) 绘制的。这个矩形的高度告诉我们第一类的频率是 3 。第二个直方图矩形代表付款时间 13,14 和 15 。它的底是从第二类的下边界（13）到第三类的下边界（16）绘制的，它的高度告诉我们第二类的频率是14。其他直方图条的构造类似。请注意，直方图中的相邻矩形之间没有间隙。在这里，虽然支付时间已记录到最近的一整天，但直方图条相互接触的事实强调了支付时间（理论上）可以是水平轴上的任何数字。通常，绘制直方图以使相邻的条相互接触。

查看表中的频率分布2.7和图 2.7 中的频率直方图，我们可以描述支付时间：
1 支付时间没有超过 30 天的行业标准。（实际上，所有的付款时间都少于 30 天——记住最大的付款时间是 29 天。）
2 付款时间集中在 13 到 24 天之间（65 天中的 57 天，或(57/65)×100=87.69%, 的付款时间在此范围内）。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

随机过程代考

贝叶斯方法代考

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

机器学习代写

多元统计分析代考

基础数据: $N$ 个样本， $P$ 个变量数的单样本，组成的横列的数据表
变量定性: 分类和顺序；变量定量：数值
数学公式的角度分为: 因变量与自变量

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Graphically Summarizing

Posted on 2022年4月18日2022年4月18日 by statistics-lab

如果你也在怎样代写商业分析Statistical Modelling for Business这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。

商业分析就是利用数据分析和统计的方法，来分析企业之前的商业表现，从而通过分析结果来对未来的商业战略进行预测和指导。

我们提供的商业分析Statistical Modelling for Business及其相关学科的代写，服务范围广, 其中包括但不限于:

Statistical Inference 统计推断
Statistical Computing 统计计算
Advanced Probability Theory 高等楖率论
Advanced Mathematical Statistics 高等数理统计学
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

统计代写|商业分析作业代写Statistical Modelling for Business代考|Graphically Summarizing

统计代写|商业分析作业代写Statistical Modelling for Business代考|Studying Pizza Preferences by Using

Part 1: Studying Pizza Preferences by Using a Frequency Distribution Unfortunately, the raw data in Table $2.1$ do not reveal much useful information about the pattern of pizza preferences. In order to summarize the data in a more useful way, we can construct a frequency distribution. To do this we simply count the number of times each of the six pizza restaurants appears in Table 2.1. We find that Bruno’s appears 8 times, Domino’s appears 2 times, Little Caesars appears 9 times, Papa John’s appears 19 times, Pizza Hut appears 4 times, and Will’s Uptown Pizza appears 8 times. The frequency distribution for the pizza preferences is given in Table $2.2$ on the next page-a list of each of the six restaurants along with their corresponding counts (or frequencies). The frequency distribution shows us how the preferences are distributed among the six restaurants. The purpose of the frequency

distribution is to make the data easier to understand. Certainly, looking at the frequency distribution in Table $2.2$ is more informative than looking at the raw data in Table 2.1. We see that Papa John’s is the most popular restaurant, and that Papa John’s is roughly twice as popular as each of the next three runners-up-Bruno’s, Little Caesars, and Will’s. Finally, Pizza Hut and Domino’s are the least preferred restaurants.

When we wish to summarize the proportion (or fraction) of items in each class, we employ the relative frequency for each class. If the data set consists of $n$ observations, we define the relative frequency of a class as follows:
Relative frequency of a class $=\frac{\text { frequency of the class }}{h}$
This quantity is simply the fraction of items in the class. Further, we can obtain the percent frequency of a class by multiplying the relative frequency by 100 .

Table $2.3$ gives a relative frequency distribution and a percent frequency distribution of the pizza preference data. A relative frequency distribution is a table that lists the relative frequency for each class, and a percent frequency distribution lists the percent frequency for each class. Looking at Table $2.3$, we see that the relative frequency for Bruno’s pizza is $8 / 50=.16$ and that (from the percent frequency distribution) $16 \%$ of the sampled students preferred Bruno’s pizza. Similarly, the relative frequency for Papa John’s pizza is $19 / 50=$ $.38$ and $38 \%$ of the sampled students preferred Papa John’s pizza. Finally, the sum of the relative frequencies in the relative frequency distribution equals $1.0$, and the sum of the percent frequencies in the percent frequency distribution equals $100 \%$. These facts are true for all relative frequency and percent frequency distributions.

统计代写|商业分析作业代写Statistical Modelling for Business代考|Studying Pizza Preferences

Part 2: Studying Pizza Preferences by Using Bar Charts and Pie Charts A bar chart is a graphic that depicts a frequency, relative frequency, or percent frequency distribution. For example, Figure $2.1$ gives an Excel bar chart of the pizza preference data. On the horizontal axis we have placed a label for each class (restaurant), while the vertical axis measures frequencies. To construct the bar chart, Excel draws a bar (of fixed width) corresponding to each class label.

Each bar is drawn so that its height equals the frequency corresponding to its label. Because the height of each bar is a frequency, we refer to Figure $2.1$ as a frequency bar chart. Notice that there are gaps between the bars. When data are qualitative, the bars should always be separated by gaps in order to indicate that each class is separate from the others. The bar chart in Figure $2.1$ clearly illustrates that, for example, Papa Juhits pizca is preferred by more sampled students than any other restaurant and Domino’s pizza is least preferred by the sampled students.

If desired, the har heights can represent relative frequencies or percent frequencies For instance, Figure $2.2$ is a Minitab percent bar chart for the pizza preference data. Here the heights of the bars are the percentages given in the percent frequency distribution of Table 2.3. Lastly, the bars in Figures $2.1$ and $2.2$ have been positioned vertically. Because of this, these bar charts are called vertical bar charts. However, sometimes bar charts are constructed with horizontal bars and are called horizontal bar charts.

A pie chart is another graphic that can be used to depict a frequency distribution. When constructing a pie chart, we first draw a circle to represent the entire data set. We then divide the circle into sectors or “pie slices” based on the relative frequencies of the classes. For example, remembering that a circle consists of 360 degrees, Bruno’s Pizza (which has relative frequency .16) is assigned a pie slice that consists of $.16(360)=57.6$ degrees. Similarly, Papa John’s Pizza (with relative frequency .38) is assigned a pie slice having. $.38(360)=$ $136.8$ degrees. The resulting pie chart (constructed using Excel) is shown in Figure $2.3$ on the next page. Here we have labeled the pie slices using the percent frequencies. The pie slices can also be labeled using frequencies or relative frequencies.

统计代写|商业分析作业代写Statistical Modelling for Business代考|The Pareto chart (Optional)

Pareto charts are used to help identify important quality problems and opportunities for process improvement. By using these charts we can prioritize problem-solving activities. The Pareto chart is named for Vilfredo Pareto (1848-1923), an Italian economist. Pareto suggested that, in many economies, most of the wealth is held by a small minority of the population. It has been found that the “Pareto principle” often applies to defects. That is, only a few defect types account for most of a product’s quality problems.

To illustrate the use of Pareto charts, suppose that a jelly producer wishes to evaluate the labels being placed on 16-ounce jars of grape jelly. Every day for two weeks, all defective labels found on inspection are classified by type of defect. If a label has more than one defect, the type of defect that is most noticeable is recorded. The Excel output in Figure $2.4$ presents the frequencies and percentages of the types of defects observed over the two-week period.
In general, the first step in setting up a Pareto chart summarizing data concerning types of defects (or categories) is to construct a frequency table like the one in Figure 2.4. Defects or categories should be listed at the left of the table in decreasing order by frequenciesthe defect with the highest frequency will be at the top of the table, the defect with the second-highest frequency below the first, and so forth. If an “other” category is employed, it should be placed at the bottom of the table. The “other” category should not make up 50 percent or more of the total of the frequencies, and the frequency for the “other” category should not exceed the frequency for the defect at the top of the table. If the frequency for the “other” category is too high, data should be collected so that the “other” category can be broken down into new categories. Once the frequency and the percentage for each category are determined, a cumulative percentage for each category is computed. As illustrated in Figure $2.4$, the cumulative percentage for a particular category is the sum of the percentages corresponding to the particular eategory and the categories that are above that category in the table.

金融中的随机方法代写

统计代写|商业分析作业代写Statistical Modelling for Business代考|Studying Pizza Preferences by Using

第 1 部分：使用频率分布研究比萨偏好不幸的是，表中的原始数据2.1不要透露太多关于披萨偏好模式的有用信息。为了以更有用的方式汇总数据，我们可以构造频率分布。为此，我们只需计算表 2.1 中六家比萨餐厅中的每家出现的次数。我们发现 Bruno’s 出现了 8 次，Domino’s 出现了 2 次，Little Caesars 出现了 9 次，Papa John’s 出现了 19 次，Pizza Hut 出现了 4 次，Will’s Uptown Pizza 出现了 8 次。披萨偏好的频率分布在表中给出2.2在下一页上 – 六家餐厅中每家的列表及其相应的计数（或频率）。频率分布向我们展示了偏好在六家餐厅之间的分布情况。频率的目的

分布是为了让数据更容易理解。当然，看看表中的频率分布2.2比查看表 2.1 中的原始数据提供更多信息。我们看到 Papa John’s 是最受欢迎的餐厅，而 Papa John’s 的受欢迎程度大约是接下来三个亚军 Bruno’s、Little Caesars 和 Will’s 的两倍。最后，必胜客和达美乐是最不受欢迎的餐厅。

当我们希望总结每个类别中项目的比例（或分数）时，我们使用每个类别的相对频率。如果数据集由n观察，我们定义一个类的相对频率如下：
一个类的相对频率= 上课频率 H
这个数量只是类中项目的分数。此外，我们可以通过将相对频率乘以 100 来获得类的百分比频率。

桌子2.3给出比萨偏好数据的相对频率分布和百分比频率分布. 相对频率分布是列出每个类别的相对频率的表格，百分比频率分布列出每个类别的百分比频率。看表2.3，我们看到布鲁诺披萨的相对频率是8/50=.16和那个（从百分比频率分布）16%的抽样学生更喜欢布鲁诺的披萨。同样，棒约翰披萨的相对频率是19/50= .38和38%的抽样学生更喜欢 Papa John 的披萨。最后，相对频率分布中的相对频率之和等于1.0，并且百分比频率分布中的百分比频率之和等于100%. 这些事实适用于所有相对频率和百分比频率分布。

统计代写|商业分析作业代写Statistical Modelling for Business代考|Studying Pizza Preferences

第 2 部分：使用条形图和饼图研究比萨偏好条形图是描述频率、相对频率或百分比频率分布的图形。例如，图2.1给出披萨偏好数据的 Excel 条形图. 在水平轴上，我们为每个类别（餐厅）放置了一个标签，而垂直轴测量频率。为了构建条形图，Excel 会绘制一个与每个类标签相对应的条形（固定宽度）。

绘制每个条形使其高度等于其标签对应的频率。因为每个条的高度是一个频率，我们参考图2.1作为频率条形图。请注意，条形之间存在间隙。当数据是定性的时，条形应始终由间隙隔开，以表明每个类与其他类是分开的。图中的条形图2.1清楚地表明，例如，Papa Juhits pizca 比任何其他餐厅更受抽样学生的青睐，而 Domino 的比萨饼是抽样学生最不喜欢的。

如果需要，har 高度可以表示相对频率或百分比频率。例如，图2.2是比萨偏好数据的 Minitab 百分比条形图。这里的条形高度是表 2.3 的百分比频率分布中给出的百分比。最后，图中的条形图2.1和2.2已垂直放置。因此，这些条形图被称为垂直条形图。但是，有时条形图由水平条构成，称为水平条形图。

饼图是另一种可用于描述频率分布的图形。在构建饼图时，我们首先画一个圆圈来表示整个数据集。然后，我们根据类的相对频率将圆圈分成扇区或“饼片”。例如，记住一个圆由 360 度组成，Bruno’s Pizza（相对频率为 0.16）被分配一个饼片，该饼片由.16(360)=57.6度。类似地，Papa John’s Pizza（相对频率为 0.38）被分配了一个饼片。.38(360)= 136.8度。生成的饼图（使用 Excel 构建）如图2.3在下一页。在这里，我们使用百分比频率标记了饼图。饼图切片也可以使用频率或相对频率来标记。

统计代写|商业分析作业代写Statistical Modelling for Business代考|The Pareto chart (Optional)

帕累托图用于帮助识别重要的质量问题和过程改进的机会。通过使用这些图表，我们可以优先考虑解决问题的活动。Pareto 图以意大利经济学家 Vilfredo Pareto (1848-1923) 命名。帕累托建议，在许多经济体中，大部分财富由少数人口持有。已经发现，“帕累托原理”通常适用于缺陷。也就是说，只有少数缺陷类型可以解决大部分产品的质量问题。

为了说明帕累托图的使用，假设果冻生产商希望评估放置在 16 盎司罐装葡萄果冻上的标签。两周内每天都会对检查中发现的所有缺陷标签按缺陷类型进行分类。如果标签有多个缺陷，则记录最明显的缺陷类型。图中的 Excel 输出2.4显示了在两周内观察到的缺陷类型的频率和百分比。
一般而言，建立帕累托图总结有关缺陷类型（或类别）的数据的第一步是构建一个如图 2.4 中所示的频率表。缺陷或类别应按频率降序排列在表的左侧，频率最高的缺陷将位于表的顶部，频率第二高的缺陷位于第一个之下，依此类推。如果使用“其他”类别，则应将其放在表格底部。“其他”类别不应占总频率的 50% 或更多，“其他”类别的频率不应超过表顶部缺陷的频率。如果“其他”类别的频率太高，应收集数据，以便将“其他”类别分解为新类别。一旦确定了每个类别的频率和百分比，就会计算每个类别的累积百分比。如图所示2.4，特定类别的累积百分比是对应于特定类别的百分比和表中高于该类别的类别的总和。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。统计代写|python代写代考

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写