• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

The statistical analyst’s goal should be to present the most accurate and truthful portrayal of a data set that is possible. Such a presentation allows managers using the analysis to make informed decisions. However, it is possible to construct statistical summaries that are misleading. Although we do not advocate using misleading statistics, you should be aware of some of the ways statistical graphs and charts can be manipulated in order to distort the truth. By knowing what to look for, you can avoid being misled by a (we hope) small number of unscrupulous practitioners.

As an example, suppose that the nurses at a large hospital will soon vote on a proposal to join a union. Both the union organizers and the hospital administration plan to distribute recent salary statistics to the entire nursing staff. Suppose that the mean nurses’ salary at the hospital and the mean nurses’ salary increase at the hospital (expressed as a percentage) for each of the last four years are as follows:

The hospital administration does not want the nurses to unionize and, therefore, hopes to convince the nurses that substantial progress has been made to increase salaries without a union. On the other hand, the union organizers wish to portray the salary increases as minimal so that the nurses will feel the need to unionize.

Figure $2.29$ gives two bar charts of the mean nurses’ salaries at the hospital for each of the last four years. Notice that in Figure $2.29$ (a) the administration has started the vertical scale of the bar chart at a salary of $\$ 58,000$by using a scale break ( ). Alternatively, the chart could be set up without the scale break by simply starting the vertical scale at$\$58,000$. Starting the vertical scale at a value far above zero makes the salary increases look more dramatic. Notice that when the union organizers present the bar chart in Figure $2.29$ (b), which has a vertical scale starting at zero, the salary increases look far less impressive.

Figure $2.30$ presents two bar charts of the mean nurses’ salary increases (in percentages) at the hospital for each of the last four years. In Figure $2.30$ (a), the administration has made the widths of the bars representing the percentage increases proportional to their heights. This makes the upward movement in the mean salary increases look more dramatic because the observer’s eye tends to compare the areas of the bars, while the improvements in the mean salary increases are really only proportional to the heights of the bars. When the union organizers present the bar chart of Figure $2.30$ (b), the improvements in the mean salary increases look less impressive because each bar has the same width.

Figure $2.31$ gives two time series plots of the mean nurses’ salary increases at the hospital for the last four years. In Figure $2.31$ (a) the administration has stretched the vertical axis of the graph. That is, the vertical axis is set up so that the distances between the percentages are large. This makes the upward trend of the mean salary increases appear to be steep. In Figure $2.31$ (b) the union organizers have compressed the vertical axis (that is, the distances between the percentages are small). This makes the upward trend of the

mean salary increases appear to be gradual. As we will see in the exercises, stretching and compressing the horizontal axis in a time series plot can also greatly affect the impression given by the plot.

It is also possible to create totally different interpretations of the same statistical summary by simply using different labeling or captions. For example, consider the bar chart of mean nurses’ salary increases in Figure 2.30(b). To create a favorable interpretation, the hospital administration might use the caption “Salary Increase Is Higher for the Fourth Year in a Row.” On the other hand, the union organizers might create a negative impression by using the caption “Salary Increase Fails to Reach $10 \%$ for Fourth Straight Year.”

In summary, it is important to carefully study any statistical summary so that you will not be misled. Look for manipulations such as stretched or compressed axes on graphs, axes that do not begin at zero, bar charts with bars of varying widths, and biased captions. Doing these things will help you to see the truth and to make well-informed decisions.

In Section $1.5$ we said that descriptive analytics uses traditional and more recently developed graphics to present to executives (and sometimes customers) easy-to-understand visual summaries of up-to-the-minute information conceming the operational status of a business. In this section we will discuss some of the more recently developed graphics used by descriptive analytics, which include gauges, bullet graphs, treemaps, and sparklines. In addition, we will see how they are used with each other and more traditional graphics to form analytic dashboards, which are part of executive information systems. We will also briefly discuss data discovery, which involves, in it simplest form, data drill down.

Dashboards and gauges An analytic dashboard provides a graphical presentation of the current status and historical trends of a business’s key performance indicators. The term dashboard originates from the automotive dashboard, which helps a driver monitor a car’s key functions. Figure $2.35$ shows a dashboard that graphically portrays some key performance indicators for a (fictitious) airline for last year. In the lower left-hand portion of the dashboard we see three gauges depicting the percentage utilizations of the regional, short-haul, and international fleets of the airline. In general, a gauge (chart) allows us to visualize data in a way that is similar to a real-life speedometer needle on an automobile. The outer scale of the gauge is often color coded to provide additional performance information. For example, note that the colors on the outer scale of the gauges in Figure $2.35$ range from red to dark green to light green. These colors signify percentage fleet utilizations that range from poor (less than 75 percent) to satisfactory (less than 90 percent but at least 75 percent) to good (at least 90 percent).
Bullet graphs While gauge charts are nice looking, they take up considerable space and to some extent are cluttered. For example, recalling the Disney Parks Case, using seven $\mathrm{~ ส ม ่ ม ด ร ~ c h a}$ seven Epcot rides would take up considerable space and would be less efficient than using the bullet graph that we originally showed in Figure 1.8. That bullet graph (which has been obtained using Excel) is shown again in Figure 2.36.

In general, a bullet graph features a single measure (for example, predicted waiting time) and displays it as a horizontal bar (or a vertical bar) that extends into ranges representing qualitative measures of performance, such as poor, satisfactory, and good. The ranges are displayed as either different colors or as varying intensities of a single hue. Using a single hue makes them discernible by those who are color blind and restricts the use of color if the bullet graph is part of a dashboard that we don’t want to look too “busy.” Many bullet graphs compare the single primary measure to a target, or objective, which is represented by a symbol on the bullet graph. The bullet graph of Disney’s predicted waiting times uses five colors ranging from dark green to red and signifying short ( 0 to 20 minutes) to very long ( 80 to 100 minutes) predicted waiting times. This bullet graph does not compare the predicted waiting times to an objective. However, the bullel graphs located in the upper left of the dashboard in Figure $2.35$ (representing the percentages of on-time arrivals and departures for the airline) do display obbjectives represented by shoort vertical black lines. For examplé, consider the bullet graphs representing the percentages of on-time arrivals and departures in the Midwest, which are shown below.

Sparklines Figure $2.38$ shows sparklines depicting the monthly closing prices of five stocks over six months. In general, a sparkline, another of the new descriptive analytics graphics, is a line chart that presents the general shape of the variation (usually over time) in some measurement, such as temperature or a stock price. A sparkline is typically drawn without axes or coordinates and made small enough to be embedded in text. Sparklines are often grouped together so that comparative variations can be seen. Whereas most charts are designed to show considerable information and are set off from the main text, sparklines are intended to be succinct and located where they are discussed. Therefore, if the sparklines in Figure $2.38$ were located in the text of a report comparing stocks, they would probably be made smaller than shown in Figure $2.38$, and the actual monthly closing prices of the stocks used to obtain the sparklines would probably not be shown next to the sparklines.
Data discovery To conclude this section, we briefly discuss data discovery methods, which allow decision makers to interactively view data and make preliminary analyses. One simple version of data discovery is data drill down, which reveals more detailed data that underlie a higher-level summary. For example, an online presentation of the airline dashboard in Figure $2.35$ might allow airline executives to drill down by clicking the gauge which shows that approximately 89 percent of the airline’s international fleet was utilized over the year to see the bar chart in Figure 2.39. This bar chart depicts the quarterly percentage utilizations of the airline’s international fleet over the year. The bar chart suggests that the airline might offer international travel specials in quarter 1 (Winter) and quarter 4 (Fall) to increase international travel and thus increase the percentage of the international fleet utilized in those quarters. Drill down can be done at multiple levels. For example, Disney executives might wish to have weekly reports of total merchandise sales at its four Orlando parks, which can be drilled down to reveal total merchandise sales at each park, which can be further drilled down to reveal total merchandise sales at each “land” in each park, which can be yet further drilled down to reveal total merchandise sales in each store in each land in each park.

广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。