## 统计代写|数据可视化代写Data visualization代考|INFO2001

statistics-lab™ 为您的留学生涯保驾护航 在代写数据可视化Data visualization方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写数据可视化Data visualization代写方面经验极为丰富，各种代写数据可视化Data visualization相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|数据可视化代写Data visualization代考|Another Asymmetry

There is still one more small, but nagging, problem with this description of Galton’s development of regression and the idea of correlation. In Figure 6.13, which shows Galton’s sweet pea data, we were careful to plot the size of child seeds on the vertical $y$ axis against that of their parent seeds on the horizontal $x$ axis, as is the modern custom for a scatterplot, whose goal is to show how $y$ depends on, or varies with, $x$. Modern statistical methods that flow from Galton and Pearson are all about directional relationships, and they try to predict $y$ from $x$, not vice-versa. It makes sense to ask how a child’s height is related to that of its parents, but it stretches the imagination to go in the reverse direction and contemplate how a child’s height might influence that of its parents.

So, why didn’t Galton put child height on the $y$ axis and parent height on the $x$ axis in Figure 6.16, as one would do today? One suggestion is that such graphs were in their infancy, so the convention of plotting the outcome variable on the ordinate had not yet been established. Yet in Playfair’s timeseries graphs (Plate 10) and in all other not-quite-scatterplots such as Halley’s (Figure 6.2), the outcome variable was always shown on the $y$ axis.

The answer is surely that Galton’s Figure $6.16$ started out as a table, listing mid-parent heights in the rows and heights of children in the columns. Parent height was the first grouping variable, and he tallied the heights of their children in the columns.

In a table, the rows are typically displayed in increasing order (of $y$ ) from top to bottom; a plot does the reverse, showing increasing values of $y$ from bottom to top. Hence, it seems clear that Galton constructed his Table I (Figure 6.14) and figures based on it (Figure $6.15$ and Figure 6.16) as if he thought of them as plots.

## 统计代写|数据可视化代写Data visualization代考|Some Remarkable Scatterplots

As Galton’s work shows, scatterplots had advantages over earlier graphic forms: the ability to see clusters, patterns, trends, and relations in a cloud of points. Perhaps most importantly, it allowed the addition of visual annotations (point symbols, lines, curves, enclosing contours, etc.) to make those relationships more coherent and tell more nuanced stories. This $2 \mathrm{D}$ form of the scatterplot allows these higher-level visual explanations to be placed firmly in the foreground. John Tukey later expressed this as, “The greatest value of a picture is when it forces us to notice what we never expected to see” (1977, p. vi).

In the first half of the twentieth century, data graphics entered the mainstream of science, and the scatterplot soon became an important tool in new discoveries. Two short examples must serve to illustrate applications in physical science and economics.

One key feature was the idea that discovery of something interesting could come from the perception-and understanding-of classifications of objects based on clusters, groupings, and patterns of similarity, rather than direct relations, linear or nonlinear. Observations shown in a scatterplot could belong to different groups, revealing other laws. The most famous example concerns the Hertzsprung-Russell (HR) diagram, which revolutionized astrophysics.
The original version of the Hertzsprung-Russell diagram, shown here in Figure 6.17, is not a graph of great beauty, but nonetheless it radically changed thinking in astrophysics by showing that scatterplots of measurements of stars could lead to a new understanding of stellar evolution.

Astronomers had long noted that stars varied, not only in brightness (luminosity), but also in color, from blue-white to orange, yellow, and red. But until the early 1900 s, they had no general way to classify them or interpret variations in color. In 1905, the Danish astronomer Ejnar Hertzsprung presented tables of luminosity and star color. He noted some apparent correlations and trends, but the big picture-an interpretable classification, leading to theory-was lacking, probably because his data were displayed in tables.

## 统计代写|数据可视化代写数据可视化代考|一些显著的散点图

Galton的工作表明，散点图比早期的图形形式有优势:能够在点云中看到集群、模式、趋势和关系。也许最重要的是，它允许添加视觉注释(点符号、线、曲线、外围轮廓等)，使这些关系更连贯，讲述更微妙的故事。这种$2 \mathrm{D}$形式的散点图可以让这些更高层次的视觉解释牢牢地放在前景中。约翰·杜克(John Tukey)后来将其表达为:“一幅画的最大价值在于它迫使我们注意到我们从未期望看到的东西”(1977,p. vi)

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|数据可视化代写Data visualization代考|STAT1100

statistics-lab™ 为您的留学生涯保驾护航 在代写数据可视化Data visualization方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写数据可视化Data visualization代写方面经验极为丰富，各种代写数据可视化Data visualization相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|数据可视化代写Data visualization代考|Francis Galton and the Idea of Correlation

Francis Galton [1822-1911] was among the first to show a purely empirical bivariate relation in graphical form using actual data with his work on questions of heritability of traits. He began with plots showing the relationship between physical characteristics of people (head circumference and height) or between parents and their offspring, as a means to study the association and inheritance of traits: Do tall people have larger heads than average? Do tall parents have taller than average children?

Inspecting and calculating from his graphs, he discovered a phenomenon he called “regression toward the mean,” and his work on these problems can be considered to be the foundation of modern statistical methods. His insight from these diagrams led to much more: the ideas of correlation and linear regression; the bivariate normal distribution; and eventually to nearly all of classical statistical linear models (analysis of variance, multiple regression, etc.).
The earliest known example is a chart of head circumference compared to stature from Galton’s notebook (circa 1874) “Special Peculiarities,” shown in Figure 6.11. ${ }^{16}$ In this hand-drawn chart, the intervals of height are shown horizontally against head circumference vertically. The entries in the body are the tallies in the pairs of class intervals. Galton included the total counts of height and head circumference in the right and bottom margins and drew smooth curves to represent their frequency distributions. The conceptual origin of this chart as a table rather than a graph can be seen in the fact that the smallest values of the two variables are shown at the top left (first row and first column), rather than in the bottom right, as would be more natural in a graph. One may argue that Galton’s graphic representations of bivariate relations were both less and more than true scatterplots of data, as these are used today. They are less because at first glance they look like little more than tables with some graphic annotations. They are more because he used these as devices to calculate and reason with. ${ }^{17} \mathrm{He}$ did this because the line of regression he sought was initially defined as the trace of the mean of the vertical variable $y$ as the horizontal variable $x$ varied $^{18}$ (what we now think of as the conditional mean function, $\mathcal{E}(y \mid x)$ ), and so required grouping at least the $x$ variable into class intervals. Galton’s displays of these data were essentially graphic transcriptions of these tables, using count-symbols $(/, / /, / / /, \ldots)$ or numbers to represent the frequency in each cell-what in 1972 the Princeton polymath John Tukey called “semi-graphic displays,” making them a visual chimera: part table, part plot.

## 统计代写|数据可视化代写Data visualization代考|Galton’s Elliptical Insight

Galton’s next step on the problem of filial correlation and regression turned out to be one of the most important in the history of statistics. In 1886, he published a paper titled “Regression Towards Mediocrity in Hereditary Stature” containing the table shown in Figure 6.14. The table records the frequencies of the heights of 928 adult children born to 205 pairs of fathers and mothers, classified by the average height of their father and mother (“mid-parent” height).$^{22}$

If you look at this table, you may see only a table of numbers with larger values in the middle and some dashes (meaning 0 ) in the upper left, and bottom right corners. But for Galton, it was something he could compute with, both in his mind’s eye and on paper.

I found it hard at first to catch the full significance of the entries in the table, which had curious relations that were very interesting to investigate. They came out distinctly when I “smoothed” the entries by writing at each intersection of a horizontal column with a vertical one, the sum of the entries in the four adjacent squares, and using these to work upon. (Galton, 1886, p. 254)

Consequently, Galton first smoothed the numbers in this table, which he did by the simple step of summing (or averaging) each set of four adjacent cells. We can imagine that he wrote that average number larger in red ink, exactly at the intersection of these four cells. When he had completed this task, we can imagine him standing above the table with a different pen and trying to connect the dots-to draw curves, joining the points of approximately equal frequency. We tried to reproduce these steps in Figure 6.15, except that we did the last step mechanically, using a computer algorithm, whereas Galton probably did it by eye and brain, in the manner of Herschel, with the aim that the resulting curves should be gracefully smooth.

## 统计代写|数据可视化代写数据可视化代考|Francis Galton和相关性的思想

Francis Galton[1822-1911]是第一个使用实际数据以图形形式展示纯经验二元关系的人之一，他的工作涉及性状的遗传力问题。他从展示人的身体特征(头围和身高)之间的关系，或父母和他们的后代之间的关系的图表开始，作为一种研究特征的关联和遗传的手段:高个子的人的头比普通人大吗?个子高的父母生的孩子比一般人高吗?

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|数据可视化代写Data visualization代考|BINF7003

statistics-lab™ 为您的留学生涯保驾护航 在代写数据可视化Data visualization方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写数据可视化Data visualization代写方面经验极为丰富，各种代写数据可视化Data visualization相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|数据可视化代写Data visualization代考|John Herschel and the Orbits of Twin Stars

In the hundred years from 1750 to 1850 , during which most of the modern graphic forms were invented, fundamentally important problems of measurement attracted the best mathematical minds, including Euler, Laplace,

Legendre, Newton, and Gauss, and led to the inventions of calculus, least squares, curve fitting, and interpolation. ${ }^8$ In these scientific and mathematical domains, graphs had begun to play an increasing role in the explanation of scientific phenomena, as we described earlier in the case of Johann Lambert.
Among this work, we find the remarkable paper of Sir John Frederick W. Herschel [1792-1871], On the Investigation of the Orbits of Revolving Double Stars, which he read to the Royal Astronomical Society on January 13, 1832, and published the next year. Double stars had long played a particularly important role in astrophysics because they provided the best means to measure stellar masses and sizes, and this paper was prepared as andendum to another 1833 paper, in which Herschel had meticulously cataloged observations on the orbits of 364 double stars.

The printed paper refers to four figures, presented at the meeting. Alas, the version printed in the Memoirs of the Royal Astronomical Society did not include them, presumably owing to the cost of engraving. Herschel noted, “The original charts and figures which accompanied the paper being all deposited with the Society.” These might have been lost to historians, but Thomas Hankins discovered copies of them in research for his insightful 2006 paper on Herschel’s graphical method.

To see why Herschel’s paper is remarkable, we must follow his exposition of the goals, the construction of scatterplots, and the idea of visual smoothing to produce a more satisfactory solution for parameters of the orbits of double stars than were available by analytic methods.

## 统计代写|数据可视化代写Data visualization代考|Herschel’s Graphical Impact Factor

The critical reader may object, thinking that Herschel’s graphical method, as ingenious as it might be, did not produce true scatterplots in the modern sense because the horizontal axis in Figure $6.9$ is time rather than a separate variable. Thus one might argue that all we have is another time-series graph,so priority really belongs to Playfair, or further back, to Lambert, who stated the essential ideas. On the surface this is true.

But it’s only true on the surface. We argue that a close and appreciative reading of Herschel’s description of his graphical method can, at the very least, be considered a true innovation in visual thinking, worthy of note in the present account. More importantly, Herschel’s true objective was to calculate the parameters of the orbits of twin stars based on the relation between position angle and separation distance; the use of time appears in the graph as a proxy or indirect means to overcome the scant observations and perhaps extravagant errors in the data on separation distance.

Yet Herschel’s graphical development of calculation based on a scatterplot attracted little attention outside the field of astronomy, where his results were widely hailed as groundbreaking in the Royal Astronomical Society. But this notice was for his scientific achievement rather than for his contribution of a graphical method, which scientists probably rightly considered just a means to an end.

It took another 30-50 years for graphical methods to be fully welcomed into the family of data-based scientific explanation, and seen as something more than mere illustrations. This change is best recorded in presentations at the Silver Jubilee of the Royal Statistical Society in 1885 . Even at that time, most British statisticians still considered themselves “statists”, mere recorders of statistical facts in numerical tables; but “graphists” had finally been invited to the party.

On June 23, the influential British economist Alfred Marshall [1842-1924] addressed the attendees on the benefits of the graphic method, a radical departure for a statist. His French counterpart Émile Levasseur [1828-1911] presented a survey of the wide variety of graphs and statistical maps then in use. Yet even then, the scientific work of Lambert and Herschel, and the concept of the scatterplot as a new graphical form remained largely unknown. This would soon change with Francis Galton.

## 统计代写|数据可视化代写数据可视化代考|赫歇尔的图形影响因子

6月23日，有影响力的英国经济学家阿尔弗雷德·马歇尔(Alfred Marshall, 1842-1924)向与会者发表了关于图表方法的好处的演讲，这是对中央集权主义者的一种激进的背离。他的法国同行Émile Levasseur[1828-1911]对当时使用的各种各样的图表和统计地图进行了调查。然而，即使在那时，兰伯特和赫歇尔的科学工作，以及散点图作为一种新的图形形式的概念，在很大程度上仍然是未知的。弗朗西斯·高尔顿很快改变了这种情况

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|数据可视化代写Data visualization代考|INFO2001

statistics-lab™ 为您的留学生涯保驾护航 在代写数据可视化Data visualization方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写数据可视化Data visualization代写方面经验极为丰富，各种代写数据可视化Data visualization相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|数据可视化代写Data visualization代考|The Broad Street Pump

Snow’s opportunity to test his theory came with the new eruption that began toward the end of August in 1854. His celebrated 1855 report, On the Mode of Communication of Cholera, ${ }^{16}$ describes it dramatically:
The most terrible outbreak of cholera which ever occurred in this kingdom, is probably that which took place in Broad Street, Golden Square, and the adjoining streets, a few weeks ago. Within two hundred and fifty yards of the spot where Cambridge Street joins Broad Street, there were upwards of five hundred fatal attacks of cholera in ten days. The mortality in this limited area probably equals any that was ever caused in this country, even by the plague; and it was much more sudden, as the greater number of cases terminated in a few hours. (p. 38)

The full story of Snow’s discovery of the waterborne cause of cholera has been told in rich detail many times, by medical historians ${ }^{17}$ and cartographers, ${ }^{18}$ and it was brought to the attention of statisticians and those interested in the history of data visualization by Edward Tufte. ${ }^{19}$

The short, if slightly apocryphal, version of this story is that, during the outbreak of cholera in Soho in 1854, Snow created a dot map of the locations of deaths and immediately noticed that they clustered on Broad Street, near the site of one of the public pumps from which residents drew their water. This narrative continues: Snow recognized that cases of death were strongly associated with drinking water from this pump. He petitioned the Board of Guardians of St. James Parish to remove the pump handle, whereupon the cholera epidemic subsided.

## 统计代写|数据可视化代写Data visualization代考|The Neighborhoods Map

The version of Snow’s map shown in Figure $4.7$ is the most famous, but a second version is more interesting graphically and scientifically. The Cholera Inquiry Committee, appointed by the Vestry of St. James, submitted its report on July 25, 1855. ${ }^{22}$ The section titled “Dr. Snow’s Report” contained a new map that attempted a more detailed and direct visual analysis of the association of death with the Broad Street pump.

This new map, shown in Figure 4.8, states and tests a geospatial hypothesis: people are most likely to draw their water from the nearest pump (by walking distance). The outlined region in this map “shews the various points which have been found by careful measurement to be at an equal distance by the nearest road from the pump in Broad Street and the surrounding pumps.” 23

If the source of the outbreak was indeed the Broad Street pump, one should expect to find the highest concentration of deaths within this area, and also a low prevalence outside it. He stated his conclusion as “it will be observed that the deaths either very much diminish, or cease altogether, at every point where it becomes decidedly nearer to send to another pump than to the one in Broad street.” 24

The final explanation for the source of the outbreak came slightly later, through the work of Reverend Henry Whitehead, the curate at a local church and a member of the Cholera Inquiry Committee. He identified the first (“index”) case as the death of a five-month-old infant, Frances Lewis, whose family lived at 40 Broad Street, immediately adjacent to the pump. When severe diarrhea struck the child, her mother, Sarah Lewis, soaked the diapers and emptied the pails into the cesspool at the front of their house, only three feet from the well. Unfortunately, the cesspool walls had decayed and the effluent flowed directly into the pump well. Thomas Lewis, the baby’s father and a local constable, suffered a fatal attack of the disease on September 8, the same day that the pump handle was removed.

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|数据可视化代写Data visualization代考|STAT1100

statistics-lab™ 为您的留学生涯保驾护航 在代写数据可视化Data visualization方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写数据可视化Data visualization代写方面经验极为丰富，各种代写数据可视化Data visualization相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|数据可视化代写Data visualization代考|The Transcendent Effect of Water

Farr was certainly meticulous in evaluating the impact of potential causes on mortality from cholera. But he lacked an effective method for doing so, even for one potential cause, and the idea of accounting for the combination of several causes stretched him to the limit. His general method was to prepare tables of cholera mortality in the districts of London, broken down and averaged over classes of a possible explanatory variable.

For example, Farr divided the 38 districts into the 19 highest and 19 lowest values on other variables and calculated the ratio of cholera mortality for each; elevation had the largest ratio (3:1), while all other variables showed smaller ratios (e.g., 2.1:1 for house values). Having hit on elevation above the Thames as his principal cause, he prepared many other tables showing mortality by districts also in relation to density of the population, value of houses and shops, relief to the poor, and geographical features.

Figure $4.5$ illustrates the depth of this inquiry, in an ingenious semigraphic combination of small tables for each district overlaid on a schematic map of their spatial arrangement along the Thames. The tables show the numbers for elevation, cholera deaths, deaths from all causes, and population density, and identify the water companies supplying each district. Unfortunately, this lovely diagram concealed more than it revealed: the signal was there, but the wealth of detail provided too much noise.

It would later turn out that the direct cause of cholera could be traced to contamination of the water supply from which people drew. It was probably confusing that water was provided by nine water companies, as Farr shows in Figure 4.5, so he divided the registration districts into three groups based on the region along the Thames for their water supply: Thames, between Kew and Hammersmith bridges (western London), between Battersea and Waterloo bridges (central London), and districts that obtained their water from tributaries of the Thames (New River, Lea River, and Ravensbourne River).

## 统计代写|数据可视化代写Data visualization代考|John Snow on Cholera

Another terrible wave of cholera struck London toward the end of summer 1854, concentrated in the parish of St. James, Westminster (the present-day district of Soho). This time, a correct explanation of the cause would eventually be found with the aid of meticulous data collection, a map of disease incidence, keen medical detective work, and logical reasoning to rule out alternative explanations. It is useful to understand why John Snow succeeded while William Farr did not.

The physician John Snow [1813-1858] lived in the Soho district at the time of this new outbreak. He had been an eighteen-year-old medical assistant in Newcastle upon Tyne in 1831 when cholera first struck there with great loss of life. At the time of the second great epidemic, in 1848-1849, Snow observed the severity of the disease in his district. In 1849, in a two-part paper in the Medical Gazette and Times ${ }^{11}$ and a longer monograph ${ }^{12}$ he proposed that cholera was transmitted by water rather than through the air and passed from person to person through the intestinal discharges of the sick, either transmitted directly or entering the water supply.

Snow’s reasoning was entirely that of a clinician based on the form of pathology of the disease, rather than that of a statistician seeking associations with potential causal factors. Had cholera been an airborne disease, one would expect to see its effects in the lungs and then perhaps spread to others by respiratory discharge. But the disease clearly acted mainly in the gut, causing vomiting, intense diarrhea, and the massive dehydration that led to death. Whatever causal agent was responsible, it must have been something ingested rather than something inhaled.

William Farr was well aware of Snow’s theory when he wrote his 1852 report. ${ }^{13} \mathrm{He}$ described it quite politely but rejected Snow’s theory of the pathology of cholera. He could not understand any mechanism whereby something ingested by one individual could be passed to a larger community. To Farr, who was then considered the foremost authority on the outbreak and contagion of cholera, Snow’s contention of a single causal agent (some unknown poisonous matter, materies morbi) and a limited vector of transmission (water) was too circumscribed, too restrictive. Snow presented his argument and the evidence to support it as if ingestion and waterborne transmission could be the only causes; he also lacked the crucial data, either from a natural experiment or from direct knowledge of the water that cholera victims drank.

## 统计代写|数据可视化代写Data visualization代考|John Snow on Cholera

1854 年夏末，另一波可怕的霍乱袭击了伦敦，集中在威斯敏斯特的圣詹姆斯教区（现在的苏活区）。这一次，在细致的数据收集、疾病发病率地图、敏锐的医学侦探工作以及排除其他解释的逻辑推理的帮助下，最终将找到对原因的正确解释。理解为什么约翰·斯诺成功而威廉·法尔没有成功是很有用的。

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|数据可视化代写Data visualization代考|BINF7003

statistics-lab™ 为您的留学生涯保驾护航 在代写数据可视化Data visualization方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写数据可视化Data visualization代写方面经验极为丰富，各种代写数据可视化Data visualization相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|数据可视化代写Data visualization代考|Vital Statistics

In the previous chapter we explained how concerns in France about crime led to the systematic collection of social data. This combination of important social issues and available data led Guerry to new developments involving data display in graphs, maps, and tables.

A short time later, an analogous effort began in the United Kingdom, in the context of social welfare, poverty, public health, and sanitation. These efforts produced two new heroes of data visualization, William Farr and John Snow, who were influential in the attempt to understand the causes of several epidemics of cholera and how the disease could be mitigated.

In the United Kingdom the Age of Data can be said to have begun with the creation of the General Register Office (GRO) by an Act of Parliament in 1836. ${ }^1$ The initial intent was simply to track births and deaths in England and Wales as the means of ensuring the lawful transfer of property rights between generations of the landed gentry.

But the 1836 act did much more. It required that every single child of an English parent, even those born at sea, have the particulars reported to a local registrar on standard forms within fifteen days. It also required that every marriage and death be reported and that no dead body could be buried without a certificate of registration, and it imposed substantial fines (10-50£) for failure in this reporting duty. The effect was to create a complete data base of the entire population of England, which is still maintained by the GRO today.

The following year, William Farr [1807-1883], a 30-year-old physician, was hired, initially to handle the vital registration of live births, deaths, marriages, and divorces for the upcoming Census of 1841. After he wrote a chapter ${ }^2$ on “Vital statistics; or, the statistics of health, sickness, diseases, and death,” he was given a new post as the “compiler of scientific abstracts”, becoming the first official statistician of the UK.

Like Guerry at the Ministry of Justice in France, Farr had access to, and had to make sense of, a huge mountain of data. Farr quickly realized that these data could serve a far greater purpose: saving lives. Life expectancy could be broken down and compared over geographic regions, down to the county level. Information about the occupations of deceased persons was also recorded, so Farr could also begin to tabulate life expectancy according to economic and social station. Information about the cause of death was lacking, and Farr probably exceeded his initial authority by adding instructions to list the cause(s) of death on the standard form. This simple addition opened a vast new world of medical statistics and public health that would eventually be called epidemiology, involving the study of patterns of incidence, causes, and control of disease conditions in a population.

## 统计代写|数据可视化代写Data visualization代考|Farr’s Diagrams

Figure $4.1$ is one of five lithographed plates (three in color) that appear in Farr’s report. Farr takes many liberties with the vertical scales (we would now call these graphical sins) to try to show any relation between the daily numbers of deaths from cholera and diarrhea to metrological data on those days. Most apparent are the spikes of cholera deaths in August and September. Temperature was also elevated, but perhaps no more than in the adjacent months. The weather didn’t seem to be a sufficient causal factor in 1849 . Or was it? Plate 2 takes a longer view, showing the possible relationship between temperature and mortality for every week over the eleven years from 1840 to 1850. This is a remarkable chart-a new invention in the language of statistical graphs. This graphical form, now called a radial diagram (or windrose), is ideally suited to showing and comparing several related series of events having a cyclical structure, such as weeks or months of the year or compass directions. The radial lines in Plate 2 serve as axes for the fifty-two weeks of each year. The outer circles show the average weekly number of deaths (corrected for increase in population) in relation to the mean number of deaths over all years. When these exceed the average, the area is shaded black (excess mortality); they are shaded yellow when they are below the average (salubrity).
Similarly, the inner circles show average weekly temperature against a baseline of the mean temperature $\left(48^{\circ} \mathrm{F}\right)$ of the seventy-nine years from 1771 to 1849 . Weeks exceeding this average are outside the baseline circle and shaded red, while those weeks that were colder than average are said to be shaded blue (but appear as gray).

In this graph we can immediately see that something very bad happened in London in summer 1849 (row 3, column 2), leading to a huge spike in deaths from July through September, and the winter months in 1847 (row 2, column 3) also stand out. This larger view, using the idea later called “small multiples” by Tufte, ${ }^6$ does something more, which might not be noticed in a series of separate charts: it shows a general pattern across years of fewer deaths on average in the warmer months of April (at 9:00) through September (at 3:00), but the dramatic spikes point to something huge that can not be explained by temperature.

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|数据可视化代写Data visualization代考|STAT1100

statistics-lab™ 为您的留学生涯保驾护航 在代写数据可视化Data visualization方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写数据可视化Data visualization代写方面经验极为丰富，各种代写数据可视化Data visualization相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|数据可视化代写Data visualization代考|Reconstruct Test DATA

We try to reconstruct the test data using the lower dimensional representation $y$ (of the point $x$ ) such that
$$x^{\prime}=U y$$
We know that $y=U^{T} x$. By substituting y with $U^{T} x$ in the (3.14), we get
$$x^{\prime}=U U^{T} x$$
We know that $U=X V D^{-1}$. By substituting $U$ with $X V D^{-1}$ and $U^{T}$ with $D^{-1} V^{T} X^{T}$
$$\begin{gathered} x^{\prime}=X V D^{-1} D^{-1} V^{T} X^{T} x \ x^{\prime}=X V D^{-2} V^{T} X^{T} x \end{gathered}$$
Hence, test data $x^{\prime}$ can be reconstructed from the lower dimensional representation $y$.
Dual PCA is a variant of PCA used when the number of features is greater than the number of data points. Since it is just a variant of PCA, it follows all the advantages and limitations of PCA.

## 统计代写|数据可视化代写Data visualization代考|EXPLANATION AND WORKING

After understanding the concept of dimensionality reduction and a few algorithms for the same, let us now examine some plots given in Figure 4.1. What is the true dimensionality of these plots?

All dimensionality reduction techniques are based on the implicit assumption that the data lies along some low dimensional manifold. This is the case for the first three examples in Figure 4.1, which lie along a one-dimensional manifold even though it is plotted in a two-dimensional plane. In the fourth example in Figure 4.1, the data has been randomly plotted on a two-dimensional plane, so dimensionality reduction without losing information is not possible.

For the first two examples, we can use Principal Component Analysis (PCA) to find the approximate lower dimensional linear subspace. However, PCA will make no difference in the case of the third and fourth example because the structure is nonlinear and PCA only aims at finding the linear subspace. However, there are ways to find nonlinear lower dimensional manifolds.

Any form of linear projection to one dimension on this nonlinear data will result in linear principal components and we might lose information about the original dataset. This is because we need to consider nonlinear projection to one dimension to obtain the manifold on which the data points lie. So, how do we modify the PCA algorithm to solve for the nonlinear subspace in which the data points lie? In short, how do we make PCA nonlinear?

This is done using an idea similar to Support Vector Machines. Instead of using the original two-dimensional data points in one dimension using linear projections, we first write the data points as points in higher dimensional space. For example, say we write every two-dimensional point $x_{t}=\left(X_{r}, Y_{t}\right)$ into a 3-dimensional point given by mapping $\Phi$ as
$$\Phi\left(x_{t}\right)=\left(X_{t}, Y_{t}, X_{t}^{2}+Y_{t}^{2}\right)$$
After this, instead of doing PCA on the original dataset, we perform PCA on $\Phi\left(x_{1}\right)$, $\Phi\left(x_{2}\right), \ldots, \Phi\left(x_{n}\right)$. This process is known as Kernel PCA. So, the basic idea of Kernel PCA is to take the original data set and implicitly map it to a higher dimensional space using mapping $\Phi$. Then we perform PCA on this space, which is linear projection in this higher dimensional space that already captures non-linearities in the original dataset $[1,2,3]$.

## 统计代写|数据可视化代写Data visualization代考|Reconstruct Test DATA

$$x^{\prime}=U y$$

$$x^{\prime}=U U^{T} x$$

$$x^{\prime}=X V D^{-1} D^{-1} V^{T} X^{T} x x^{\prime}=X V D^{-2} V^{T} X^{T} x$$

Dual PCA 是在特征数量大于数据点数量时使用的 PCA 的一种变体。由于它只是 PCA 的一种变体，因此它遵循了 PCA 的所有优点和局限性。

## 统计代写|数据可视化代写Data visualization代考|EXPLANATION AND WORKING

$$\Phi\left(x_{t}\right)=\left(X_{t}, Y_{t}, X_{t}^{2}+Y_{t}^{2}\right)$$

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|数据可视化代写Data visualization代考|COSC3000

statistics-lab™ 为您的留学生涯保驾护航 在代写数据可视化Data visualization方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写数据可视化Data visualization代写方面经验极为丰富，各种代写数据可视化Data visualization相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|数据可视化代写Data visualization代考|Project Data in P-Dimensional Space

For any linear projection-based techniques, given X, we obtain a low dimensional representation $\mathrm{Y}$ such that
$$Y=U^{T} X$$
We know that $X=U D V^{T}$. So, multiplying $U^{T}$ on the left side we get
$$U^{T} X=U^{T} U D V^{T}$$
Since $U$ is an orthonormal matrix, $U^{T} U=1$, so (3.6) becomes
$$U^{T} X=D V^{T}$$
Since $Y=U^{T} X$, we get
$$Y=D V^{T}$$
Hence, matrix $X$ can be mapped to a lower dimensional subspace Y, just by using $D$ and $V$.

## 统计代写|数据可视化代写Data visualization代考|RECONSTRUCT TRAINING DATA

We try to reconstruct the input matrix using the lower dimensional subspace $\mathrm{Y}$ such that
$$X^{\prime}=U Y$$

We know that, $U=X V D^{-1}$ and $Y=D V^{T}$. By substituting $U$ with $X V D^{-1}$ and $Y$ with $D V^{T}$ in the (3.9):
\begin{aligned} &X^{\prime}=X V D^{-1} D V^{T} \ &X^{\prime}=X V V^{T} \end{aligned}
Hence the lower dimensional data $Y$ can be reconstructed back to the matrix $X^{\prime}$.

Let $x$ be a $d$-dimensional test data point (out of the sample data) and we try to find its lower dimensional ( $p$-dimensional) representation $y$ :
$$y=U^{T} x$$
We know that, $U=X V D^{-1}$. By substituting $U^{T}$ with $D^{-1} V^{T} X$ :
$$y=D^{-1} V^{T} X x$$
It is possible to project a $d$-dimensional test data (out of sample) in a p-dimensional (lower dimensional) space.

## 统计代写|数据可视化代写Data visualization代考|Project Data in P-Dimensional Space

$$Y=U^{T} X$$

$$U^{T} X=U^{T} U D V^{T}$$

$$U^{T} X=D V^{T}$$

$$Y=D V^{T}$$

## 统计代写|数据可视化代写Data visualization代考|RECONSTRUCT TRAINING DATA

$$X^{\prime}=U Y$$

$$X^{\prime}=X V D^{-1} D V^{T} \quad X^{\prime}=X V V^{T}$$

$$y=U^{T} x$$

$$y=D^{-1} V^{T} X x$$

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|数据可视化代写Data visualization代考|BINF7003

statistics-lab™ 为您的留学生涯保驾护航 在代写数据可视化Data visualization方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写数据可视化Data visualization代写方面经验极为丰富，各种代写数据可视化Data visualization相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|数据可视化代写Data visualization代考|EXPLANATION AND WORKING

Principal Component Analysis (PCA), a feature extraction method for dimensionality reduction, is one of the most popular dimensionality reduction techniques. We want to reduce the number of features of the dataset (dimensionality of the dataset) and preserve the maximum possible information from the original dataset at the same time. PCA solves this problem by combining the input variables to represent it with fewer orthogonal (uncorrelated) variables that capture most of its variability [1].
Let the dataset contain a set of $n$ data points denoted by $x_{1}, x_{2}, \ldots, x_{n}$ where each $x_{i}$ is a $d$-dimensional vector. PCA finds a $p$-dimensional linear subspace (where $p<d$, and often $p \ll d$ ) in a way that the original data points lie mainly on this p-dimensional linear subspace. In practice, we do not usually find a reduced subspace where all the points lie precisely in that subspace. Instead, we try to find the approximate subspace which retains most of the variability of data. Thus, $\mathrm{PCA}$ tries to find the linear subspace in which the data approximately lies.

The linear subspace can be defined by $p$ orthogonal vectors, say: $U_{1}, U_{2}, \ldots, U_{p^{\text {}}}$. This linear subspace forms a new coordinate system and the orthogonal vectors that define this linear subspace are called the “principal components” [2]. The principal components are a linear transformation of the original features, so there can be no more than $d$ of them. Also, the principal components are perpendicular to each other. However, the hope is that only $p(p<d)$ principal components are needed to approximate the $d$-dimensional original space. In the case, where $p=d$ the number of dimensions remains the same and there is no reduction.

Let there be $n$ data points denoted by $x_{1}, x_{2}, \ldots, x_{n}$ where each $x_{i}$ is a $d$-dimensional vector. The goal is to reduce these points and find a mapping $y_{1}, y_{2}, \ldots, y_{n}$ where each $y_{i}$ is a $p$-dimensional vector (where $p<d$, and often $p \ll d$ ). That is, the data points $x_{1}, x_{2}, \ldots, x_{n} \forall x_{i} \in R^{d}$ are mapped to $y_{1}, y_{2}, \ldots, y_{n} \forall y_{i} \in R^{p}$.

Let $\mathrm{X}$ be a $d \times n$ matrix that contains all the data points in the original space which has to be mapped to another $p \times n$ matrix $Y$ (matrix), which retains maximum variability of the data points by reducing the number of features to represent the data point.
Note: In PCA or any variant of PCA, a standardized input matrix is used. So, $X$ represents the standardized input data matrix, unless otherwise specified.

## 统计代写|数据可视化代写Data visualization代考|EXPLANATION AND WORKING

Dual Principal Component Analysis (PCA) is a variant of classical PCA. The goal is to reduce the dimensionality of the dataset but at the same time preserve maximum possible information from the original dataset.

• Problem Statement: Let the dataset contain a set of $n$ data points denoted by $x_{1}, x_{2}, \ldots, x_{n}$ where each $x_{i}$ is a $d$-dimensional vector. We try to find a $p$-dimensional linear subspace (where $p<d$, and often $p \ll d$ ) such that the data points lie mainly on this linear subspace. The linear subspace can be defined by $p$ orthogonal vectors, say: $U_{1}, U_{2}, \ldots, U_{p}$. This linear subspace forms a new coordinate system.
• Let $\mathrm{X}$ be a $d \times n$ matrix which contains all the data points in the original space which has to be mapped to another $p \times n$ matrix $Y$ (matrix) which retains maximum variability of the data points by reducing the number of features to represent the data point.
We saw that by using Singular Value Decomposition (SVD) on matrix $X\left(X=U D V^{T}\right)$, the problem of eigendecomposition for $\mathrm{PCA}$ can be solved. This method works just perfectly when there are more data points than the dimensionality of the data (i.e., $dn$, the $d \times d$ matrix’s eigendecomposition would be computationally expensive compared to the eigendecomposition of the $n \times n$ matrix. Figure $3.1$ represents the case when $d>n[1,2]$.

Hence, if we come up with a case where $d>n$, it would be beneficial for us to decompose $X^{T} X$ which is an $n \times n$ matrix, instead of decomposing $X X^{T}$ which is a $d \times d$ matrix. This is where dual PCA comes into the picture.

## 统计代写|数据可视化代写Data visualization代考|EXPLANATION AND WORKING

• 问题陈述：让数据集包含一组 $n$ 数据点表示为 $x_{1}, x_{2}, \ldots, x_{n}$ 其中每个 $x_{i}$ 是一个 $d$ 维向量。我们试图找到一 个 $p$ 维线性子空间 (其中 $p<d$ ，并且经常 $p \ll d$ ) 使得数据点主要位于这个线性子空间上。线性子空间可以 定义为 $p$ 正交向量，例如: $U_{1}, U_{2}, \ldots, U_{p}$. 这个线性子空间形成了一个新的坐标系。
• 让 $\mathrm{X}$ 做一个 $d \times n$ 包含原始空间中所有必须映射到另一个空间的数据点的矩阵 $p \times n$ 矩阵 $Y$ (矩阵) 通过减 少表示数据点的特征数量来保持数据点的最大可变性。
我们通过在矩阵上使用奇异值分解 (SVD) 看到了这一点 $X\left(X=U D V^{T}\right)$ ，的特征分解问题 $\mathrm{PCA}$ 可以解 决。当数据点多于数据的维数时 (即， $d n$ ，这 $d \times d$ 与 $n \times n$ 矩阵。数字 $3.1$ 表示当 $d>n[1,2]$.
因此，如果我们想出一个案例 $d>n$ ，分解对我们是有益的 $X^{T} X$ 这是一个 $n \times n$ 矩阵，而不是分解 $X X^{T}$ 这是一 个 $d \times d$ 矩阵。这就是双 PCA 出现的地方。

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 统计代写|数据可视化代写Data visualization代考|DESN6003

statistics-lab™ 为您的留学生涯保驾护航 在代写数据可视化Data visualization方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写数据可视化Data visualization代写方面经验极为丰富，各种代写数据可视化Data visualization相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|数据可视化代写Data visualization代考|The Rise of the Graphic Method and Visual Thinking

Playfair’s graphical inventions-the line chart, bar chart, and pie chart-are the most commonly used graphical forms today. The bar chart was something of an anomaly: lacking the time series data required to draw a timeline showing the trade with Scotland, he used bars to symbolize the cross-sectional character of the data that he did have. Playfair acknowledged Priestley’s (1765, 1769) priority in this form, using thin horizontal bars to symbolize the life spans of historical figures in a time line (Figure 1.6). What attracted Playfair’s interest was the possibility of visualizing a history over a long period and showing a classification (statesmen versus men of learning)-all in a single view.

Playfair’s role was crucial for several reasons. It was not for his development of the graphic recording of data; others preceded him in that. Indeed, in 1805 he pointed out that as a child his brother John had him keep a graphic record of temperature readings. But Playfair was in a remarkable position. Because of his close relationship with his brother and his connections with Watt, he was on the periphery of science. He was close enough to know of the value of the graphical method but sufficiently detached in his own interests to apply them in a very different arena-that of economics and finance. These areas, then as now, tend to attract a larger audience than matters of science, and Playfair was adept at self-promotion. ${ }^{15}$

In a review of his 1786 Atlas, which appeared in The Political Herald, the Scottish historian Dr. Gilbert Stuart wrote,
The new method, in which accounts are stated in this work, has attracted very general notice. The propriety and expediency of all men, who have any interest in the nation, being acquainted with the general outlines, and the great facts relating to our commerce are unquestionable; and this is the most commodious, as well as accurate mode of effecting this object, that has hitherto been thought of…. To each of his charts the author has added observations (which) … in general are just and shrewd; and sometimes profound…. Very considerable applause is certainly due to his invention; as a new, distinct, and easy mode of conveying information to statesmen and merchants.

## 统计代写|数据可视化代写Data visualization代考|A Golden Age

Something even more remarkable occurred in the latter part of the nineteenth century, as many forces combined to produce the perfect storm for data graphics, in what we call the Golden Age of Graphics (Chapter 7). By the mid-1800s, vast quantities of data on important social issues (commerce, disease, literacy, crime) had become available in Europe and the United States, so much that one historian called this an “avalanche of numbers.” 16 In the second half of this century some statistical theory was developed to allow their essence to be summarized and sensible comparisons to be made. Technological advances in printing and reproduction now allowed for the broad dissemination of graphic works in color and with a graphic style that

was unavailable previously. Excitement and enthusiasm for graphics were in the air. ${ }^{17}$ The audience was international, but they shared a common visual language and visual thinking.

Another of the key developers of graphic vision was Charles Joseph Minard $[1781-1870]$, a civil engineer in France, who produced what is now applauded as the greatest data-based graphic of all time-a flow map depicting Napoleon’s disastrous Russian campaign of 1812 (see Figure 10.3). Minard used the graphic method to design beautiful thematic maps and diagrams showing all manner of topics of interest to the modern French state during the dawn of national concern for trade, commerce, and transportation: Where to build railroads? How did the US Civil War affect the British mills’ importation of cotton? In these and other graphs, he told graphic stories of immediate visual impact-the message hit the viewer between the eyes. Minard too was driven by an inner vision.

By the end of the nineteenth century, scientists from the United States (Francis Walker in the Census Bureau), France (Émile Cheysson in the Mini stry of Public Works), and others in Germany (Herman Schwabe, August F. W. Crome), Sweden, and elsewhere, began to produce and widely disseminate elaborate and detailed statistical albums tracing and celebrating their nations’ achievements and aspirations. These contain some of the most exquisite graphs ever produced, even to this day. They were resplendent in color and style and revealed a vision of inventive graphic design that serves as a model to emulate and has become part of the language of graphics today. They inspire awe, just as do the cave paintings in Lascaux.

In this chapter we have taken a long-range view of visualizations spanning more than 17,000 years from the Lascaux cave paintings of long extinct auroch bulls to Minard’s equally exquisite depictions of the horrors of the Napoleonic Wars. In both these cases, and in many in between, visualizations have painlessly provided memorable understanding for those who look at them. Over the course of centuries, rising visual thinking was expressed in diagrams, maps, and graphs. A universal language of visualizations was used to communicate both quantitative and qualitative information, to uncover complex phenomena, to support, or refute, scientific claims. In the balance of this book we will elaborate and illustrate the wonders of visual communication and welcome you along on the journey.

## 统计代写|数据可视化代写Data visualization代考|The Rise of the Graphic Method and Visual Thinking

Playfair 的图形发明——折线图、条形图和饼图——是当今最常用的图形形式。条形图有点反常：缺乏绘制显示与苏格兰贸易时间线所需的时间序列数据，他使用条形来象征他确实拥有的数据的横截面特征。Playfair 以这种形式承认了 Priestley (1765, 1769) 的优先权，使用细横条在时间线上象征历史人物的寿命（图 1.6）。引起 Playfair 兴趣的是，可以将一段长时间的历史可视化并显示分类（政治家与有学问的人）——所有这些都在一个单一的视图中。

Playfair 的角色至关重要，原因有几个。这不是因为他开发了数据的图形记录；其他人在这方面先于他。事实上，他在 1805 年指出，他的兄弟约翰小时候曾让他保存温度读数的图形记录。但普莱费尔处于一个了不起的位置。由于他与兄弟的密切关系以及与瓦特的关系，他处于科学的边缘。他足够接近了解图解法的价值，但又足够独立于自己的利益，以将它们应用到一个非常不同的领域——经济和金融领域。与现在一样，这些领域往往比科学问题更能吸引更多的观众，而 Playfair 擅长自我推销。15

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。