## 统计代写|数据可视化代写Data visualization代考|Vital Statistics

In the previous chapter we explained how concerns in France about crime led to the systematic collection of social data. This combination of important social issues and available data led Guerry to new developments involving data display in graphs, maps, and tables.

A short time later, an analogous effort began in the United Kingdom, in the context of social welfare, poverty, public health, and sanitation. These efforts produced two new heroes of data visualization, William Farr and John Snow, who were influential in the attempt to understand the causes of several epidemics of cholera and how the disease could be mitigated.

In the United Kingdom the Age of Data can be said to have begun with the creation of the General Register Office (GRO) by an Act of Parliament in 1836. ${ }^1$ The initial intent was simply to track births and deaths in England and Wales as the means of ensuring the lawful transfer of property rights between generations of the landed gentry.

But the 1836 act did much more. It required that every single child of an English parent, even those born at sea, have the particulars reported to a local registrar on standard forms within fifteen days. It also required that every marriage and death be reported and that no dead body could be buried without a certificate of registration, and it imposed substantial fines (10-50£) for failure in this reporting duty. The effect was to create a complete data base of the entire population of England, which is still maintained by the GRO today.

The following year, William Farr [1807-1883], a 30-year-old physician, was hired, initially to handle the vital registration of live births, deaths, marriages, and divorces for the upcoming Census of 1841. After he wrote a chapter ${ }^2$ on “Vital statistics; or, the statistics of health, sickness, diseases, and death,” he was given a new post as the “compiler of scientific abstracts”, becoming the first official statistician of the UK.

Like Guerry at the Ministry of Justice in France, Farr had access to, and had to make sense of, a huge mountain of data. Farr quickly realized that these data could serve a far greater purpose: saving lives. Life expectancy could be broken down and compared over geographic regions, down to the county level. Information about the occupations of deceased persons was also recorded, so Farr could also begin to tabulate life expectancy according to economic and social station. Information about the cause of death was lacking, and Farr probably exceeded his initial authority by adding instructions to list the cause(s) of death on the standard form. This simple addition opened a vast new world of medical statistics and public health that would eventually be called epidemiology, involving the study of patterns of incidence, causes, and control of disease conditions in a population.

## 统计代写|数据可视化代写Data visualization代考|Farr’s Diagrams

Figure $4.1$ is one of five lithographed plates (three in color) that appear in Farr’s report. Farr takes many liberties with the vertical scales (we would now call these graphical sins) to try to show any relation between the daily numbers of deaths from cholera and diarrhea to metrological data on those days. Most apparent are the spikes of cholera deaths in August and September. Temperature was also elevated, but perhaps no more than in the adjacent months. The weather didn’t seem to be a sufficient causal factor in 1849 . Or was it? Plate 2 takes a longer view, showing the possible relationship between temperature and mortality for every week over the eleven years from 1840 to 1850. This is a remarkable chart-a new invention in the language of statistical graphs. This graphical form, now called a radial diagram (or windrose), is ideally suited to showing and comparing several related series of events having a cyclical structure, such as weeks or months of the year or compass directions. The radial lines in Plate 2 serve as axes for the fifty-two weeks of each year. The outer circles show the average weekly number of deaths (corrected for increase in population) in relation to the mean number of deaths over all years. When these exceed the average, the area is shaded black (excess mortality); they are shaded yellow when they are below the average (salubrity).
Similarly, the inner circles show average weekly temperature against a baseline of the mean temperature $\left(48^{\circ} \mathrm{F}\right)$ of the seventy-nine years from 1771 to 1849 . Weeks exceeding this average are outside the baseline circle and shaded red, while those weeks that were colder than average are said to be shaded blue (but appear as gray).

In this graph we can immediately see that something very bad happened in London in summer 1849 (row 3, column 2), leading to a huge spike in deaths from July through September, and the winter months in 1847 (row 2, column 3) also stand out. This larger view, using the idea later called “small multiples” by Tufte, ${ }^6$ does something more, which might not be noticed in a series of separate charts: it shows a general pattern across years of fewer deaths on average in the warmer months of April (at 9:00) through September (at 3:00), but the dramatic spikes point to something huge that can not be explained by temperature.

