Stay informed: Sign up for eNews Subscribe

More Damned Lies and Statistics How Numbers Confuse Public Issues

Read Chapter 1
Chapter 1

Missing Numbers

CBS News anchor Dan Rather began his evening newscast on March 5, 2001, by declaring: "School shootings in this country have become an epidemic." That day, a student in Santee, California, had killed two other students and wounded thirteen more, and media coverage linked this episode to a disturbing trend. Between December 1997 and May 1998, there had been three heavily publicized school shooting incidents: in West Paducah, Kentucky (three dead, five wounded); Jonesboro, Arkansas (five dead, ten wounded); and Springfield, Oregon (two dead and twenty-one wounded at the school, after the shooter had killed his parents at home). The following spring brought the rampage at Columbine High School in Littleton, Colorado, in which two students killed twelve fellow students and a teacher, before shooting themselves.1 Who could doubt Rather's claim about an epidemic?

And yet the word epidemic suggests a widespread, growing phenomenon. Were school shootings indeed on the rise? Surprisingly, a great deal of evidence indicated that they were not:

 ï Since school shootings are violent crimes, we might begin by examining trends in criminality documented by the Federal Bureau of Investigation. The Uniform Crime Reports, the FBI's tally of crimes reported to the police, showed that the overall crime rate, as well as the rates for such major violent crimes as homicide, robbery, and aggravated assault, fell during the 1990s.

 ï Similarly, the National Crime Victimization Survey (which asks respondents whether anyone in their household has been a crime victim) revealed that victimization rates fell during the 1990s; in particular, reports of teenagers being victimized by violent crimes at school dropped.

 ï Other indicators of school violence also showed decreases. The Youth Risk Behavior Survey conducted by the U.S. Centers for Disease Control and Prevention found steadily declining percentages of high school students who reported fighting or carrying weapons on school property during the 1990s.

 ï Finally, when researchers at the National School Safety Center combed media reports from the school years 1992-;1993 through 2000-;2001, they identified 321 violent deaths that had occurred at schools. Not all of these incidents involved student-on-student violence; they included, for example, 16 accidental deaths and 56 suicides, as well as incidents involving nonstudents, such as a teacher killed by her estranged husband (who then shot himself) and a nonstudent killed on a school playground during a weekend. Even if we include all 321 of these deaths, however, the average fell from 48 violent deaths per year during the school years 1992-;1993 through 1996-;1997 to 32 per year from 1997-;1998 through 2000-;2001. If we eliminate accidental deaths and suicides, the decline remains, with the average falling from 31 deaths per year in the earlier period to 24 per year in the later period (which included all of the heavily publicized incidents mentioned earlier). While violent deaths are tragedies, they are also rare. Tens of millions of children attend school; for every million students, fewer than one violent death per year occurs in school.

In other words, a great deal of statistical evidence was available to challenge claims that the country was experiencing a sudden epidemic of school shootings. The FBI's Uniform Crime Reports and the National Crime Victimization Survey in particular are standard sources for reporters who examine crime trends; the media's failure to incorporate findings from these sources in their coverage of school shootings is striking.2

Although it might seem that statistics appear in every discussion of every social issue, in some cases—such as the media's coverage of school shootings—relevant, readily available statistics are ignored. We might think of these as missing numbers. This chapter examines several reasons for missing numbers, including overwhelming examples, incalculable concepts, uncounted phenomena, forgotten figures, and legendary numbers. It asks why potentially relevant statistics don't figure in certain public debates and tries to assess the consequences of their absence.

The Power of Examples

Why are numbers missing from some debates over social problems and social policies? One answer is that a powerful example can overwhelm discussion of an issue. The 1999 shootings at Columbine High School are a case in point. The high death toll ensured that Columbine would be a major news story. Moreover, the school's location in a suburb of a major city made it easy for reporters to reach the scene. As it took some hours to evacuate the students and secure the building, the press had time to arrive and capture dramatic video footage that could be replayed to illustrate related stories in the weeks that followed. The juxtaposition of a terrible crime in a prosperous suburban community made the story especially frightening—if this school shooting could happen at Columbine, surely such crimes could happen anywhere. In addition, the Columbine tragedy occurred in the era of competing twenty-four-hour cable news channels; their decisions to run live coverage of several funeral and memorial services and to devote broadcast time to extended discussions of the event and its implications helped to keep the story alive for weeks.

For today's media, a dramatic event can become more than simply a news story in its own right; reporters have become attuned to searching for the larger significance of an event so that they can portray newsworthy incidents as instances of a widespread pattern or problem. Thus, Columbine, when coupled with the earlier, heavily publicized school shooting stories of 1997-;1998, came to exemplify the problem of school violence. And, commentators reasoned, if a larger problem existed, it must reflect underlying societal conditions; that is, school shootings needed to be understood as a trend, wave, or epidemic with identifiable causes. Journalists have been identifying such crime waves since at least the nineteenth century—and, for nearly as long, criminologists have understood that crime waves are not so much patterns in criminal behavior as they are patterns in media coverage. All of the available statistical evidence suggested that school violence had declined from the early 1990s to the late 1990s; there was no actual wave of school shootings. But the powerful images from Columbine made that evidence irrelevant. One terrible example was "proof" that school shootings were epidemic.

Compelling examples need not even be true. The stories that folklorists call contemporary legends (or the more familiar term urban legends) also shape our thinking about social problems. Contemporary legends usually spread through informal channels, which once meant word of mouth but now also includes the more modern means of faxes and e-mail messages. A legend's key quality remains unchanged, however: it must be a good story, good enough for people to remember it and want to pass it along. Legends thrive because they arouse fear, disgust, or other powerful emotions that make the tales memorable and repeatable.3 Very often, contemporary legends are topical: when child abductions are in the news, we tell stories about kidnappings in shopping malls; when gangs are receiving attention, we warn each other about lethal gang initiation rites. Such stories shape our thinking about social problems in much the same way dramatic news stories do.

The power of examples is widely recognized. A reporter preparing a story about any broad social condition—say, homelessness—is likely to begin by illustrating the problem with an example, perhaps a particular homeless person. Journalists (and their editors) prefer interesting, compelling examples that will intrigue their audience. And advocates who are trying to promote particular social policies learn to help journalists by guiding them to examples that can be used to make specific points. Thus, activists calling for increased services for the homeless might showcase a homeless family, perhaps a mother of young children whose husband has been laid off by a factory closing and who cannot find affordable housing. In contrast, politicians seeking new powers to institutionalize the homeless mentally ill might point to a deranged, violent individual who seems to endanger passersby.4 The choice of examples conveys a sense of a social problem's nature.

The problem with examples—whether they derive from dramatic events, contemporary legends, or the strategic choices of journalists or advocates—is that they probably aren't especially typical. Examples compel when they have emotional power, when they frighten or disturb us. But atypical examples usually distort our understanding of a social problem; when we concentrate on the dramatic exception, we tend to overlook the more common, more typical—but more mundane—cases. Thus, Democrats used to complain about Republican President Ronald Reagan's fondness for repeating the story of a "welfare queen" who had supposedly collected dozens of welfare checks using false identities.5 Using such colorful examples to typify welfare fraud implies that welfare recipients are undeserving or don't really need public assistance. Defenders of welfare often countered Reagan's anecdotes with statistics showing that recipients were deserving (as evidenced by the small number of able-bodied adults without dependent children who received benefits) or that criminal convictions for fraud were relatively few.6 The danger is that the powerful but atypical example—the homeless intact family, the welfare queen—will warp our vision of a social problem, thereby reducing a complicated social condition to a simple, melodramatic fable.

Statistics, then, offer a way of checking our examples. If studies of the homeless find few intact families (or individuals who pose threats of violence), or if studies of welfare recipients find that fraud involving multiple false identities is rare, then we should recognize the distorting effects of atypical examples and realize that the absence of numbers can damage our ability to grasp the actual dimensions of our problems.

The Incalculable

Sometimes numbers are missing because phenomena are very hard to count. Consider another crime wave. During the summer of 2002, public concern turned to kidnapped children. Attention first focused on the case of an adolescent girl abducted from her bedroom one night—a classic melodramatic example of a terrible crime that seemingly could happen to anyone. As weeks passed without a sign of the girl, both the search and the accompanying news coverage continued. Reports of other cases of kidnapped or murdered children began linking these presumably unrelated crimes to the earlier kidnapping, leading the media to begin talking about an epidemic of abductions.

This issue had a history, however. Twenty years earlier, activists had aroused national concern about the problem of missing children by coupling frightening examples to large statistical estimates. One widespread claim alleged that nearly two million children went missing each year, including fifty thousand kidnapped by strangers. Later, journalists and social scientists exposed these early estimates as being unreasonably high. As a result, in 2002, some reporters questioned the claims of a new abduction epidemic; in fact, they argued, the FBI had investigated more kidnappings the previous year, which suggested that these crimes were actually becoming less common.7

Both sets of claims—that kidnappings were epidemic and that they were declining—were based on weak evidence. Missing-children statistics can never be precise because missing children are so difficult to count. We encounter problems of definition:

ï What is a child—that is, what is the upper age limit for being counted?

ï What do we mean by missing? How long must a child be missing to be counted—a few minutes, one day, seventy-two hours?

ï What sorts of absences should be counted? Wandering off and getting lost? Running away? Being taken by a relative during a family dispute? Is a child who is with a noncustodial parent at a known location considered missing?

People need to agree about what to count before they can start counting, but not everyone agrees about the answers to these questions. Obviously, the answers chosen will affect the numbers counted; using a broad definition means that more missing children will be counted.

A second set of problems concerns reporting. Parents of missing children presumably call their local law enforcement agency—usually a police or sheriff's department. But those authorities may respond in different ways. Some states require them to forward all missing-children reports to a statewide clearinghouse, which is supposed to contact all law enforcement agencies in the state in order to facilitate the search. The clearinghouses—and some departments—may notify the National Crime Information Center, a branch of the FBI that compiles missing-persons reports. Some reports also reach the National Center for Missing and Exploited Children (the federally funded group best known for circulating pictures of missing children) or FBI investigators (who claim jurisdiction over a few, but by no means most, kidnappings). Authorities in the same jurisdiction do not necessarily handle all missing-children reports the same way; the case of a six-year-old seen being dragged into a strange car is likely to be treated differently than a report of a sixteen-year-old who has run away. We can suspect that the policies of different agencies will vary significantly. The point is that the jurisdiction from which a child disappears and the particulars of the case probably affect whether a particular missing-child report finds its way into various agencies' records.

It is thus very difficult to make convincing comparisons of the numbers of missing children from either time to time or place to place. Reporters who noted that fewer child-kidnapping reports were filed with the FBI in 2002 than in 2001, and who therefore concluded that the problem was declining, mistakenly assumed that the FBI's records were more complete and authoritative than they actually were. Some things—like missing children—are very difficult to count, which should make us skeptical about the accuracy of statistics that claim to describe the situation.

Such difficulties can create special problems when people try to weigh things that are relatively easy to measure against things that are less calculable. Consider the method of cost-benefit analysis as a basis for decision-making.8 In principle, it seems straightforward: calculate the expected costs and the value of the expected benefits for different courses of action, and choose the option that promises the best outcome. One problem, however, is that some costs and benefits are easier to compute than others. A teenager trying to decide whether to go to a movie or spend an evening babysitting can probably assign reasonably accurate dollar values to these options—the cost of the movie ticket and refreshments versus the expected earnings from babysitting—but even then the decision will probably hinge on additional assumptions about happiness: would I be happier spending the evening with my friends at a movie, or would I prefer to earn money that can be spent for some greater benefit down the line?

When applied to questions of social policy, such calculations only become more complex. Should we build more highways or support mass transit? Mass transit is rarely self-supporting: if the cost per trip seems too high, riders abandon mass transit; in order to keep them riding, ticket prices usually must be kept low by subsidizing the system. Critics of mass transit sometimes argue that such subsidies are wrong, that mass transit is inefficient, expensive, and therefore not competitive. Advocates respond that this critique ignores many of the relevant costs and benefits. Whereas riders directly bear the costs of using mass transit each time they buy a ticket, the ways we pay for the costs of highway travel are less obvious (for example, through gasoline taxes). Moreover, highways carry hidden, quality of life costs, such as greater air pollution, more traffic fatalities, and cities that discourage foot traffic by devoting huge areas to roads and parking lots. But such costs are hard to calculate. Even if we can agree on the likely health costs from air pollution and traffic accidents, how can we hope to assign a dollar value to being able to comfortably walk from one destination to another? And, of course, the critics have a rebuttal: costs are also incurred in building and maintaining mass transit systems. And what about the freedom cars offer—the ability to choose your own route and schedule? Shouldn't these considerations be incorporated in any calculations?

There are basically two solutions to the problems that intangible factors pose to cost-benefit analyses, but neither solution is completely satisfactory. The first is to leave these factors out of the equation, to simply ignore what seems impossible to quantify. But should factors such as quality of life be treated as irrelevant simply because they are hard to measure? The second solution is to estimate the values of costs and benefits, to assign dollar values to them. This approach keeps these factors in view, but the process is obviously arbitrary—what dollar value should be assigned to comfort or freedom? It is easy to skew the results of any cost-benefit analysis by pegging values as either very high or very low.

Our culture has a particularly difficult time assigning values to certain types of factors. Periodically, for example, the press expresses shock that a cost-benefit analysis has assigned some specific value to individual lives.9 Such revelations produce predictably outraged challenges: how can anyone place a dollar value on a human life—aren't people's lives priceless? The answer to that question depends on when and where it is asked. Americans' notion that human life is priceless has a surprisingly short history. Only a century ago, the parents of a child killed by a streetcar could sue the streetcar company for damages equal to the child's economic value to the family (basically, the child's expected earnings until adulthood); today, of course, the parents would sue for the (vastly greater) value of their pain and suffering. Even the dollar value of a child's life varies across time and space.10

But the larger point is that trade-offs are inevitable. Building a bridge or implementing a childhood vaccination program has both risks and costs—as do the alternatives of not building the bridge or not vaccinating children. Our culture seems to have a lot of difficulty debating whether, say, vaccinations should proceed if they will cause some number of children to sicken and die. Advocates on both sides try to circumvent this debate by creating melodramatically simple alternatives: vaccine proponents can be counted on to declare that harm from vaccines is virtually nonexistent but that failure to vaccinate will have terrible, widespread consequences; whereas opponents predictably insist that vaccines harm many and that they don't do all that much good. Obviously, such debates could use some good data. But, beyond that, we need to recognize that every choice carries costs and that we can weigh and choose only among imperfect options. Even if we can agree that a vaccine will kill a small number of children but will save a great many, how are we to incorporate into our decision-making the notion that every human life is beyond price? How should we weigh the value of a few priceless lives that might be lost if vaccinations proceed against the value of many priceless lives that might be lost if vaccinations are curtailed? (Chapter 3 extends this discussion of trade-offs.)

In short, some numbers are missing from discussions of social issues because certain phenomena are hard to quantify, and any effort to assign numeric values to them is subject to debate. But refusing to somehow incorporate these factors into our calculations creates its own hazards. The best solution is to acknowledge the difficulties we encounter in measuring these phenomena, debate openly, and weigh the options as best we can.

The Uncounted

A third category of missing numbers involves what is deliberately uncounted, records that go unkept. Consider the U.S. Bureau of the Census's tabulations of religious affiliation: there are none. In fact, the census asks no questions about religion. Arguments about the constitutionally mandated separation of church and state, as well as a general sense that religion is a touchy subject, have led the Census Bureau to omit any questions about religion when it surveys the citizenry (in contrast to most European countries, where such questions are asked).11

Thus, anyone trying to estimate the level of religious activity in the United States must rely on less accurate numbers, such as church membership rolls or individuals' reports of their attendance at worship services. The membership rolls of different denominations vary in what they count: Are infants counted once baptized, or does one become an enrolled member only in childhood or even adulthood? Are individuals culled from the rolls if they stop attending or actively participating in religious activities? Such variation makes it difficult to compare the sizes of different faiths (as discussed further in chapter 6). Surveys other than the census sometimes ask people how often they attend religious services, but we have good reason to suspect that respondents overreport attendance (possibly to make a good impression on the interviewers).12 The result is that, for the United States, at least, it is difficult to accurately measure the population's religious preferences or level of involvement. The policy of not asking questions about religion through the census means that such information simply does not exist.

The way choices are phrased also creates uncounted categories. Since 1790, each census has asked about race or ethnicity, but the wording of the questions—and the array of possible answers—has changed. The 2000 census, for example, was the first to offer respondents the chance to identify themselves as multiracial. Proponents of this change had argued that many Americans have family trees that include ancestors of different races and that it was unreasonable to force people to place themselves within a single racial category.

But some advocates had another reason for promoting this change. When forced to choose only one category, people who knew that their family backgrounds included people of different ethnicities had to oversimplify; most probably picked the option that fit the largest share of their ancestors. For example, an individual whose grandparents included three whites and one Native American was likely to choose "white." In a society in which a group's political influence depends partly on its size, such choices could depress the numbers of people of American Indian ancestry (or any other relatively small, heavily intermarried group) identified by the census. Native American activists favored letting people list themselves as being of more than one race because they believed that this would help identify a larger Native American population and presumably increase that group's political clout. In contrast, African American activists tended to be less enthusiastic about allowing people to identify themselves as multiracial. Based in part on the legacy of segregation, which sometimes held that having a single black ancestor was sufficient to warrant being considered nonwhite, people with mixed black and white ancestry (who account for a majority of those usually classified as African Americans) had tended to list themselves as "black." If large numbers of these individuals began listing more than one racial group, black people might risk losing political influence.

As is so often the case, attitudes toward altering the census categories depended on whether one expected to win or lose by the change. The reclassification had the expected effect, even though only 2.4 percent of respondents to the 2000 census opted to describe themselves as multiracial. The new classification boosted the numbers of people classified as Native Americans: although only 2.5 million respondents listed themselves under the traditional one-ethnicity category, adding those who identified themselves as part-Indian raised the total to 4.1 million—a 110 percent increase since 1990. However, relatively small numbers of people (fewer than eight hundred thousand) listed their race as both white and black, compared to almost thirty-four million identified as black.13

Sometimes only certain cases go uncounted. Critics argue that the official unemployment rate, which counts only those without full-time work who have actively looked for a job during the previous four weeks, is too low. They insist that a more accurate count would include those who want to work but have given up looking as well as those who want full-time work but have had to settle for part-time jobs—two groups that, taken together, actually outnumber the officially unemployed.14 Of course, every definition draws such distinctions between what does—and doesn't—count.

The lesson is simple. Statistics depend on collecting information. If questions go unasked, or if they are asked in ways that limit responses, or if measures count some cases but exclude others, information goes ungathered, and missing numbers result. Nevertheless, choices regarding which data to collect and how to go about collecting the information are inevitable. If we want to describe America's racial composition in a way that can be understood, we need to distill incredible diversity into a few categories. The cost of classifying anything into a particular set of categories is that some information is inevitably lost: distinctions seem sharper; what may have been arbitrary cut-offs are treated as meaningful; and, in particular, we tend to lose sight of the choices and uncertainties that went into creating our categories.

In some cases, critics argue that a failure to gather information is intentional, a method of avoiding the release of damaging information. For example, it has proven very difficult to collect information about the circumstances under which police shoot civilians. We might imagine that police shootings can be divided into two categories: those that are justified by the circumstances, and those that are not. In fact, many police departments conduct reviews of shootings to designate them as justifiable or not. Yet efforts to collect national data on these findings have foundered. Not all departments share their records (which, critics say, implies that they have something to hide); and the proportion of shootings labeled "justified" varies wildly from department to department (suggesting either that police behave very differently in different departments or that the process of reviewing shootings varies a great deal).15

There are a variety of ways to ensure that things remain uncounted. The simplest is to not collect the information (for instance, don't ask census respondents any questions about religion). But, even when the data exist, it is possible to avoid compiling information (by simply not doing the calculations necessary to produce certain statistics), to refuse to publish the information, or even to block access to it.16 More subtly, both data collection and analysis can be time-consuming and expensive; in a society where researchers depend on others for funding, decisions not to fund certain research can have the effect of relegating those topics to the ranks of the uncounted.

This works both ways. Inevitably, we also hear arguments that people should stop gathering some sorts of numbers. For example, a popular guide to colleges for prospective students offers a ranking of "party schools." A Matter of Degree—a program sponsored by the American Medical Association to fight alcohol abuse on college campuses—claims that this ranking makes light of and perhaps contributes to campus drinking problems and has called for the guidebook to stop publishing the list.17 While it is probably uncommon for critics to worry that statistics might be a harmful moral influence, all sorts of data, some will contend, might be better left uncollected—and therefore missing.

The Forgotten

Another form of missing numbers is easy to overlook—these are figures, once public and even familiar, that we no longer remember or don't bother to consider. Consider the number of deaths from measles. In 1900, the death rate from measles was 13.3 per 100,000 in the population; measles ranked among the top ten diseases causing death in the United States. Over the course of a century, however, measles lost its power to kill; first more effective treatments and then vaccination eliminated measles as a major medical threat. Nor was this an exceptional case. At the beginning of the twentieth century, many of the leading causes of death were infectious diseases; influenza/pneumonia, tuberculosis, diphtheria, and typhoid/typhoid fever also ranked in the top ten.18 Most of those formerly devastating diseases have been brought under something approaching complete control in the United States through the advent of vaccinations and antibiotics. The array of medical threats has changed.

Forgotten numbers have the potential to help us put things in perspective, if only we can bring ourselves to remember them. When we lose sight of the past, we have more trouble assessing our current situation. However, people who are trying to draw attention to social problems are often reluctant to make comparisons with the past. After all, such comparisons may reveal considerable progress. During the twentieth century, for example, Americans' life expectancies increased dramatically. In 1900, a newborn male could expect to live forty-six years; a century later, male life expectancy had risen to seventy-three. The increase for females was even greater—from age forty-eight to eighty. During the same period, the proportion of Americans completing high school rose from about 6 percent to about 85 percent. Many advocates seem to fear that talking about long-term progress invites complacency about contemporary society, and they prefer to focus on short-run trends—especially if the numbers seem more compelling because they show things getting worse.19

Similarly, comparing our society to others can help us get a better sense of the size and shape of our problems. Again, in discussions of social issues, such comparisons tend to be made selectively, in ways that emphasize the magnitude of our contemporary problems. Where data suggest that the United States lags behind other nations, comparative statistics are commonplace, but we might suspect that those trying to promote social action will be less likely to present evidence showing America to advantage. (Of course, those resisting change may favor just such numbers.) Comparisons across time and space are recalled when they help advocates make their points, but otherwise they tend to be ignored, if not forgotten.

Legendary Numbers

One final category deserves mention. It does not involve potentially relevant numbers that are missing, but rather includes irrelevant or erroneous figures that somehow find their way into discussions of social issues. Recently, for example, it became fairly common for journalists to compare various risks against a peculiar standard: the number of people killed worldwide each year by falling coconuts (the annual coconut-death figure usually cited was 150). Do 150 people actually die in this way? It might seem possible—coconuts are hard and heavy, and they fall a great distance, so being bonked on the head presumably might be fatal. But who keeps track of coconut fatalities? The answer: no one. Although it turns out that the medical literature includes a few reports of injuries—not deaths—inflicted by falling coconuts, the figure of 150 deaths is the journalistic equivalent of a contemporary legend.20 It gets passed along as a "true fact," repeated as something that "everybody knows."

Other legendary statistics are attributed to presumably authoritative sources. A claim that a World Health Organization (WHO) study had determined that blondness was caused by a recessive gene and that blonds would be extinct within two hundred years was carried by a number of prominent news outlets, which presumably ran the story on the basis of one another's coverage, without bothering to check with the WHO (which denied the story).21

Legendary numbers can become surprisingly well established. Take the claim that fifty-six is the average age at which a woman becomes widowed. In spite of its obvious improbability (after all, the average male lives into his seventies, married men live longer than those who are unmarried, and husbands are only a few years older on average than their wives), this statistic has circulated for more than twenty years. It appeared in a television commercial for financial services, in materials distributed to women's studies students, and in countless newspaper and magazine articles; its origins are long lost. Perhaps it has endured because no official agency collects data on age at widowhood, making it difficult to challenge such a frequently repeated figure. Nevertheless, demographers—using complicated equations that incorporate age-specific death rates, the percentage of married people in various age cohorts, and age differences between husbands and wives—have concluded that the average age at which women become widows has, to no one's surprise, been rising steadily, from sixty-five in 1970 to about sixty-nine in 1988.22

Even figures that actually originate in scientists' statements can take on legendary qualities. In part, this reflects the difficulties of translating complex scientific ideas into what are intended to be easy-to-understand statements. For example, the widely repeated claim that individuals need to drink eight glasses of water each day had its origin in an analysis that did in fact recommend that level of water intake. But the analysis also noted that most of this water would ordinarily come from food (bread, for example, is 35 percent water, and meats and vegetables contain even higher proportions of water). However, the notion that food contained most of the water needed for good health was soon forgotten, in favor of urging people to consume the entire amount through drinking.23 Similarly, the oft-repeated statements that humans and chimpanzees have DNA that is 98 percent similar—or, variously, 98.4, 99, or 99.44 percent similar—may seem precise, but they ignore the complex assumptions involved in making such calculations and imply that this measure is more meaningful than it actually is.24

Widely circulated numbers are not necessarily valid or even meaningful. In the modern world, with ready access to the Internet and all manner of electronic databases, even figures that have been thoroughly debunked can remain in circulation; they are easy to retrieve and disseminate but almost impossible to eradicate. The problem is not one of missing numbers—in such cases, the numbers are all too present. What is absent is the sort of evidence needed to give the statistics any credibility.

The attraction of legendary numbers is that they seem to give weight or authority to a claim. It is far less convincing to argue, "That's not such an important cause of death! Why, I'll bet more people are killed each year by falling coconuts!" than to flatly compare 150 coconut deaths to whatever is at issue. Numbers are presumed to be factual; numbers imply that someone has actually counted something. Of course, if that is true, it should be possible to document the claim—which cannot be done for legendary numbers.

A related phenomenon is that some numbers, if not themselves fanciful, come to be considered more meaningful than they are. (Chapter 5 also addresses this theme.) We see this particularly in the efforts of bureaucrats to measure the unmeasurable. A school district, for example, might want to reward good teaching. But what makes a good teacher? Most of us can look back on our teachers and identify some as better than others. But what made them better? Maybe they helped us when we were having trouble, encouraged us, or set high standards. My reasons for singling out some of my teachers as especially good might be very different from the reasons you would cite. Teachers can be excellent in many ways, and there's probably no reliable method of translating degree of excellence into a number. How can we measure good teaching or artistic genius? Even baseball fans—those compulsive recordkeepers and lovers of statistics—can argue about the relative merits of different athletes, and baseball has remarkably complete records of players' performances.

But that sort of soft appeal to the immeasurability of performance is unlikely to appease politicians or an angry public demanding better schools. So educational bureaucrats—school districts and state education departments—insist on measuring "performance." In recent years, the favored measure has been students' scores on standardized tests. This is not completely unreasonable—one could argue that, overall, better teaching should lead to students learning more and, in turn, to higher test scores. But test scores are affected by many things besides teachers' performance, including students' home lives. And our own memories of our "best teachers" probably don't depend on how they shaped our performances on standardized tests.

However imperfect test scores might be as an indicator of the quality of teaching, they do offer a nice quantitative measure—this student got so many right, the students in this class scored this well, and so on. No wonder bureaucrats gravitate toward such measures—they are precise (and it is relatively inexpensive to get the information), even if it isn't clear just what they mean. The same thing happens in many settings. Universities want their professors to do high-quality research and be good teachers, but everyone recognizes that these qualities are hard to measure. Thus, there is a tremendous temptation to focus on things that are easy to count: How many books or articles has a faculty member published? (Some departments even selectively weigh articles in different journals, depending on some measure of each journal's influence.) Are a professor's teaching evaluation scores better than average?

The problem with such bureaucratic measures is that we lose sight of their limitations. We begin by telling ourselves that we need some way of measuring teaching quality and that this method—whatever its flaws—is better than nothing. Even if some resist adopting the measure at first, over time inertia sets in, and people come to accept its use. Before long, the measure is taken for granted, and its flaws tend to be forgotten. The criticism of being an imperfect measure can be leveled at many of the numbers discussed in the chapters that follow. If pressed, a statistic's defenders will often acknowledge that the criticism is valid, that the measure is flawed. But, they ask, what choice do we have? How else can we measure—quickly, cheaply, and more or less objectively—good teaching (or whatever else concerns us)? Isn't an imperfect statistic better than none at all? They have a point. But we should never blind ourselves to a statistic's shortcomings; once we forget a number's limitations, we give it far more power and influence than it deserves. We need to remember that a clear and direct measure would be preferable and that our imperfect measure is—once again—a type of missing number.

What's Missing?

When people use statistics, they assume—or, at least, they want their listeners to assume—that the numbers are meaningful. This means, at a minimum, that someone has actually counted something and that they have done the counting in a way that makes sense. Statistical information is one of the best ways we have of making sense of the world's complexities, of identifying patterns amid the confusion. But bad statistics give us bad information.

This chapter argues that some statistics are bad not so much because the information they contain is bad but because of what is missing—what has not been counted. Numbers can be missing in several senses: a powerful example can make us forget to look for statistics; things can go uncounted because they are considered difficult or impossible to count or because we decide not to count them. In other cases, we count, but something gets lost in the process: things once counted are forgotten, or we brandish numbers that lack substance.

In all of these cases, something is missing. Understanding that helps us recognize what counts as a good statistic. Good statistics are not only products of people counting; the quality of statistics also depends on people's willingness and ability to count thoughtfully and on their decisions about what, exactly, ought to be counted so that the resulting numbers will be both accurate and meaningful.

This process is never perfect. Every number has its limitations; every number is a product of choices that inevitably involve compromise. Statistics are intended to help us summarize, to get an overview of part of the world's complexity. But some information is always sacrificed in the process of choosing what will be counted and how. Something is, in short, always missing. In evaluating statistics, we should not forget what has been lost, if only because this helps us understand what we still have.