On November 22, 2013, the U.S. Food and Drug Administration sent a letter to 23andMe, a company that offers genetic testing directly to customers. The FDA asserted that the company had failed to provide evidence that its tests, which it marketed as providing predictive “health reports on 254 diseases and conditions,” were in fact accurate. The agency ordered the company to stop marketing its personal genome service. Five days later, a 23andMe customer named Lisa Casey filed a class-action lawsuit against the company, alleging that customers had been deceived by the company’s advertising about the health benefits of their service. The following week, the company announced that it would comply with the FDA’s orders, ceasing to offer health-related genetic results while continuing to offer information on ancestry and “raw genetic data without interpretation.”
The shuttering of 23andMe’s diagnostic services can probably be attributed, at least in part, to mismanagement: the company apparently ignored the government’s inquiries for six months, a move that one reporter said may be “the single dumbest regulatory strategy I have seen.” The company’s chief executive has admitted that “we failed to communicate proactively” with the FDA, calling the agency “a very important partner.” But setting aside the specific causes that precipitated the FDA’s decision to stop the company’s personal genome service, the case raises broader questions about the field of direct-to-consumer genetic testing, how we should understand the information it provides, and how it should be regulated in the future.
Can knowledge of our own genetic sequence give us accurate information about ourselves, our vulnerabilities to various diseases, our dispositions to think and to act, and a myriad of other traits? Or is the proliferation of these purveyors of genetic knowledge closer to a new pseudoscience? Is there something problematic about personalized genetic knowledge given directly to individuals, without the mediation of doctors and clinical geneticists, the usual gatekeepers of medical information?
As a population geneticist with some interest in the ethical implications of modern science, I decided to set out to explore these issues, not only from my perspective as a scientist, but also as a participant, a customer. And so, in July 2013 — in what may well have been the waning days of the Wild West of direct-to-consumer genetic testing — I joined the ranks of the “spiterati,” sending 23andMe a sample of my saliva containing my DNA.
The field of direct-to-consumer genetics began in the mid-2000s, with a handful of startups offering relatively simple genetic tests for ancestry, disease risks, and a variety of other traits. Some of the original companies in this field, like deCODE Genetics and Navigenics, have since been absorbed by larger biotech firms. Today the most significant company is certainly 23andMe. Founded in 2006, the Silicon Valley-based company has provided genetic tests for nearly half a million customers.
When 23andMe began marketing its genome test to consumers in 2007, it offered reports on just fourteen genetic traits at a cost of $999. This rather hefty price tag made the service a luxury of the wealthy and the tech-savvy. The company tried to drum up enthusiasm by hosting “spit parties,” in which guests could buy the genetic testing kits and spit into the sample-collection tubes. One of these spit parties, held in Manhattan in 2008, brought together so many wealthy and celebrity attendees that it received prominent coverage in the New York Times; the occasion is less noteworthy for being at the cutting edge of democratized science than for its combination of opulence and decadence.
Since then, and in part because the company was able to raise a great deal of capital — including from Google, to which the chief executive of 23andMe is connected through her husband, one of Google’s founders, and her sister, another Google executive — the company gradually increased the number of traits it tested from fourteen to 254, while reducing the price of its service to $99. The push to lower the price was likely encouraged by the company’s desire to build a large bank of genetic data for use by its research arm; the company’s scientists have so far published eighteen peer-reviewed papers based on both genetic and survey-response data from customers who agreed to participate. Thanks in part to the price drop, the company’s customer base grew from around 180,000 in December 2012 to around 500,000 in December 2013.
With more customers purchasing genetic tests comes the increasing likelihood that the tests will play a role in actual medical decisions. The possibility that genetic tests will affect people’s medical decisions is also made more likely by high-profile cases publicized in the media, like that of actress Angelina Jolie. After a genetic test indicated she had an 87 percent chance of developing breast cancer, Jolie elected to undergo a preventive double mastectomy, announcing her decision in an op-ed piece in the New York Times. “It is my hope,” Jolie wrote, that other women will “be able to get gene tested,” and that those tests will shape their medical choices.
But should genetic information be shared directly with consumers? Clinical geneticists, whose job it is to help patients and medical professionals understand the medical significance of genetic information, are generally wary of direct-to-consumer genetic testing services like the one offered by 23andMe. They worry that customers who receive claims about genetic risk factors — quite possibly unreliable claims, at that — might make rash decisions, especially in the absence of the kind of careful medical advice and support that clinical geneticists provide. This is also one of the FDA’s stated concerns. The agency’s November 2013 letter imagines that a false positive for cancer-related genes might “lead a patient to undergo prophylactic surgery, chemoprevention, intensive screening, or other morbidity-inducing actions, while a false negative could result in a failure to recognize an actual risk that may exist.”
This is not simply a hypothetical possibility. One case study described a woman who, upon learning from a direct-to-consumer genetic testing company that she had a mutation in a gene associated with a higher risk of Alzheimer’s disease, made plans to commit suicide at the onset of symptoms. Researcher Donna A. Messner, the study’s author, concludes that “when groups of health-related genetic tests are offered as packages by [direct-to-consumer] companies, informed consumer choice is rendered impossible.” More generally, other researchers interested in the public understanding of science have often noted that most Americans, and even many doctors, have a poor understanding of the complex relationship between genes and health, and are therefore ill-equipped to comprehend the genetic information they receive from direct-to-consumer genetic testing companies. Still, other than a handful of anecdotes collected by professional genetic counselors (who might well have an interest in preserving their gatekeeper status) there is little evidence of individuals harming themselves by acting rashly on information provided by genetic testing companies.
My own results, when they arrived, were neither as interesting as I might have hoped nor as alarming as I might have feared. In addition to alleged potential health risks, 23andMe offers information on ancestry and a variety of genetically influenced traits. I did not learn about any major genetic risk factors from the test that I wasn’t already aware of from family history. Some of the reports on non-health-related traits did not inspire a great deal of confidence. I was told that my eyes are “likely brown”; my wife swears they are hazel. I was told that I have a high probability of having straight hair, but pictures of me from the Seventies tell a different story. To be fair, the test did correctly peg my ability to detect the “asparagus metabolite” — the sulfurous smell associated with the urine of some people after they’ve eaten asparagus.
For commentators like the libertarian science journalist Ronald Bailey, it is “outrageous” that “FDA bureaucrats think that they know better than you how to handle” your genetic information. Others have also criticized the FDA for undue paternalism. In a post at The Volokh Conspiracy law blog, Duke law professor Nita Farahany criticized the FDA for “overreach,” sarcastically musing that the FDA’s action against 23andMe was justified because “maybe we just can’t handle the truth.”
But of course, whether or not the information that direct-to-consumer genetic testing companies provide is the truth is a central part of the controversy about the field. Though the recent conflict between the FDA and 23andMe raises a number of important legal questions regarding how consumer genetics will be marketed in the future, beneath the problem of how to regulate the field are questions about the accuracy and meaning of the genetic science that are supposed to give these tests legitimacy. In a recent New York Times article, Columbia University graduate student Kira Peikoff recounts how she paid for genetic tests from three companies, only to find that they gave conflicting reports for her risks of conditions like psoriasis, rheumatoid arthritis, and coronary heart disease. Conflicting results between different genetic testing companies indicate that the field lacks reliable standards for interpreting genetic information, as a damning 2010 report by the Government Accountability Office concluded, finding serious inconsistencies between the interpretations of genetic information provided by major testing companies. The forceful criticism made by population geneticist Margaret Lock in a 2005 article, where she said that using genetic testing to predict Alzheimer’s disease is “no more accurate than fortune-telling,” may indeed still be relevant today.
Evaluating the information that genomics companies provide requires a basic grasp of modern genetics and its methods, and of the uncertainty inherent in this type of data. In brief: Our hereditary characteristics are mostly encoded in the 23 pairs of chromosomes in our body’s cells. (Hence the name of the company 23andMe.) Our chromosomes are made of DNA, a long molecule shaped, famously, like a double helix and made up of individual units called nucleotides. There are four kinds of nucleotides in DNA, usually symbolized by the letters A, C, G, and T. The term “gene” generally refers to a stretch of DNA on a chromosome that serves, through the sequence of nucleotides, to provide the information for the synthesis of a large biological molecule, usually a protein (or sometimes RNA, a molecule related to DNA). The synthesis of a protein (or an RNA molecule) encoded in a gene is referred to as the “expression” of that gene. For example, the APOE gene (pronounced by saying each letter, “A-P-O-E”) encodes the cholesterol-carrying protein apolipoprotein E; some mutations of APOE are associated with Alzheimer’s disease.
Proteins in cells interact with one another in a variety of ways. Some proteins, for example, serve as transcription factors, which bind to the DNA and initiate expression of other genes. The observable traits of the organism — what biologists call phenotypes — result from a combination of gene expression and the interactions of proteins. There is a further influence from environmental factors, such as diet, smoking, and medications, which can affect the emergence of phenotypic traits, sometimes by changing the pattern of gene expression, but also by simply affecting the way the body grows or functions, as a high-fat diet affects the cardiovascular system by causing fatty plaques to develop in arteries.
Some diseases and disorders have a genetic basis: as a result of mutation, an individual expresses defective copies of a gene or genes, thereby interfering with certain biological processes and causing illness. Usually such diseases are classified as either Mendelian or complex. A Mendelian genetic disease is one that is caused by a mutation of a single gene, and is thus inherited according to the simple laws discovered through Gregor Mendel’s experiments with pea plants. Most Mendelian disease genes are recessive, meaning that one must receive the defective version of the gene from both parents in order to develop the disease. A familiar example of a Mendelian disease is sickle-cell anemia. The allele (the term for a specific version of a gene) that causes sickle-cell anemia contains a change in a single nucleotide in the DNA sequence that encodes beta globin, one of the two protein molecules that make up hemoglobin — a protein in the blood that is necessary for carrying oxygen throughout the body. If an individual receives the mutant allele from just one parent, he becomes a “carrier” of sickle-cell anemia, but because he received a normal allele from his other parent, he is able to produce enough normal hemoglobin that he will not experience symptoms of the disease. But if the individual receives the mutant allele from both parents, the result is sickle-cell anemia, a serious illness.
Complex genetic diseases include heart disease, stroke, and many forms of cancer. In these cases, there is evidence of a heritable component, but no single Mendelian gene can be identified. In addition, there is typically evidence that environmental factors play a role, often in combination with the effects of mutations in a number of different genes. So far, there is no complex disease for which all the causal factors have been determined, although in some cases certain associated genes and environmental factors have been identified. And because complex diseases involve multiple factors, genetic tests will at most only be able to provide an estimate of probability or risk. Unlike for Mendelian diseases, it does not make sense to say that there is an allele “for” breast cancer or heart disease; at best (or worst), one can say that an allele increases the risk of developing those complex diseases. The increased risk might be high — as in the case of the widely discussed gene mutations associated with breast cancer, which led Angelina Jolie to pursue her double mastectomy — but even a high risk does not imply certainty that the disease would ever happen.
Although the cost of DNA sequencing has consistently fallen for the last decade — and in particular has become markedly cheaper in the last five years as so-called “next-generation sequencing” technologies have begun to be widely adopted — the cost of sequencing an individual’s entire genome is still, as of now, beyond the reach of most consumers. So instead, the personal genomics companies often provide a service called “genotyping” rather than gene sequencing. In gene sequencing, the goal is to determine the sequence of nucleotides, sometimes even of an individual’s entire genome; in genotyping, the goal is to determine which nucleotides are present at a specific location in an individual’s genome, usually focusing on locations where there are a number of well-known variants. The genotyping technology used by 23andMe focuses on locations around the genome that are thought to be associated with a variety of complex diseases and traits. The term for these variations is single-nucleotide polymorphisms (SNPs, pronounced “snips”).
While Mendelian diseases can generally be traced back to a single mutation or SNP, complex genetic diseases cannot: there is a big difference between knowing whether a person has an A or a T at a specific place in his or her genome and showing he or she might be likely to develop Alzheimer’s disease or breast cancer. Although we can know the specific effect that a SNP will have on the amino acid sequence of a protein, it is still very difficult to know the biochemical and physiological consequences of even a small change in the structure of a protein. Most of our knowledge of the relationships between complex traits and genotypes comes not from detailed biochemical explanations but large statistical studies that associate SNPs with putative genetic traits — traits like complex diseases or even dispositions for behavior.
These genome-wide association studies, or GWA studies, are the source of many of the claims we hear in news reports about scientists having found a “gene for X.” They also serve as the basis for the disease-risk calculations used by 23andMe and other personal genomics services. A typical GWA study examines a number of genetic markers throughout the genome. Usually these are particular sites in the genome at which two different nucleotides can be found in the human population. The great majority of nucleotides in most genes are identical for every human being — and likewise for many of the genes that we share with other species — because mutations that alter the nucleotide sequence of a gene tend to be quite harmful.
To conduct a GWA study, nucleotide data is collected, sometimes including a million or more specific sites from the genomes of each individual studied. Comparisons are then made between two groups of subjects: individuals with some trait or complex disease, and a control group of those without. The researchers can then find which SNPs are statistically associated with the trait or disease.
Although GWA studies are certainly an important tool for researching the genetic bases of diseases and traits, there are limits to what they can explain and to the reliability of their conclusions. These limitations arise from technical details that may not be well explained to consumers by personal genomics companies.
As with any study of groups of subjects, misleading results can occur by chance. Statistical methods can give us a sense of the uncertainty surrounding a set of results, but we will always be better off accepting some statistical association if it is based on large studies, and if multiple independent studies confirm it.
To its credit, 23andMe distinguishes between “preliminary research reports” and “established research reports,” using the number of studies reporting an association, along with the sample size of those studies, to specify the strength of the evidence. As an example of just how preliminary those preliminary studies can be, my report from 23andMe included information about a SNP involved in dopamine signaling. A 2007 study reported that individuals with a certain allele at this location have difficulty learning from their mistakes. But the study involved only twenty-six individuals. With that small a sample, just about any result is possible. So I was not particularly impressed that 23andMe described me as one who “effectively avoids errors.” (Nor should other customers have had much reason to worry if they received the alternate description of “much less efficient at learning to avoid errors.”)
An example of a report that 23andMe considers “established” involves an autoimmune disease called Limited Cutaneous Type Scleroderma, or Limited SSc. My results told me that, because I have one copy of the T allele at a single-nucleotide site in the STAT4 gene, I have a 0.08 percent chance of contracting this disease, as opposed to a 0.07 percent chance in the general population. One of the main studies supporting this association involved 896 patients with this disease and 3,113 healthy control subjects. This may seem like a large number — and it was the largest sample size of the three studies cited by 23andMe for this association — but it may be insufficient to avoid chance results in this case. In the major study supporting this association, the frequency of the T allele in healthy subjects ranged from 21 percent to 25 percent of the sampled populations; in individuals with Limited SSc, the frequency was 29.5 percent. This is a rather minor difference given that the frequency of Limited SSc is very small, less than 0.1 percent in males of European ancestry, regardless of whether they have the allele that increases the risk of the disease.
Having a large sample size is important in GWA studies but does not guarantee a reliable result. The probabilities found by genetic studies are influenced by many complicating factors that are not present in other kinds of population surveys. Take, for instance, population substructures. Many human populations include a certain degree of genetic substructure that may not be obviously apparent. Such substructure is the result of separate ancestral populations that have partially merged to form a current-day population. For example, the people of Madagascar derive from two distinct source populations: the African mainland and the Indian subcontinent. More subtle substructure may occur even in such an apparently homogeneous population as Americans of European ancestry, since European-Americans include a variety of incompletely admixed ethnic groups originating from different parts of Europe. Whenever there is population substructure, an association between a SNP and some disease may not actually mean that there is any real causal link between them.
Imagine that there is some SNP that occurs in one particular ethnic group at a higher frequency than in others, but has no health effects, harmful or otherwise. The ethnic group may have a higher frequency of some disease because of other factors, like diet. An association study would still find a significant association between the allele and the disease. The allele in this example serves simply as a fortuitous marker of ethnicity, and the disease is caused by cultural factors associated with the ethnic group, not by genetics. In this way, an association study can mark a completely harmless variant as associated with a disease and, by implication, a cause of it.
Though most GWA studies in the United States have been done on Americans of European ancestry, increasing numbers of studies are being done on African Americans. Even more than European Americans, African Americans are far from an ethnically homogeneous population. The populations examined by GWA studies thus have substantial genetic substructure. Although subjects are typically grouped by broad “racial” categories, substructure within those categories is generally ignored.
In some cases, one might imagine that even cultural responses toward people with certain traits might influence their behavior in ways that give rise to further associations with no biological basis. Consider, to use a rather facetious example, the “dumb blonde.” Perhaps some blondes become conditioned to behave according to stereotypes about them being scatterbrained, or perhaps they respond by being conscientiously serious and scholarly. A researcher studying a gene that in fact influences hair color might find a significant association between that gene and the behaviors that blondes exhibit in response to cultural stereotypes — especially if the researcher does not know that the gene influences hair color. Or, in a culture where men with athletic ability tend to engage in contact sports, a gene associated with athletic ability, such as one that affects endurance or muscle growth, might end up being reported to be associated with susceptibility to concussions and other common sports injuries. One can think of a million other such cross-associations. When studies rely only on statistical associations without delving into actual biological causality, it is difficult to distinguish spurious from meaningful relationships between genotypes and traits.
There are several other broad ways in which the data from GWA studies can be less straightforward than they seem. For instance, sometimes there are gene-by-gene interactions: a SNP that plays a causal role in some illness may do so only in the presence of some other SNP. Consider the many cases in which two different variants interact to affect the probability of contracting some complex disease. For example, there are two SNPs in the APOE gene that together are associated with the probability of developing Alzheimer’s disease. The genotype alleged to confer the greatest risk of Alzheimer’s disease has the nucleotide C at both of these sites. (It is designated the ε4 variant, pronounced “epsilon four” or just “ee four.”) By contrast, the most common form of the APOE gene (designated ε3) has C at one of these sites and T at the other, and is associated with a lower risk of Alzheimer’s disease, at least in populations of European ancestry. A much rarer genotype (designated ε2) has T at both sites and may be protective against Alzheimer’s disease.
Now imagine we knew only about one of these sites being associated with Alzheimer’s — the nucleotide in APOE that has C in ε3 and ε4 but T in ε2. In this case, a GWA study would likely show only that a C at that position confers an increased risk of Alzheimer’s disease in comparison to a T at the same position. Ignorance of the second SNP might lead individuals with the ε3 genotype on both chromosomes to believe that they are at increased risk for Alzheimer’s disease when in fact they are not. The APOE case is relatively well understood, but there are likely many other cases where such interactions between nucleotides are unknown. This may be particularly true when the sites are located in different genes, which may in turn be located in completely different parts of the genome. In these cases, individuals may be erroneously told that they are at increased risk for a given disease, because the available genetic information has focused on only one of the causative nucleotides.
The information from genetic studies is also complicated by the interactions of genes and the environment. Complex diseases are, by definition, those to which both environmental and genetic factors contribute. Exactly how genes interact with the environment in producing disease has not yet been completely unraveled for any complex disease. In some cases, environmental effects are known in a general way, such as the effects of diet on heart disease, but there are surely many gene-environment interactions that remain entirely unknown. Suppose that some allele causes a disease but only in the presence of a specific environmental factor, such as exposure to a certain toxin. If the role of the environmental factor is not known, the presence of the allele will be considered a risk factor for the disease even in individuals who are at no risk at all because they are not exposed to the environmental factor.
The information provided by 23andMe draws attention to the role of environmental, as well as genetic, factors. For example, in the case of Limited SSc, 23andMe states that the “relative contributions of genetic and non-genetic factors … are still unclear” and that “occupational exposure to chemical toxins” may “play a larger role than genetics in determining a person’s risk for the disease.” Yet in spite of such cautionary language, in reporting the results of disease-associated SNPs, the results provided to consumers by 23andMe include a column labeled “your risk.” This “risk” is based on a population-level estimate of the frequency of the disease in individuals having the same ethnic background as the consumer and the same genotype at the SNP site in question, but it does not take into account environmental differences among individuals — of which, in the case of its own customers, 23andMe has almost no knowledge.
In addition to information about disease risks and health, another aspect of the appeal of direct-to-consumer genomics is the claim that genomics can help us “find our roots.” Some companies specialize in ancestry services, while others, like 23andMe, have provided them as part of a larger package. Genetic reports about ancestry have received prominent media attention, including on reality-TV shows about genealogy. (For example, 23andMe provided genotypes for all twenty-five of the celebrities to appear on the 2012 public-television program Finding Your Roots.) Because of the FDA’s decision to halt the company’s health-related testing in late 2013, 23andMe is, as of this writing, offering new customers reports only about ancestry.
Having inaccurate information about one’s ancestry is not as serious as false information about one’s health. But there is still the potential for damaging revelations, or pseudo-revelations. Purported evidence that your ancestry was not what you believed it to be could raise suspicions that a putative parent or grandparent was not in fact a biological relative. This kind of discovery can be emotionally devastating, whether or not the information it is based on is accurate.
In the report it sent me, 23andMe provided a count of individuals in the company’s database who were designated as my “relatives.” I was told that I have 991 “DNA relatives” in the database, including one second or third cousin, 202 fourth cousins, and 788 “distant relatives.” A fourth cousin would be someone with whom I share a set of great-great-great-grandparents. Now, I would not be surprised to find that I have a lot of fourth cousins, since our last common ancestors would probably have been born around the early 1800s. After five generations, it is possible I really do have 202 of them — but it is extremely unlikely that there would be that many among 23andMe’s customers, who may number half a million but still represent a tiny fraction of the U.S. and world populations.
And what does 23andMe mean when it refers to my “distant relatives”? The company has a peculiar definition of relatedness: “If you have a large piece of identical DNA in common with someone, then you are related.” “Large” is not defined. But given an appropriate definition of “large,” I am related to every other human being, as well as to every other primate, and to every other mammal and so on; evolutionary theory also states that all life on earth shares some genealogical connection with that primordial being into which life was “originally breathed by the Creator,” as Darwin put it, though these genealogical connections are of course not easily traced. But that is not what most people mean by “relatives” in everyday speech; they mean people with whom they share a traceable genealogical connection.
And while it is true that, in general, I am more likely to share large DNA segments with genealogical relatives than with the population at large, it is by no means certain. Because of the random reassortment of chromosomal segments that occurs in sexual reproduction, it is perfectly possible someone can be my genealogical fourth cousin and yet share no segment of DNA with me any larger than I share with the average human being. In a large, entirely outbred population, fourth cousins are theoretically expected to share on average only about 0.2 percent of their genes. But real human populations are far from the idealized outbred populations of textbook theory. Historically, European populations, like those in the rest of the world, tended to be moderately inbred because most marriages took place within local communities. As a result, I would expect to share at least 0.2 percent of my genome with a substantial fraction of persons of Northern European ancestry, probably most of them. These people are not my relatives in any genealogically traceable sense, and certainly not my fourth cousins.
As a typical boring white guy, I was not surprised when 23andMe assured me that my ancestry was 99.8 percent European. The remaining 0.2 percent was unassigned to any known human population. (Maybe some of my distant cousins are alien life forms, or perhaps the Tylwyth Teg, as my Welsh-speaking grandmother called the fairies.)
The company also informed me that I have 2.8 percent Neanderthal ancestry, putting me in the 73rd percentile among 23andMe members of European descent. This degree of precision strikes me as suspect. Because of the degradation that DNA undergoes over long spans of time, we have only a handful of samples of genomic material in short fragments from Neanderthals dating back around 40,000 years. Though scientists from the Max Planck Institute claim to have stitched together a complete Neanderthal genome from the pieces of partial DNA, we have no knowledge of the extent of genetic variability within the Neanderthal population and no information about the genomes of the common ancestors of Neanderthals and modern humans. Thus, we have no way of knowing whether DNA segments resembling the known Neanderthal genome really reflect Neanderthal ancestry or merely our common ancestry with Neanderthals.
The supposed information on ancestry provided by 23andMe is, in general, not very informative, and sometimes it’s positively misleading. It is hard to see how much real harm could arise from my believing that a group of essentially unrelated individuals are my relatives. But personal genomics customers are paying for information that they expect to be true. That this information might be far from accurate is troubling, particularly since it is wrapped in the mantle of science, and so the average consumer, lacking the scientific training necessary to put it in its proper context, is all the more likely to simply trust it even as he or she actually misunderstands it.
A major concern of bioethicists and clinical geneticists regarding direct-to-consumer genomics has been the impact of “bad news” from genetic testing. The distinction between population averages and individual probabilities is one I can attest is often lost on undergraduate science students, and it may be difficult for customers to grasp as well. This misunderstanding may very well give rise to unnecessary anxiety.
To be sure, people are already accustomed to receiving and acting on statistical predictions in their everyday lives, from weather forecasts to lottery odds. (The popularity of lotteries and casinos goes to show that people do not always understand or act wisely on the statistics they hear.) A better comparison for the kinds of probabilities or “risks” reported in genetic studies can be found in another familiar example: What are a person’s chances of being struck by lightning? The National Weather Service estimates that, for the U.S. population, there is a one-in-500,000 chance of being struck by lightning in a given year, and a one-in-6,250 chance of being struck over an eighty-year lifespan. But the agency makes it clear that these estimates are based on the assumption that the chance of being struck by lightning is the same for everyone, though this is obviously not the case. In reality, one’s odds of being struck by lightning depend on where one lives (since lightning is more frequent in certain parts of the country) and how much time one spends outdoors. An avid golfer living in Central Florida has a much higher chance of being struck by lightning than an avid bowler living in Seattle. The oft-repeated figures are averages, which ignore such differences.
The risks reported in genetic studies raise similar issues of interpretation. The values labeled “your risk” by 23andMe are, in reality, estimated population averages, but they are treated as if they are individual probabilities. To treat these population averages as individual probabilities entails the same fallacy as assuming a single “probability of being struck by lightning” that applies to everyone.
For example, 23andMe estimates that around 12.6 percent of men of European ancestry with the ε3/ε4 genotype (that is, having the ε3 genotype of the APOE gene on one chromosome and the ε4 variant in their other copy of the gene) will develop Alzheimer’s disease, which is almost double the average incidence of Alzheimer’s for all men of European ancestry. But we should remember that this does not mean that Bill Jones, who is of European ancestry and has the ε3/ε4 genotype at the APOE gene, has a 12.6 percent chance of developing Alzheimer’s disease. Statistical models that apply a constant probability to alleles associated with complex diseases are not applicable to each individual, because we know that many environmental and other genetic factors play a role in determining an individual’s risk, and these factors vary among individuals.
Even if we ignore the differences among individuals and take 23andMe’s estimates of disease “risk” at face value, the news that one possesses a genotype associated with a complex disease is no cause for panic. By the very nature of complex diseases, predisposing alleles in themselves do not generally confer a high risk of disease. In the case of Alzheimer’s, the probability of developing the disease remains relatively low even for those with the disease-associated APOE genotype. If 12.6 percent of men of European ancestry with the disease-associated genotype will develop the disease, that means that 87.4 percent will not. In the case of Limited SSc, the probability that someone who, like me, has the disease-associated allele will actually develop the disease is still less than one in a thousand. Thus, even if we ignore the complexities introduced by environmental factors and other genes, the presence of an allegedly disease-associated genotype does not confer anything close to certainty regarding one’s future health.
By and large, 23andMe handles the potential for bad news reasonably well, with a website that implements various safeguards designed to minimize its impact. These include locking the results for alleles associated with diseases such as breast cancer and Alzheimer’s so that the consumer cannot access the results without first reading a warning text and checking a box indicating that they have done so. This process ensures at the very least a degree of psychological readiness when examining the results, so consumers are not caught off guard. In addition, the pages on the 23andMe website discussing the results relevant to these diseases include text explaining that the causative factors are not well understood and that the alleles identified are not the only possible factors. If read carefully, these texts serve as a caution against regarding the stated probability values as an indication of one’s own individual chance of disease. Consumers are also encouraged to discuss their results with a physician or a genetic counselor. According to one recent study, 28 percent of customers of direct-to-consumer genetic testing companies actually do go on to consult with health care professionals about the results of their genetic tests, though most customers did not make any health-related decisions on the basis of the information they received.
The molecular techniques by which hundreds of alleles in our genomes can be accurately typed are truly amazing; such a degree of detailed genomic knowledge was unimaginable as recently as when I was in graduate school in the Seventies. Yet we really do not yet know how to take advantage of all this information. Still, the clever people at genetic testing companies like 23andMe make it seem as if we can use our newfound knowledge to answer age-old questions about ourselves. In the case of complex diseases and behavioral traits, many of the associations reported in genetic tests are probably fortuitous and reflect no genuine causal relationship. Though some relationships between genes and traits may in fact be causal, it has never been definitively proven in the case of any complex trait — even those studied in what 23andMe describes as “established research.”
It is too soon to tell what will come of the recent regulatory and legal actions against 23andMe. Whether or not the company succeeds in obtaining FDA approval for its personal genome service, the future of direct-to-consumer genetic testing may be based not on genotyping but instead on the more extensive information provided in complete sequences of the human genome. Other companies are already moving into that field, and while prices are now still significantly higher than the $99 that 23andMe charges, they are likely to drop.
More extensive genetic data may help improve the accuracy of direct-to-consumer testing services, but the basic limitations of our understanding of the relationship between genes and complex traits remain. The fundamentally limited and sometimes misleading nature of the information provided by personal genomics raises ethical questions that go beyond the mere problem of consumer overreaction to bad news. Direct-to-consumer genomics might be applauded for helping to increase the public’s understanding of modern genetics, but the field may also contribute to some of the common misunderstandings of the causal role played by genes. Since these companies rely so heavily on the conclusions of GWA studies, which they tend to present as showing causal relationships between genetic mutations and traits, they promote an overly simplistic understanding of how genes really operate.
Genetic determinism is the idea that all or most traits are determined by genes, or that the differences between us are simply caused by differences in our genes. This is not the position of any serious geneticist or biologist — they all understand that genes interact in complex ways with the environment to produce traits — but it is an idea that has a lasting appeal among non-scientists, and is also implicitly or explicitly found in the work of many of today’s advocates of scientism, from IQ theorists to evolutionary psychologists.
By extending the purported domain of genetic influence to encompass such traits as smoking behavior, caffeine consumption, food preference, eating behavior, measures of intelligence, memory, and pain sensitivity, genetics companies like 23andMe threaten to reinforce the idea of genetic determinism in their customers. Especially for people unfamiliar with the science, this could lead to the belief that our entire lives are determined by our genetic inheritance — a crude but modern scientific form of fatalism that will not enhance but degrade our self-understanding. Fatalistic doctrines can only undermine individual initiative, making people more apathetic, more easily dominated by tyrants or manipulated by technocrats. It was in urging resistance to tyranny that Shakespeare’s Cassius said, “The fault, dear Brutus, is not in our stars, / But in ourselves, that we are underlings.” Today we look falsely for the fault in our genes; but while they are far more than the stars a part of who we are, they no more diminish our nature as free beings, responsible for ourselves, our fates ultimately unwritten.
Exhausted by science and tech debates that go nowhere?