Over the last decade, DNA sequencing technologies have become more accessible, and a number of companies offering to unlock the secrets of our genome have become increasingly popular. These companies market themselves as tools for learning about ancestry and potentially connecting with distant relatives by analyzing their customers’ genomes.
DNA extracted from saliva samples sent in by patrons is sequenced, and loci that are highly variable among different ethnic groups are compared to corresponding loci in existing libraries. The largest of these companies, Ancestry.com and 23&Me, have user libraries containing the genomes of over 15 million individuals (1). The software then determines the population to which the customer’s input is most likely to correspond to and compiles a genealogy report (2). For example, if 35% of the analyzed loci from a customer’s genome correspond to the standing library of Sub-Saharan African sequences, the report will tell the customer that they are 35% Sub-Saharan African. Of course, a person’s genome - and more so a library of several million people’s genomes - contains much more interesting information than an elaborate family tree.
Genome analysis can also detect sequence variants that are associated with genetic disorders, and provide other health related information. 23&Me offers this service in one of its packages that includes analysis of one’s predisposition to diseases in addition to ethnic background. Using information about their customers’ health also allows for discovery of new disease associated variants; data from 23&Me’s library has been mined in over 100 publications since the company’s inception in 2006 (3). Furthermore, 23&Me recently announced a partnership with pharmaceutical giant GlaxoSmithKline (GSK). It is unclear what this collaboration may yield, but it seems that GSK will use disease associated variations discovered by 23&Me to develop new therapies. GSK may also be interested in contacting 23&Me customers for later clinical trials in an attempt to expedite the normally slow developmental process for therapeutics (3).
Genealogy services have also made headlines recently for helping to close homicide investigations, the most notable of which is the case of The Golden State Killer. To confirm that suspect Joseph James DeAngelo was indeed the Golden State Killer, police uploaded DNA gathered from multiple crime scenes to the open source library of GEDmatch, another genealogy company. They found that crime scene DNA had partial matches to the DNA of DeAngelo’s relatives who had used GEDmatch’s services in the past. These partial matches eventually led police to DeAngelo, who was apprehended in 2018 after confirmation that DNA he left behind at a restaurant matched the crime scene isolates (4, 5).
Example of a DNA Relatives Map generated by 23&Me. This map shows the individual the locations of all the 23&Me members whose submitted DNA is a close match to theirs.
The success of this methodology has led to a partnership between the FBI and the genealogy company FamilyTreeDNA (5). This collaboration will grant the FBI access to nearly 2 million genetic profiles, effectively doubling the amount of genomic data they previously had access to through open source libraries (5). Allegedly, the FBI will not have totally unrestricted access to FamilyTreeDNA’s archives, but will be able to upload crime-scene samples to their database and search matches. This has engendered apprehension among the company’s customers, as there is now a possibility that some of their relatives may be caught up in a criminal investigation. In fact, FamilyTreeDNA has been stricken from a list of genealogy companies adhering to a set of voluntary privacy guidelines maintained by the Future of Privacy Forum, whose president called the deal “deeply flawed” and “out of line with industry best practices and with consumer expectations” (5). Obviously, the idea of sharing such intimate information stirs up controversy, and 23&Me’s collaboration with GSK is no exception.
The scientific community has a unique understanding of the potential applications of the databases that these companies have amassed. The majority of scientists would agree that the generating a robust genomic database containing information from a representative sampling of the population is invaluable for biomedical research efforts. However, the marketing of this technology as a consumer product is a different story, as there are currently no robust regulatory measure to protect consumers from having their information shared with various third parties. As scientists, it is our responsibility to ensure that our friends outside of the lab have an appreciation for the power of the information they are handing over to these corporations. As such, we must help ensure that policy keeps pace with technology, and advocate for legislation protecting consumers from the appropriation of their genomic material and guaranteeing fair compensation should they consent to the use of their personal data for research purposes. These companies could potentially be instrumental to the general progress of biomedical research, but we will have to do our part to advocate for their fair and ethical behavior with the genomic data of their customers.