Unhidden Traits: Genomic Data Privacy Debates Heat Up

Earlier this year Yaniv Erlich of the Massachusetts Institute of Technology sent bioethicists into a frenzy when he and his team uncovered the names of people whose anonymous genome profiles were published by the 1000 Genomes Project. Erlich and his co-workers found the identities entirely by connecting Y-chromosome data and other information from the database with publicly available records, including genealogy databases and lists of people living in particular locales.
Their aim is to mine the genomes and medical histories of study participants to pinpoint gene variants that contribute to such diseases as cancer and diabetes. A huge data set is needed to uncover genetic links that have so far proved elusive. Only after such links are established will society enter the promised era of personalized medicine, when physicians armed with patients’ unique genetic sketches can help prevent disease and customize therapies.
At the moment, however, the alliance and other genome researchers have no clear model to follow for how best to protect theprivacy of genetic donors. Some researchers have simply accepted that privacy is impractical, even impossible. That is the premise of the Personal Genome Project (PGP), the brainchild of geneticist George Church of Harvard Medical School University. This project intends to recruit 100,000 people to share with the world their genomic, medical and demographic data—even giving them the option to volunteer their names and headshots in light of Erlich’s recent work.
Church has designed this enterprise and its consent policies to cater to people who, like him, think that guaranteeing privacy in today’s digital world is unfeasible. People hack databases, genomes, tell secrets and every lock has a key. His participants are informed explicitly about the benefits and risks of participating, going as far as being warned that malicious individuals could potentially synthesize and plant their DNA on a crime scene. They accept the risk and upload their data online. You can take a peek right now, if you wish. “The old idea ‘to add more security’ is looking less and less viable in the face of people like Julian Assange, Aaron Swartz and Edward Snowden,” Church says, referring to contemporary leakers of secrets that have made recent headlines. Instead of using a data model that prioritized privacy, “we flipped the whole thing on its head,” he says. “Instead of saying how can we make the data more secure, I said, ‘Let’s make it more open and find people who are okay with that.’” Church calls this “open consent.”
“There are literally millions of people who participate in medical research, and probably over a million people whose genomes have been characterized in some way or another,” where the data is not freely available precisely because of privacy concerns, says David Altshuler, deputy director of the Broad Institute of M.I.T. and Harvard. He is also a leader in the global alliance the 1000 Genomes Project and theInternational HapMap Project, which look for genetic variations involved in health and disease. Those projects upload genomes online to promote open science, but they collect much less personal information than the PGP does.
The drawback of a tight emphasis on subject privacy, of course, is that researchers lose out on an enormous pool of valuable data—information that can allow them to link the confluence of genes and environment in determining health and disease traits. With demographics, a scientist could compare zip codes with susceptibility to lung cancer to surmise the potential role of air quality in developing the disease.