DNA is just another way we can’t opt out of data sharing

Your family tree might contain a few bad apples.

Updated ·7 min read
Westend61 via Getty Images

Growing up in California, serial killers are as much a fact of life as year-round citrus or having a bit of Spanish in your daily vocabulary. News of the Golden State Killer's arrest came as a surprise and a relief to most of us whose early lives were shaped by a generation of fear. The Golden State Killer raped at least 51 women and killed 12 people (that we know of). Our parents literally slept with guns and knives under the killer's shadow, and the many others like him.

Even past being a child, as a homeless teen among many on the streets here in San Francisco, the man who would be known today as the Golden State Killer was a real-life boogeyman who occupied our pantheon of constant, terrifying, and very real mortal threats. We knew no one noticed when we went missing, and therefore endeavored to protect each other from men like him.

Yet, the way the Golden State Killer was found — through other people's DNA — raises a new kind of specter. One of a data dystopia where we've lost control in new ways.

The combination of DNA and family tree databases, blooming in the commercial sector, is a new weapon for law enforcement in cold cases. The Golden State Killer case is a jumping off point; police are now using the same techniques to look for the Zodiac Killer.

Suspected Golden State Killer, Joseph James DeAngelo, would not have been caught recently if someone, somewhere in his family tree hadn't done DNA testing and exported it onto a popular genealogy search website. As the Sacramento Bee first reported upon DeAngelo's arrest, "Paul Holes, a retired investigator with the Contra Costa County District Attorney's Office, confirmed Friday that he used 'open-source' site GEDmatch to help law enforcement make the DNA connection."

We don't yet know exactly how the detectives used GEDmatch. We do know that they had at least one false start in their DNA-fueled search. This comes as little surprise to those who've encountered botched results from the same popular DNA-family tree websites whose databases end up in GEDmatch.

GEDmatch "provides DNA and genealogical analysis tools for amateur and professional researchers and genealogists." The Florida-based site is a place where genetic profiles are voluntarily shared publicly by their owners; there are currently around 650,000 members.

In order to use it, you need to have spat in a tube and sent it to a DNA testing company like Ancestry or 23andMe. "These two companies don't allow law enforcement to access their customer databases unless they get a court order," Wired reported. "Neither 23andMe nor Ancestry was approached by investigators in this case, according to spokespeople for the companies."


After you surrender your spit and get the results, you upload your DNA data (GEDCOM — short for "genealogical data communication") to GEDmatch. Then you use it to find a long-lost cousin or uncle-daddy. Or you might apply it how Contra Costa County detectives did, and use it to track down a suspected criminal.

The Golden State Killer left enough DNA at his rape-and-murder crime scenes for law enforcement to wrangle a modern-day DNA sequence out of it, evidently in the form of a GEDCOM file. After finding DeAngelo's family tree, investigators narrowed their search by factors such as age and location. "Investigators traced his family tree and surveilled him," wrote the Los Angeles Times. "They watched him discard his DNA in a public place, allowing them to obtain a sample."

LA Times explained:

DNA can be obtained from gum, a used cup, skin cells or fallen hair. Officials have not revealed what type of DNA sample they obtained from DeAngelo or how they got it — only that it matches the genetic evidence left behind after at least three rapes in Northern California and three killings in Orange County.

Right now there's a fight over keeping certain documents associated with the case sealed. According to VC Star, "The Associated Press and other news outlets have filed a motion to unseal the information related to the April arrest." DeAngelo's public defender argued that the search and arrest warrants shouldn't be made public, with press reporting she claimed "the affidavits do not include extensive information about the use of genealogical websites used to link DeAngelo to the case through DNA." Even still, I'm in the camp who thinks there's only one way to find out.

Either way, police use of a voluntary public DNA-genealogy database to find a suspect brings us to the edge of the data void, where we find the void staring right back at us. It's one thing that companies like Cambridge Analytica have practically outstripped the NSA in Hoovering up out data without our knowledge for nefarious purposes (and profit). It's another thing entirely that apps and social media companies have been tricking users — our friends and relatives, and sometimes us — into letting them scrape our address books, or otherwise get data on us indirectly.

And that's been pretty bad, hence the feeding frenzy by Google and Facebook on data sharing -- both voluntary or otherwise. Like, creating entire dossiers on us in the form of shadow profiles, or collecting data on us we don't want to provide. Companies have been doing this for a while, most people are just starting to understand what it all means, and everyone is fifty shades of pissed off and feeling totally powerless about it.

Add your DNA to the list of things you can't necessarily opt out of having shared, and no one will be blamed for wanting to flip every table in the universe.

News of the the Golden State Killer's capture rang multiple chimes for someone like me. The aforementioned terror of adults in my (admittedly bizarre) California childhood is one. Another is the continual fight against my personal data being essentially stolen, or at least slipping through my fingers before I have time to realize data dealers have turned my privacy into sand.

The other is privacy and accuracy in regard to the commercial DNA testing industry. I know that if you're reading this, you probably also have questions without satisfying answers when it comes to accuracy, data sharing practices, and law enforcement access to consumer DNA testing companies. And of course, others in my family tree sharing ways for strangers to find me, or stalk me, or accuse me of a crime I didn't do.

Like most people I know, I really want to have my DNA tested. The stakes are higher for me because I don't know my family, or my family health history, and the only family member I did know went missing before I became an adult. Like most people I know, I'm really worried about the future of 23andMe's privacy promises, Ancestry's assurances about not sharing data with law enforcement, and the like. This is the age of companies being bought and sold, self-serving changes to Terms and privacy policies, databases that get exposed, servers that get hacked, and companies that do awful things with our data while telling us everything's okay.

I'm glad that Contra Costa County detectives thought of a novel way to find the Golden State Killer, and I hope authorities have the same successes with other serial killers. The crime-solving part of me finds it exciting, while the female part of me hopes it lends weight to shifting society away from lending credence to men who hate and punish women.

But I absolutely believe we are dangerously lacking in responsible stewardship of both data and having a sane conversation about imbalances of power. So, like most of us, I do what I can to hope for the best and prepare for the worst.

Images: Phil Noble / Reuters (DNA sample viles); Sacramento Bee via Getty Images (Suspect in court)