Researchers from the UW’s Molecular Information Systems Lab (MISL) have created one of the first systems that uses DNA molecules to store digital images — and successfully demonstrated the ability to retrieve the encoded images intact.
UW CSE professor Luis Ceze, joint CSE and EE professor Georg Seelig, CSE affiliate faculty members Doug Carmean and Karin Strauss of Microsoft Research, CSE Ph.D. student James Bornholt, and BioE Ph.D. student Randolph Lopez are the authors of an ASPLOS paper describing the effort to advance the state of the art in digital storage. Taking their cues from nature, the researchers aim to create a system that will be able to accommodate the growing volume of data being generated around the world — predicted to reach 44 trillion gigabytes by 2020.
From the UW media release:
“The team of computer scientists and electrical engineers has detailed one of the first complete systems to encode, store and retrieve digital data using DNA molecules, which can store information millions of times more compactly than current archival technologies.
“In one experiment…the team successfully encoded digital data from four image files into the nucleotide sequences of synthetic DNA snippets.
“More significantly, they were also able to reverse that process — retrieving the correct sequences from a larger pool of DNA and reconstructing the images without losing a single byte of information.”
According to Ceze, “Life has produced this fantastic molecule called DNA that efficiently stores all kinds of information about your genes and how a living system works — it’s very, very compact and very durable…We’re essentially repurposing it to store digital data — pictures, videos, documents — in a manageable way for hundreds or thousands of years.”
The MISL team became one of only two nationwide that have demonstrated the ability to achieve “random access” — that is, to retrieve the correct sequences of data from a large pool of random DNA molecules — by encoding the equivalent of street addresses in the DNA sequences and then employing a technique commonly used in molecular biology, Polymerase Chain Reaction (PCR), to identify and reorder the data. They also applied error correction techniques typically used in computer memory to the DNA to address errors in the encoding process.
“This is an example where we’re borrowing something from nature — DNA — to store information. But we’re using something we know from computers — how to correct memory errors — and applying that back to nature,” Ceze said.
Read the full UW media release here and check out our previous blog post here. The team presented its findings at the ASPLOS 2016 conference earlier this month — read the research paper here. Check out coverage of the project by Newsweek, Gizmodo, Discover Magazine, CNET, Motherboard, Crosscut, Geekwire and the Daily Mail.
Photos: Tara Brown Photography