Skip to main content

UW and Microsoft researchers set new record in DNA data storage

Luis Ceze and Lee Organick in the lab

CSE professor Luis Ceze and research scientist Lee Organick in the Molecular Information Systems Lab (Credit: Tara Brown Photography)

Researchers in the Molecular Information Systems Lab housed at the University of Washington have achieved a new milestone in their quest to develop the next generation of data storage by encoding a world record-setting 200 megabytes of data in strands of DNA. A team that includes UW CSE professor Luis Ceze and Microsoft researcher (and UW CSE affiliate professor) Karin Strauss announced today that it has successfully stored and retrieved an impressive list of multimedia and literary works, including a high-definition music video by the band OK Go, the complete Universal Declaration of Human Rights in over 100 languages, the novel War and Peace, and more—all contained in a space smaller than the tip of a pencil.

The Microsoft Next blog has the full story. Here’s an excerpt:

“Demand for data storage is growing exponentially, and the capacity of existing storage media is not keeping pace.  That’s making it hard for organizations that need to store a lot of data – such as hospitals with vast databases of patient data or companies with lots of video footage – to keep up. And it means information is being lost, and the problem will only worsen without a new solution.

“DNA could be the answer.

“It has several advantages as a storage medium. It’s compact, durable – capable of lasting for a very long time if kept in good conditions (DNA from woolly mammoths was recovered several thousand years after they went extinct, for instance) – and will always be current, the researchers believe.

“‘As long as there is DNA-based life on the planet, we’ll be interested in reading it,’ said Karin Strauss, the principal Microsoft researcher on the project. ‘So it’s eternally relevant.'”

In a Q&A on UW Today, Ceze explained how the team combined concepts from molecular biology and computer science to achieve its latest milestone. He described how he and his colleagues use polymerase chain reactions—a technique commonly employed by microbiologists to amplify specific segments of DNA for research—to selectively access only the data they want to read. He also noted that, despite its reliability, “DNA writing and reading have errors, just like hard drives and electronic memories have errors, so we needed to develop error-correcting codes to reliably retrieve data.”

Read the Microsoft Next blog post here and the UW Today Q&A with Ceze here, and watch a pair of videos (short version here, extended version here) produced by Microsoft that feature MISL team members and Microsoft leaders talking about the opportunity presented by DNA data storage.

Check out the latest coverage of the team’s record-setting achievement in the Seattle Times, GeekWire, The Verge, Mashable, U.S. News & World ReportCIO Today, Business Insider, and MIT Technology Review.

Earlier this year, Ceze and Strauss co-authored a paper on their efforts to develop a DNA-based storage system with CSE Ph.D. student James Bornholt, Bioengineering Ph.D. student Randolph Lopez, Microsoft researcher and CSE affiliate professor Douglas Carmean, and CSE and Electrical Engineering professor Georg Seelig. For more on this groundbreaking project, read the April 2016 UW News release here.