In a new study, researchers from Columbia University and the New York Genome Centre (NYGC) in the US showed that an algorithm designed for streaming video on a cellphone can unlock DNA's nearly full storage potential by squeezing more information into its four base nucleotides.
Scientists have successfully stored a computer operating system, a short movie along with other data in DNA, an advance that may usher the next generation of ultra-compact, biological storage devices which will last hundreds of thousands of years. In a new study, researchers from Columbia University and the New York Genome Centre (NYGC) in the US showed that an algorithm designed for streaming video on a cellphone can unlock DNA’s nearly full storage potential by squeezing more information into its four base nucleotides. They also showed that the technology is extremely reliable. DNA is an ideal storage medium because it is ultra-compact and can last hundreds of thousands of years if kept in a cool, dry place, as demonstrated by the recent recovery of DNA from the bones of a 430,000-year-old human ancestor found in a cave in Spain.
“DNA won’t degrade over time like cassette tapes and CDs, and it won’t become obsolete – if it does, we have bigger problems,” said Yaniv Erlich from Columbia University. Researchers chose six files to encode, or write, into DNA: a full computer operating system, an 1895 French film, “Arrival of a train at La Ciotat,” a 50 USD Amazon gift card, a computer virus, a Pioneer plaque and a 1948 study by information theorist Claude Shannon. They compressed the files into a master file, and then split the data into short strings of binary code made up of ones and zeros.
Using an erasure-correcting algorithm called fountain codes, they randomly packaged the strings into so-called droplets, and mapped the ones and zeros in each droplet to the four nucleotide bases in DNA: A, G, C and T. The algorithm deleted letter combinations known to create errors and added a barcode to each droplet to help reassemble the files later. The researchers showed that their coding strategy packs 215 petabytes of data on a single gram of DNA, which according to Erlich was the highest-density data-storage device ever created. “We believe this is the highest-density data-storage device ever created,” said Erlich. The study was published in the journal Science.