We have previously covered news concerning the potential and the development of DNA data storage here on HEXUS. Today we have learned of a significant advancement in the technology being devised to make DNA data storage tech realise its full potential in terms of capacity and durability. However, drawbacks such as the high cost, and relatively slow speeds, of DNA data writing and reading remain.
Science Magazine reports that researchers based at Columbia University and the New York Genome Centre have created a new DNA data storage algorithm which they call the 'DNA Fountain'. In tests the new algorithm enabled encoding of data into DNA strands at a density of 215 petabytes (215 million gigabytes) per single gram of DNA. This density comes much closer (85 per cent) to the theoretical max data density of storage on DNA, easily beating rival approaches such as a 2012 Harvard University study which packed 'only' 1.28 petabytes per gram of DNA.
Looking at the need for such massive data storage capacity, Science Mag says that humans have generated as much data in the last two years as was created in the whole of our preceding history. It notes that using the new 'DNA Fountain' technology all of humanity's data could be stored on a medium weighing and measuring about the same as two pickup trucks.
To test the new 'DNA Fountain' algorithm and technology the scientists encoded six files including; a full computer operating system, a computer virus, an 1895 French film called Arrival of a Train at La Ciotat, and a 1948 study by information theorist Claude Shannon. The files were compressed into one binary file and then split into short code strings, says Science Mag. Then DNA Fountani came into play to "randomly package the strings into so-called droplets, to which they added extra tags to help reassemble them in the proper order later".
Twist Bioscience, a well known leader in synthesized DNA, synthesized DNA strands from the info provided by the scientists. A fortnight later the Columbia University and New York Genome Centre scientists, Yaniv Erlich and Dina Zielinski, decoded their DNA sequenced data with the genetic code translated back to binary. The data recovery was error free and a virtually unlimited number of copies of their DNA files could be made using the polymerase chain reaction, which is "a standard DNA copying technique".
Yaniv Erlich and Dina Zielinski
The story of this data storage density achievement is tempered by the current costs of DNA data storage and retrieval. Apparently the synthesis of 2 megabytes of data into DNA cost $7,000, and a further cost of $2,000 was involved in reading the data back. Another drawback right now is the relatively slow speed of DNA data writing and retrieval.