Abstract
Demand for data storage is growing exponentially, but the capacity of existing storage media is not keeping up. Using DNA to archive data is an attractive possibility because it is extremely dense, with a raw limit of 1 exabyte/mm3 (109 GB/mm3), and long-lasting, with observed half-life of over 500 years. This paper presents an architecture for a DNA-based archival storage system. It is structured as a key-value store, and leverages common biochemical techniques to provide random access. We also propose a new encoding scheme that offers controllable redundancy, trading off reliability for density. We demonstrate feasibility, random access, and robustness of the proposed encoding with wet lab experiments involving 151 kB of synthesized DNA and a 42 kB random-access subset, and simulation experiments of larger sets calibrated to the wet lab experiments. Finally, we highlight trends in biotechnology that indicate the impending practicality of DNA storage for much larger datasets.
- L. Adleman. Molecular computation of solutions to combinatorial problems. Science, 266 (5187): 1021--1024, 1994.Google Scholar
Digital Library
- M. E. Allentoft, M. Collins, D. Harker, J. Haile, C. L. Oskam, M. L. Hale, P. F. Campos, J. A. Samaniego, M. T. P. Gilbert, E. Willerslev, G. Zhang, R. P. Scofield, R. N. Holdaway, and M. Bunce. The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils. Proceedings of the Royal Society of London B: Biological Sciences, 279 (1748): 4724--4733, 2012.Google Scholar
Cross Ref
- C. Bancroft, T. Bowler, B. Bloom, and C. T. Clelland. Long-term storage of information in DNA. Science, 293 (5536): 1763--1765, 2001.Google Scholar
- R. Carlson. Time for new DNA synthesis and sequencing cost curves. http://www.synthesis.cc/2014/02/time-for-new-cost-curves-2014.html, 2014.Google Scholar
- Y.-J. Chen, N. Dalchau, N. Srinivas, A. Phillips, L. Cardelli, D. Soloveichik, and G. Seelig. Programmable chemical controllers made from DNA. Nature Nanotechnology, 8 (10): 755--762, 2013.Google Scholar
Cross Ref
- G. M. Church, Y. Gao, and S. Kosuri. Next-generation digital information storage in DNA. Science, 337 (6102): 1628, 2012.Google Scholar
- C. T. Clelland, V. Risca, and C. Bancroft. Hiding messages in DNA microdots. Nature, 399: 533--534, 1999.Google Scholar
Cross Ref
- ExtremeTech. New optical laser can increase DVD storage up to one petabyte. http://www.extremetech.com/computing/159245-new-optical-laser-can-increase-dvd-storage-up-to-one-petabyte, 2013.Google Scholar
- D. G. Gibson, J. I. Glass, C. Lartigue, V. N. Noskov, R.-Y. Chuang, M. A. Algire, G. A. Benders, M. G. Montague, L. Ma, M. M. Moodie, C. Merryman, S. Vashee, R. Krishnakumar, N. Assad-Garcia, C. Andrews-Pfannkoch, E. A. Denisova, L. Young, Z.-Q. Qi, T. H. Segall-Shapiro, C. H. Calvey, P. P. Parmar, C. A. Hutchison, H. O. Smith, and J. C. Venter. Creation of a bacterial cell controlled by a chemically synthesized genome. Science, 329 (5987): 52--56, 2010.Google Scholar
Cross Ref
- N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature, 494: 77--80, 2013.Google Scholar
Cross Ref
- R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark. Robust chemical preservation of digital information on DNA in silica with error-correcting codes. Angew. Chem. Int. Ed., 54: 2552--2555, 2015.Google Scholar
Cross Ref
- Q. Guo, K. Strauss, L. Ceze, and H. Malvar. High-density image storage using approximate memory cells. In ASPLOS, 2016.Google Scholar
Digital Library
- D. Huffman. A method for the construction of minimum-redundancy codes. Proceedings of the IRE, 40 (9): 1098--1101, 1952.Google Scholar
Cross Ref
- IDC. Where in the world is storage. http://www.idc.com/downloads/where_is_storage_infographic_243338.pdf, 2013.Google Scholar
- S. Kosuri and G. M. Church. Large-scale de novo DNA synthesis: technologies and applications. Nature Methods, 11: 499--507, 2014.Google Scholar
Cross Ref
- A. Leier, C. Richter, W. Banzhaf, and H. Rauhe. Cryptography with DNA binary strands. Biosystems, 57 (1): 13--22, 2000.Google Scholar
Cross Ref
- M. D. Matteucci and M. H. Caruthers. Synthesis of deoxyoligonucleotides on a polymer support. Journal of the American Chemical Society, 103 (11): 3185--3191, 1981.Google Scholar
Cross Ref
- R. Miller. Facebook builds exabyte data centers for cold storage. http://www.datacenterknowledge.com/archives/2013/01/18/facebook-builds-new-data-centers-for-cold-storage/, 2013.Google Scholar
- R. A. Muscat, K. Strauss, L. Ceze, and G. Seelig. DNA-based molecular architecture with spatially localized components. In International Symposium on Computer Architecture, 2013.Google Scholar
Digital Library
- T. P. Niedringhaus, D. Milanova, M. B. Kerby, M. P. Snyder, and A. E. Barron. Landscape of next-generation sequencing technologies. Anal. Chem., 83: 4327--4341, 2011.Google Scholar
Cross Ref
- L. Qian, E. Winfree, and J. Bruck. Neural network computation with DNA strand displacement cascades. Science, 475 (7356): 368--372, 2011.Google Scholar
- I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. Journal of the Society for Industrial and Applied Mathematics, 8 (2): 300--304, 1960.Google Scholar
Cross Ref
- A. Sampson, J. Nelson, K. Strauss, and L. Ceze. Approximate storage in solid-state memories. In International Symposium on Microarchitecture, 2013.Google Scholar
Digital Library
- J. J. Schwartz, C. Lee, and J. Shendure. Accurate gene synthesis with tag-directed retrieval of sequence-verified DNA molecules. Nature Methods, 9 (9): 913--915, 2012.Google Scholar
Cross Ref
- Sony. Sony develops magnetic tape technology with the world's highest recording density. http://www.sony.net/SonyInfo/News/Press/201404/14-044E/, 2014.Google Scholar
- K. Takahashi, S. Yaegashi, A. Kameda, and M. Hagiya. Chain reaction systems based on loop dissociation of DNA. In DNA Computing, volume 3892 of Lecture Notes in Computer Science, pages 347--358. Springer Berlin Heidelberg, 2006.Google Scholar
- B. Talawar. A crossbar interconnection network in DNA. In Workshop on High Performance Computational Biology, 2015.Google Scholar
Digital Library
- S. M. H. T. Yazdi, Y. Yuan, J. Ma, H. Zhao, and O. Milenkovic. A Rewritable, Random-Access DNA-Based Storage System. Nature Scientific Reports, 5 (14318), 2015.Google Scholar
- J. N. Zadeh, B. R. Wolfe, and N. A. Pierce. Nucleic acid sequence design via efficient ensemble defect optimization. Journal of Computational Chemistry, 32 (3): 439--452, 2011.Google Scholar
Cross Ref
Index Terms
A DNA-Based Archival Storage System
Recommendations
A DNA-Based Archival Storage System
ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating SystemsDemand for data storage is growing exponentially, but the capacity of existing storage media is not keeping up. Using DNA to archive data is an attractive possibility because it is extremely dense, with a raw limit of 1 exabyte/mm3 (109 GB/mm3), and ...
A DNA-Based Archival Storage System
ASPLOS'16Demand for data storage is growing exponentially, but the capacity of existing storage media is not keeping up. Using DNA to archive data is an attractive possibility because it is extremely dense, with a raw limit of 1 exabyte/mm3 (109 GB/mm3), and ...
An Erasure Coded Archival Storage System
ICPADS '12: Proceedings of the 2012 IEEE 18th International Conference on Parallel and Distributed SystemsThere is an ever increasing need of storage capacity for storage of digital archives and historical datadigital preservation, because of regulatory and compliance requirements. There is an increasing interest in disk based archival system. Major ...







Comments