skip to main content
10.1145/2872362.2872397acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Public Access

A DNA-Based Archival Storage System

Published: 25 March 2016 Publication History
  • Get Citation Alerts
  • Abstract

    Demand for data storage is growing exponentially, but the capacity of existing storage media is not keeping up. Using DNA to archive data is an attractive possibility because it is extremely dense, with a raw limit of 1 exabyte/mm3 (109 GB/mm3), and long-lasting, with observed half-life of over 500 years. This paper presents an architecture for a DNA-based archival storage system. It is structured as a key-value store, and leverages common biochemical techniques to provide random access. We also propose a new encoding scheme that offers controllable redundancy, trading off reliability for density. We demonstrate feasibility, random access, and robustness of the proposed encoding with wet lab experiments involving 151 kB of synthesized DNA and a 42 kB random-access subset, and simulation experiments of larger sets calibrated to the wet lab experiments. Finally, we highlight trends in biotechnology that indicate the impending practicality of DNA storage for much larger datasets.

    References

    [1]
    L. Adleman. Molecular computation of solutions to combinatorial problems. Science, 266 (5187): 1021--1024, 1994.
    [2]
    M. E. Allentoft, M. Collins, D. Harker, J. Haile, C. L. Oskam, M. L. Hale, P. F. Campos, J. A. Samaniego, M. T. P. Gilbert, E. Willerslev, G. Zhang, R. P. Scofield, R. N. Holdaway, and M. Bunce. The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils. Proceedings of the Royal Society of London B: Biological Sciences, 279 (1748): 4724--4733, 2012.
    [3]
    C. Bancroft, T. Bowler, B. Bloom, and C. T. Clelland. Long-term storage of information in DNA. Science, 293 (5536): 1763--1765, 2001.
    [4]
    R. Carlson. Time for new DNA synthesis and sequencing cost curves. http://www.synthesis.cc/2014/02/time-for-new-cost-curves-2014.html, 2014.
    [5]
    Y.-J. Chen, N. Dalchau, N. Srinivas, A. Phillips, L. Cardelli, D. Soloveichik, and G. Seelig. Programmable chemical controllers made from DNA. Nature Nanotechnology, 8 (10): 755--762, 2013.
    [6]
    G. M. Church, Y. Gao, and S. Kosuri. Next-generation digital information storage in DNA. Science, 337 (6102): 1628, 2012.
    [7]
    C. T. Clelland, V. Risca, and C. Bancroft. Hiding messages in DNA microdots. Nature, 399: 533--534, 1999.
    [8]
    ExtremeTech. New optical laser can increase DVD storage up to one petabyte. http://www.extremetech.com/computing/159245-new-optical-laser-can-increase-dvd-storage-up-to-one-petabyte, 2013.
    [9]
    D. G. Gibson, J. I. Glass, C. Lartigue, V. N. Noskov, R.-Y. Chuang, M. A. Algire, G. A. Benders, M. G. Montague, L. Ma, M. M. Moodie, C. Merryman, S. Vashee, R. Krishnakumar, N. Assad-Garcia, C. Andrews-Pfannkoch, E. A. Denisova, L. Young, Z.-Q. Qi, T. H. Segall-Shapiro, C. H. Calvey, P. P. Parmar, C. A. Hutchison, H. O. Smith, and J. C. Venter. Creation of a bacterial cell controlled by a chemically synthesized genome. Science, 329 (5987): 52--56, 2010.
    [10]
    N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature, 494: 77--80, 2013.
    [11]
    R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark. Robust chemical preservation of digital information on DNA in silica with error-correcting codes. Angew. Chem. Int. Ed., 54: 2552--2555, 2015.
    [12]
    Q. Guo, K. Strauss, L. Ceze, and H. Malvar. High-density image storage using approximate memory cells. In ASPLOS, 2016.
    [13]
    D. Huffman. A method for the construction of minimum-redundancy codes. Proceedings of the IRE, 40 (9): 1098--1101, 1952.
    [14]
    IDC. Where in the world is storage. http://www.idc.com/downloads/where_is_storage_infographic_243338.pdf, 2013.
    [15]
    S. Kosuri and G. M. Church. Large-scale de novo DNA synthesis: technologies and applications. Nature Methods, 11: 499--507, 2014.
    [16]
    A. Leier, C. Richter, W. Banzhaf, and H. Rauhe. Cryptography with DNA binary strands. Biosystems, 57 (1): 13--22, 2000.
    [17]
    M. D. Matteucci and M. H. Caruthers. Synthesis of deoxyoligonucleotides on a polymer support. Journal of the American Chemical Society, 103 (11): 3185--3191, 1981.
    [18]
    R. Miller. Facebook builds exabyte data centers for cold storage. http://www.datacenterknowledge.com/archives/2013/01/18/facebook-builds-new-data-centers-for-cold-storage/, 2013.
    [19]
    R. A. Muscat, K. Strauss, L. Ceze, and G. Seelig. DNA-based molecular architecture with spatially localized components. In International Symposium on Computer Architecture, 2013.
    [20]
    T. P. Niedringhaus, D. Milanova, M. B. Kerby, M. P. Snyder, and A. E. Barron. Landscape of next-generation sequencing technologies. Anal. Chem., 83: 4327--4341, 2011.
    [21]
    L. Qian, E. Winfree, and J. Bruck. Neural network computation with DNA strand displacement cascades. Science, 475 (7356): 368--372, 2011.
    [22]
    I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. Journal of the Society for Industrial and Applied Mathematics, 8 (2): 300--304, 1960.
    [23]
    A. Sampson, J. Nelson, K. Strauss, and L. Ceze. Approximate storage in solid-state memories. In International Symposium on Microarchitecture, 2013.
    [24]
    J. J. Schwartz, C. Lee, and J. Shendure. Accurate gene synthesis with tag-directed retrieval of sequence-verified DNA molecules. Nature Methods, 9 (9): 913--915, 2012.
    [25]
    Sony. Sony develops magnetic tape technology with the world's highest recording density. http://www.sony.net/SonyInfo/News/Press/201404/14-044E/, 2014.
    [26]
    K. Takahashi, S. Yaegashi, A. Kameda, and M. Hagiya. Chain reaction systems based on loop dissociation of DNA. In DNA Computing, volume 3892 of Lecture Notes in Computer Science, pages 347--358. Springer Berlin Heidelberg, 2006.
    [27]
    B. Talawar. A crossbar interconnection network in DNA. In Workshop on High Performance Computational Biology, 2015.
    [28]
    S. M. H. T. Yazdi, Y. Yuan, J. Ma, H. Zhao, and O. Milenkovic. A Rewritable, Random-Access DNA-Based Storage System. Nature Scientific Reports, 5 (14318), 2015.
    [29]
    J. N. Zadeh, B. R. Wolfe, and N. A. Pierce. Nucleic acid sequence design via efficient ensemble defect optimization. Journal of Computational Chemistry, 32 (3): 439--452, 2011.

    Cited By

    View all
    • (2024)VSD: A Novel Method for Video Segmentation and Storage in DNA Using RS CodeMathematics10.3390/math1208123512:8(1235)Online publication date: 19-Apr-2024
    • (2024)Dna Digital-storage: Advantages, Approach and Technical Implementation2024 Pan Pacific Strategic Electronics Symposium (Pan Pacific)10.23919/PanPacific60013.2024.10436508(1-6)Online publication date: 29-Jan-2024
    • (2024) Fully in vitro iterative construction of a 24 kb-long artificial DNA sequence to store digital information BioTechniques10.2144/btn-2023-010976:5(207-219)Online publication date: 4-Jun-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems
    March 2016
    824 pages
    ISBN:9781450340915
    DOI:10.1145/2872362
    • General Chair:
    • Tom Conte,
    • Program Chair:
    • Yuanyuan Zhou
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 March 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. DNA
    2. archival storage
    3. molecular computing

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ASPLOS '16

    Acceptance Rates

    ASPLOS '16 Paper Acceptance Rate 53 of 232 submissions, 23%;
    Overall Acceptance Rate 535 of 2,713 submissions, 20%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1,092
    • Downloads (Last 6 weeks)150
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)VSD: A Novel Method for Video Segmentation and Storage in DNA Using RS CodeMathematics10.3390/math1208123512:8(1235)Online publication date: 19-Apr-2024
    • (2024)Dna Digital-storage: Advantages, Approach and Technical Implementation2024 Pan Pacific Strategic Electronics Symposium (Pan Pacific)10.23919/PanPacific60013.2024.10436508(1-6)Online publication date: 29-Jan-2024
    • (2024) Fully in vitro iterative construction of a 24 kb-long artificial DNA sequence to store digital information BioTechniques10.2144/btn-2023-010976:5(207-219)Online publication date: 4-Jun-2024
    • (2024)An Encoding Scheme to Enlarge Practical DNA Storage Capacity by Reducing Primer-Payload CollisionsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640417(71-84)Online publication date: 27-Apr-2024
    • (2024)Bridging DNA Storage and Computation: An Integrated Framework for Efficient Biomolecular Data ManagementProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636017(196-203)Online publication date: 8-Apr-2024
    • (2024)Design of DNA Storage Coding Scheme With LDPC Codes and InterleavingIEEE Transactions on NanoBioscience10.1109/TNB.2024.337997623:3(447-457)Online publication date: Jul-2024
    • (2024)Iterative Soft Decoding Algorithm for DNA Storage Using Quality Score and RedecodingIEEE Transactions on NanoBioscience10.1109/TNB.2023.328440623:1(81-90)Online publication date: Jan-2024
    • (2024)Codes Correcting Long Duplication ErrorsIEEE Transactions on Molecular, Biological, and Multi-Scale Communications10.1109/TMBMC.2024.340375510:2(272-288)Online publication date: Jun-2024
    • (2024)DNA Merge-Sort: A Family of Nested Varshamov-Tenengolts Reassembly Codes for Out-of-Order MediaIEEE Transactions on Communications10.1109/TCOMM.2023.333540972:3(1303-1317)Online publication date: Mar-2024
    • (2024)A Tutorial on Coding Methods for DNA-Based Molecular Communications and StorageIEEE Internet of Things Journal10.1109/JIOT.2023.333290311:7(11825-11847)Online publication date: 1-Apr-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media