skip to main content
research-article

How to Enable Index Scheme for Reducing the Writing Cost of DNA Storage on Insertion and Deletion

Authors Info & Claims
Published:28 May 2022Publication History
Skip Abstract Section

Abstract

Recently, the requirement of storing digital data has been growing rapidly; however, the conventional storage medium cannot satisfy these huge demands. Fortunately, thanks to biological technology development, storing digital data into deoxyribonucleic acid (DNA) has become possible in recent years. Furthermore, because of the attractive features (e.g., high storing density, long-term durability, and stability), DNA storage has been regarded as a potential alternative storage medium to store massive digital data in the future. Nevertheless, reading and writing digital data over DNA requires a series of extremely time-consuming processes (i.e., DNA sequencing and DNA synthesis). More specifically, among the two costs, the writing cost is the predominant cost of a DNA data storage system. Therefore, to enable efficient DNA storage, this article proposes an index management scheme for reducing the number of accesses to DNA storage. Additionally, this article introduces a new DNA data encoding format with VERA (Version Editing Recovery Approach) to reduce the total writing bits while inserting and deleting the data. To the best of our knowledge, this work is the first work to provide a total data management solution for DNA storage. According to the experimental results, the proposed design with VERA can reduce the cost by 77% and improve the performance by 71% compared to the append-only methods.

REFERENCES

  1. [1] Blawat Meinolf, Gaedke Klaus, Hütter Ingo, Chen Xiao-Ming, Turczyk Brian, Inverso Samuel, Pruitt Benjamin W., and Church George M.. 2016. Forward error correction for DNA data storage. Procedia Comput. Sci. 80 (2016), 10111022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Bornholt James, Lopez Randolph, Carmean Douglas M., Ceze Luis, Seelig Georg, and Strauss Karin. 2016. A DNA-Based archival storage system. In Proceedings of the 21st International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’16). Association for Computing Machinery, New York, NY, 637649. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Bryksin Anton and Matsumura Ichiro. 2010. Overlap extension PCR cloning: A simple and reliable way to create recombinant plasmids. BioTechniques 48, 6 (2010), 463465. Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Carlson Rob. 2016. On DNA and Transistorso. Retrieved from http://www.synthesis.cc/synthesis/category/Carlson+Curves.Google ScholarGoogle Scholar
  5. [5] Ceze Luis, Nivala Jeff, and Strauss Karin. 2019. Molecular digital data storage using DNA. Nature Rev. Genet. 20, 8 (2019), 456466.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Choi Yeongjae, Ryu Taehoon, Lee Amos C., Choi Hansol, Lee Hansaem, Park Jaejun, Song Suk-Heung, Kim Seojoo, Kim Hyeli, Park Wook, et al. 2019. High information capacity DNA-based data storage with augmented encoding characters using degenerate bases. Sci. Rep. 9, 1 (2019), 17.Google ScholarGoogle Scholar
  7. [7] Church George M., Gao Yuan, and Kosuri Sriram. 2012. Next-generation digital information storage in DNA. Science 337, 6102 (2012), 16281628.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Cisco. 2020. Cisco Annual Internet Report (2018–2023) White Paper. Retrieved from https://www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/annual-internet-report/white-paper-c11-741490.html.Google ScholarGoogle Scholar
  9. [9] Erlich Yaniv and Zielinski Dina. 2017. DNA fountain enables a robust and efficient storage architecture. Science 355, 6328 (2017), 950954. Retrieved from arXiv: https://science.sciencemag.org/content/355/6328/950.full.pdf.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Extance Andy. 2016. How DNA could store all the world’s data. Nature News 537, 7618 (2016), 22.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Fritz Markus Hsi-Yang, Leinonen Rasko, Cochrane Guy, and Birney Ewan. 2011. Efficient storage of high throughput DNA sequencing data using reference-based compression. Genome Res. 21, 5 (2011), 734740.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Goldman Nick, Bertone Paul, Chen Siyuan, Dessimoz Christophe, LeProust Emily M., Sipos Botond, and Birney Ewan. 2013. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature 494, 7435 (2013), 7780.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Goodwin Sara, McPherson John D., and McCombie W. Richard. 2016. Coming of age: Ten years of next-generation sequencing technologies. Nature Rev. Genet. 17, 6 (2016), 333.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Grass Robert N., Heckel Reinhard, Puddu Michela, Paunescu Daniela, and Stark Wendelin J.. 2015. Robust chemical preservation of digital information on DNA in silica with error-correcting codes. Angewandte Chemie Int. Ed. 54, 8 (2015), 25522555. Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Heckel R., Shomorony I., Ramchandran K., and Tse D. N. C.. 2017. Fundamental limits of DNA storage systems. In Proceedings of the IEEE International Symposium on Information Theory (ISIT’17). 31303134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Inc. Bio Basic2017. DNA sequencing price. Retrieved from https://www.biobasic.com/dna-pricing/.Google ScholarGoogle Scholar
  17. [17] Jahaan Alam, Ravi T. N., and Arokiaraj S. Panneer. 2017. A comparative study and survey on existing DNA compression techniques. Int. J. Adv. Res. Comput. Sci. 8, 3 (2017).Google ScholarGoogle Scholar
  18. [18] Kosuri Sriram and Church George M.. 2014. Large-scale de novo DNA synthesis: Technologies and applications. Nature Methods 11, 5 (2014), 499.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Lim Seung-Hwan, Sim Hyogi, Gunasekaran Raghul, and Vazhkudai Sudharshan S.. 2017. Scientific user behavior and data-sharing trends in a petascale file system. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 112.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Lin Kevin N., Volkel Kevin, Tuck James M., and Keung Albert J.. 2020. Dynamic and scalable DNA-based information storage. Nature Commun. 11, 1 (2020). Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Wang Liu-yue, Li Hui-mei, Ma Meng-qi, Liang Ming-xing, He Ru-yang, and Chen Hua-bo. 2019. Improve the site-directed mutagenesis efficiency of overlap extension PCR by outboard-primers. Biotechnol. Bull. 35, 12 (2019), 196.Google ScholarGoogle Scholar
  22. [22] Matteucci Mark Douglas and Caruthers M. Ho. 1981. Synthesis of deoxyoligonucleotides on a polymer support. J. Amer. Chem. Soc. 103, 11 (1981), 31853191.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Mehdi Kchouk, Gibrat Jean-Francois, and Elloumi Mourad. 2017. Generations of sequencing technologies: From first to next generation. Electromagn. Biol. Med. 9, 3 (2017), 8–p.Google ScholarGoogle Scholar
  24. [24] Meiser Linda C., Antkowiak Philipp L., Koch Julian, Chen Weida D., Kohll A. Xavier, Stark Wendelin J., Heckel Reinhard, and Grass Robert N.. 2020. Reading and writing digital data in DNA. Nature Protocols 15, 1 (2020), 86101.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Newman Sharon, Stephenson Ashley P., Willsey Max, Nguyen Bichlien H., Takahashi Christopher N., Strauss Karin, and Ceze Luis. 2019. High density DNA data storage library via dehydration with digital microfluidic retrieval. Nature Commun. 10, 1 (2019), 16.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Organick Lee, Ang Siena Dumas, Chen Yuan-Jyue, Lopez Randolph, Yekhanin Sergey, Makarychev Konstantin, Racz Miklos Z., Kamath Govinda, Gopalan Parikshit, Nguyen Bichlien, et al. 2018. Random access in large-scale DNA data storage. Nature Biotechnol. 36, 3 (2018), 242.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Technologies. Synbio2021. Gene Synthesis. Retrieved from https://www.synbio-tech.com/.Google ScholarGoogle Scholar
  28. [28] Yazdi S. M. Hossein Tabatabaei, Yuan Yongbo, Ma Jian, Zhao Huimin, and Milenkovic Olgica. 2015. A rewritable, random-access DNA-based storage system. Sci. Rep. 5, 1 (2015), 110.Google ScholarGoogle Scholar
  29. [29] Zhang Pingping, Ding Yingying, Liao Wenting, Chen Qiuli, Zhang Huaqun, Qi Peipei, He Ting, Wang Jinhong, Deng Songhua, Pan Tianyue, et al. 2013. A simple, universal, efficient PCR-based gene synthesis method: Sequential OE-PCR gene synthesis. Gene 524, 2 (2013), 347354.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Zhirnov Victor, Zadegan Reza M., Sandhu Gurtej S., Church George M., and Hughes William L.. 2016. Nucleic acid memory. Nature Mater. 15, 4 (2016), 366370. Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. How to Enable Index Scheme for Reducing the Writing Cost of DNA Storage on Insertion and Deletion

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Embedded Computing Systems
            ACM Transactions on Embedded Computing Systems  Volume 21, Issue 3
            May 2022
            365 pages
            ISSN:1539-9087
            EISSN:1558-3465
            DOI:10.1145/3530307
            • Editor:
            • Tulika Mitra
            Issue’s Table of Contents

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 28 May 2022
            • Online AM: 2 March 2022
            • Accepted: 1 January 2022
            • Received: 1 October 2021
            Published in tecs Volume 21, Issue 3

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Refereed
          • Article Metrics

            • Downloads (Last 12 months)268
            • Downloads (Last 6 weeks)19

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Full Text

          View this article in Full Text.

          View Full Text

          HTML Format

          View this article in HTML Format .

          View HTML Format
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!