skip to main content
research-article

Ares: Adaptive, Reconfigurable, Erasure coded, Atomic Storage

Published:12 November 2022Publication History
Skip Abstract Section

Abstract

Emulating a shared atomic, read/write storage system is a fundamental problem in distributed computing. Replicating atomic objects among a set of data hosts was the norm for traditional implementations (e.g., [11]) in order to guarantee the availability and accessibility of the data despite host failures. As replication is highly storage demanding, recent approaches suggested the use of erasure-codes to offer the same fault-tolerance while optimizing storage usage at the hosts. Initial works focused on a fixed set of data hosts. To guarantee longevity and scalability, a storage service should be able to dynamically mask hosts failures by allowing new hosts to join, and failed host to be removed without service interruptions. This work presents the first erasure-code -based atomic algorithm, called Ares, which allows the set of hosts to be modified in the course of an execution. Ares is composed of three main components: (i) a reconfiguration protocol, (ii) a read/write protocol, and (iii) a set of data access primitives (DAPs). The design of Ares is modular and is such to accommodate the usage of various erasure-code parameters on a per-configuration basis. We provide bounds on the latency of read/write operations and analyze the storage and communication costs of the Ares algorithm.

REFERENCES

  1. [1] Ansible. 2022. Retrieved 31 October 2022 from https://www.ansible.com/overview/how-ansible-works.Google ScholarGoogle Scholar
  2. [2] Emulab network testbed. 2022. Retrieved 31 October 2022 from https://www.emulab.net/.Google ScholarGoogle Scholar
  3. [3] Intel Storage Acceleration Library (Open Source Version). 2022. Retrieved 31 October 2022 from https://goo.gl/zkVl4N.Google ScholarGoogle Scholar
  4. [4] PyEClib. 2022. Retrieved 31 October 2022 from https://github.com/openstack/pyeclib.Google ScholarGoogle Scholar
  5. [5] PySyncObj. 2022. Retrieved 31 October 2022 from https://github.com/bakwc/PySyncObj.Google ScholarGoogle Scholar
  6. [6] ZeroMQ. 2022. Retrieved 31 October 2022 from https://zeromq.org.Google ScholarGoogle Scholar
  7. [7] Abebe M., Daudjee K., Glasbergen B., and Tian Y.. 2018. EC-store: Bridging the gap between storage and latency in distributed erasure coded systems. In Proceedings of the 2018 IEEE 38th International Conference on Distributed Computing Systems. 255266. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Aguilera Marcos Kawazoe, Keidar Idit, Malkhi Dahlia, and Shraer Alexander. 2009. Dynamic atomic storage without consensus. In Proceedings of the 28th ACM Symposium on Principles of Distributed Computing. ACM, 1725. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Aguilera Marcos K., Keidar Idit, Malkhi Dahlia, Martin Jean-Philippe, and Shraery Alexander. 2010. Reconfiguring replicated atomic storage: A tutorial. Bulletin of the EATCS 102 (2010), 84108.Google ScholarGoogle Scholar
  10. [10] Anta Antonio Fernández, Nicolaou Nicolas, and Popa Alexandru. 2015. Making “fast” atomic operations computationally tractable. In Proceedings of the International Conference on Principles Of Distributed Systems.Google ScholarGoogle Scholar
  11. [11] Attiya H., Bar-Noy A., and Dolev D.. 1996. Sharing memory robustly in message passing systems. Journal of the ACM 42, 1 (1996), 124142.Google ScholarGoogle Scholar
  12. [12] Burihabwa Dorian, Felber Pascal, Mercier Hugues, and Schiavoni Valerio. 2016. A performance evaluation of erasure coding libraries for cloud-based data stores. In Proceedings of the Distributed Applications and Interoperable Systems. Springer, 160173.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Cachin Christian and Tessaro Stefano. 2006. Optimal resilience for erasure-coded byzantine distributed storage. In Proceedings of the International Conference on Dependable Systems and Networks. IEEE Computer Society, 115124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Cadambe V. R., Lynch N., Médard M., and Musial P.. 2014. A coded shared atomic memory algorithm for message passing architectures. In Proceedings of the 2014 IEEE 13th International Symposium on Network Computing and Applications.253260. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Cadambe Viveck R., Lynch Nancy A., Médard Muriel, and Musial Peter M.. 2017. A coded shared atomic memory algorithm for message passing architectures. Distributed Computing 30, 1 (2017), 4973.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Chen Yu Lin Chen, Mu Shuai, and Li Jinyang. 2017. Giza: Erasure coding objects across global data centers. In Proceedings of the 2017 USENIX Annual Technical Conference. 539551.Google ScholarGoogle Scholar
  17. [17] Chockler Gregory, Gilbert Seth, Gramoli Vincent, Musial Peter M., and Shvartsman Alexander A.. 2009. Reconfigurable distributed storage for dynamic networks. Journal of Parallel and Distributed Computing 69, 1 (2009), 100116.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Chockler Gregory and Malkhi Dahlia. 2005. Active disk paxos with infinitely many processes. Distributed Computing 18, 1 (2005), 7384. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Dobre Dan, Karame Ghassan O., Li Wenting, Majuntke Matthias, Suri Neeraj, and Vukolić Marko. 2019. Proofs of writing for robust storage. IEEE Transactions on Parallel and Distributed Systems 30, 11 (2019), 25472566. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Dutta Partha, Guerraoui Rachid, and Levy Ron R.. 2008. Optimistic erasure-coded distributed storage. In DISC’08: Proceedings of the 22nd International Symposium on Distributed Computing. Springer-Verlag, 182196.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Dutta Partha, Guerraoui Rachid, Levy Ron R., and Chakraborty Arindam. 2004. How fast can a distributed atomic read be? In Proceedings of the 23rd ACM Symposium on Principles of Distributed Computing. 236245.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Fan Rui and Lynch Nancy. 2003. Efficient replication of large data objects. In Proceedings of the Distributed Algorithms.Fich Faith Ellen (Ed.), Lecture Notes in Computer Science, Vol. 2848. 7591.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Anta Antonio Fernández, Hadjistasi Theophanis, and Nicolaou Nicolas. 2016. Computationally light “multi-speed” atomic memory. In Proceedings of the International Conference on Principles Of Distributed Systems.Google ScholarGoogle Scholar
  24. [24] Gafni Eli and Malkhi Dahlia. 2015. Elastic configuration maintenance via a parsimonious speculating snapshot solution. In Proceedings of the International Symposium on Distributed Computing. Springer, 140153.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Georgiou Chryssis, Nicolaou Nicolas C., and Shvartsman Alexander A.. 2008. On the robustness of (semi) fast quorum-based implementations of atomic shared memory. In DISC ’08: Proceedings of the 22nd International Symposium on Distributed Computing. Springer-Verlag, 289304.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Georgiou Chryssis, Nicolaou Nicolas C., and Shvartsman Alexander A.. 2009. Fault-tolerant semifast implementations of atomic read/write registers. Journal of Parallel and Distributed Computing 69, 1 (2009), 6279.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Gilbert Seth. 2003. RAMBO II: Rapidly Reconfigurable Atomic Memory for Dynamic Networks. Master’s thesis. MIT.Google ScholarGoogle Scholar
  28. [28] Gilbert S., Lynch N., and Shvartsman A. A.. 2003. RAMBO II: Rapidly reconfigurable atomic memory for dynamic networks. In Proceedings of the International Conference on Dependable Systems and Networks. 259268.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Herlihy Maurice P. and Wing Jeannette M.. 1990. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems 12, 3 (1990), 463492. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Huang Jianzhong, Liang Xianhai, Qin Xiao, Xie Ping, and Xie Changsheng. 2015. Scale-RS: An efficient scaling scheme for RS-coded storage clusters. IEEE Transactions on Parallel and Distributed Systems 26, 6 (2015), 17041717. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Huffman W. C. and Pless V.. 2003. Fundamentals of Error-correcting Codes. Cambridge University Press.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Jehl Leander, Vitenberg Roman, and Meling Hein. 2015. Smartmerge: A new approach to reconfiguration for atomic storage. In Proceedings of the International Symposium on Distributed Computing. Springer, 154169.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Joshi Gauri, Soljanin Emina, and Wornell Gregory. 2017. Efficient redundancy techniques for latency reduction in cloud systems. ACM Transactions on Modeling and Performance Evaluation of Computing Systems 2, 2 (2017), 12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Konwar K. M., Prakash N., Kantor E., Lynch N., Médard M., and Schwarzmann A. A.. 2016. Storage-optimized data-atomic algorithms for handling erasures and errors in distributed storage systems. In Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium. 720729. Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Konwar Kishori M., Prakash N., Lynch Nancy, and Médard Muriel. 2016. RADON: Repairable atomic data object in networks. In Proceedings of the International Conference on Distributed Systems.Google ScholarGoogle Scholar
  36. [36] Lakshman Avinash and Malik Prashant. 2010. Cassandra: A decentralized structured storage system. ACM SIGOPS Operating Systems Review 44, 2 (2010), 3540. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Lamport Leslie. 1998. The part-time parliament. ACM Transactions on Computer Systems 16, 2 (1998), 133169. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Lynch N. A.. 1996. Distributed Algorithms. Morgan Kaufmann Publishers.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Lynch N. and Shvartsman A. A.. 2002. RAMBO: A reconfigurable atomic memory service for dynamic networks. In Proceedings of the16th International Symposium on Distributed Computing. 173190.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Lynch Nancy A. and Shvartsman Alexander A.. 1997. Robust emulation of shared memory using dynamic quorum-acknowledged broadcasts. In Proceedings of the Symposium on Fault-Tolerant Computing. 272281.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Michael Ellis, Ports Dan R. K., Sharma Naveen Kr., and Szekeres Adriana. 2017. Recovering shared objects without stable storage. In Proceedings of the 31st International Symposium on Distributed Computing.Richa Andréa W. (Ed.), Leibniz International Proceedings in Informatics,Vol. 91. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 36:1–36:16. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Nakamoto Satoshi. 2008. Bitcoin: A peer-to-peer electronic cash system. https://bitcoin.org/bitcoin.pdf.Google ScholarGoogle Scholar
  43. [43] Nicolaou Nicolas, Cadambe Viveck, Konwar Kishori, Prakash N., Lynch Nancy, and Médard Muriel. 2018. ARES: Adaptive, reconfigurable, erasure coded, atomic storage. arXiv:1805.03727. Retrieved from https://arxiv.org/abs/1805.03727.Google ScholarGoogle Scholar
  44. [44] Nicolaou Nicolas, Cadambe Viveck, Prakash N., Konwar Kishori, Medard Muriel, and Lynch Nancy. 2019. ARES: Adaptive, reconfigurable, erasure coded, atomic storage. In Proceedings of the 2019 IEEE 39th International Conference on Distributed Computing Systems. 21952205. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Ongaro Diego and Ousterhout John. 2014. In search of an understandable consensus algorithm. In Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference.USENIX Association, 305320.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. [46] Rashmi K. V., Chowdhury Mosharaf, Kosaian Jack, Stoica Ion, and Ramchandran Kannan. 2016. EC-cache: Load-balanced, low-latency cluster caching with online erasure coding. In Proceedings of the OSDI. 401417.Google ScholarGoogle Scholar
  47. [47] Shraer Alexander, Martin Jean-Philippe, Malkhi Dahlia, and Keidar Idit. 2010. Data-centric reconfiguration with network-attached disks. In Proceedings of the 4th InternationalWorkshop on Large Scale Distributed System and Middleware. 2226.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. [48] Spiegelman Alexander, Keidar Idit, and Malkhi Dahlia. 2017. Dynamic reconfiguration: Abstraction and optimal asynchronous solution. In Proceedings of the 31st International Symposium on Distributed Computing. 40:1–40:15.Google ScholarGoogle Scholar
  49. [49] Wang Shuang, Huang Jianzhong, Qin Xiao, Cao Qiang, and Xie Changsheng. 2017. WPS: A workload-aware placement scheme for erasure-coded in-memory stores. In Proceedings of the International Conference on Networking, Architecture, and Storage. IEEE, 110.Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Wu Chentao and He Xubin. 2012. GSR: A global stripe-based redistribution approach to accelerate RAID-5 scaling. In Proceedings of the 2012 41st International Conference on Parallel Processing. 460469. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. [51] Wu Chentao and He Xubin. 2013. A flexible framework to enhance RAID-6 scalability via exploiting the similarities among MDS codes. In Proceedings of the 2013 42nd International Conference on Parallel Processing. 542551. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Xiang Yu, Lan Tian, Aggarwal Vaneet, and Chen Yih-Farn R.. 2015. Multi-tenant latency optimization in erasure-coded storage with differentiated services. In Proceedings of the 2015 IEEE 35th International Conference on Distributed Computing Systems. IEEE, 790791.Google ScholarGoogle ScholarCross RefCross Ref
  53. [53] Xiang Yu, Lan Tian, Aggarwal Vaneet, Chen Yih-Farn R., Xiang Yu, Lan Tian, Aggarwal Vaneet, and Chen Yih-Farn R.. 2016. Joint latency and cost optimization for erasure-coded data center storage. IEEE/ACM Transactions on Networking 24, 4 (2016), 24432457.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. [54] Yu Yinghao, Huang Renfei, Wang Wei, Zhang Jun, and Letaief Khaled Ben. 2018. SP-cache: Load-balanced, redundancy-free cluster caching with selective partition. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. IEEE, 113.Google ScholarGoogle Scholar
  55. [55] Zhang Guangyan, Li Keqin, Wang Jingzhe, and Zheng Weimin. 2015. Accelerate RDP RAID-6 scaling by reducing disk I/Os and XOR operations. IEEE Transactions on Computers 64, 1 (2015), 3244. DOI:DOI:Google ScholarGoogle ScholarCross RefCross Ref
  56. [56] Zhang Heng, Dong Mingkai, and Chen Haibo. 2016. Efficient and available in-memory KV-store with hybrid erasure coding and replication. In Proceedings of the 14th USENIX Conference on File and Storage Technologies. USENIX Association, Santa Clara, CA, 167180.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. [57] Zhang Xiaoyang, Hu Yuchong, Lee Patrick P. C., and Zhou Pan. 2018. Toward optimal storage scaling via network coding: From theory to practice. In Proceedings of the IEEE INFOCOM 2018 - IEEE Conference on Computer Communications. 18081816. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. [58] P. Zhou, J. Huang, X. Qin, and C. Xie. 2019. PaRS: A popularity-aware redundancy scheme for in-memory stores, IEEE Trans. Computers, 68, 4 (2019), 556–569. https://dblp.org/rec/journals/tc/ZhouHQX19.html?view=bibtex.Google ScholarGoogle Scholar

Index Terms

  1. Ares: Adaptive, Reconfigurable, Erasure coded, Atomic Storage

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Storage
          ACM Transactions on Storage  Volume 18, Issue 4
          November 2022
          279 pages
          ISSN:1553-3077
          EISSN:1553-3093
          DOI:10.1145/3570642
          Issue’s Table of Contents

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 12 November 2022
          • Online AM: 27 September 2022
          • Accepted: 6 January 2022
          • Revised: 19 November 2021
          • Received: 27 April 2021
          Published in tos Volume 18, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Full Text

        View this article in Full Text.

        View Full Text

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!