Abstract
Emulating a shared atomic, read/write storage system is a fundamental problem in distributed computing. Replicating atomic objects among a set of data hosts was the norm for traditional implementations (e.g., [11]) in order to guarantee the availability and accessibility of the data despite host failures. As replication is highly storage demanding, recent approaches suggested the use of erasure-codes to offer the same fault-tolerance while optimizing storage usage at the hosts. Initial works focused on a fixed set of data hosts. To guarantee longevity and scalability, a storage service should be able to dynamically mask hosts failures by allowing new hosts to join, and failed host to be removed without service interruptions. This work presents the first erasure-code -based atomic algorithm, called Ares, which allows the set of hosts to be modified in the course of an execution. Ares is composed of three main components: (i) a reconfiguration protocol, (ii) a read/write protocol, and (iii) a set of data access primitives (DAPs). The design of Ares is modular and is such to accommodate the usage of various erasure-code parameters on a per-configuration basis. We provide bounds on the latency of read/write operations and analyze the storage and communication costs of the Ares algorithm.
- [1] Ansible. 2022. Retrieved 31 October 2022 from https://www.ansible.com/overview/how-ansible-works.Google Scholar
- [2] Emulab network testbed. 2022. Retrieved 31 October 2022 from https://www.emulab.net/.Google Scholar
- [3] Intel Storage Acceleration Library (Open Source Version). 2022. Retrieved 31 October 2022 from https://goo.gl/zkVl4N.Google Scholar
- [4] PyEClib. 2022. Retrieved 31 October 2022 from https://github.com/openstack/pyeclib.Google Scholar
- [5] PySyncObj. 2022. Retrieved 31 October 2022 from https://github.com/bakwc/PySyncObj.Google Scholar
- [6] ZeroMQ. 2022. Retrieved 31 October 2022 from https://zeromq.org.Google Scholar
- [7] . 2018. EC-store: Bridging the gap between storage and latency in distributed erasure coded systems. In Proceedings of the 2018 IEEE 38th International Conference on Distributed Computing Systems. 255–266.
DOI: Google ScholarCross Ref
- [8] . 2009. Dynamic atomic storage without consensus. In Proceedings of the 28th ACM Symposium on Principles of Distributed Computing. ACM, 17–25. Google Scholar
Digital Library
- [9] . 2010. Reconfiguring replicated atomic storage: A tutorial. Bulletin of the EATCS 102 (2010), 84–108.Google Scholar
- [10] . 2015. Making “fast” atomic operations computationally tractable. In Proceedings of the International Conference on Principles Of Distributed Systems.Google Scholar
- [11] . 1996. Sharing memory robustly in message passing systems. Journal of the ACM 42, 1 (1996), 124–142.Google Scholar
- [12] . 2016. A performance evaluation of erasure coding libraries for cloud-based data stores. In Proceedings of the Distributed Applications and Interoperable Systems. Springer, 160–173.Google Scholar
Digital Library
- [13] . 2006. Optimal resilience for erasure-coded byzantine distributed storage. In Proceedings of the International Conference on Dependable Systems and Networks. IEEE Computer Society, 115–124. Google Scholar
Digital Library
- [14] . 2014. A coded shared atomic memory algorithm for message passing architectures. In Proceedings of the 2014 IEEE 13th International Symposium on Network Computing and Applications.253–260.
DOI: Google ScholarDigital Library
- [15] . 2017. A coded shared atomic memory algorithm for message passing architectures. Distributed Computing 30, 1 (2017), 49–73.Google Scholar
Digital Library
- [16] . 2017. Giza: Erasure coding objects across global data centers. In Proceedings of the 2017 USENIX Annual Technical Conference. 539–551.Google Scholar
- [17] . 2009. Reconfigurable distributed storage for dynamic networks. Journal of Parallel and Distributed Computing 69, 1 (2009), 100–116.Google Scholar
Digital Library
- [18] . 2005. Active disk paxos with infinitely many processes. Distributed Computing 18, 1 (2005), 73–84. Google Scholar
Digital Library
- [19] . 2019. Proofs of writing for robust storage. IEEE Transactions on Parallel and Distributed Systems 30, 11 (2019), 2547–2566.
DOI: Google ScholarDigital Library
- [20] . 2008. Optimistic erasure-coded distributed storage. In DISC’08: Proceedings of the 22nd International Symposium on Distributed Computing. Springer-Verlag, 182–196.Google Scholar
Digital Library
- [21] . 2004. How fast can a distributed atomic read be? In Proceedings of the 23rd ACM Symposium on Principles of Distributed Computing. 236–245.Google Scholar
Digital Library
- [22] . 2003. Efficient replication of large data objects. In Proceedings of the Distributed Algorithms. (Ed.),
Lecture Notes in Computer Science , Vol. 2848. 75–91.Google ScholarCross Ref
- [23] . 2016. Computationally light “multi-speed” atomic memory. In Proceedings of the International Conference on Principles Of Distributed Systems.Google Scholar
- [24] . 2015. Elastic configuration maintenance via a parsimonious speculating snapshot solution. In Proceedings of the International Symposium on Distributed Computing. Springer, 140–153.Google Scholar
Digital Library
- [25] . 2008. On the robustness of (semi) fast quorum-based implementations of atomic shared memory. In DISC ’08: Proceedings of the 22nd International Symposium on Distributed Computing. Springer-Verlag, 289–304.Google Scholar
Digital Library
- [26] . 2009. Fault-tolerant semifast implementations of atomic read/write registers. Journal of Parallel and Distributed Computing 69, 1 (2009), 62–79.Google Scholar
Digital Library
- [27] . 2003. RAMBO II: Rapidly Reconfigurable Atomic Memory for Dynamic Networks. Master’s thesis. MIT.Google Scholar
- [28] . 2003. RAMBO II: Rapidly reconfigurable atomic memory for dynamic networks. In Proceedings of the International Conference on Dependable Systems and Networks. 259–268.Google Scholar
Cross Ref
- [29] . 1990. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems 12, 3 (1990), 463–492. Google Scholar
Digital Library
- [30] . 2015. Scale-RS: An efficient scaling scheme for RS-coded storage clusters. IEEE Transactions on Parallel and Distributed Systems 26, 6 (2015), 1704–1717.
DOI: Google ScholarDigital Library
- [31] . 2003. Fundamentals of Error-correcting Codes. Cambridge University Press.Google Scholar
Cross Ref
- [32] . 2015. Smartmerge: A new approach to reconfiguration for atomic storage. In Proceedings of the International Symposium on Distributed Computing. Springer, 154–169.Google Scholar
Digital Library
- [33] . 2017. Efficient redundancy techniques for latency reduction in cloud systems. ACM Transactions on Modeling and Performance Evaluation of Computing Systems 2, 2 (2017), 12.Google Scholar
Digital Library
- [34] . 2016. Storage-optimized data-atomic algorithms for handling erasures and errors in distributed storage systems. In Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium. 720–729. Google Scholar
Cross Ref
- [35] . 2016. RADON: Repairable atomic data object in networks. In Proceedings of the International Conference on Distributed Systems.Google Scholar
- [36] . 2010. Cassandra: A decentralized structured storage system. ACM SIGOPS Operating Systems Review 44, 2 (2010), 35–40.
DOI: Google ScholarDigital Library
- [37] . 1998. The part-time parliament. ACM Transactions on Computer Systems 16, 2 (1998), 133–169.
DOI: Google ScholarDigital Library
- [38] . 1996. Distributed Algorithms. Morgan Kaufmann Publishers.Google Scholar
Digital Library
- [39] . 2002. RAMBO: A reconfigurable atomic memory service for dynamic networks. In Proceedings of the16th International Symposium on Distributed Computing. 173–190.Google Scholar
Cross Ref
- [40] . 1997. Robust emulation of shared memory using dynamic quorum-acknowledged broadcasts. In Proceedings of the Symposium on Fault-Tolerant Computing. 272–281.Google Scholar
Cross Ref
- [41] . 2017. Recovering shared objects without stable storage. In Proceedings of the 31st International Symposium on Distributed Computing. (Ed.),
Leibniz International Proceedings in Informatics, Vol. 91. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 36:1–36:16.DOI: Google ScholarCross Ref
- [42] . 2008. Bitcoin: A peer-to-peer electronic cash system. https://bitcoin.org/bitcoin.pdf.Google Scholar
- [43] . 2018. ARES: Adaptive, reconfigurable, erasure coded, atomic storage. arXiv:1805.03727. Retrieved from https://arxiv.org/abs/1805.03727.Google Scholar
- [44] . 2019. ARES: Adaptive, reconfigurable, erasure coded, atomic storage. In Proceedings of the 2019 IEEE 39th International Conference on Distributed Computing Systems. 2195–2205.
DOI: Google ScholarCross Ref
- [45] . 2014. In search of an understandable consensus algorithm. In Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference.USENIX Association, 305–320.Google Scholar
Digital Library
- [46] . 2016. EC-cache: Load-balanced, low-latency cluster caching with online erasure coding. In Proceedings of the OSDI. 401–417.Google Scholar
- [47] . 2010. Data-centric reconfiguration with network-attached disks. In Proceedings of the 4th InternationalWorkshop on Large Scale Distributed System and Middleware. 22–26.Google Scholar
Digital Library
- [48] . 2017. Dynamic reconfiguration: Abstraction and optimal asynchronous solution. In Proceedings of the 31st International Symposium on Distributed Computing. 40:1–40:15.Google Scholar
- [49] . 2017. WPS: A workload-aware placement scheme for erasure-coded in-memory stores. In Proceedings of the International Conference on Networking, Architecture, and Storage. IEEE, 1–10.Google Scholar
Cross Ref
- [50] . 2012. GSR: A global stripe-based redistribution approach to accelerate RAID-5 scaling. In Proceedings of the 2012 41st International Conference on Parallel Processing. 460–469.
DOI: Google ScholarDigital Library
- [51] . 2013. A flexible framework to enhance RAID-6 scalability via exploiting the similarities among MDS codes. In Proceedings of the 2013 42nd International Conference on Parallel Processing. 542–551.
DOI: Google ScholarDigital Library
- [52] . 2015. Multi-tenant latency optimization in erasure-coded storage with differentiated services. In Proceedings of the 2015 IEEE 35th International Conference on Distributed Computing Systems. IEEE, 790–791.Google Scholar
Cross Ref
- [53] . 2016. Joint latency and cost optimization for erasure-coded data center storage. IEEE/ACM Transactions on Networking 24, 4 (2016), 2443–2457.Google Scholar
Digital Library
- [54] . 2018. SP-cache: Load-balanced, redundancy-free cluster caching with selective partition. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. IEEE, 1–13.Google Scholar
- [55] . 2015. Accelerate RDP RAID-6 scaling by reducing disk I/Os and XOR operations. IEEE Transactions on Computers 64, 1 (2015), 32–44. DOI:
DOI: Google ScholarCross Ref
- [56] . 2016. Efficient and available in-memory KV-store with hybrid erasure coding and replication. In Proceedings of the 14th USENIX Conference on File and Storage Technologies. USENIX Association, Santa Clara, CA, 167–180.Google Scholar
Digital Library
- [57] . 2018. Toward optimal storage scaling via network coding: From theory to practice. In Proceedings of the IEEE INFOCOM 2018 - IEEE Conference on Computer Communications. 1808–1816.
DOI: Google ScholarDigital Library
- [58] P. Zhou, J. Huang, X. Qin, and C. Xie. 2019. PaRS: A popularity-aware redundancy scheme for in-memory stores, IEEE Trans. Computers, 68, 4 (2019), 556–569. https://dblp.org/rec/journals/tc/ZhouHQX19.html?view=bibtex.Google Scholar
Index Terms
Ares: Adaptive, Reconfigurable, Erasure coded, Atomic Storage
Recommendations
An Erasure Coded Archival Storage System
ICPADS '12: Proceedings of the 2012 IEEE 18th International Conference on Parallel and Distributed SystemsThere is an ever increasing need of storage capacity for storage of digital archives and historical datadigital preservation, because of regulatory and compliance requirements. There is an increasing interest in disk based archival system. Major ...
Data Delta Based Hybrid Writes for Erasure-Coded Storage Systems
Network and Parallel ComputingAbstractErasure coding is widely used in storage systems since it can offer higher reliability at lower redundancy than data replication. However, erasure-coded storage systems have to perform a partial write to an entire erasure coding group for a small ...
Optimistic Erasure-Coded Distributed Storage
DISC '08: Proceedings of the 22nd international symposium on Distributed ComputingWe study erasure-coded atomic register implementations in an asynchronous crash-recovery model. Erasure coding provides a cheap and space-efficient way to tolerate failures in a distributed system. This paper presents ORCAS, Optimistic eRasure-Coded ...






Comments