ABSTRACT
Modern internet applications require scalability to millions of clients, response times in the tens of milliseconds, and availability in the presence of partitions, hardware faults and even disasters. To obtain these requirements, applications are usually geo-replicated across several data centres (DCs) spread throughout the world, providing clients with fast access to nearby DCs and fault-tolerance in case of a DC out-age. Using multiple replicas also has disadvantages, not only does this incur extra storage, bandwidth and hardware costs, but programming these systems becomes more difficult.
To address the additional hardware costs, data is often partially replicated, meaning that only certain DCs will keep a copy of certain data, for example in a key-value store it may only store values corresponding to a portion of the keys. Additionally, to address the issue of programming these systems, consistency protocols are run on top ensuring different guarantees for the data, but as shown by the CAP theorem, strong consistency, availability, and partition tolerance cannot be ensured at the same time. For many applications availability is paramout, thus strong consistency is exchanged for weaker consistencies allowing concurrent writes like causal consistency. Unfortunately these protocols are not designed with partial replication in mind and either end up not supporting it or do so in an inefficient manner. In this work we will look at why this happens and propose a protocol designed to support partial replication under causal consistency more efficiently.
References
- Basho. Riak-core. https://github.com/basho/riak_core, 2015.Google Scholar
- N. Belaramani, M. Dahlin, L. Gao, A. Nayate, A. Venkataramani, P. Yalagandula, and J. Zheng. PRACTI replication. In Networked Sys. Design and Implem. (NSDI), pages 59--72, San Jose, CA, USA, May 2006. Usenix, Usenix. URL https://www.usenix.org/legacy/event/nsdi06/tech/belaramani.html. Google Scholar
Digital Library
- J. Du, S. Elnikety, A. Roy, and W. Zwaenepoel. Orbe: Scalable causal consistency using dependency matrices and physical clocks. In Symp. on Cloud Computing, pages 11:1--11:14, Santa Clara, CA, USA, Oct. 2013. Assoc. for Computing Machinery. URL http://doi.acm.org/10.1145/2523616.2523628. Google Scholar
Digital Library
- J. Du, S. Elnikety, and W. Zwaenepoel. Clock-SI: Snapshot isolation for partitioned data stores using loosely synchronized clocks. In Symp. on Reliable Dist. Sys. (SRDS), pages 173--184, Braga, Portugal, Oct. 2013. IEEE Comp. Society. URL http://doi.ieeecomputersociety.org/10.1109/SRDS.2013.26. Google Scholar
Digital Library
- J. Du, C. Iorgulescu, A. Roy, and W. Zwaenepoel. Closing the performance gap between causal consistency and eventual consistency,. In W. on the Principles and Practice of Eventual Consistency (PaPEC), Amsterdam, the Netherlands, 2014. URL http://eventos.fct.unl.pt/papec/pages/program.Google Scholar
- W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. Don't settle for eventual: scalable causal consistency for wide-area storage with COPS. In Symp. on Op. Sys. Principles (SOSP), pages 401--416, Cascais, Portugal, Oct. 2011. Assoc. for Computing Machinery. Google Scholar
Digital Library
- W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. Stronger semantics for low-latency geo-replicated storage. In Networked Sys. Design and Implem. (NSDI), pages 313--328, Lombard, IL, USA, Apr. 2013. URL https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final149.pdf. Google Scholar
Digital Library
- M. Saeida Ardekani, P. Sutra, M. Shapiro, and N. Preguia. On the scalability of snapshot isolation. In F. Wolf, B. Mohr, and D. an Mey, editors, Euro-Par 2013 Parallel Processing, volume 8097 of Lecture Notes in Computer Science, pages 369--381. Springer Berlin Heidelberg, 2013. ISBN 978-3-642-40046-9. URL http://dx.doi.org/10.1007/978-3-642-40047-6_39. Google Scholar
Digital Library
- N. Schiper, P. Sutra, and F. Pedone. P-Store: Genuine partial replication in wide area networks. In Symp. on Reliable Dist. Sys. (SRDS), pages 214--224, New Dehli, India, Oct. 2010. IEEE Comp. Society. URL http://doi.ieeecomputersociety.org/10.1109/SRDS.2010.32. Google Scholar
Digital Library
- M. Shapiro, N. Preguiça, C. Baquero, and M. Zawirski. Conflict-free replicated data types. In X. Défago, F. Petit, and V. Villain, editors, Int. Symp. on Stabilization, Safety, and Security of Distributed Systems (SSS), volume 6976 of Lecture Notes in Comp. Sc., pages 386--400, Grenoble, France, Oct. 2011. Springer-Verlag. URL http://www.springerlink.com/content/3rg39l2287330370/. Google Scholar
Digital Library
- SyncFree. Antidote reference platform. https://github.com/SyncFree/antidote, 2015.Google Scholar
- SyncFree. Antidote reference platform - partial replication branch. https://github.com/SyncFree/antidote/tree/partial_replication, 2015.Google Scholar
- M. Zawirski, A. Bieniusa, V. Balegas, S. Duarte, C. Baquero, M. Shapiro, and N. Preguiça. Swiftcloud: Fault-tolerant geo-replication integrated all the way to the client machine. arXiv preprint arXiv:1310.3107, 2013.Google Scholar
Index Terms
Designing a causally consistent protocol for geo-distributed partial replication

Marc Shapiro

Comments