Abstract
Programming geo-replicated distributed systems is challenging given the complexity of reasoning about different evolving states on different replicas. Existing approaches to this problem impose significant burden on application developers to consider the effect of how operations performed on one replica are witnessed and applied on others. To alleviate these challenges, we present a fundamentally different approach to programming in the presence of replicated state. Our insight is based on the use of invertible relational specifications of an inductively-defined data type as a mechanism to capture salient aspects of the data type relevant to how its different instances can be safely merged in a replicated environment. Importantly, because these specifications only address a data type's (static) structural properties, their formulation does not require exposing low-level system-level details concerning asynchrony, replication, visibility, etc. As a consequence, our framework enables the correct-by-construction synthesis of rich merge functions over arbitrarily complex (i.e., composable) data types. We show that the use of a rich relational specification language allows us to extract sufficient conditions to automatically derive merge functions that have meaningful non-trivial convergence properties. We incorporate these ideas in a tool called Quark, and demonstrate its utility via a detailed evaluation study on real-world benchmarks.
Supplemental Material
- Peter Alvaro, Neil Conway, Joe Hellerstein, and William R. Marczak. 2011. Consistency Analysis in Bloom: a CALM and Collected Approach. In CIDR 2011, Fifth Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 9-12, 2011, Online Proceedings. 249–260.Google Scholar
- Peter Bailis, Aaron Davidson, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, and Ion Stoica. 2013. Highly Available Transactions: Virtues and Limitations. PVLDB 7, 3 (2013), 181–192.Google Scholar
Digital Library
- Peter Bailis, Alan Fekete, Michael J. Franklin, Ali Ghodsi, Joseph M. Hellerstein, and Ion Stoica. 2014. Coordination Avoidance in Database Systems. Proc. VLDB Endow. 8, 3 (Nov. 2014), 185–196. Google Scholar
Digital Library
- Peter Bailis, Ali Ghodsi, Joseph M. Hellerstein, and Ion Stoica. 2013. Bolt-on Causal Consistency. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD ’13). ACM, New York, NY, USA, 761–772. Google Scholar
Digital Library
- Valter Balegas, Nuno Preguiça, Rodrigo Rodrigues, Sérgio Duarte, Carla Ferreira, Mahsa Najafzadeh, and Marc Shapiro. 2015. Putting the Consistency back into Eventual Consistency. In Proceedings of the Tenth European Conference on Computer System (EuroSys ’15). Bordeaux, France. http://lip6.fr/Marc.Shapiro/papers/putting- consistency- back- EuroSys- 2015.pdfGoogle Scholar
Digital Library
- Hans-J. Boehm, Russ Atkinson, and Michael Plass. 1995. Ropes: An Alternative to Strings. Softw. Pract. Exper. 25, 12 (Dec. 1995), 1315–1330. Google Scholar
Digital Library
- Kenneth A. Bowen. 1979. Prolog. In Proceedings of the 1979 Annual Conference (ACM ’79). ACM, New York, NY, USA, 14–23. Google Scholar
Digital Library
- Brewer 2013. (2013). http://highscalability.com/blog/2013/5/1/myth- eric- brewer- on- why- banks- are- base- not- acidavailability.html Myth: Eric Brewer on Why Banks are BASE Not ACID - Availability Is Revenue.Google Scholar
- Eric Brewer. 2000. Towards Robust Distributed Systems (Invited Talk). (2000).Google Scholar
- Sebastian Burckhardt, Alexandro Baldassin, and Daan Leijen. 2010. Concurrent Programming with Revisions and Isolation Types. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA ’10). ACM, New York, NY, USA, 691–707. Google Scholar
Digital Library
- Sebastian Burckhardt, Manuel Fähndrich, Daan Leijen, and Benjamin P. Wood. 2012. Cloud Types for Eventual Consistency. In Proceedings of the 26th European Conference on Object-Oriented Programming (ECOOP’12). Springer-Verlag, Berlin, Heidelberg, 283–307. Google Scholar
Digital Library
- Sebastian Burckhardt, Alexey Gotsman, Hongseok Yang, and Marek Zawirski. 2014. Replicated Data Types: Specification, Verification, Optimality. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’14). ACM, New York, NY, USA, 271–284. Google Scholar
Digital Library
- Sebastian Burckhardt, Daan Leijen, Jonathan Protzenko, and Manuel Fähndrich. 2015. Global Sequence Protocol: A Robust Abstraction for Replicated Shared State. In Proceedings of the 29th European Conference on Object-Oriented Programming (ECOOP ’15). Prague, Czech Republic. http://research.microsoft.com/pubs/240462/gsp- tr- 2015- 2.pdfGoogle Scholar
- Bor-Yuh Evan Chang and Xavier Rival. 2008. Relational Inductive Shape Analysis. In Proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’08). ACM, New York, NY, USA, 247–260. Google Scholar
Digital Library
- Natacha Crooks, Youer Pu, Nancy Estrada, Trinabh Gupta, Lorenzo Alvisi, and Allen Clement. 2016. TARDiS: A Branchand-Merge Approach To Weak Consistency. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD ’16). ACM, New York, NY, USA, 1615–1628. Google Scholar
Digital Library
- Martin Erwig. 2001. Inductive Graphs and Functional Graph Algorithms. J. Funct. Program. 11, 5 (Sept. 2001), 467–492. Google Scholar
Digital Library
- Functional Graph 2008. A Functional Graph Library. (2008). http://hackage.haskell.org/package/fglGoogle Scholar
- Alexey Gotsman, Hongseok Yang, Carla Ferreira, Mahsa Najafzadeh, and Marc Shapiro. 2016. ’Cause I’m Strong Enough: Reasoning About Consistency Choices in Distributed Systems. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL 2016). ACM, New York, NY, USA, 371–384. Google Scholar
Digital Library
- Farzin Houshmand and Mohsen Lesani. 2019. Hamsaz: Replication Coordination Analysis and Synthesis. PACMPL 3, POPL (2019), 74:1–74:32. https://dl.acm.org/citation.cfm?id=3290387Google Scholar
- Bertrand Jeannet, Alexey Loginov, Thomas Reps, and Mooly Sagiv. 2010. A Relational Approach to Interprocedural Shape Analysis. ACM Trans. Program. Lang. Syst. 32, 2, Article 5 (Feb. 2010), 52 pages. Google Scholar
Digital Library
- Stefan Kahrs. 2001. Red-black Trees with Types. J. Funct. Program. 11, 4 (July 2001), 425–432. Google Scholar
Digital Library
- Gowtham Kaki and Suresh Jagannathan. 2014. A Relational Framework for Higher-order Shape Analysis. In Proceedings of the 19th ACM SIGPLAN International Conference on Functional Programming (ICFP ’14). ACM, New York, NY, USA, 311–324. Google Scholar
Digital Library
- Gowtham Kaki, Kartik Nagar, Mahsa Najafzadeh, and Suresh Jagannathan. 2017. Alone Together: Compositional Reasoning and Inference for Weak Isolation. Proc. ACM Program. Lang. 2, POPL, Article 27 (Dec. 2017), 34 pages. Google Scholar
Digital Library
- Gowtham Kaki, KC Sivaramakrishnan, and Suresh Jagannathan. 2019. Version Control Is for Your Data Too. In 3rd Summit on Advances in Programming Languages (SNAPL 2019) (Leibniz International Proceedings in Informatics (LIPIcs)), Benjamin S. Lerner, Rastislav Bodík, and Shriram Krishnamurthi (Eds.), Vol. 136. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 8:1–8:18. Google Scholar
Cross Ref
- Mohsen Lesani, Christian J. Bell, and Adam Chlipala. 2016. Chapar: Certified Causally Consistent Distributed Key-value Stores. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’16). ACM, New York, NY, USA, 357–370. Google Scholar
Digital Library
- Cheng Li, João Leitão, Allen Clement, Nuno Preguiça, Rodrigo Rodrigues, and Viktor Vafeiadis. 2014a. Automating the Choice of Consistency Levels in Replicated Systems. In Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference (USENIX ATC’14). USENIX Association, Berkeley, CA, USA, 281–292. http://dl.acm.org/citation. cfm?id=2643634.2643664Google Scholar
Digital Library
- Cheng Li, João Leitão, Allen Clement, Nuno Preguiça, Rodrigo Rodrigues, and Viktor Vafeiadis. 2014b. Automating the Choice of Consistency Levels in Replicated Systems. In Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference (USENIX ATC’14). USENIX Association, Berkeley, CA, USA, 281–292. http://dl.acm.org/citation. cfm?id=2643634.2643664Google Scholar
Digital Library
- Cheng Li, Daniel Porto, Allen Clement, Johannes Gehrke, Nuno Preguiça, and Rodrigo Rodrigues. 2012a. Making Georeplicated Systems Fast As Possible, Consistent when Necessary. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12). USENIX Association, Berkeley, CA, USA, 265–278. http: //dl.acm.org/citation.cfm?id=2387880.2387906Google Scholar
Digital Library
- Cheng Li, Daniel Porto, Allen Clement, Johannes Gehrke, Nuno Preguiça, and Rodrigo Rodrigues. 2012b. Making Georeplicated Systems Fast As Possible, Consistent when Necessary. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12). USENIX Association, Berkeley, CA, USA, 265–278. http: //dl.acm.org/citation.cfm?id=2387880.2387906Google Scholar
Digital Library
- David Maier, K. Tuncay Tekle, Michael Kifer, and David S. Warren. 2018. Declarative Logic Programming. Association for Computing Machinery and Morgan & Claypool, New York, NY, USA, Chapter Datalog: Concepts, History, and Outlook, 3–100. Google Scholar
Digital Library
- Chris Okasaki. 1998. Purely Functional Data Structures. Cambridge University Press, New York, NY, USA.Google Scholar
Digital Library
- PPX 2017. PPX extension points. (2017). Accessed: 2017-01-04 10:12:00.Google Scholar
- Nuno Preguica, Joan Manuel Marques, Marc Shapiro, and Mihai Letia. 2009. A Commutative Replicated Data Type for Cooperative Editing. In Proceedings of the 2009 29th IEEE International Conference on Distributed Computing Systems (ICDCS ’09). IEEE Computer Society, Washington, DC, USA, 395–403. Google Scholar
Digital Library
- RUBiS 2014. Rice University Bidding System. (2014). http://rubis.ow2.org/ Accessed: 2014-11-4 13:21:00.Google Scholar
- Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. 2011a. Conflict-Free Replicated Data Types. In Stabilization, Safety, and Security of Distributed Systems, Xavier Défago, Franck Petit, and Vincent Villain (Eds.). Lecture Notes in Computer Science, Vol. 6976. Springer Berlin Heidelberg, 386–400. Google Scholar
Cross Ref
- Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. 2011b. Conflict-Free Replicated Data Types. In Stabilization, Safety, and Security of Distributed Systems, Xavier Défago, Franck Petit, and Vincent Villain (Eds.). Lecture Notes in Computer Science, Vol. 6976. Springer Berlin Heidelberg, 386–400. Google Scholar
Cross Ref
- KC Sivaramakrishnan, Gowtham Kaki, and Suresh Jagannathan. 2015. Declarative Programming over Eventually Consistent Data Stores. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2015). ACM, New York, NY, USA, 413–424. Google Scholar
Digital Library
- TPC 2018. (2018). http://www.tpc.org/information/benchmarks.asp TPC Benchmarks.Google Scholar
- Twissandra 2014. Twitter clone on Cassandra. (2014). http://twissandra.com/ Accessed: 2014-11-4 13:21:00.Google Scholar
- James R. Wilcox, Doug Woos, Pavel Panchekha, Zachary Tatlock, Xi Wang, Michael D. Ernst, and Thomas Anderson. 2015. Verdi: A Framework for Implementing and Formally Verifying Distributed Systems. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). ACM, New York, NY, USA, 357–368. Google Scholar
Digital Library
Index Terms
Mergeable replicated data types
Recommendations
Safe replication through bounded concurrency verification
High-level data types are often associated with semantic invariants that must be preserved by any correct implementation. While having implementations enforce strong guarantees such as linearizability or serializability can often be used to prevent ...
Replicated data types: specification, verification, optimality
POPL '14: Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming LanguagesGeographically distributed systems often rely on replicated eventually consistent data stores to achieve availability and performance. To resolve conflicting updates at different replicas, researchers and practitioners have proposed specialized ...
Consistent and automatic replica regeneration
Reducing management costs and improving the availability of large-scale distributed systems require automatic replica regeneration, that is, creating new replicas in response to replica failures. A major challenge to regeneration is maintaining ...






Comments