Abstract
Despite recent advances, guaranteeing the correctness of large-scale distributed applications without compromising performance remains a challenging problem. Network and node failures are inevitable and, for some applications, careful control over how they are handled is essential. Unfortunately, existing approaches either completely hide these failures behind an atomic state machine replication (SMR) interface, or expose all of the network-level details, sacrificing atomicity. We propose a novel, compositional, atomic distributed object (ADO) model for strongly consistent distributed systems that combines the best of both options. The object-oriented API abstracts over protocol-specific details and decouples high-level correctness reasoning from implementation choices. At the same time, it intentionally exposes an abstract view of certain key distributed failure cases, thus allowing for more fine-grained control over them than SMR-like models. We demonstrate that proving properties even of composite distributed systems can be straightforward with our Coq verification framework, Advert, thanks to the ADO model. We also show that a variety of common protocols including multi-Paxos and Chain Replication refine the ADO semantics, which allows one to freely choose among them for an application's implementation without modifying ADO-level correctness proofs.
Supplemental Material
- Andrew W. Appel. 2011. Verified Software Toolchain. In Proc. of the 20th European Symposium on Programming, Gilles Barthe (Ed.) (ESOP ’11, Vol. 6602). Springer, Berlin, Heidelberg. 1–17. https://doi.org/10.1007/978-3-642-19718-5_1 Google Scholar
Cross Ref
- Mahesh Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, Ted Wobber, Michael Wei, and John D. Davis. 2012. CORFU: A Shared Log Design for Flash Clusters. In Proc. of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI ’12). USENIX Association, Berkeley, CA, USA. 1–14. https://doi.org/10.1145/2535930 Google Scholar
Digital Library
- Romain Boichat, Partha Dutta, Svend Frølund, and Rachid Guerraoui. 2003. Deconstructing Paxos. SIGACT News, 34, 1 (2003), March, 47–67. issn:0163-5700 https://doi.org/10.1145/637437.637447 Google Scholar
Digital Library
- Mike Burrows. 2006. The Chubby Lock Service for Loosely-Coupled Distributed Systems. In Proc. of the 7th Symposium on Operating Systems Design and Implementation (OSDI ’06). USENIX Association, Berkeley, CA, USA. 335–350. https://dl.acm.org/doi/10.5555/1298455.1298487Google Scholar
Digital Library
- Paul Castro, Vatche Ishakian, Vinod Muthusamy, and Aleksander Slominski. 2019. The Rise of Serverless Computing. Commun. ACM, 62, 12 (2019), 44–54. https://doi.org/10.1145/3368454 Google Scholar
Digital Library
- Tej Chajed, Frans Kaashoek, Butler Lampson, and Nickolai Zeldovich. 2018. Verifying Concurrent Software Using Movers in CSPEC. In Proc. of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’18). USENIX Association, Carlsbad, CA. 306–322. https://dl.acm.org/doi/10.5555/3291168.3291191Google Scholar
- Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2006. Bigtable: A Distributed Storage System for Structured Data. In Proc. of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’06). ACM, New York, NY, USA. 205–218. https://doi.org/10.1145/1365815.1365816 Google Scholar
Digital Library
- CorfuDB. 2017. CorfuDB. https://www.github.com/CorfuDB/CorfuDBGoogle Scholar
- Jeff Dean. 2009. Designs, Lessons and Advice from Building Large Distributed Systems. https://research.cs.cornell.edu/ladis2009/talks/dean-keynote-ladis2009.pdf Keynote from ACM SIGOPS International Workshop on Large Scale Distributed Systems and MiddlewareGoogle Scholar
- Pascal Felber, Ben Jai, Rajeev Rastogi, and Mark Smith. 2001. Using Semantic Knowledge of Distributed Objects to Increase Reliability and Availability. In Proc. of the 6th International Workshop on Object-Oriented Real-Time Dependable Systems (WORDS ’01). IEEE Computer Society, Washington, DC, USA. 153–160. https://doi.org/10.1109/WORDS.2001.945126 Google Scholar
Cross Ref
- Ivana Filipovic, Peter W. O’Hearn, Noam Rinetzky, and Hongseok Yang. 2010. Abstraction for Concurrent Objects. Theor. Comput. Sci., 411, 51-52 (2010), 4379–4398. https://doi.org/10.1007/978-3-642-00590-9_19 Google Scholar
Digital Library
- Pedro Fonseca, Kaiyuan Zhang, Xi Wang, and Arvind Krishnamurthy. 2017. An Empirical Study on the Correctness of Formally Verified Distributed Systems. In Proc. of the 12th European Conference on Computer Systems (EuroSys ’17). ACM, New York, NY, USA. 328–343. https://doi.org/10.1145/3064176.3064183 Google Scholar
Digital Library
- Eli Gafni and Leslie Lamport. 2003. Disk Paxos. Distributed Computing, 16, 1 (2003), 1–20. https://doi.org/10.1007/s00446-002-0070-8 Google Scholar
Digital Library
- Phillipa Gill, Navendu Jain, and Nachiappan Nagappan. 2011. Understanding Network Failures in Data Centers: Measurement, Analysis, and Implications. In Proc. of the ACM SIGCOMM 2011 Conference. ACM, New York, NY, USA. 350–361. https://doi.org/10.1145/2043164.2018477 Google Scholar
Digital Library
- Jim Gray and Leslie Lamport. 2006. Consensus on Transaction Commit. ACM Transactions on Database Systems, 31, 1 (2006), 133–160. https://doi.org/10.1145/1132863.1132867 Google Scholar
Digital Library
- Ronghui Gu, Jérémie Koenig, Tahina Ramananandro, Zhong Shao, Xiongnan (Newman) Wu, Shu-Chun Weng, Haozhong Zhang, and Yu Guo. 2015. Deep Specifications and Certified Abstraction Layers. In Proc. of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’15). ACM, New York, NY, USA. 595–608. https://doi.org/10.1145/2676726.2676975 Google Scholar
Digital Library
- Ronghui Gu, Zhong Shao, Hao Chen, Xiongnan Wu, Jieung Kim, Vilhelm Sjöberg, and David Costanzo. 2016. CertiKOS: An Extensible Architecture for Building Certified Concurrent OS Kernels. In Proc. of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’16). USENIX Association, Berkeley, CA, USA. 653–669. https://dl.acm.org/doi/10.5555/3026877.3026928Google Scholar
- Ronghui Gu, Zhong Shao, Jieung Kim, Xiongnan Wu, Jérémie Koenig, Vilhelm Sjöberg, Hao Chen, David Costanzo, and Tahina Ramananandro. 2018. Certified Concurrent Abstraction Layers. In Proc. of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’18). ACM, New York, NY, USA. 646–661. https://doi.org/10.1145/3296979.3192381 Google Scholar
Digital Library
- Rachid Guerraoui and Michal Kapalka. 2008. On the Correctness of Transactional Memory. In Proc. of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’08). ACM, New York, NY, USA. 175–184. https://doi.org/10.1145/1345206.1345233 Google Scholar
Digital Library
- Haryadi S. Gunawi, Mingzhe Hao, Tanakorn Leesatapornwongsa, Tiratat Patana-anake, Thanh Do, Jeffry Adityatama, Kurnia J. Eliazar, Agung Laksono, Jeffrey F. Lukman, Vincentius Martin, and Anang D. Satria. 2014. What Bugs Live in the Cloud? A Study of 3000+ Issues in Cloud Systems. In Proc. of the ACM Symposium on Cloud Computing (SoCC ’14). ACM, New York, NY, USA. 1–14. https://doi.org/10.1145/2670979.2670986 Google Scholar
Digital Library
- Chris Hawblitzel, Jon Howell, Manos Kapritsos, Jacob R. Lorch, Bryan Parno, Michael L. Roberts, Srinath Setty, and Brian Zill. 2015. IronFleet: Proving Practical Distributed Systems Correct. In Proc. of the 25th Symposium on Operating Systems Principles (SOSP ’15). ACM, New York, NY, USA. 1–17. https://doi.org/10.1145/2815400.2815428 Google Scholar
Digital Library
- Chris Hawblitzel, Jon Howell, Jacob R. Lorch, Arjun Narayan, Bryan Parno, Danfeng Zhang, and Brian Zill. 2014. Ironclad Apps: End-to-end Security via Automated Full-system Verification. In Proc. of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’14). USENIX Association, Berkeley, CA, USA. 165–181. https://dl.acm.org/doi/10.5555/2685048.2685062Google Scholar
- Chris Hawblitzel, Erez Petrank, Shaz Qadeer, and Serar Tasiran. 2015. Automated and Modular Refinement Reasoning for Concurrent Programs. In Proc. of the 27th International Conference on Computer Aided Verification (CAV ’15). Springer-Verlag, Berlin, Heidelberg. 449–465. https://doi.org/10.1007/978-3-319-21668-3_26 Google Scholar
Cross Ref
- Wolf Honoré, Jieung Kim, Ji-Yong Shin, and Zhong Shao. 2021. Much ADO about Failures: A Fault-Aware Model for Compositional Verification of Strongly Consistent Distributed Systems. Yale Univ.. https://flint.cs.yale.edu/publications/ado.htmlGoogle Scholar
- Joseph Izraelevitz, Hammurabi Mendes, and Michael L. Scott. 2016. Linearizability of Persistent Memory Objects Under a Full-System-Crash Failure Model. In Proc. of the 30th International Symposium on Distributed Computing (DISC ’16). Springer-Verlag, Berlin, Heidelberg. 313–327. https://doi.org/10.1007/978-3-662-53426-7_23 Google Scholar
Cross Ref
- Tom Killalea. 2016. The Hidden Dividends of Microservices. Commun. ACM, 59, 8 (2016), July, 42–45. issn:0001-0782 https://doi.org/10.1145/2948985 Google Scholar
Digital Library
- Eric Koskinen and Matthew Parkinson. 2015. The Push/Pull Model of Transactions. In Proc. of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). ACM, New York, NY, USA. 186–195. https://doi.org/10.1145/2737924.2737995 Google Scholar
Digital Library
- Bernhard Kragl, Constantin Enea, Thomas A. Henzinger, Suha Orhun Mutluergil, and Shaz Qadeer. 2020. Inductive Sequentialization of Asynchronous Programs. In Proc. of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’20). ACM, New York, NY, USA. 227–242. https://doi.org/10.1145/3385412.3385980 Google Scholar
Digital Library
- Morten Krogh-Jespersen, Amin Timany, Marit Edna Ohlenbusch, Simon Oddershede Gregersen, and Lars Birkedal. 2020. Aneris: A Mechanised Logic for Modular Reasoning about Distributed Systems. In Proc. of the 27th European Symposium on Programming, Peter Müller (Ed.) (ESOP ’20). Springer-Verlag, Berlin, Heidelberg. 336–365. https://doi.org/10.1007/978-3-030-44914-8_13 Google Scholar
Cross Ref
- Leslie Lamport. 2001. Paxos Made Simple. SIGACT News, 32, 4 (2001), Dec., 51–58. issn:0163-5700 https://doi.org/10.1145/568425.568433 Google Scholar
Digital Library
- Leslie Lamport. 2005. Generalized Consensus and Paxos. Microsoft.Google Scholar
- Leslie Lamport. 2006. Fast Paxos. Distributed Computing, 19, 2 (2006), 79–103. https://doi.org/10.1007/s00446-006-0005-x Google Scholar
Digital Library
- Leslie Lamport, Dahlia Malkhi, and Lidong Zhou. 2009. Vertical Paxos and Primary-Backup Replication. In Proc. of the 28th ACM Symposium on Principles of Distributed Computing (PODC ’09). ACM, New York, NY, USA. 312–313. https://doi.org/10.1145/1582716.1582783 Google Scholar
Digital Library
- Butler Lampson. 2001. The ABCD’s of Paxos. In Proc. of the 20th Annual ACM Symposium on Principles of Distributed Computing (PODC ’01). ACM, New York, NY, USA. 13. https://doi.org/10.1145/383962.383969 Google Scholar
Digital Library
- K. Rustan M. Leino. 2010. Dafny: An Automatic Program Verifier for Functional Correctness. In Proc. of the 16th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning, Edmund M. Clarke and Andrei Voronkov (Eds.) (LPAR ’10). Springer-Verlag, Berlin, Heidelberg. 348–370. https://doi.org/10.1007/978-3-642-17511-4_20 Google Scholar
Cross Ref
- Xavier Leroy. 2005–2020. The CompCert verified compiler. http://compcert.inria.fr/Google Scholar
- Xavier Leroy. 2009. A Formally Verified Compiler Back-End. Journal of Automated Reasoning, 43, 4 (2009), 363–446. https://doi.org/10.1007/s10817-009-9155-4 Google Scholar
Digital Library
- Hongjin Liang, Jan Hoffmann, Xinyu Feng, and Zhong Shao. 2013. Characterizing Progress Properties of Concurrent Objects via Contextual Refinements. In Proc. of the 24th International Conference on Concurrency Theory (CONCUR ’13). Springer-Verlag, Berlin, Heidelberg. 227–241. https://doi.org/10.1007/978-3-642-40184-8_17 Google Scholar
Digital Library
- Haojun Ma, Aman Goel, Jean-Baptiste Jeannin, Manos Kapritsos, Baris Kasikci, and Karem A. Sakallah. 2019. I4: Incremental Inference of Inductive Invariants for Verification of Distributed Protocols. In Proc. of the 27th ACM Symposium on Operating Systems Principles (SOSP ’19). ACM, New York, NY, USA. 370–384. https://doi.org/10.1145/3341301.3359651 Google Scholar
Digital Library
- John MacCormick, Nick Murphy, Marc Najork, Chandramohan A. Thekkath, and Lidong Zhou. 2004. Boxwood: Abstractions as the Foundation for Storage Infrastructure. In Proc. of the 6th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’04, Vol. 4). USENIX Association, Berkeley, CA, USA. 8–8. https://dl.acm.org/doi/10.5555/1251254.1251262Google Scholar
- Antoni Mazurkiewicz. 1995. Introduction to Trace Theory. World Scientific, Hackensack, NJ, USA. 3–41. https://doi.org/10.1142/9789814261456_0001 Google Scholar
Cross Ref
- Justin Meza, Tianyin Xu, Kaushik Veeraraghavan, and Onur Mutlu. 2018. A Large Scale Study of Data Center Network Reliability. In Proc. of the Internet Measurement Conference (IMC ’18). ACM, New York, NY, USA. 393–407. https://doi.org/10.1145/3278532.3278566 Google Scholar
Digital Library
- Iulian Moraru, David G. Andersen, and Michael Kaminsky. 2013. There is More Consensus in Egalitarian Parliaments. In Proc. of the 24th ACM Symposium on Operating Systems Principles (SOSP ’13). ACM, New York, NY, USA. 358–372. https://doi.org/10.1145/2517349.2517350 Google Scholar
Digital Library
- Edmund B. Nightingale, Peter M. Chen, and Jason Flinn. 2005. Speculative Execution in a Distributed File System. ACM SIGOPS Operating Systems Review, 39, 5 (2005), 191–205. https://doi.org/10.1145/1189256.1189258 Google Scholar
Digital Library
- Diego Ongaro and John K. Ousterhout. 2014. In Search of an Understandable Consensus Algorithm. In USENIX Annual Technical Conference. USENIX Association, Berkeley, CA, USA. 305–319. https://dl.acm.org/doi/10.5555/2643634.2643666Google Scholar
- Oded Padon, Giuliano Losa, Mooly Sagiv, and Sharon Shoham. 2017. Paxos Made EPR: Decidable Reasoning About Distributed Protocols. Proc. ACM Program. Lang., 1, OOPSLA (2017), Article 108, Oct., 31 pages. https://doi.org/10.1145/3140568 Google Scholar
Digital Library
- Oded Padon, Kenneth L. McMillan, Aurojit Panda, Mooly Sagiv, and Sharon Shoham. 2016. Ivy: Safety Verification by Interactive Generalization. In Proc. of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, Chandra Krintz and Emery Berger (Eds.) (PLDI ’16). ACM, New York, NY, USA. 614–630. https://doi.org/10.1145/2908080.2908118 Google Scholar
Digital Library
- Ganesan Ramalingam and Kapil Vaswani. 2013. Fault Tolerance via Idempotence. In Proc. of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’13). ACM, New York, NY, USA. 249–262. https://doi.org/10.1145/2429069.2429100 Google Scholar
Digital Library
- Robbert Van Renesse and Deniz Altinbuken. 2015. Paxos Made Moderately Complex. ACM Computing Surveys (CSUR), 47, 3 (2015), 42. https://doi.org/10.1145/2673577 Google Scholar
Digital Library
- Robbert Van Renesse and Fred B. Schneider. 2004. Chain Replication for Supporting High Throughput and Availability. In Proc. of the 6th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’04, Vol. 4). USENIX Association, Berkeley, CA, USA. 91–104. https://dl.acm.org/doi/10.5555/1251254.1251261Google Scholar
- Denis Rystsov. 2018. CASPaxos: Replicated State Machines without Logs. CoRR, abs/1802.07000 (2018), 15 pages. arXiv:1802.07000. arxiv:1802.07000Google Scholar
- Michael Sammler, Rodolphe Lepigre, Robbert Krebbers, Kayvan Memarian, Derek Dreyer, and Deepak Garg. 2021. RefinedC: Automating the Foundational Verification of C Code with Refined Ownership Types. In Proc. of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI ’21). ACM, New York, NY, USA. 158–174. https://doi.org/10.1145/3453483.3454036 Google Scholar
Digital Library
- Fred B. Schneider. 1990. Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial. ACM Computing Surveys (CSUR), 22, 4 (1990), 299–319. https://doi.org/10.1145/98163.98167 Google Scholar
Digital Library
- Ilya Sergey, James R. Wilcox, and Zachary Tatlock. 2017. Programming and Proving with Distributed Protocols. Proc. ACM Program. Lang., 2, POPL (2017), Article 28, Dec., 30 pages. issn:2475-1421 https://doi.org/10.1145/3158116 Google Scholar
Digital Library
- Ji-Yong Shin, Jieung Kim, Wolf Honoré, Hernán Vanzetto, Srihari Radhakrishnan, Mahesh Balakrishnan, and Zhong Shao. 2019. WormSpace: A Modular Foundation for Simple, Verifiable Distributed Systems. In Proc. of the ACM Symposium on Cloud Computing (SoCC ’19). ACM, New York, NY, USA. 299–311. https://doi.org/10.1145/3357223.3362739 Google Scholar
Digital Library
- Vilhelm Sjöberg, Yuyang Sang, Shu chun Weng, and Zhong Shao. 2019. DeepSEA: A Language for Certified System Software. Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 136, Oct., 27 pages. issn:2475-1421 https://doi.org/10.1145/3360562 Google Scholar
Digital Library
- Andrew S. Tanenbaum and Maarten van Steen. 2006. Distributed Systems: Principles and Paradigms (2nd Edition). Prentice-Hall, Inc., Singapore. https://dl.acm.org/doi/10.5555/1202502Google Scholar
Digital Library
- Marcelo Taube, Giuliano Losa, Kenneth L. McMillan, Oded Padon, Mooly Sagiv, Sharon Shoham, James R. Wilcox, and Doug Woos. 2018. Modularity for Decidability: Implementing and Semi-Automatically Verifying Distributed Systems. In Proc. of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’18). ACM, New York, NY, USA. 662–677. https://doi.org/10.1145/3192366.3192414 Google Scholar
Digital Library
- Jeff Terrace and Michael J. Freedman. 2009. Object Storage on CRAQ: High-Throughput Chain Replication for Read-Mostly Workloads. In USENIX Annual Technical Conference. USENIX Association, Berkeley, CA, USA. 16 pages. https://dl.acm.org/doi/10.5555/1855807.1855818Google Scholar
- The AWS Team. 2011. Summary of the Amazon EC2 and Amazon RDS Service Disruption in the US East Region. https://aws.amazon.com/message/65648/Google Scholar
- The Coq Development Team. 1999–2018. The Coq Proof Assistant. http://coq.inria.frGoogle Scholar
- Ben Treynor. 2011. Today’s Outage for Several Google Services. https://googleblog.blogspot.com/2014/01/todays-outage-for-several-google.htmlGoogle Scholar
- Klaus v. Gleissenthall, Rami Gökhan Kici, Alexander Bakst, Deian Stefan, and Ranjit Jhala. 2019. Pretend Synchrony: Synchronous Verification of Asynchronous Distributed Programs. Proc. ACM Program. Lang., 3, POPL (2019), Article 59, Jan., 30 pages. issn:2475-1421 https://doi.org/10.1145/3290372 Google Scholar
Digital Library
- Jim Waldo, Geoff Wyant, Ann Wollrath, and Sam Kendall. 1994. A Note on Distributed Computing. IEEE Micro. https://dl.acm.org/doi/10.5555/974938Google Scholar
- Chao Wang, Constantin Enea, Suha Orhun Mutluergil, and Gustavo Petri. 2019. Replication-Aware Linearizability. In Proc. of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, Kathryn S. McKinley and Kathleen Fisher (Eds.) (PLDI ’19). ACM, New York, NY, USA. 980–993. https://doi.org/10.1145/3314221.3314617 Google Scholar
Digital Library
- James R. Wilcox, Doug Woos, Pavel Panchekha, Zachary Tatlock, Xi Wang, Michael D. Ernst, and Thomas Anderson. 2015. Verdi: A Framework for Implementing and Formally Verifying Distributed Systems. In Proc. of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). ACM, New York, NY, USA. 357–368. https://doi.org/10.1145/2737924.2737958 Google Scholar
Digital Library
- Ann Wollrath, Roger Riggs, and Jim Waldo. 1996. A Distributed Object Model for the Java System. Comput. Syst., 9 (1996), 265–290. https://dl.acm.org/doi/10.5555/1268049.1268066Google Scholar
- Doug Woos, James R. Wilcox, Steve Anton, Zachary Tatlock, Michael D. Ernst, and Thomas Anderson. 2016. Planning for Change in a Formal Verification of the Raft Consensus Protocol. In Proc. of the 5th ACM SIGPLAN Conference on Certified Programs and Proofs (CPP ’16). ACM, New York, NY, USA. 154–165. https://doi.org/10.1145/2854065.2854081 Google Scholar
Digital Library
- Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan R. K. Ports. 2015. Building Consistent Transactions with Inconsistent Replication. In Proc. of the 25th Symposium on Operating Systems Principles (SOSP ’15). ACM, New York, NY, USA. 263–278. https://doi.org/10.1145/2815400.2815404 Google Scholar
Digital Library
Index Terms
Much ADO about failures: a fault-aware model for compositional verification of strongly consistent distributed systems
Recommendations
Verdi: a framework for implementing and formally verifying distributed systems
PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and ImplementationDistributed systems are difficult to implement correctly because they must handle both concurrency and failures: machines may crash at arbitrary points and networks may reorder, drop, or duplicate packets. Further, their behavior is often too complex ...
Verdi: a framework for implementing and formally verifying distributed systems
PLDI '15Distributed systems are difficult to implement correctly because they must handle both concurrency and failures: machines may crash at arbitrary points and networks may reorder, drop, or duplicate packets. Further, their behavior is often too complex ...
Planning for change in a formal verification of the raft consensus protocol
CPP 2016: Proceedings of the 5th ACM SIGPLAN Conference on Certified Programs and ProofsWe present the first formal verification of state machine safety for the Raft consensus protocol, a critical component of many distributed systems. We connected our proof to previous work to establish an end-to-end guarantee that our implementation ...






Comments