Abstract
A real-world distributed system is rarely implemented as a standalone monolithic system. Instead, it is composed of multiple independent interacting components that together ensure the desired system-level specification. One can scale systematic testing to large, industrial-scale implementations by decomposing the system-level testing problem into a collection of simpler component-level testing problems.
This paper proposes techniques for compositional programming and testing of distributed systems with two central contributions: (1) We propose a module system based on the theory of compositional trace refinement for dynamic systems consisting of asynchronously-communicating state machines, where state machines can be dynamically created, and communication topology of the existing state machines can change at runtime; (2) We present ModP, a programming system that implements our module system to enable compositional reasoning (assume-guarantee) of distributed systems.
We demonstrate the efficacy of our framework by building two practical fault-tolerant distributed systems, a transaction-commit service and a replicated hash-table. ModP helps implement these systems modularly and validate them via compositional testing. We empirically demonstrate that the abstraction-based compositional reasoning approach helps amplify the coverage during testing and scale it to real-world distributed systems. The distributed services built using ModP achieve performance comparable to open-source equivalents.
Supplemental Material
- Martín Abadi and Leslie Lamport. 1995. Conjoining Specifications. ACM Trans. Program. Lang. Syst. (1995). Google Scholar
Digital Library
- Gul Agha. 1986. Actors: A Model of Concurrent Computation in Distributed Systems. MIT Press, Cambridge, MA, USA. Google Scholar
Digital Library
- Akka. 2017. Akka Programming Language. http://akka.io/ . (2017).Google Scholar
- Rajeev Alur and Thomas A. Henzinger. 1999. Reactive Modules. Formal Methods in System Design 15, 1 (1999), 7–48. Google Scholar
Digital Library
- Rajeev Alur, Thomas A. Henzinger, Freddy Y. C. Mang, Shaz Qadeer, Sriram K. Rajamani, and Serdar Tasiran. 1998. MOCHA: Modularity in Model Checking. In Computer Aided Verification, 10th International Conference, CAV ’98, Vancouver, BC, Canada, June 28 - July 2, 1998, Proceedings. 521–525. Google Scholar
Digital Library
- Davide Ancona, Viviana Bono, Mario Bravetti, Joana Campos, Giuseppe Castagna, Pierre-Malo Deniélou, Simon J Gay, Nils Gesbert, Elena Giachino, Raymond Hu, et al. 2016. Behavioral types in programming languages. Foundations and Trends® in Programming Languages 3, 2-3 (2016), 95–230. Google Scholar
Digital Library
- Joe Armstrong. 2007. Programming Erlang: Software for a Concurrent World. Pragmatic Bookshelf. Google Scholar
Digital Library
- Thomas Arts, Laura M Castro, and John Hughes. 2008. Testing erlang data types with quviq quickcheck. In Proceedings of the 7th ACM SIGPLAN workshop on ERLANG. ACM, 1–8. Google Scholar
Digital Library
- PaulC. Attie and NancyA. Lynch. 2001. Dynamic Input/Output Automata: A Formal Model for Dynamic Systems. In CONCUR 2001, KimG. Larsen and Mogens Nielsen (Eds.). Springer Berlin Heidelberg. Google Scholar
Digital Library
- Philip A Bernstein, Vassos Hadzilacos, and Nathan Goodman. 1986. Concurrency Control and Recovery in Database Systems. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA. Google Scholar
Digital Library
- Colin Blundell, Dimitra Giannakopoulou, and Corina S. P ˇ as ˇ areanu. 2006. Assume-guarantee Testing. SIGSOFT Softw. Eng. Notes (2006). Google Scholar
Digital Library
- Edwin Brady. 2016. State Machines All The Way Down An Architecture for Dependently Typed Applications. (2016).Google Scholar
- Sergey Bykov, Alan Geller, Gabriel Kliot, James Larus, Ravi Pandya, and Jorgen Thelin. 2010. Orleans: A Framework for Cloud Computing. Technical Report.Google Scholar
- Giuseppe Castagna, Mariangiola Dezani-Ciancaglini, Elena Giachino, and Luca Padovani. 2009. Foundations of session types. In Proceedings of the 11th ACM SIGPLAN conference on Principles and practice of declarative programming. ACM, 219–230. Google Scholar
Digital Library
- S. Chandra, B. Richards, and J. R. Larus. 1999. Teapot: a domain-specific language for writing cache coherence protocols. IEEE Transactions on Software Engineering (1999). Google Scholar
Digital Library
- Tushar D. Chandra, Robert Griesemer, and Joshua Redstone. 2007. Paxos made live: an engineering perspective. In Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing (PODC ’07). ACM, New York, NY, USA, 398–407. Google Scholar
Digital Library
- Haogang Chen, Daniel Ziegler, Tej Chajed, Adam Chlipala, M. Frans Kaashoek, and Nickolai Zeldovich. 2015. Using Crash Hoare Logic for Certifying the FSCQ File System. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP). Google Scholar
Digital Library
- Véronique Cortier and Stéphanie Delaune. 2009. A method for proving observational equivalence. In Computer Security Foundations Symposium, 2009. CSF’09. 22nd IEEE. IEEE, 266–276. Google Scholar
Digital Library
- Ankush Desai, Tommaso Dreossi, and Sanjit A. Seshia. 2017a. Combining Model Checking and Runtime Verification for Safe Robotics.Google Scholar
- Ankush Desai, Vivek Gupta, Ethan Jackson, Shaz Qadeer, Sriram Rajamani, and Damien Zufferey. 2013. P: Safe asynchronous event-driven programming. In Proceedings of PLDI. Google Scholar
Digital Library
- Ankush Desai, Amar Phanishayee, Shaz Qadeer, and Sanjit A. Seshia. 2018. Compositional Programming and Testing of Dynamic Distributed Systems. Technical Report UCB/EECS-2018-95. EECS Department, University of California, Berkeley. http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS- 2018- 95.htmlGoogle Scholar
- Ankush Desai, Shaz Qadeer, and Sanjit A. Seshia. 2015. Systematic Testing of Asynchronous Reactive Systems. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE). ACM, New York, NY, USA. Google Scholar
Digital Library
- Ankush Desai, Indranil Saha, Jianqiao Yang, Shaz Qadeer, and Sanjit A. Seshia. 2017b. DRONA: A Framework for Safe Distributed Mobile Robotics. In Proceedings of the 8th International Conference on Cyber-Physical Systems (ICCPS ’17). ACM, New York, NY, USA. Google Scholar
Digital Library
- Mariangiola Dezani-Ciancaglini and Ugo De’Liguoro. 2009. Sessions and session types: An overview. In International Workshop on Web Services and Formal Methods. Springer, 1–28. Google Scholar
Digital Library
- Xinyu Feng, Rodrigo Ferreira, and Zhong Shao. 2007. On the relationship between concurrent separation logic and assume-guarantee reasoning. In European Symposium on Programming. Springer, 173–188. Google Scholar
Digital Library
- Jasmin Fisher, Thomas A. Henzinger, Dejan Nickovic, Nir Piterman, Anmol V. Singh, and Moshe Y. Vardi. 2011. Dynamic Reactive Modules.Google Scholar
- Robert W Floyd. 1993. Assigning meanings to programs. In Program Verification. Springer, 65–81.Google Scholar
- Cédric Fournet and Georges Gonthier. 1996. The Reflexive CHAM and the Join-calculus. In Proceedings of the 23rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. Google Scholar
Digital Library
- Cédric Fournet, Georges Gonthier, Jean-Jacques Lévy, Luc Maranget, and Didier Rémy. 1996. A Calculus of Mobile Agents. In Proceedings of the 7th International Conference on Concurrency Theory (CONCUR ’96). Google Scholar
Digital Library
- Ivan Gavran, Filip Niksic, Aditya Kanade, Rupak Majumdar, and Viktor Vafeiadis. 2015. Rely/guarantee reasoning for asynchronous programs. In LIPIcs-Leibniz International Proceedings in Informatics, Vol. 42. Schloss Dagstuhl-LeibnizZentrum fuer Informatik.Google Scholar
- Jim Gray. 1978. Notes on Data Base Operating Systems. In Operating Systems, An Advanced Course. London, UK, UK, 393–481. Google Scholar
Digital Library
- Jim Gray and Leslie Lamport. 2006. Consensus on Transaction Commit. ACM Trans. Database Syst. 31, 1 (March 2006), 133–160. Google Scholar
Digital Library
- Huayang Guo, Ming Wu, Lidong Zhou, Gang Hu, Junfeng Yang, and Lintao Zhang. 2011. Practical software model checking via dynamic interface reduction. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles 2011, SOSP 2011, Cascais, Portugal, October 23-26, 2011. 265–278. Google Scholar
Digital Library
- David Harel. 1987. Statecharts: A Visual Formalism for Complex Systems. Sci. Comput. Program. (1987). Google Scholar
Digital Library
- Chris Hawblitzel, Jon Howell, Manos Kapritsos, Jacob R. Lorch, Bryan Parno, Michael L. Roberts, Srinath Setty, and Brian Zill. 2015. IronFleet: Proving Practical Distributed Systems Correct. In Proceedings of the 25th ACM Symposium on Operating Systems Principles. Google Scholar
Digital Library
- Chris Hawblitzel, Jon Howell, Jacob R. Lorch, Arjun Narayan, Bryan Parno, Danfeng Zhang, and Brian Zill. 2014. Ironclad Apps: End-to-end Security via Automated Full-system Verification. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI). Google Scholar
Digital Library
- Matthew Hennessy and James Riely. 2002. Resource Access Control in Systems of Mobile Agents. Inf. Comput. 173, 1 (Feb. 2002), 82–120. Google Scholar
Digital Library
- Thomas A. Henzinger, Xiaojun Liu, Shaz Qadeer, and Sriram K. Rajamani. 1999. Formal Specification and Verification of a Dataflow Processor Array. In Proceedings of the 1999 IEEE/ACM International Conference on Computer-aided Design. Google Scholar
Digital Library
- C. A. R. Hoare. 1969. An Axiomatic Basis for Computer Programming. Commun. ACM (1969). Google Scholar
Digital Library
- C. A. R. Hoare. 1978. Communicating Sequential Processes. Commun. ACM (1978). Google Scholar
Digital Library
- Kohei Honda, Nobuko Yoshida, and Marco Carbone. 2016. Multiparty Asynchronous Session Types. J. ACM 63, 1 (March 2016). Google Scholar
Digital Library
- John Hughes, Benjamin C Pierce, Thomas Arts, and Ulf Norell. 2016. Mysteries of dropbox: property-based testing of a distributed synchronization service. In Software Testing, Verification and Validation (ICST), 2016 IEEE International Conference on. IEEE, 135–145.Google Scholar
- Charles Edwin Killian, James W. Anderson, Ryan Braud, Ranjit Jhala, and Amin Vahdat. 2007a. Mace: language support for building distributed systems. In Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, San Diego, California, USA, June 10-13, 2007. 179–188. Google Scholar
Digital Library
- Charles Edwin Killian, James W. Anderson, Ranjit Jhala, and Amin Vahdat. 2007b. Life, Death, and the Critical Transition: Finding Liveness Bugs in Systems Code. In Symposium on Networked Systems Design and Implementation. Google Scholar
Digital Library
- Gerwin Klein, Kevin Elphinstone, Gernot Heiser, June Andronick, David Cock, Philip Derrin, Dhammika Elkaduwe, Kai Engelhardt, Rafal Kolanski, Michael Norrish, Thomas Sewell, Harvey Tuch, and Simon Winwood. 2009. seL4: Formal Verification of an OS Kernel. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP). Google Scholar
Digital Library
- Leslie Lamport. 1998. The part-time parliament. ACM Transactions on Computer Systems 16, 2 (1998), 133–169. Google Scholar
Digital Library
- Leslie Lamport. 2001. Paxos Made Simple. ACM SIGACT News 32, 4 (Dec. 2001).Google Scholar
- S. Lauterburg, M. Dotta, D. Marinov, and G. Agha. 2009. A Framework for State-Space Exploration of Java-Based Actor Programs. In Automated Software Engineering, 2009. ASE ’09. 24th IEEE/ACM International Conference on. Google Scholar
Digital Library
- K Rustan M Leino and Peter Müller. 2009. A basis for verifying multi-threaded programs. In European Symposium on Programming. Springer, 378–393. Google Scholar
Digital Library
- Yanhong A Liu, Scott D Stoller, Bo Lin, and Michael Gorbovitski. 2012. From clarity to efficiency for distributed algorithms. In ACM SIGPLAN Notices, Vol. 47. ACM, 395–410. Google Scholar
Digital Library
- Nancy A. Lynch and Mark R. Tuttle. 1987. Hierarchical Correctness Proofs for Distributed Algorithms. In Proceedings of the Sixth Annual ACM Symposium on Principles of Distributed Computing (PODC). Google Scholar
Digital Library
- Caitie McCaffrey. 2016. The Verification of a Distributed System. Commun. ACM 59, 2 (Jan. 2016). Google Scholar
Digital Library
- Kenneth McMillan. 2016. Modular specification and verification of a cache-coherent interface. In Proceedings of the 16th Conference on Formal Methods in Computer-Aided Design. FMCAD Inc, 109–116. Google Scholar
Digital Library
- Kenneth Lauchlin McMillan. 1992. Symbolic Model Checking: An Approach to the State Explosion Problem. Ph.D. Dissertation. Pittsburgh, PA, USA.Google Scholar
- Kenneth L. McMillan. 2000. A methodology for hardware verification using compositional model checking. Sci. Comput. Program. 37, 1-3 (2000), 279–309. Google Scholar
Digital Library
- Kenneth Lauchlin McMillan. 2017. SMV Model Checker. http://www.kenmcmil.com/smv.html . (2017).Google Scholar
- R. Milner. 1982. A Calculus of Communicating Systems. Springer-Verlag New York, Inc., Secaucus, NJ, USA. Google Scholar
Digital Library
- Robin Milner, Joachim Parrow, and David Walker. 1992. A Calculus of Mobile Processes, I. Inf. Comput. 100, 1 (Sept. 1992). Google Scholar
Digital Library
- Iulian Moraru, David G Andersen, and Michael Kaminsky. 2013a. EPaxos Code. https://github.com/efficient/epaxos/ . (2013).Google Scholar
- Iulian Moraru, David G. Andersen, and Michael Kaminsky. 2013b. There is More Consensus in Egalitarian Parliaments. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP). Google Scholar
Digital Library
- Peter W O’Hearn. 2007. Resources, concurrency, and local reasoning. Theoretical computer science 375, 1-3 (2007), 271–307. Google Scholar
Digital Library
- P-GitHub. 2018. The P Programming Langugage. https://github.com/p- org/P . (2018).Google Scholar
- Oded Padon, Kenneth L. McMillan, Aurojit Panda, Mooly Sagiv, and Sharon Shoham. 2016. Ivy: Safety Verification by Interactive Generalization. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16). Google Scholar
Digital Library
- Benjamin Pierce and Davide Sangiorgi. 1996. Typing and Subtyping for Mobile Processes. In Mathematical Strustures In Computer Science. 376–385.Google Scholar
- Benjamin C. Pierce and David N. Turner. 2000. Proof, Language, and Interaction. Chapter Pict: A Programming Language Based on the Pi-Calculus. Google Scholar
Digital Library
- Pony. 2017. Pony Programming Langugage. https://www.ponylang.org . (2017).Google Scholar
- James Riely and Matthew Hennessy. 1998. A Typed Language for Distributed Mobile Processes (Extended Abstract). In Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’98). Google Scholar
Digital Library
- Fred B. Schneider. 1990. Implementing fault-tolerant services using the state machine approach: a tutorial. ACM Comput. Surv. 22, 4 (Dec. 1990), 299–319. Google Scholar
Digital Library
- Koushik Sen and Gul Agha. 2006. Automated Systematic Testing of Open Distributed Programs. In Proceedings of the 9th International Conference on Fundamental Approaches to Software Engineering. Google Scholar
Digital Library
- Koushik Sen, George Necula, Liang Gong, and Wontae Choi. 2015. MultiSE: Multi-path symbolic execution using value summaries. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. ACM, 842–853. Google Scholar
Digital Library
- Ilya Sergey, James R. Wilcox, and Zachary Tatlock. 2018. Programming and Proving with Distributed Protocols. In 45th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’18). ACM.Google Scholar
- Alexander J. Summers and Peter Müller. 2016. Actor Services. In Proceedings of the 25th European Symposium on Programming Languages and Systems - Volume 9632.Google Scholar
- Viktor Vafeiadis and Matthew Parkinson. 2007. A marriage of rely/guarantee and separation logic. In International Conference on Concurrency Theory. Springer, 256–271. Google Scholar
Digital Library
- Robbert Van Renesse and Deniz Altinbuken. 2015. Paxos Made Moderately Complex. ACM Comput. Surv. 47, 3, Article 42 (Feb. 2015). Google Scholar
Digital Library
- Robbert van Renesse and Fred B. Schneider. 2004. Chain replication for supporting high throughput and availability. In Proc. 6th USENIX OSDI. San Francisco, CA. Google Scholar
Digital Library
- Xi Wang, David Lazar, Nickolai Zeldovich, Adam Chlipala, and Zachary Tatlock. 2014. Jitk: A Trustworthy In-kernel Interpreter Infrastructure. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI). Google Scholar
Digital Library
- James R. Wilcox, Doug Woos, Pavel Panchekha, Zachary Tatlock, Xi Wang, Michael D. Ernst, and Tom Anderson. 2015. Verdi: A Framework for Implementing and Formally Verifying Distributed Systems. In 2015 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Google Scholar
Digital Library
- Qiwen Xu, Willem-Paul de Roever, and Jifeng He. 1997. The rely-guarantee method for verifying shared variable concurrent programs. Formal Aspects of Computing 9, 2 (1997), 149–174.Google Scholar
Digital Library
- Junfeng Yang, Tisheng Chen, Ming Wu, Zhilei Xu, Xuezheng Liu, Haoxiang Lin, Mao Yang, Fan Long, Lintao Zhang, and Lidong Zhou. 2009. MODIST: Transparent Model Checking of Unmodified Distributed Systems. In Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation (NSDI). Google Scholar
Digital Library
- Jianqiao Yang, Ankush Desai, and Koushik Sen. 2017. Multi-Path Symbolic Execution for P Language. https://github.com/ thisiscam/MultiPathP . (2017).Google Scholar
Index Terms
Compositional programming and testing of dynamic distributed systems
Recommendations
Program Segmentation for Controlling Test Coverage
ISSRE '97: Proceedings of the Eighth International Symposium on Software Reliability EngineeringIn this paper we present a new control-flow based approach to dynamic testing of sequential software. A practicable number of test cases is generated by using the boundary-interior path testing strategy and by dividing the test units into test segments (...
Program Testing Complexity and Test Criteria
This paper explores the testing complexity of several classes of programs, where the testing complexity is measured in terms of the number of test data required for demonstrating program correctness by testing. It is shown that even for very restrictive ...
Distributed test agents: a pattern for the development of automatic system tests for distributed applications
SugarLoafPLoP '12: Proceedings of the 9th Latin-American Conference on Pattern Languages of ProgrammingThis paper presents a test pattern for developing automated system tests for distributed applications. System tests are those intended to test the whole, completely integrated application. Developing such tests is hard because it demands the probing and ...






Comments