Abstract
Testing the components of a distributed system is challenging as it requires consideration of not just the state of a component, but also the sequence of messages it may receive from the rest of the system or the environment. Such messages may vary in type and content, and more particularly, in the frequency at which they are generated. All of these factors, in the right combination, may lead to faulty behavior. In this paper we present an approach to address these challenges by systematically analyzing a component in a distributed system to identify specific message sequences and frequencies at which a failure can occur. At the core of the analysis is the generation of a test driver that defines the space of message sequences to be generated, the exploration of that space through the use of dynamic symbolic execution, and the timing and analysis of the generated tests to identify problematic frequencies. We implemented our approach in the context of the popular Robotic Operating System and investigated its application to three systems of increasing complexity.
- L. Baresi, C. Ghezzi, and L. Mottola. Towards fine-grained automated verification of publish-subscribe architectures. In Formal Techniques for Networked and Distributed Systems, pages 131--135. Springer-Verlag, 2006. Google Scholar
Digital Library
- L. Baresi, C. Ghezzi, and L. Mottola. On accurate automatic verification of publish-subscribe architectures. In the International Conference on Software Engineering, pages 199--208, 2007. Google Scholar
Digital Library
- L. Baresi, C. Ghezzi, and L. Mottola. Loupe: Verifying publish-subscribe architectures with a magnifying lens. IEEE Transactions on Software Engineering, 37(2):228--246, March-April 2011. Google Scholar
Digital Library
- C. Cadar, D. Dunbar, and D. Engler. KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs. In the USENIX Conference on Operating Systems Design and Implementation, pages 209--224, 2008. Google Scholar
Digital Library
- R. Cardell-Oliver. Conformance test experiments for distributed real-time systems. In the ACM SIGSOFT international Symposium on Software testing and Analysis, pages 159--163, 2002. Google Scholar
Digital Library
- J. Chen, R. Hierons, and H. Ural. Conditions for resolving observability problems in distributed testing. In Formal Techniques for Networked and Distributed Systems FORTE 2004, volume 3235, pages 229--242. Springer Berlin / Heidelberg, 2004.Google Scholar
- J. Chen, R. M. Hierons, and H. Ural. Overcoming observability problems in distributed test architectures. Inf. Process. Lett., 98(5): 177--182, June 2006. Google Scholar
Digital Library
- D. Engler and K. Ashcraft. RacerX: effective, static detection of race conditions and deadlocks. In the Symposium on Operating systems principles, pages 237--252, 2003. Google Scholar
Digital Library
- P. T. Eugster, P. A. Felber, R. Guerraoui, and A. M. Kermarrec. The many faces of publish/subscribe. ACM Computing Surveys, 35:114--131, 2003. Google Scholar
Digital Library
- V. Ganesh and D. L. Dill. A decision procedure for bit-vectors and arrays. In the Computer Aided Verification Conference, pages 524--536, 2007. Google Scholar
Digital Library
- W. Garage, 2012. URL http://www.willowgarage.com/pages/pr2.Google Scholar
- D. Geels, G. Altekar, P. Maniatis, T. Roscoe, and I. Stoica. Friday: Global comprehension for distributed replay. In the Symposium on Networked Systems Design and Implementation, page 21. Google Scholar
Digital Library
- P. Godefroid, N. Klarlund, and K. Sen. DART: directed automated random testing. In the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 213--223, 2005. Google Scholar
Digital Library
- IRobot, 2012. URL http://www.irobot.com/company.Google Scholar
- A. Khoumsi. A temporal approach for testing distributed systems. IEEE Transactions on Software Engineering, 28:1085--1103, November 2002. Google Scholar
Digital Library
- C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In the International Symposium on Code Generation and Optimization, March 2004. Google Scholar
Digital Library
- G. T. Leavens. The java modeling language(jml). URL http://sourceforge.net/apps/wordpress/fixedptc/.Google Scholar
- A. Michlmayr, P. Fenkam, and S. Dustdar. Architecting a testing framework for publish/subscribe applications. In the International Computer Software and Applications Conference, pages 467--474, 2006. Google Scholar
Digital Library
- A. Michlmayr, P. Fenkam, and S. Dustdar. Specification-based unit testing of publish/subscribe applications. In the IEEE International Conference Workshops on Distributed Computing Systems, pages 34--, 2006. Google Scholar
Digital Library
- M. Quigley, K. Conley, B. P. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler, and A. Y. Ng. Ros: an open-source robot operating system. In International Conference on Robotics and Automation Workshop on Open Source Software, 2009.Google Scholar
- P. Reynolds, C. Killian, J. L. Wiener, J. C. Mogul, M. A. Shah, and A. Vahdat. Pip: detecting the unexpected in distributed systems. In the Conference on Networked Systems Design & Implementation, pages 115--128, 2006. Google Scholar
Digital Library
- M. J. Rutherford, A. Carzaniga, and A. L. Wolf. Simulation-based test adequacy criteria for distributed systems. In the ACM SIGSOFT international Symposiumon Foundations of Software Engineering, pages 231--241, 2006. Google Scholar
Digital Library
- R. Sasnauskas, O. Landsiedel, M. H. Alizai, C. Weise, S. Kowalewski, and K. Wehrle. Kleenet: discovering insidious interaction bugs in wireless sensor networks before deployment. In International Conference on Information Processing in Sensor Networks, pages 186--196, 2010. Google Scholar
Digital Library
- R. Sasnauskas, O. S. Dustmann, B. L. Kaminski, K. Wehrle, C. Weise, and S. Kowalewski. Scalable symbolic execution of distributed sys-tems. In International Conference on Distributed Computing Systems, pages 333--342, 2011. Google Scholar
Digital Library
- S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson. Eraser: a dynamic data race detector for multithreaded programs. ACM Trans. Comput. Syst., 15(4):391--411, Nov. 1997. Google Scholar
Digital Library
- K. Sen. Race directed random testing of concurrent programs. In the Conference on Programming Language Design and Implementation, pages 11--21, 2008. Google Scholar
Digital Library
- E. Sherman, M. B. Dwyer, and S. Elbaum. Saturation-based testing of concurrent programs. In the European Software Engineering conference and the ACM SIGSOFT Symposium on the Foundations of Soft-ware Engineering, pages 53--62, 2009. Google Scholar
Digital Library
- J. Simsa, R. Bryant, and G. Gibson. dBug: systematic evaluation of distributed systems. In the international conference on Systems software verification, pages 3--3, 2010. Google Scholar
Digital Library
- A. Singh, P. Maniatis, T. Roscoe, and P. Druschel. Using queries for distributed monitoring and forensics. In the ACM SIGOPS/EuroSys European Conference on Computer Systems 2006, pages 389--402, 2006. Google Scholar
Digital Library
- R. Taylor, D. Levine, and C. Kelly. Structural testing of concurrent programs. IEEE Transactions on Software Engineering, 18:206--215, 1992. Google Scholar
Digital Library
- C. Wang, M. Said, and A. Gupta. Coverage guided systematic concur-rency testing. In the International Conference on Software Engineer-ing, pages 221--230, 2011. Google Scholar
Digital Library
- J. Yang, T. Chen, M. Wu, Z. Xu, X. Liu, H. Lin, M. Yang, F. Long, L. Zhang, and L. Zhou. MODIST: transparent model checking of unmodified distributed systems. In the USENIX Symposium on Networked Systems Design and Implementation, pages 213--228, 2009. Google Scholar
Digital Library
Index Terms
Detecting problematic message sequences and frequencies in distributed systems
Recommendations
Detecting problematic message sequences and frequencies in distributed systems
OOPSLA '12: Proceedings of the ACM international conference on Object oriented programming systems languages and applicationsTesting the components of a distributed system is challenging as it requires consideration of not just the state of a component, but also the sequence of messages it may receive from the rest of the system or the environment. Such messages may vary in ...
Logically Instantaneous Message Passing in Asynchronous Distributed Systems
Asynchrony (due to unknown message transmission delay) complicates the design of protocols for distributed systems. To simplify the protocol design task therefore, the authors propose an interprocess (point-to-point) communication mechanism that has the ...
Constructing formal rules to verify message communication in distributed systems
This study presents a method to construct formal rules used to run-time verify message passing between clients in distributed systems. Rules construction is achieved in four steps: (1) Visual specification of expected behavior of the sender, receiver, ...







Comments