Abstract
The Message Passing Interface (MPI) is the standard API for parallelization in high-performance and scientific computing. Communication deadlocks are a frequent problem in MPI programs, and this article addresses the problem of discovering such deadlocks. We begin by showing that if an MPI program is single path, the problem of discovering communication deadlocks is NP-complete. We then present a novel propositional encoding scheme that captures the existence of communication deadlocks. The encoding is based on modeling executions with partial orders and implemented in a tool called MOPPER. The tool executes an MPI program, collects the trace, builds a formula from the trace using the propositional encoding scheme, and checks its satisfiability. Finally, we present experimental results that quantify the benefit of the approach in comparison to other analyzers and demonstrate that it offers a scalable solution for single-path programs.
- Jade Alglave, Daniel Kroening, and Michael Tautschnig. 2013. Partial orders for efficient bounded model checking of concurrent software. In Computer Aided Verification. Lecture Notes in Computer Science, Vol. 8044. Springer, 141--157.Google Scholar
- Olivier Bailleux and Yacine Boufkhad. 2003. Efficient CNF encoding of Boolean cardinality constraints. In Principles and Practice of Constraint Programming. Lecture Notes in Computer Science, Vol. 2833. Springer, 108--122. Google Scholar
Digital Library
- Stanislav Böhm, Ondrej Meca, and Petr Jancar. 2016. State-space reduction of non-deterministically synchronizing systems applicable to deadlock detection in MPI. In Formal Methods. Lecture Notes in Computer Science, Vol. 9995. Springer, 102--118.Google Scholar
- Stefan Bucur, Vlad Ureche, Cristian Zamfir, and George Candea. 2011. Parallel symbolic execution for automated real-world software testing. In Proceedings of the Computer Systems Conference (EuroSys’11). ACM, New York, NY, 183--198. Google Scholar
Digital Library
- Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI’08). 209--224. Google Scholar
Digital Library
- John D. Carter, William B. Gardner, and Gary Grewal. 2010. The pilot library for novice MPI programmers. In Proceedings of the Conference on Principles and Practice of Parallel Programming (PPoPP’10). ACM, New York, NY, 351--352. Google Scholar
Digital Library
- Feng Chen, Traian Florin Serbanuta, and Grigore Rosu. 2008. jPredictor: A predictive runtime analysis tool for Java. In Proceedings of the International Conference on Software Engineering (ICSE’08). ACM, New York, NY, 221--230. Google Scholar
Digital Library
- Allan Cheng, Javier Esparza, and Jens Palsberg. 1995. Complexity results for 1-safe nets. Theoretical Computer Science 147, 1--2, 117--136. Google Scholar
Digital Library
- Edmund Clarke, Daniel Kroening, and Flavio Lerda. 2004. A tool for checking ANSI-C programs. In Tools and Algorithms for the Construction and Analysis of Systems. Lecture Notes in Computer Science, Vol. 2988. Springer, 168--176.Google Scholar
- Etem Deniz, Alper Sen, and Jim Holt. 2012. Verification and coverage of message passing multicore applications. ACM Transactions on Design Automation of Electronic Systems 17, 3, 23. Google Scholar
Digital Library
- Niklas Eén and Niklas Sörensson. 2003. An extensible SAT-solver. In Theory and Applications of Satisfiability Testing. Lecture Notes in Computer Science, Vol. 2919. Springer, 502--518.Google Scholar
- Mohamed Elwakil and Zijiang Yang. 2010. Debugging support tool for MCAPI applications. In Proceedings of the Conference on Parallel and Distributed Systems: Testing, Analysis, and Debugging (PADTAD’10). ACM, New York, NY, 20--25. Google Scholar
Digital Library
- Mahdi Eslamimehr and Jens Palsberg. 2014. Sherlock: Scalable deadlock detection for concurrent programs. In Proceedings of the Conference on Foundations of Software Engineering (FSE’14). ACM, New York, NY, 353--365. Google Scholar
Digital Library
- Vojtech Forejt, Daniel Kroening, Ganesh Narayanaswamy, and Subodh Sharma. 2014. Precise predictive analysis for discovering communication deadlocks in MPI programs. In Formal Methods. Lecture Notes in Computer Science, Vol. 8442. Springer, 263--278. Google Scholar
Digital Library
- Alan M. Frisch and Paul A. Giannaros. 2010. SAT encodings of the at-most- constraint: Some old, some new, some fast, some slow. In Proceedings of the 9th International Workshop on Constraint Modelling and Reformulation (ModRef’10).Google Scholar
- Xianjin Fu, Zhenbang Chen, Chun Huang, Wei Dong, and Ji Wang. 2014. Synchronization error detection of MPI programs by symbolic execution. In Proceedings of the Asia-Pacific Software Engineering Conference (APSEC’14). IEEE, Los Alamitos, CA, 127--134. Google Scholar
Digital Library
- Xianjin Fu, Zhenbang Chen, Yufeng Zhang, Chun Huang, Wei Dong, and Ji Wang. 2015. MPISE: Symbolic execution of MPI programs. In Proceedings of the Conference on High Assurance Systems Engineering (HASE’15). IEEE, Los Alamitos, CA, 181--188. Google Scholar
Digital Library
- Sara Gradara, Antonella Santone, and Maria Luisa Villani. 2006. DELFIN: An efficient deadlock detection tool for CCS processes. Journal of Computer and System Sciences 72, 8, 1397--1412. Google Scholar
Digital Library
- Waqar Haque. 2006. Concurrent deadlock detection in parallel programs. International Journal of Computers and Applications 28, 1, 19--25. Google Scholar
Digital Library
- Tobias Hilbrich, Joachim Protze, Martin Schulz, Bronis R. de Supinski, and Matthias S. Müller. 2012. MPI runtime error detection with MUST: Advances in deadlock detection. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’12). IEEE, Los Alamitos, CA, Article No. 30. Google Scholar
Digital Library
- Jim Holt, Anant Agarwal, Sven Brehmer, Max Domeika, Patrick Griffin, and Frank Schirrmeister. 2009. Software standards for the multicore era. IEEE Micro 29, 3, 40--51. Google Scholar
Digital Library
- Yu Huang and Eric Mercer. 2015. Detecting MPI zero buffer incompatibility by SMT encoding. In NASA Formal Methods. Lecture Notes in Computer Science, Vol. 9058. Springer, 219--233.Google Scholar
- Yu Huang, Eric Mercer, and Jay McCarthy. 2013. Proving MCAPI executions are correct using SMT. In Proceedings of the Conference on Automated Software Engineering (ASE’13). IEEE, Los Alamitos, CA, 26--36. Google Scholar
Digital Library
- Bettina Krammer, Katrin Bidmon, Matthias S. Müller, and Michael M. Resch. 2003. MARMOT: An MPI analysis and checking tool. Advances in Parallel Computing 13, 2004, 493--500.Google Scholar
- Alan Leung, Manish Gupta, Yuvraj Agarwal, Rajesh Gupta, Ranjit Jhala, and Sorin Lerner. 2012. Verifying GPU kernels by test amplification. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’12). ACM, New York, NY, 383--394. Google Scholar
Digital Library
- Hugo A. López, Eduardo R. B. Marques, Francisco Martins, Nicholas Ng, César Santos, Vasco Thudichum Vasconcelos, and Nobuko Yoshida. 2015. Protocol-based verification of message-passing parallel programs. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’15). ACM, New York, NY, 280--298. Google Scholar
Digital Library
- Glenn R. Luecke, Yan Zou, James Coyle, Jim Hoekstra, and Marina Kraeva. 2002. Deadlock detection in MPI programs. Concurrency and Computation: Practice and Experience 14, 11, 911--932.Google Scholar
Cross Ref
- Stephan Merz, Martin Quinson, and Cristian Rosa. 2011. SimGrid MC: Verification support for a multi-API simulation platform. In FMOODS/FORTE. Lecture Notes in Computer Science, Vol. 6722. Springer, 274--288. Google Scholar
Digital Library
- Message Passing Interface Forum. 2009. MPI 2.2 Documents. Retrieved July 19, 2017, from http://www.mpi-forum.org/docs/mpi-2.2.Google Scholar
- Matthias S. Mueller, Ganesh Gopalakrishnan, Bronis R. de Supinski, David Lecomber, and Tobias Hilbrich. 2011. Dealing with MPI Bugs at Scale: Best Practices, Automatic Detection, Debugging, and Formal Verification. Retrieved July 19, 2017, from http://rcswww.zih.tu-dresden.de/ hilbrich/sc11/.Google Scholar
- N. Natarajan. 1984. A distributed algorithm for detecting communication deadlocks. In Foundations of Software Technology and Theoretical Computer Science. Lecture Notes in Computer Science, Vol. 181. Springer, 119--135. Google Scholar
Digital Library
- Doron A. Peled. 1993. All from one, one for all: On model checking using representatives. In Computer Aided Verification. Lecture Notes in Computer Science, Vol. 697. Springer, 409--423. Google Scholar
Digital Library
- César Santos, Francisco Martins, and Vasco Thudichum Vasconcelos. 2015. Deductive verification of parallel programs using why3. In Proceedings of the 3rd International Conference on Ergonomics (ICE’15).Google Scholar
Cross Ref
- Subodh Sharma, Ganesh Gopalakrishnan, Eric Mercer, and Jim Holt. 2009. MCC: A runtime verification tool for MCAPI user applications. In Proceedings of the Conference on Formal Methods in Computer-Aided Design (FMCAD’09). IEEE, Los Alamitos, CA, 41--44.Google Scholar
Cross Ref
- Stephen F. Siegel. 2007. Model checking nonblocking MPI programs. In Verification, Model Checking, and Abstract Interpretation. Lecture Notes in Computer Science, Vol. 4349. Springer, 44--58. Google Scholar
Digital Library
- Stephen F. Siegel, Manchun Zheng, Ziqing Luo, Timothy K. Zirkel, Andre V. Marianiello, John G. Edenhofner, Matthew B. Dwyer, and Michael S. Rogers. 2015. CIVL: The concurrency intermediate verification language. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’15). ACM, New York, NY, 61:1--61:12. Google Scholar
Digital Library
- Stephen F. Siegel and Timothy K. Zirkel. 2011a. FEVS: A functional equivalence verification suite for high-performance scientific computing. Mathematics in Computer Science 5, 4, 427--435.Google Scholar
Cross Ref
- Stephen F. Siegel and Timothy K. Zirkel. 2011b. The Toolkit for Accurate Scientific Software. Technical Report UDEL-CIS-2011/01. Department of Computer and Information Sciences, University of Delaware.Google Scholar
- Sarvani Vakkalanka. 2010. Efficient Dynamic Verification Algorithms for MPI Applications. Ph.D. Dissertation. University of Utah. Google Scholar
Digital Library
- Sarvani S. Vakkalanka, Ganesh Gopalakrishnan, and Robert M. Kirby. 2008. Dynamic verification of MPI programs with reductions in presence of split operations and relaxed orderings. In Computer Aided Verification. Lecture Notes in Computer Science, Vol. 5123. Springer, 66--79. Google Scholar
Digital Library
- Antti Valmari. 1989. Stubborn sets for reduced state space generation. In Advances in Petri Nets 1990. Lecture Notes in Computer Science, Vol. 483. Springer, 491--515. Google Scholar
Digital Library
- Anh Vo, Sriram Aananthakrishnan, Ganesh Gopalakrishnan, Bronis R. de Supinski, Martin Schulz, and Greg Bronevetsky. 2010. A scalable and distributed dynamic formal verifier for MPI programs. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’10). IEEE, Los Alamitos, CA. Google Scholar
Digital Library
- Chao Wang, Sudipta Kundu, Malay K. Ganai, and Aarti Gupta. 2009. Symbolic predictive analysis for concurrent programs. In FM 2009: Formal Methods. Lecture Notes in Computer Science, Vol. 5850. Springer, 256--272. Google Scholar
Digital Library
- Ruini Xue, Xuezheng Liu, Ming Wu, Zhenyu Guo, Wenguang Chen, Weimin Zheng, Zheng Zhang, and Geoffrey Voelker. 2009. MPIWiz: Subgroup reproducible replay of MPI applications. In Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’09). ACM, New York, NY, 251--260. Google Scholar
Digital Library
- Timothy K. Zirkel, Stephen F. Siegel, and Louis F. Rossi. 2014. Using Symbolic Execution to Verify the Order of Accuracy of Numerical Approximations. Technical Report UD-CIS-2014/002. Department of Computer and Information Sciences, University of Delaware.Google Scholar
Index Terms
Precise Predictive Analysis for Discovering Communication Deadlocks in MPI Programs
Recommendations
Precise Predictive Analysis for Discovering Communication Deadlocks in MPI Programs
Proceedings of the 19th International Symposium on FM 2014: Formal Methods - Volume 8442The Message Passing Interface MPI is the standard API for high-performance and scientific computing. Communication deadlocks are a frequent problem in MPI programs, and this paper addresses the problem of discovering such deadlocks. We begin by showing ...
Additional Parallelization of Existing MPI Programs Using SAPFOR
Parallel Computing TechnologiesAbstractThe SAPFOR and DVM systems were primary designed to simplify the development of parallel programs of scientific-technical calculations. SAPFOR is a software development suite that aims to produce a parallel version of a sequential program in a ...
MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory
Hybrid parallel programming with the message passing interface (MPI) for internode communication in conjunction with a shared-memory programming model to manage intranode parallelism has become a dominant approach to scalable parallel programming. While ...






Comments