Abstract
Traditional debuggers are of limited value for modern scientific codes that manipulate large complex data structures. This paper discusses a novel debug-time assertion, called a "Statistical Assertion", that allows a user to reason about large data structures, and the primitives are parallelised to provide an efficient solution. We present the design and implementation of statistical assertions, and illustrate the debugging technique with a molecular dynamics simulation. We evaluate the performance of the tool on a 12,000 cores Cray XE6.
- M. N. Dinh, D. Abramson, D. Kurniawan, C. Jin, B. Moench, and L. DeRose, "Assertion based parallel debugging", in CCGrid, Newport Beach, 2011. Google Scholar
Digital Library
- D. Abramson, M. N. Dinh, D. Kurniawan, B. Moench, and L. DeRose, "Data Centric Highly Parallel Debugging", in HPDC, Chicago, 2010. Google Scholar
Digital Library
- G. R. Watson, "The Design and Implementation of a Parallel Relative Debugger", in Faculty of Information Technology. vol. PhD Thesis Melbourne: Monash University, 2000, p. 197.Google Scholar
- D. Frenkel and B. Smit, Understanding Molecular Simulations: From Algorithms to Applications, 2 ed. Elsevier Science & Technology, 2002. Google Scholar
Digital Library
- P. S. Branicio, R. K. Kalia, A. Nakano, and P. Vashishta, "Shock-Induced Structural Phase Transition, Plasticity, and Brittle Cracks in Aluminum Nitride Ceramic", PHYSICAL REVIEW LETTERS, vol. 96, issue 6, 2005.Google Scholar
Index Terms
Scalable parallel debugging with statistical assertions
Recommendations
Scalable parallel debugging with statistical assertions
PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel ProgrammingTraditional debuggers are of limited value for modern scientific codes that manipulate large complex data structures. This paper discusses a novel debug-time assertion, called a "Statistical Assertion", that allows a user to reason about large data ...
Parallel assertions for debugging parallel programs
MEMOCODE '11: Proceedings of the Ninth ACM/IEEE International Conference on Formal Methods and Models for CodesignA parallel program must execute correctly even in the presence of unpredictable thread interleavings. This interleaving makes it hard to write correct parallel programs, and also makes it hard to find bugs in incorrect parallel programs. A range of ...
Relative debugging for a highly parallel hybrid computer system
SC '15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisRelative debugging traces software errors by comparing two executions of a program concurrently - one code being a reference version and the other faulty. Relative debugging is particularly effective when code is migrated from one platform to another, ...







Comments