skip to main content
research-article

Dynamic deadlock verification for general barrier synchronisation

Published:24 January 2015Publication History
Skip Abstract Section

Abstract

We present Armus, a dynamic verification tool for deadlock detection and avoidance specialised in barrier synchronisation. Barriers are used to coordinate the execution of groups of tasks, and serve as a building block of parallel computing. Our tool verifies more barrier synchronisation patterns than current state-of-the-art. To improve the scalability of verification, we introduce a novel event-based representation of concurrency constraints, and a graph-based technique for deadlock analysis. The implementation is distributed and fault-tolerant, and can verify X10 and Java programs. To formalise the notion of barrier deadlock, we introduce a core language expressive enough to represent the three most widespread barrier synchronisation patterns: group, split-phase, and dynamic membership. We propose a graph analysis technique that selects from two alternative graph representations: the Wait-For Graph, that favours programs with more tasks than barriers; and the State Graph, optimised for programs with more barriers than tasks. We prove that finding a deadlock in either representation is equivalent, and that the verification algorithm is sound and complete with respect to the notion of deadlock in our core language. Armus is evaluated with three benchmark suites in local and distributed scenarios. The benchmarks show that graph analysis with automatic graph-representation selection can record a 7-fold execution increase versus the traditional fixed graph representation. The performance measurements for distributed deadlock detection between 64 processes show negligible overheads.

References

  1. S. Agarwal, R. Barik, V. Sarkar, and R. K. Shyamasundar. Mayhappen-in-parallel analysis of X10 programs. In PPoPP’10, pages 183–193. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Agarwal, S. Joshi, and R. K. Shyamasundar. Distributed generalized dynamic barrier synchronization. In ICDCN’11, pages 143–154. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Armus homepage. bitbucket.org/cogumbreiro/armus/wiki/ PPoPP15.Google ScholarGoogle Scholar
  4. D. Atkins, A. Potanin, and L. Groves. The design and implementation of clocked variables in X10. In ACSC’13, pages 87–95. Australian Computer Society, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. A. Bader and K. Madduri. Design and implementation of the HPCS graph analysis benchmark on symmetric multiprocessors. In HiPC’05, volume 3769 of LNCS, pages 465–476. Springer, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Bang-Jensen and G. Z. Gutin. Digraphs: Theory, Algorithms and Applications. Springer, 2nd edition, 2009. Google ScholarGoogle ScholarCross RefCross Ref
  7. V. Cavé, J. Zhao, J. Shirako, and V. Sarkar. Habanero-Java: the new adventures of old X10. In PPPJ’11, pages 51–61. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. Cogumbreiro, R. Hu, F. Martins, and N. Yoshida. Dynamic deadlock verification for general barrier synchronisation. Technical Report DTR14-12, Imperial College London, 2014.Google ScholarGoogle Scholar
  9. T. Cogumbreiro, F. Martins, and V. T. Vasconcelos. Coordinating phased activities while maintaining progress. In COORDINATION’13, volume 7890 of LNCS, pages 31–44. Springer, 2013.Google ScholarGoogle Scholar
  10. D. Cunningham, D. Grove, B. Herta, A. Iyengar, K. Kawachiya, H. Murata, V. Saraswat, M. Takeuchi, and O. Tardieu. Resilient X10: Efficient failure-aware programming. In PPoPP’14, pages 67–80. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. A. Frumkin, M. Schultz, H. Jin, and J. Yan. Performance and scalability of the NAS Parallel Benchmarks in Java. In IPDPS’03. IEEE, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Georges, D. Buytaert, and L. Eeckhout. Statistically rigorous Java performance evaluation. In OOPSLA’07, pages 57–76. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Gligoric, P. C. Mehlitz, and D. Marinov. X10X: Model checking a new programming language with an "old" model checker. In ICST’12, pages 11–20. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Hilbrich, B. R. de Supinski, M. Schulz, and M. S. Müller. A graph based approach for MPI deadlock detection. In ICS’09, pages 296– 305. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Hilbrich, J. Protze, M. Schulz, B. R. de Supinski, and M. S. Müller. MPI runtime error detection with MUST: advances in deadlock detection. In SC’12, pages 1–11. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. C. Holt. Some deadlock properties of computer systems. ACM Computing Surveys, 4(3):179–196, Sept. 1972. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Java 7 Phaser API. docs.oracle.com/javase/7/docs/api/ java/util/concurrent/Phaser.html.Google ScholarGoogle Scholar
  18. JGraphT homepage. jgrapht.org.Google ScholarGoogle Scholar
  19. E. G. C. Jr., M. J. Elphick, and A. Shoshani. System deadlocks. ACM Computing Surveys, 3(2):67–78, 1971. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. E. Knapp. Deadlock detection in distributed databases. ACM Computing Survey, 19(4):303–328, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. D. Kshemkalyani and M. Singhal. Correct two-phase and onephase deadlock detection algorithms for distributed systems. In SPDP’90, pages 126–129, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. L. Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7):558–565, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D.-K. Le, W.-N. Chin, and Y.-M. Teo. Verification of static and dynamic barrier synchronization using bounded permissions. In ICFEM’13, volume 8144 of LNCS, pages 231–248. Springer, 2013.Google ScholarGoogle Scholar
  24. J. K. Lee and J. Palsberg. Featherweight X10: a core calculus for async-finish parallelism. In PPoPP’10, pages 25–36. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. Leijen, W. Schulte, and S. Burckhardt. The design of a task parallel library. In OOPSLA’09, pages 227–242. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. P. R. Luszczek, D. H. Bailey, J. J. Dongarra, J. Kepner, R. F. Lucas, R. Rabenseifner, and D. Takahashi. The HPC Challenge (HPCC) benchmark suite. In SC’06. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Marr, S. Verhaegen, B. D. Fraine, T. D’Hondt, and W. D. Meuter. Insertion tree phasers: Efficient and scalable barrier synchronization for fine-grained parallelism. In HPCC’10, pages 130–137. IEEE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Message Passing Interface (MPI) homepage. mpi-forum.org.Google ScholarGoogle Scholar
  29. M. T. O’Keefe and H. G. Dietz. Hardware barrier synchronization: Dynamic barrier MIMD (DBM). In ICPP’90, pages 43–46. Pennsylvania State University, 1990.Google ScholarGoogle Scholar
  30. OpenMP homepage. openmp.org.Google ScholarGoogle Scholar
  31. Redis homepage. redis.io.Google ScholarGoogle Scholar
  32. I. Roy, G. R. Luecke, J. Coyle, and M. Kraeva. A scalable deadlock detection algorithm for UPC collective operations. In PGAS’13, pages 2–15. The University of Edinburgh, 2013.Google ScholarGoogle Scholar
  33. V. Saraswat and R. Jagadeesan. Concurrent clustered programming. In CONCUR’05, volume 3653 of LNCS, pages 353–367. Springer, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. Shirako, D. M. Peixotto, V. Sarkar, and W. N. Scherer. Phasers: a unified deadlock-free construct for collective and point-to-point synchronization. In ICS’08, pages 277–288. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. Shirako, D. M. Peixotto, V. Sarkar, and W. N. Scherer. Phaser accumulators: A new reduction construct for dynamic parallelism. In IPDPS’09, pages 1–12. IEEE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. Shirako, D. M. Peixotto, D.-D. Sbîrlea, and V. Sarkar. Phaser beams: Integrating stream parallelism with task parallelism. Presented at the X10’11, 2011.Google ScholarGoogle Scholar
  37. J. Shirako and V. Sarkar. Hierarchical phasers for scalable synchronization and reductions in dynamic parallelism. In IPDPS’10, pages 1–12. IEEE, 2010.Google ScholarGoogle Scholar
  38. J. Shirako, K. Sharma, and V. Sarkar. Unifying barrier and pointto-point synchronization in OpenMP with Phasers. In IWOMP’11, volume 6665 of LNCS, pages 122–137. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. L. A. Smith, J. M. Bull, and J. Obdrzálek. A parallel Java Grande benchmark suite. In SC’01. ACM, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. R. Tarjan. Depth-first search and linear graph algorithms. SIAM Journal on Computing, 1(2):146–160, 1972.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. F. Turbak. First-class synchronization barriers. In ICFP’96, pages 157–168. ACM, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. UPC homepage. upc-lang.org.Google ScholarGoogle Scholar
  43. N. Vasudevan, O. Tardieu, J. Dolby, and S. A. Edwards. Compiletime analysis and specialization of clocks in concurrent programs. In CC’09, volume 5501 of LNCS, pages 48–62. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. A. Vo. Scalable Formal Dynamic Verification of MPI Programs Through Distributed Causality Tracking. PhD thesis, University of Utah, 2011. AAI3454168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. X10 homepage. x10-lang.org.Google ScholarGoogle Scholar
  46. Course materials of principles and practice of parallel programming. www.cs.columbia.edu/~martha/courses/4130/au13/, 2013.Google ScholarGoogle Scholar

Index Terms

  1. Dynamic deadlock verification for general barrier synchronisation

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 50, Issue 8
          PPoPP '15
          August 2015
          290 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2858788
          • Editor:
          • Andy Gill
          Issue’s Table of Contents
          • cover image ACM Conferences
            PPoPP 2015: Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
            January 2015
            290 pages
            ISBN:9781450332057
            DOI:10.1145/2688500

          Copyright © 2015 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 January 2015

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!