skip to main content

Efficient parallel determinacy race detection for two-dimensional dags

Published:10 February 2018Publication History
Skip Abstract Section

Abstract

A program is said to have a determinacy race if logically parallel parts of a program access the same memory location and one of the accesses is a write. These races are generally bugs in the program since they lead to non-deterministic program behavior --- different schedules of the program can lead to different results. Most prior work on detecting these races focuses on a subclass of programs with fork-join parallelism.

This paper presents a race-detection algorithm, 2D-Order, for detecting races in a more general class of programs, namely programs whose dependence structure can be represented as planar dags embedded in 2D grids. Such dependence structures arise from programs that use pipelined parallelism or dynamic programming recurrences. Given a computation with T1 work and T span, 2D-Order executes it while also detecting races in O(T1/P + T) time on P processors, which is asymptotically optimal.

We also implemented PRacer, a race-detection algorithm based on 2D-Order for Cilk-P, which is a language for expressing pipeline parallelism. Empirical results demonstrate that PRacer incurs reasonable overhead and exhibits scalability similar to the baseline (executions without race detection) when running on multiple cores.

Skip Supplemental Material Section

Supplemental Material

References

  1. Kunal Agrawal, Charles E. Leiserson, and Jim Sukha. 2010. Executing Task Graphs Using Work-Stealing. In 24th IEEE International Parallel and Distributed Processing Symposium. 1--12.Google ScholarGoogle Scholar
  2. Todd R. Allen and David A. Padua. 1987. Debugging Fortran on a Shared Memory Machine. In Proceedings of the 1987 International Conference on Parallel Processing. 721--727.Google ScholarGoogle Scholar
  3. K. A. Baker, P. C. Fishburn, and F. S. Roberts. 1972. Partial orders of dimension 2. Networks 2, 1 (1972), 11--28.Google ScholarGoogle ScholarCross RefCross Ref
  4. Rajkishore Barik, Zoran Budimlić, Vincent Cavè, Sanjay Chatterjee, Yi Guo, David Peixotto, Raghavan Raman, Jun Shirako, Sağnak Taşirlar, Yonghong Yan, Yisheng Zhao, and Vivek Sarkar. 2009. The Habanero Multicore Software Research Project. In Proceedings of the 24th ACM SIGPLAN Conference Companion on Object Oriented Programming Systems Languages and Applications. ACM, 735--736. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Michael A. Bender, Richard Cole, Erik D. Demaine, Martin Farach-Colton, and Jack Zito. 2002. Two Simplified Algorithms for Maintaining Order in a List. In Proceedings of the 10th European Symposium on Algorithms. 152--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, and Charles E. Leiserson. 2004. On-the-Fly Maintenance of Series-Parallel Relationships in Fork-Join Multithreaded Programs. In 16th Annual ACM Symposium on Parallel Algorithms and Architectures. 133--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC Benchmark Suite: Characterization and Architectural Implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Vincent Cavé, Jisheng Zhao, Jun Shirako, and Vivek Sarkar. 2011. Habanero-Java: the new adventures of old X10. In Proceedings of the 9th International Conference on Principles and Practice of Programming in Java. 51--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Philippe Charles, Christian Grothoff, Vijay Saraswat, Christopher Donawa, Allan Kielstra, Kemal Ebcioglu, Christoph von Praun, and Vivek Sarkar. 2005. X10: An Object-Oriented Approach to Non-Uniform Cluster Computing. In 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications. 519--538. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jong-Deok Choi, Barton P. Miller, and Robert H. B. Netzer. 1991. Techniques for debugging parallel programs with flowback analysis. ACM Transactions on Programming Languages and Systems 13, 4 (1991), 491--530. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Charles Consel, Hedi Hamdi, Laurent Réveillère, Lenin Singaravelu, Haiyan Yu, and Calton Pu. 2003. Spidle: a DSL approach to specifying streaming applications. In Proceedings of the 2nd International Conference on Generative Programming and Component Engineering. 1--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. John S. Danaher, I-Ting Angelina Lee, and Charles E. Leiserson. 2008. Programming with exceptions in JCilk. Science of Computer Programming 63, 2 (Dec. 2008), 147--171. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Dietz and D. Sleator. 1987. Two Algorithms for Maintaining Order in a List. In Proceedings of the 19th Annual ACM Symposium on Theory of Computing. New York City, 365--372. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Dimitar Dimitrov, Martin Vechev, and Vivek Sarkar. 2015. Race Detection in Two Dimensions. In Proceedings of the 27th ACM Symposium on Parallelism in Algorithms and Architectures. ACM, Portland, Oregon, USA, 101--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Tayfun Elmas, Shaz Qadeer, and Serdar Tasiran. 2007. Goldilocks: A Race and Transaction-aware Java Runtime. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, San Diego, California, USA, 245--255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Perry A. Emrath and Davis A. Padua. 1988. Automatic Detection of Nondeterminacy in Parallel Programs. In Proceedings of the Workshop on Parallel and Distributed Debugging. 89--99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Dawson Engler and Ken Ashcraft. 2003. RacerX: Effective, Static Detection of Race Conditions and Deadlocks. In Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles. ACM, Bolton Landing, NY, USA, 237--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Mingdong Feng and Charles E. Leiserson. 1997. Efficient Detection of Determinacy Races in Cilk Programs. In Proceedings of the Ninth Annual ACM Symposium on Parallel Algorithms and Architectures. 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Mingdong Feng and Charles E. Leiserson. 1999. Efficient Detection of Determinacy Races in Cilk Programs. Theory of Computing Systems (1999).Google ScholarGoogle Scholar
  20. Jeremy T. Fineman. 2005. Provably Good Race Detection That Runs in Parallel. Master's thesis. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, Cambridge, MA.Google ScholarGoogle Scholar
  21. Cormac Flanagan and Stephen N. Freund. 2009. FastTrack: efficient and precise dynamic race detection. In Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, Dublin, Ireland, 121--133. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. 1998. The Implementation of the Cilk-5 Multithreaded Language. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation. ACM, 212--223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Michael I. Gordon, William Thies, and Saman Amarasinghe. 2006. Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 151--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Jialu Huang, Arun Raman, Thomas B. Jablin, Yun Zhang, Tzu-Han Hung, and David I. August. 2010. Decoupled Software Pipelining Creates Parallelization Opportunities. In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM, 121--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Intel Corporation. 2011. Intel® Cilk™Plus. Available from https://www.cilkplus.org/. (2011). Accessed: August 2017.Google ScholarGoogle Scholar
  26. Intel Corporation. 2013. Piper: Experimental Language Support for Pipeline Parallelism In Intel® Cilk™Plus. Available from https://www.cilkplus.org/piper-experimental-language-support-pipeline-parallelism-intel-cilk-plus. (2013). Accessed: August 2017.Google ScholarGoogle Scholar
  27. I-Ting Angelina Lee, Silas Boyd-Wickizer, Zhiyi Huang, and Charles E. Leiserson. 2010. Using Memory Mapping to Support Cactus Stacks in Work-Stealing Runtime Systems. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques. ACM, 411--420. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. I-Ting Angelina Lee, Charles E. Leiserson, Tao B. Schardl, Jim Sukha, and Zhunping Zhang. 2013. On-the-Fly Pipeline Parallelism. In Proceedings of the 25th Annual ACM Symposium on Parallelism in Algorithms and Architectures. 140--151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. I-Ting Angelina Lee, Charles E. Leiserson, Tao B. Schardl, Jim Sukha, and Zhunping Zhang. 2015. On-the-Fly Pipeline Parallelism. ACM Transactions on Parallel Computing 2, 3, Article 17 (Sept. 2015), 42 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. I-Ting Angelina Lee and Tao B. Schardl. 2015. Efficiently Detecting Races in Cilk Programs That Use Reducer Hyperobjects. In SPAA '15: Proceedings of the 27th ACM on Symposium on Parallelism in Algorithms and Architectures (SPAA '15). ACM, Portland, Oregon, USA, 111--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Charles E. Leiserson. 2010. The Cilk++ Concurrency Platform. Journal of Supercomputing 51, 3 (March 2010), 244--257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Steve MacDonald, Duane Szafron, and Jonathan Schaeffer. 2004. Rethinking the Pipeline as Object-Oriented States with Transformations. In 9th International Workshop on High-Level Parallel Programming Models and Supportive Environments at IPDPS. 12--21.Google ScholarGoogle ScholarCross RefCross Ref
  33. William R. Mark, R. Steven Glanville, Kurt Akeley, and Mark J. Kilgard. 2003. Cg: a system for programming graphics hardware in a C-like language. In ACM SIGGRAPH. 896--907. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Michael McCool, Arch D. Robison, and James Reinders. 2012. Structured Parallel Programming: Patterns for Efficient Computation. Elsevier Science. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. John Mellor-Crummey. 1991. On-the-fly Detection of Data Races for Programs with Nested Fork-Join Parallelism. In Proceedings of Supercomputing'91. 24--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. John Mellor-Crummey. 1993. Compile-time Support for Efficient Data Race Detection in Shared-Memory Parallel Programs. In Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging. ACM Press, 129--139. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Barton P. Miller and Jong-Deok Choi. 1988. A Mechanism for Efficient Debugging of Parallel Programs. In Proceedings of the 1988 ACM SIGPLAN Conference on Programming Language Design and Implementation. 135--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Robert H. B. Netzer and Barton P. Miller. 1989. Detecting Data Races in Parallel Program Executions. In In Advances in Languages and Compilers for Parallel Computing, 1990 Workshop. MIT Press, 109--129.Google ScholarGoogle Scholar
  39. Robert H. B. Netzer and Barton P. Miller. 1992. What are Race Conditions? ACM Letters on Programming Languages and Systems 1, 1 (March 1992), 74--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Itzhak Nudler and Larry Rudolph. 1986. Tools for the Efficient Development of Efficient Parallel Programs. In Proceedings of the First Israeli Conference on Computer Systems Engineering.Google ScholarGoogle Scholar
  41. Robert O'Callahan and Jong-Deok Choi. 2003. Hybrid Dynamic Data Race Detection. In Proceedings of the Ninth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '03). ACM, New York, NY, USA, 167--178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. OpenMP Architecture Review Board. 2013. OpenMP Application Program Interface, Version 4.0. Available from http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf. (2013).Google ScholarGoogle Scholar
  43. Guilherme Ottoni, Ram Rangan, Adam Stoler, and David I. August. 2005. Automatic Thread Extraction with Decoupled Software Pipelining. In Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 105--118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Antoniu Pop and Albert Cohen. 2011. A Stream-computing Extension to OpenMP. In Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers. ACM, 5--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Eli Pozniansky and Assaf Schuster 2007. MultiRace: Efficient On-the-fly Data Race Detection in Multithreaded C++ Programs: Research Articles. Concurrency and Computation: Practice and Experience 19, 3 (March 2007), 327--340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Easwaran Raman, Guilherme Ottoni, Arun Raman, Matthew J. Bridges, and David I. August. 2008. Parallel-stage Decoupled Software Pipelining. In Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM, 114--123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Raghavan Raman, Jisheng Zhao, Vivek Sarkar, Martin Vechev, and Eran Yahav. 2010. Efficient Data Race Detection for Async-Finish Parallelism. In Runtime Verification, Howard Barringer, Ylies Falcone, Bernd Finkbeiner, Klaus Havelund, Insup Lee, Gordon Pace, Grigore Rosu, Oleg Sokolsky, and Nikolai Tillmann (Eds.). Lecture Notes in Computer Science, Vol. 6418. Springer Berlin / Heidelberg, 368--383. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Raghavan Raman, Jisheng Zhao, Vivek Sarkar, Martin Vechev, and Eran Yahav. 2012. Scalable and Precise Dynamic Datarace Detection for Structured Parallelism. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation. 531--542. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Ram Rangan, Neil Vachharajani, Manish Vachharajani, and David I. August. 2004. Decoupled Software Pipelining with the Synchronization Array. In Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques. IEEE Computer Society, 177--188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Lawrence Rauchwerger, Nancy M. Amato, and David A. Padua. 1995. Run-time Methods for Parallelizing Partially Parallel Loops. In Proceedings of the 9th International Conference on Supercomputing. ACM, 137--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Lawrence Rauchwerger, Nancy M. Amato, and David A. Padua. 1995. A scalable method for run-time loop parallelization. International Journal of Parallel Programming 23, 6 (01 Dec. 1995), 537--576. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Lawrence Rauchwerger and David A. Padua. 1999. The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization. IEEE Transactions on Parallel and Distributed Systems 10, 2 (Feb. 1999), 160--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Daniel Sanchez, David Lo, Richard M. Yoo, Jeremy Sugerman, and Christos Kozyrakis. 2011. Dynamic Fine-Grain Scheduling of Pipeline Parallelism. In 2011 International Conference on Parallel Architectures and Compilation Techniques. IEEE, 22--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, and Thomas Anderson. 1997. Eraser: A Dynamic Race Detector for Multi-Threaded Programs. In Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Konstantin Serebryany and Timur Iskhodzhanov. 2009. ThreadSanitizer: Data Race Detection in Practice. In Proceedings of the Workshop on Binary Instrumentation and Applications. ACM, 62--71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. M. Aater Suleman, Moinuddin K. Qureshi, Khubaib, and Yale N. Patt. 2010. Feedback-directed Pipeline Parallelism. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques. ACM, 147--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Rishi Surendran and Vivek Sarkar. 2016. Dynamic Determinacy Race Detection for Task Parallelism with Futures. Springer International Publishing, 368--385.Google ScholarGoogle Scholar
  58. Robert Endre Tarjan. 1979. Applications of Path Compression on Balanced Trees. Journal of the Association for Computing Machinery 26, 4 (October 1979), 690--715. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. William Thies, Vikram Chandrasekhar, and Saman Amarasinghe. 2007. A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 356--369. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Robert Utterback, Kunal Agrawal, Jeremy T. Fineman, and I-Ting Angelina Lee. 2016. Provably Good and Practically Efficient Parallel Race Detection for Fork-Join Programs. In Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures. 83--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Jacobo Valdes. 1978. Parsing Flowcharts and Series-Parallel Graphs. Ph.D. Dissertation. Stanford University. STAN-CS-78-682. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Yuan Yu, Tom Rodeheffer, and Wei Chen. 2005. RaceTrack: Efficient Detection of Data Race Conditions via Adaptive Tracking. In Proceedings of the Twentieth ACM Symposium on Operating Systems Principles. ACM, 221--234. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 53, Issue 1
    PPoPP '18
    January 2018
    426 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/3200691
    Issue’s Table of Contents
    • cover image ACM Conferences
      PPoPP '18: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
      February 2018
      442 pages
      ISBN:9781450349826
      DOI:10.1145/3178487

    Copyright © 2018 ACM

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 10 February 2018

    Check for updates

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader
About Cookies On This Site

We use cookies to ensure that we give you the best experience on our website.

Learn more

Got it!