Abstract
Detecting data races in multithreaded programs is a crucial part of debugging such programs, but traditional data race detectors are too slow to use routinely. This paper shows how to speed up race detection by spreading the work across multiple cores. Our strategy relies on uniparallelism, which executes time intervals of a program (called epochs) in parallel to provide scalability, but executes all threads from a single epoch on a single core to eliminate locking overhead. We use several techniques to make parallelization effective: dividing race detection into three phases, predicting a subset of the analysis state, eliminating sequential work via transitive reduction, and reducing the work needed to maintain multiple versions of analysis via factorization. We demonstrate our strategy by parallelizing a happens-before detector and a lockset-based detector. We find that uniparallelism can significantly speed up data race detection. With 4x the number of cores as the original application, our strategy speeds up the median execution time by 4.4x for a happens-before detector and 3.3x for a lockset race detector. Even on the same number of cores as the conventional detectors, the ability for uniparallelism to elide analysis locks allows it to reduce the median overhead by 13% for a happens-before detector and 8% for a lockset detector.
- S. Adve. Data races are evil with no exceptions. Communications of the ACM, 53 (11): 84, November 2010. Google Scholar
Digital Library
- S. V. Adve, M. D. Hill, B. P. Miller, and R. H. B. Netzer. Detecting data races on weak memory systems. In Proc. 1991 International Symposium on Computer Architecture, pages 234--243. Google Scholar
Digital Library
- U. Banerjee, B. Bliss, Z. Ma, and P. Petersen. A theory of data race detection. In 2006 Workshop on Parallel and Distributed Systems: Testing and Debugging, pages 69--78. Google Scholar
Digital Library
- H.-J. Boehm and S. V. Adve. Foundations of the CGoogle Scholar
- concurrency memory model. In Proc. 2008 ACM Conference on Programming Language Design and Implementation, pages 68--78.Google Scholar
- H.-J. Boehm and S. V. Adve. You Don't Know Jack About Shared Variables or Memory Models. Communications of the ACM, 55 (2): 48--54, February 2012. Google Scholar
Digital Library
- M. D. Bond, K. E. Coons, and K. S. McKinley. Pacer: Proportional detection of data races. In Proc. 2010 ACM Conference on Programming Language Design and Implementation, pages 255--268. Google Scholar
Digital Library
- S. Chen, B. Falsafi, P. B. Gibbons, M. Kozuch, T. C. Mowry, R. Teodorescu, A. Ailamaki, L. Fix, G. R. Ganger, B. Lin, and S. W. Schlosser. Log-Based Architectures for General-Purpose Monitoring of Deployed Code. In 2006 Workshop on Architectural and System Support for Improving Software Dependability. Google Scholar
Digital Library
- J.-D. Choi, K. Lee, A. Loginov, R. O'Callahan, V. Sarkar, and M. Sridharan. Efficient and precise datarace detection for multithreaded object-oriented programs. In Proc. 2002 ACM Conference on Programming Language Design and Implementation, pages 258--269. Google Scholar
Digital Library
- J.-D. Choi, B. P. Miller, and R. H. B. Netzer. Techniques for debugging parallel programs with flowback analysis. ACM Transactions on Programming Languages and Systems, 13 (4): 491--530, October 1991. Google Scholar
Digital Library
- J. Chung, M. Dalton, H. Kannan, and C. Kozyrakis. Thread-Safe Dynamic Binary Translation Using Transactional Memory. In Proc. 2008 Symposium on High Performance Computer Architecture, pages 279--289.Google Scholar
- J. Devietti, B. P. Wood, K. Strauss, L. Ceze, D. Grossman, and S. Qadeer. RADISH: Always-on sound and complete race detection in software and hardware. In Proc. 2012 International Symposium on Computer Architecture, pages 201--212. Google Scholar
Digital Library
- T. Elmas, S. Qadeer, and S. Tasiran. Goldilocks: A race and transaction-aware Java runtime. In Proc. 2007 ACM Conference on Programming Language Design and Implementation, pages 245--255. Google Scholar
Digital Library
- J. Erickson, M. Musuvathi, S. Burckhardt, and K. Olynyk. Effective data-race detection for the kernel. In Proc. 2010 Symposium on Operating Systems Design and Implementation, pages 151--162. Google Scholar
Digital Library
- C. J. Fidge. Timestamps in message-passing systems that preserve the partial ordering. Australian Computer Science Communications, 10 (1): 56--66, February 1988.Google Scholar
- C. Flanagan and S. N. Freund. FastTrack: Efficient and precise dynamic race detection. In Proc. 2009 ACM Conference on Programming Language Design and Implementation, pages 121--133. Google Scholar
Digital Library
- A. Itzkovitz, A. Schuster, and O. Zeev-Ben-Mordehai. Toward integration of data race detection in DSM systems. Journal of Parallel and Distributed Computing, 59 (2): 180--203, November 1999. Google Scholar
Digital Library
- K. Kelsey, T. Bai, C. Ding, and C. Zhang. Fast Track: A software system for speculative program optimization. In Proc. 2009 International Symposium on Code Generation and Optimization, pages 157--168. Google Scholar
Digital Library
- L. Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21 (7): 558--565, July 1978. ISSN 0001-0782. Google Scholar
Digital Library
- D. Lee, B. Wester, K. Veeraraghavan, S. Narayanasamy, P. M. Chen, and J. Flinn. Respec: Efficient online multiprocesor replay via speculation and external determinism. In Proc. 2010 International Conference on Architectural Support for Programming Languages and Operating Systems, pages 77--90. Google Scholar
Digital Library
- N. G. Leveson and C. S. Turner. An Investigation of the Therac-25 Accidents. IEEE Computer, 26 (7): 18--41, July 1993. Google Scholar
Digital Library
- B. Lucia, L. Ceze, K. Strauss, S. Qadeer, and H.-J. Boehm. Conflict exceptions: Simplifying concurrent language semantics with precise hardware exceptions for data-races. In Proc. 2010 International Symposium on Computer Architecture, pages 210--221. Google Scholar
Digital Library
- D. Marino, M. Musuvathi, and S. Narayanasamy. LiteRace: Effective sampling for lightweight data-race detection. In Proc. 2009 ACM Conference on Programming Language Design and Implementation, pages 134--143. Google Scholar
Digital Library
- F. Mattern. Virtual time and global states of distributed systems. In Proc. 1988 International Workshop on Parallel and Distributed Algorithms.Google Scholar
- A. Muzahid, D. Suárez, S. Qi, and J. Torrellas. SigRace: Signature-based data race detection. In Proc. 2009 International Symposium on Computer Architecture, pages 337--348. Google Scholar
Digital Library
- S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, and B. Calder. Automatically Classifying Benign and Harmful Data Races Using Replay Analysis. In Proc. 2007 ACM Conference on Programming Language Design and Implementation, pages 22--31. Google Scholar
Digital Library
- N. Nethercote and J. Seward. Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation. In Proc. 2007 ACM Conference on Programming Language Design and Implementation, pages 89--100. Google Scholar
Digital Library
- N. Nethercote and J. Seward. How to shadow every byte of memory used by a program. In Proc. 2007 ACM Conference on Virtual Execution Environments, pages 65--74. Google Scholar
Digital Library
- R. H. B. Netzer. Optimal Tracing and Replay for Debugging Shared-Memory Parallel Programs. In 1993 ACM/ONR Workshop on Parallel and Distributed Debugging. Google Scholar
Digital Library
- E. B. Nightingale, D. Peek, P. M. Chen, and J. Flinn. Parallelizing security checks on commodity hardware. In Proc. 2008 International Conference on Architectural Support for Programming Languages and Operating Systems, pages 308--318. Google Scholar
Digital Library
- A. Nistor, D. Marinov, and J. Torrellas. Light64: Lightweight hardware support for data race detection using systematic testing of parallel programs. In Proc. 2009 International Symposium on Microarchitecture, pages 541--552. Google Scholar
Digital Library
- K. Poulsen. Tracking the blackout bug. Technical report, SecurityFocus, April 2004. http://www.securityfocus.com/news/8412.Google Scholar
- E. Pozniansky and A. Scheuster. Efficient on-the-fly data race detection in multithreaded C++ programs. In Proc. 2003 ACM Symposium on Principles and Practice of Parallel Programming, pages 179--190. Google Scholar
Digital Library
- M. Prvulovic and J. Torrellas. ReEnact: Using thread-level speculation mechanisms to debug data races in multithreaded codes. In Proc. 2003 International Symposium on Computer Architecture, pages 110--121. Google Scholar
Digital Library
- P. Ratasaworabhan, M. Burtscher, D. Kirovski, B. Zorn, R. Nagpal, and K. Pattabiraman. Detecting and tolerating asymmetric races. In Proc. 2009 ACM Symposium on Principles and Practice of Parallel Programming, pages 173--184. Google Scholar
Digital Library
- M. Ronsse and K. D. Bosschere. RecPlay: A fully integrated practical record/replay system. ACM Transactions on Computer Systems, 17 (2): 133--152, May 1999. ISSN 0734--2071. Google Scholar
Digital Library
- Ruwase, Chen, Gibbons, and Mowry}Ruwase10O. Ruwase, S. Chen, P. B. Gibbons, and T. C. Mowry. Decoupled Lifeguards: Enabling Path Optimizations for Dynamic Correctness Checking Tools. In Proc. 2007 ACM Conference on Programming Language Design and Implementation, pages 245--255. Google Scholar
Digital Library
- O. Ruwase, P. B. Gibbons, T. C. Mowry, V. Ramachandran, S. Chen, M. Kozuch, and M. Ryan. Parallelizing Dynamic Information Flow Tracking. In Proc. 2008 ACM Symposium on Parallelism in Algorithms and Architectures, pages 35--45. Google Scholar
Digital Library
- P. Sack, B. E. Bliss, Z. Ma, P. Petersen, and J. Torrellas. Accurate and efficient filtering for the Intel Thread Checker race detector. In Proc. 2006 Workshop on Architectural and System Support for Improving Software Dependability, pages 34--41. Google Scholar
Digital Library
- S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson. Eraser: A dynamic data race detector for multithreaded programs. ACM Transactions on Computer Systems, 15 (4): 391--411, November 1997. Google Scholar
Digital Library
- K. Serebryany and T. Iskhodzhanov. ThreadSanitizer: Data race detection in practice. In Proc. 2009 Workshop on Binary Instrumentation and Applications, pages 62--71. Google Scholar
Digital Library
- J. Sevcik and D. Aspinall. On Validity of Program Transformations in the Java Memory Model. In Proc. 2008 European conference on Object-Oriented Programming, pages 27--51. Google Scholar
Digital Library
- M. Süßkraut, T. Knauth, S. Weigert, U. Schiffel, M. Meinhold, and C. Fetzer. Prospect: A compiler framework for speculative parallelization. In Proc. 2010 International Symposium on Code Generation and Optimization, pages 131--140. Google Scholar
Digital Library
- K. Veeraraghavan, P. M. Chen, J. Flinn, and S. Narayanasamy. Detecting and surviving data races using complementary schedules. In Proc. 2011 ACM Symposium on Operating Systems Principles, pages 369--384. Google Scholar
Digital Library
- K. Veeraraghavan, D. Lee, B. Wester, J. Ouyang, P. M. Chen, J. Flinn, and S. Narayanasamy. DoublePlay: Parallelizing sequential logging and replay. In Proc. 2011 International Conference on Architectural Support for Programming Languages and Operating Systems, pages 15--26. Google Scholar
Digital Library
- E. Vlachos, M. L. Goodstein, M. A. Kozuch, S. Chen, B. Falsafi, P. B. Gibbons, and T. C. Mowry. ParaLog: Enabling and Accelerating Online Parallel Monitoring of Multithreaded Applications. In Proc. 2010 International Conference on Architectural Support for Programming Languages and Operating Systems, pages 271--284. Google Scholar
Digital Library
- C. von Praun and T. R. Gross. Object race detection. In Proc. 2001 ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 70--82. Google Scholar
Digital Library
- S. Wallace and K. Hazelwood. SuperPin: Parallelizing dynamic instrumentation for real-time performance. In Proc. 2007 International Symposium on Code Generation and Optimization, pages 209--220. Google Scholar
Digital Library
- S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 programs: Characterization and methodological considerations. In Proc. 1995 International Symposium on Computer Architecture, pages 24--36. Google Scholar
Digital Library
- M. Xu, M. D. Hill, and R. Bodik. A regulated transitive reduction (RTR) for longer memory race recording. In Proc. 2006 International Conference on Architectural Support for Programming Languages and Operating Systems, pages 49--60. Google Scholar
Digital Library
- Y. Yu, T. Rodeheffer, and W. Chen. RaceTrack: Efficient detection of data race conditions via adaptive tracking. In Proc. 2005 ACM Symposium on Operating Systems Principles, pages 221--234. Google Scholar
Digital Library
- P. Zhou, R. Teodorescu, and Y. Zhou. HARD: Hardware-assisted lockset-based race detection. In Proc. 2007 Symposium on High Performance Computer Architecture, pages 121--132. Google Scholar
Digital Library
- C. Zilles and G. Sohi. Master/slave speculative parallelization. In Proc. 2002 International Symposium on Microarchitecture, pages 85--96. Google Scholar
Digital Library
Index Terms
Parallelizing data race detection
Recommendations
Parallelizing data race detection
ASPLOS '13: Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systemsDetecting data races in multithreaded programs is a crucial part of debugging such programs, but traditional data race detectors are too slow to use routinely. This paper shows how to speed up race detection by spreading the work across multiple cores. ...
Parallelizing data race detection
ASPLOS '13Detecting data races in multithreaded programs is a crucial part of debugging such programs, but traditional data race detectors are too slow to use routinely. This paper shows how to speed up race detection by spreading the work across multiple cores. ...
Dynamic data race detection for OpenMP programs
SC '18: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and AnalysisTwo concurrent accesses to a shared variable that are unordered by synchronization are said to be a data race if at least one access is a write. Data races cause shared memory parallel programs to behave unpredictably. This paper describes ROMP - a tool ...







Comments