Abstract
It is notoriously challenging to develop parallel software systems that are both scalable and correct. Runtime support for parallelism---such as multithreaded record & replay, data race detectors, transactional memory, and enforcement of stronger memory models---helps achieve these goals, but existing commodity solutions slow programs substantially in order to track (i.e., detect or control) an execution's cross-thread dependences accurately. Prior work tracks cross-thread dependences either "pessimistically," slowing every program access, or "optimistically," allowing for lightweight instrumentation of most accesses but dramatically slowing accesses involved in cross-thread dependences.
This paper seeks to hybridize pessimistic and optimistic tracking, which is challenging because there exists a fundamental mismatch between pessimistic and optimistic tracking. We address this challenge based on insights about how dependence tracking and program synchronization interact, and introduce a novel approach called hybrid tracking. Hybrid tracking is suitable for building efficient runtime support, which we demonstrate by building hybrid-tracking-based versions of a dependence recorder and a region serializability enforcer. An adaptive, profile-based policy makes runtime decisions about switching between pessimistic and optimistic tracking. Our evaluation shows that hybrid tracking enables runtime support to overcome the performance limitations of both pessimistic and optimistic tracking alone.
Supplemental Material
Available for Download
- M. Abadi, T. Harris, and M. Mehrara. Transactional Memory with Strong Atomicity Using Off-the-Shelf Memory Protection Hardware. In PPoPP, pages 185--196, 2009. Google Scholar
Digital Library
- S. V. Adve and H.-J. Boehm. Memory Models: A Case for Rethinking Parallel Languages and Hardware. CACM, 53:90--101, 2010. Google Scholar
Digital Library
- S. V. Adve and M. D. Hill. Weak Ordering---A New Definition. In ISCA, pages 2--14, 1990. Google Scholar
Digital Library
- B. Alpern, S. Augart, S. M. Blackburn, M. Butrico, A. Cocchi, P. Cheng, J. Dolby, S. Fink, D. Grove, M. Hind, K. S. McKinley, M. Mergen, J. E. B. Moss, T. Ngo, and V. Sarkar. The Jikes Research Virtual Machine Project: Building an Open-Source Research Community. IBM Systems Journal, 44:399--417, 2005. Google Scholar
Digital Library
- D. F. Bacon, R. Konuru, C. Murthy, and M. Serrano. Thin Locks: Featherweight Synchronization for Java. In PLDI, pages 258--268, 1998. Google Scholar
Digital Library
- S. Biswas, M. Zhang, M. D. Bond, and B. Lucia. Valor: Efficient, Software-only Region Conflict Exceptions. In OOPSLA, pages 241--259, 2015. Google Scholar
Digital Library
- S. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. In OOPSLA, pages 169--190, 2006. Google Scholar
Digital Library
- H.-J. Boehm. Position paper: Nondeterminism is Unavoidable, but Data Races are Pure Evil. In RACES, pages 9--14, 2012. Google Scholar
Digital Library
- H.-J. Boehm and S. V. Adve. Foundations of the C++ Concurrency Memory Model. In PLDI, pages 68--78, 2008. Google Scholar
Digital Library
- M. D. Bond, M. Kulkarni, M. Cao, M. Fathi Salmi, and J. Huang. Efficient Deterministic Replay of Multithreaded Executions in a Managed Language Virtual Machine. In PPPJ, pages 90--101, 2015. Google Scholar
Digital Library
- M. D. Bond, M. Kulkarni, M. Cao, M. Zhang, M. Fathi Salmi, S. Biswas, A. Sengupta, and J. Huang. Octet: Capturing and Controlling Cross-Thread Dependences Efficiently. In OOPSLA, pages 693--712, 2013. Google Scholar
Digital Library
- C. Boyapati, R. Lee, and M. Rinard. Ownership Types for Safe Programming: Preventing Data Races and Deadlocks. In OOPSLA, pages 211--230, 2002. Google Scholar
Digital Library
- M. Burrows. How to Implement Unnecessary Mutexes. In Computer Systems Theory, Technology, and Applications, pages 51--57. Springer--Verlag, 2004.Google Scholar
- M. Cao, M. Zhang, and M. D. Bond. Drinking from Both Glasses: Adaptively Combining Pessimistic and Optimistic Synchronization for Efficient Parallel Runtime Support. In WoDet, 2014.Google Scholar
- J.-D. Choi, K. Lee, A. Loginov, R. O'Callahan, V. Sarkar, and M. Sridharan. Efficient and Precise Datarace Detection for Multithreaded Object-Oriented Programs. In PLDI, pages 258--269, 2002. Google Scholar
Digital Library
- D. Dice, A. Kogan, Y. Lev, T. Merrifield, and M. Moir. Adaptive Integration of Hardware and Software Lock Elision Techniques. In SPAA, pages 188--197, 2014. Google Scholar
Digital Library
- T. Elmas, S. Qadeer, and S. Tasiran. Goldilocks: A Race and Transaction-Aware Java Runtime. In PLDI, pages 245--255, 2007. Google Scholar
Digital Library
- C. Flanagan and S. N. Freund. FastTrack: Efficient and Precise Dynamic Race Detection. In PLDI, pages 121--133, 2009. Google Scholar
Digital Library
- C. Flanagan, S. N. Freund, and J. Yi. Velodrome: A Sound and Complete Dynamic Atomicity Checker for Multithreaded Programs. In PLDI, pages 293--303, 2008. Google Scholar
Digital Library
- T. Harris and K. Fraser. Language Support for Lightweight Transactions. In OOPSLA, pages 388--402, 2003. Google Scholar
Digital Library
- T. Kalibera, M. Mole, R. Jones, and J. Vitek. A Black-box Approach to Understanding Concurrency in DaCapo. In OOPSLA, pages 335--354, 2012. Google Scholar
Digital Library
- K. Kawachiya, A. Koseki, and T. Onodera. Lock Reservation: Java Locks Can Mostly Do Without Atomic Operations. In OOPSLA, pages 130--141, 2002. Google Scholar
Digital Library
- L. Lamport. Time, Clocks, and the Ordering of Events in a Distributed System. CACM, 21(7):558--565, 1978. Google Scholar
Digital Library
- T. J. LeBlanc and J. M. Mellor-Crummey. Debugging Parallel Programs with Instant Replay. IEEE TOC, 36:471--482, 1987. Google Scholar
Digital Library
- D. Lee, P. M. Chen, J. Flinn, and S. Narayanasamy. Chimera: Hybrid Program Analysis for Determinism. In PLDI, pages 463--474, 2012. Google Scholar
Digital Library
- D. Lee, B. Wester, K. Veeraraghavan, S. Narayanasamy, P. M. Chen, and J. Flinn. Respec: Efficient Online Multiprocessor Replay via Speculation and External Determinism. In ASPLOS, pages 77--90, 2010. Google Scholar
Digital Library
- J. Manson, W. Pugh, and S. V. Adve. The Java Memory Model. In POPL, pages 378--391, 2005. Google Scholar
Digital Library
- H. S. Matar, I. Kuru, S. Tasiran, and R. Dementiev. Accelerating Precise Race Detection Using Commercially-Available Hardware Transactional Memory Support. In WoDet, 2014.Google Scholar
- J. Ouyang, P. M. Chen, J. Flinn, and S. Narayanasamy. ...and region serializability for all. In HotPar, 2013.Google Scholar
- S. Park, Y. Zhou, W. Xiong, Z. Yin, R. Kaushik, K. H. Lee, and S. Lu. PRES: Probabilistic Replay with Execution Sketching on Multiprocessors. In SOSP, pages 177--192, 2009. Google Scholar
Digital Library
- C. G. Ritson and F. R. Barnes. An Evaluation of Intel's Restricted Transactional Memory for CPAs. In CPA, pages 271--292, 2013.Google Scholar
- M. Ronsse and K. De Bosschere. RecPlay: A Fully Integrated Practical Record/Replay System. TOCS, 17:133--152, 1999. Google Scholar
Digital Library
- K. Russell and D. Detlefs. Eliminating Synchronization-Related Atomic Operations with Biased Locking and Bulk Rebiasing. In OOPSLA, pages 263--272, 2006. Google Scholar
Digital Library
- B. Saha, A.-R. Adl-Tabatabai, R. L. Hudson, C. C. Minh, and B. Hertzberg. McRT-STM: A High Performance Software Transactional Memory System for a Multi-Core Runtime. In PPoPP, pages 187--197, 2006. Google Scholar
Digital Library
- D. J. Scales, K. Gharachorloo, and C. A. Thekkath. Shasta: A Low Overhead, Software-Only Approach for Supporting Fine-Grain Shared Memory. In ASPLOS, pages 174--185, 1996. Google Scholar
Digital Library
- A. Sengupta, S. Biswas, M. Zhang, M. D. Bond, and M. Kulkarni. Hybrid Static--Dynamic Analysis for Statically Bounded Region Serializability. In ASPLOS, pages 561--575, 2015. Google Scholar
Digital Library
- T. Usui, R. Behrends, J. Evans, and Y. Smaragdakis. Adaptive Locks: Combining Transactions and Locks for Efficient Concurrency. In PACT, pages 3--14, 2009. Google Scholar
Digital Library
- K. Veeraraghavan, D. Lee, B. Wester, J. Ouyang, P. M. Chen, J. Flinn, and S. Narayanasamy. DoublePlay: Parallelizing Sequential Logging and Replay. In ASPLOS, pages 15--26, 2011. Google Scholar
Digital Library
- C. von Praun and T. R. Gross. Object Race Detection. In OOPSLA, pages 70--82, 2001. Google Scholar
Digital Library
- C. von Praun and T. R. Gross. Static Conflict Analysis for Multi-Threaded Object-Oriented Programs. In PLDI, pages 115--128, 2003. Google Scholar
Digital Library
- D. Weeratunge, X. Zhang, and S. Jagannathan. Analyzing Multicore Dumps to Facilitate Concurrency Bug Reproduction. In ASPLOS, pages 155--166, 2010. Google Scholar
Digital Library
- R. M. Yoo, C. J. Hughes, K. Lai, and R. Rajwar. Performance Evaluation of Intel Transactional Synchronization Extensions for High-Performance Computing. In SC, pages 19:1--19:11, 2013. Google Scholar
Digital Library
- O. Ziv, A. Aiken, G. Golan-Gueta, G. Ramalingam, and M. Sagiv. Composing Concurrency Control. In PLDI, pages 240--249, 2015. Google Scholar
Digital Library
Recommendations
Drinking from both glasses: combining pessimistic and optimistic tracking of cross-thread dependences
PPoPP '16: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingIt is notoriously challenging to develop parallel software systems that are both scalable and correct. Runtime support for parallelism---such as multithreaded record & replay, data race detectors, transactional memory, and enforcement of stronger memory ...
From drinking philosophers to asynchronous path-following robots
AbstractIn this paper, we consider the multi-robot path execution problem where a group of robots move on predefined paths from their initial to target positions while avoiding collisions and deadlocks in the face of asynchrony. We first show ...






Comments