ABSTRACT
The most intuitive memory model for shared-memory multithreaded programming is sequential consistency(SC), but it disallows the use of many compiler and hardware optimizations thereby impacting performance. Data-race-free (DRF) models, such as the proposed C++0x memory model, guarantee SC execution for datarace-free programs. But these models provide no guarantee at all for racy programs, compromising the safety and debuggability of such programs. To address the safety issue, the Java memory model, which is also based on the DRF model, provides a weak semantics for racy executions. However, this semantics is subtle and complex, making it difficult for programmers to reason about their programs and for compiler writers to ensure the correctness of compiler optimizations.
We present the DRFx memory model, which is simple for programmers to understand and use while still supporting many common optimizations. We introduce a memory model (MM) exception which can be signaled to halt execution. If a program executes without throwing this exception, then DRFx guarantees that the execution is SC. If a program throws an MM exception during an execution, then DRFx guarantees that the program has a data race. We observe that SC violations can be detected in hardware through a lightweight form of conflict detection. Furthermore, our model safely allows aggressive compiler and hardware optimizations within compiler-designated program regions. We formalize our memory model, prove several properties about this model, describe a compiler and hardware design suitable for DRFx, and evaluate the performance overhead due to our compiler and hardware requirements.
- S. V. Adve and M. D. Hill. Weak ordering--a new definition. In Proceedings of ISCA, pages 2--14. ACM, 1990. Google Scholar
Digital Library
- S. V. Adve, M. D. Hill, B. P. Miller, and R. H. B. Netzer. Detecting data races on weak memory systems. In ISCA, pages 234--243, 1991. Google Scholar
Digital Library
- W. Ahn, S. Qi, J.-W. Lee, M. Nicolaides, X. Fang, J. Torrellas, D. Wong, and S. Midkiff. Bulkcompiler: High-performance sequential consistency through cooperative compiler and hardware support. In 42nd International Symposium on Microarchitecture, 2009. Google Scholar
Digital Library
- C. Bienia, S. Kumar, J. P. Singh, and K. Li. The parsec benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, October 2008. Google Scholar
Digital Library
- C. Blundell, M. Martin, and T. Wenisch. Invisifence: performancetransparent memory ordering in conventional multiprocessors. In ISCA, 2009. Google Scholar
Digital Library
- H. J. Boehm. Simple thread semantics require race detection. In FIT session at PLDI, 2009.Google Scholar
- H. J. Boehm and S. Adve. Foundations of the c++ concurrency memory model. In Proceedings of PLDI, pages 68--78. ACM, 2008. Google Scholar
Digital Library
- C. Boyapati and M. Rinard. A parameterized type system for race-free Java programs. In Proceedings of OOPSLA, pages 56--69. ACM Press, 2001. Google Scholar
Digital Library
- C. Boyapati, R. Lee, and M. Rinard. Ownership types for safe programming: Preventing data races and deadlocks. In Proceedings of OOPSLA, 2002. Google Scholar
Digital Library
- P. Cenciarelli, A. Knapp, and E. Sibilio. The java memory model: Operationally, denotationally, axiomatically. In ESOP, pages 331--346, 2007. Google Scholar
Digital Library
- L. Ceze, J. Tuck, P. Montesinos, and J. Torrellas. Bulksc: bulk enforcement of sequential consistency. In ISCA, pages 278--289, 2007. Google Scholar
Digital Library
- L. Ceze, J. Devietti, B. Lucia, and S. Qadeer. The case for system support for concurrency exceptions. In USENIX HotPar, 2009. Google Scholar
Digital Library
- D. Dice, Y. Lev, M. Moir, and D. Nussbaum. Early experience with a commercial hardware transactional memory implementation. In Proceedings of ASPLOS, 2009. Google Scholar
Digital Library
- C. Flanagan and S. Freund. FastTrack: efficient and precise dynamic race detection. In Proceedings of PLDI, 2009. Google Scholar
Digital Library
- C. Flanagan and S. N. Freund. Type-based race detection for Java. In Proceedings of PLDI, pages 219--232, 2000. Google Scholar
Digital Library
- K. Gharachorloo and P. Gibbons. Detecting violations of sequential consistency. In Proceedings of the third annual ACM symposium on Parallel algorithms and architectures, pages 316--326. ACM New York, NY, USA, 1991. Google Scholar
Digital Library
- K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. Hennessy. Memory consistency and event ordering in scalable shared-memory multiprocessors. In Proceedings of ISCA, pages 15--26, 1990. Google Scholar
Digital Library
- L. Hammond, V. Wong, M. K. Chen, B. D. Carlstrom, J. D. Davis, B. Hertzberg, M. K. Prabhu, H. Wijaya, C. Kozyrakis, and K. Olukotun. Transactional memory coherence and consistency. In ISCA, pages 102--113, 2004. Google Scholar
Digital Library
- M. Herlihy and J. E. B. Moss. Transactional memory: architectural support for lock-free data structures. In Proceedings of ISCA, pages 289--300. ACM, 1993. Google Scholar
Digital Library
- A. Kamil, J. Su, and K. Yelick. Making sequential consistency practical in Titanium. In Proceedings of the 2005 ACM/IEEE conference on Supercomputing, page 15. IEEE Computer Society, 2005. Google Scholar
Digital Library
- A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Journal of Parallel and Distributed Computing, 38(2):130--144, 1996. Google Scholar
Digital Library
- L. Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7):558--565, 1978. Google Scholar
Digital Library
- L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE transactions on computers, 100(28):690--691, 1979. Google Scholar
Digital Library
- C. Lattner and V. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the international symposium on Code generation and optimization: feedbackdirected and runtime optimization. IEEE Computer Society, 2004. Google Scholar
Digital Library
- B. Liblit, A. Aiken, and K. Yelick. Type systems for distributed data sharing. In Proceedings of the Tenth International Static Analysis Symposium, 2003. Google Scholar
Digital Library
- B. Lucia, L. Ceze, K. Strauss, S. Qadeer, and H. Boehm. Conflict exceptions: Providing simple parallel language semantics with precise hardware exceptions. In 37th Annual International Symposium on Computer Architecture, June 2010. Google Scholar
Digital Library
- C. K. Luk, R. Cohn, R.Muth, H. Patil, A. Klauser, G. Lowney, S.Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In Programming Language Design and Implementation, Chicago, IL, June 2005. Google Scholar
Digital Library
- J. Manson, W. Pugh, and S. Adve. The java memory model. In Proceedings of POPL, pages 378--391. ACM, 2005. Google Scholar
Digital Library
- D. Marino, A. Singh, T. Millstein, M. Musuvathi, and S. Narayanasamy. DRFx: A simple and efficient memory model for concurrent programming languages. Technical Report 090021, UCLA Computer Science Department, Nov. 2009. URL http://fmdb.cs.ucla.edu/Treports/090021.pdf.Google Scholar
- A. Muzahid, D. Suarez, S. Qi, and J. Torrellas. Sigrace: signaturebased data race detection. In ISCA, 2009. Google Scholar
Digital Library
- P. Pratikakis, J. S. Foster, and M. Hicks. Locksmith: context-sensitive correlation analysis for race detection. In Proceedings of PLDI, pages 320--331, 2006. Google Scholar
Digital Library
- M. Prvulovic and J. Torrelas. Reenact: Using thread-level speculation mechanisms to debug data races in multithreaded codes. In Proceedings of ISCA, San Diego, CA, June 2003. Google Scholar
Digital Library
- P. Ranganathan, V. Pai, and S. Adve. Using speculative retirement and larger instruction windows to narrow the performance gap between memory consistency models. In Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures, pages 199--210, 1997. Google Scholar
Digital Library
- S. Sethumadhavan, R. Desikan, D. Burger, C. Moore, and S. Keckler. Scalable hardware memory disambiguation for high ILP processors. In Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2003. Google Scholar
Digital Library
- J. Sevcík and D. Aspinall. On validity of program transformations in the java memory model. In ECOOP, pages 27--51, 2008. Google Scholar
Digital Library
- D. Shasha and M. Snir. Efficient and correct execution of parallel programs that share memory. ACM Transactions on Programming Languages and Systems (TOPLAS), 10(2):282--312, 1988. Google Scholar
Digital Library
- Z. Sura, X. Fang, C. Wong, S. Midkiff, J. Lee, and D. Padua. Compiler techniques for high performance sequentially consistent java programs. In Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 2--13, 2005. Google Scholar
Digital Library
- W. Triebel, J. Bissell, and R. Booth. Programming Itanium-based Systems. Intel Press, 2001. Google Scholar
Digital Library
Index Terms
DRFX: a simple and efficient memory model for concurrent programming languages
Recommendations
DRFX: a simple and efficient memory model for concurrent programming languages
PLDI '10The most intuitive memory model for shared-memory multithreaded programming is sequential consistency(SC), but it disallows the use of many compiler and hardware optimizations thereby impacting performance. Data-race-free (DRF) models, such as the ...
DRFx: An Understandable, High Performance, and Flexible Memory Model for Concurrent Languages
The most intuitive memory model for shared-memory multi-threaded programming is sequential consistency (SC), but it disallows the use of many compiler and hardware optimizations and thus affects performance. Data-race-free (DRF) models, such as the C++...
Efficient processor support for DRFx, a memory model with exceptions
ASPLOS '11A longstanding challenge of shared-memory concurrency is to provide a memory model that allows for efficient implementation while providing strong and simple guarantees to programmers. The C++0x and Java memory models admit a wide variety of compiler ...







Comments