skip to main content
research-article

Efficient processor support for DRFx, a memory model with exceptions

Published:05 March 2011Publication History
Skip Abstract Section

Abstract

A longstanding challenge of shared-memory concurrency is to provide a memory model that allows for efficient implementation while providing strong and simple guarantees to programmers. The C++0x and Java memory models admit a wide variety of compiler and hardwareoptimizations and provide sequentially consistent (SC) semantics for data-race-free programs. However, they either do not provide any semantics (C++0x) or provide a hard-to-understand semantics (Java) for racy programs, compromising the safety and debuggability of such programs.

In earlier work we proposed the DRFx memory model, which addresses this problem by dynamically detecting potential violations of SC due to the interaction of compiler or hardware optimizations with data races and halting execution upon detection. In this paper, we present a detailed micro-architecture design for supporting the DRFx memory model, formalize the design and prove its correctness, and evaluate the design using a hardware simulator. We describe a set of DRFx-compliant complexity-effective optimizations which allow us to attain performance close to that of TSO (Total Store Model) and DRF0 while providing strong guarantees for all programs.

References

  1. S. Adve and K. Gharachorloo. Shared memory consistency models: a tutorial. Computer, 29 (12): 66--76, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. V. Adve and M. D. Hill. Weak ordering--a new definition. In ISCA '90, pages 2--14. ACM, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. V. Adve, M. D. Hill, B. P. Miller, and R. H. B. Netzer. Detecting data races on weak memory systems. In ISCA '91, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. W. Ahn, S. Qi, J.-W. Lee, M. Nicolaides, X. Fang, J. Torrellas, D. Wong, and S. Midkiff. BulkCompiler: High-performance sequential consistency through cooperative compiler and hardware support. In 42nd International Symposium on Microarchitecture, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of PACT, October 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Blundell, M. Martin, and T. Wenisch. InvisiFence: performance-transparent memory ordering in conventional multiprocessors. In ISCA '09, pages 233--244, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Bocchino, V. Adve, D. Dig, S. Adve, S. Heumann, R. Komuravelli, J. Overbey, P. Simmons, H. Sung, and M. Vakilian. A type and effect system for Deterministic Parallel Java. In OOPSLA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. H. J. Boehm. Simple thread semantics require race detection. In FIT session at PLDI, 2009.Google ScholarGoogle Scholar
  9. H. J. Boehm and S. Adve. Foundations of the C++ concurrency memory model. In PLDI '08, pages 68--78. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Boyapati and M. Rinard. A parameterized type system for race-free Java programs. In OOPSLA '01, pages 56--69, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Boyapati, R. Lee, and M. Rinard. Ownership types for safe programming: Preventing data races and deadlocks. In OOPSLA '02, pages 211--230, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Cacti. Hp labs. cacti 4.2. URL http://quid.hpl.hp.com:9081/cacti.Google ScholarGoogle Scholar
  13. P. Cenciarelli, A. Knapp, and E. Sibilio. The Java memory model: Operationally, denotationally, axiomatically. In ESOP '07, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. Ceze, J. Tuck, J. Torrellas, and C. Cascaval. Bulk disambiguation of speculative threads in multiprocessors. In ISCA '06, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. L. Ceze, J. Tuck, P. Montesinos, and J. Torrellas. BulkSC: bulk enforcement of sequential consistency. In ISCA '07, pages 278--289, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Ceze, J. Devietti, B. Lucia, and S. Qadeer. The case for system support for concurrency exceptions. In USENIX HotPar, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. T. Elmas, S. Qadeer, and S. Tasiran. Goldilocks: a race and transaction-aware Java runtime. In PLDI '07, pages 149--158, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Flanagan and S. Freund. FastTrack: efficient and precise dynamic race detection. In PLDI '09, pages 121--133, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. Flanagan and S. N. Freund. Type-based race detection for Java. In PLDI '00, pages 219--232, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. K. Gharachorloo and P. Gibbons. Detecting violations of sequential consistency. In SPAA '91, pages 316--326, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. K. Gharachorloo, A. Gupta, and J. Hennessy. Two techniques to enhance the performance of memory consistency models. In Proceedings of ICPP, volume 1, pages 355--364, 1991.Google ScholarGoogle Scholar
  22. Hammond, Carlstrom, Wong, Hertzberg, Chen, Kozyrakis, and Olukotun}TCCL. Hammond, B. D. Carlstrom, V. Wong, B. Hertzberg, M. Chen, C. Kozyrakis, and K. Olukotun. Programming with transactional coherence and consistency (TCC). In ASPLOS-XI, pages 1--13, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Hammond, Wong, Chen, Carlstrom, Davis, Hertzberg, Prabhu, Wijaya, Kozyrakis, and Olukotun}Hammond04L. Hammond, V. Wong, M. K. Chen, B. D. Carlstrom, J. D. Davis, B. Hertzberg, M. K. Prabhu, H. Wijaya, C. Kozyrakis, and K. Olukotun. Transactional memory coherence and consistency. In ISCA, pages 102--113, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Herlihy and J. E. B. Moss. Transactional memory: architectural support for lock-free data structures. In ISCA '93, pages 289--300. ACM, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Kamil, J. Su, and K. Yelick. Making sequential consistency practical in Titanium. In Proceedings of the 2005 ACM/IEEE conference on Supercomputing, page 15. IEEE Computer Society, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Journal of Parallel and Distributed Computing, 38 (2): 130--144, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. L. Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21 (7): 558--565, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE transactions on computers, 100 (28): 690--691, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. Lattner and V. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of CGO. IEEE Computer Society, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. B. Lucia, L. Ceze, K. Strauss, S. Qadeer, and H. Boehm. Conflict exceptions: Providing simple parallel language semantics with precise hardware exceptions. In ISCA '10, pages 210--221, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. berg, Högberg, Larsson, Moestedt, and Werner}simicsS. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hållberg, J. Högberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A full system simulation platform. IEEE Computer, 35 (2): 50--58, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Manson, W. Pugh, and S. Adve. The Java memory model. In POPL '05, pages 378--391. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. D. Marino, A. Singh, T. Millstein, M. Musuvathi, and S. Narayanasamy. DRFx: A simple and efficient memory model for concurrent programming languages. Technical Report 090021, UCLA Computer Science Department, Nov. 2009. URL http://fmdb.cs.ucla.edu/Treports/090021.pdf.Google ScholarGoogle Scholar
  34. D. Marino, A. Singh, T. Millstein, M. Musuvathi, and S. Narayanasamy. DRFx: A simple and efficient memory model for concurrent programming languages. In PLDI '10, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. A. Muzahid, D. Suarez, S. Qi, and J. Torrellas. SigRace: signature-based data race detection. In ISCA '09, pages 337--348, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. N. Neelakantam, C. Blundell, J. Devietti, M. M. K. Martin, and C. Zilles. The FeS2 simulator. In Poster session at ASPLOS '08, 2008. URL http://fes2.cs.uiuc.edu/acknowledgements.html.Google ScholarGoogle Scholar
  37. M. Prvulovic and J. Torrelas. Reenact: Using thread-level speculation mechanisms to debug data races in multithreaded codes. In Proceedings of ISCA, San Diego, CA, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. P. Ranganathan, V. Pai, and S. Adve. Using speculative retirement and larger instruction windows to narrow the performance gap between memory consistency models. In SPAA '97, pages 199--210, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. J. Sevcík. Private communication.Google ScholarGoogle Scholar
  40. J. Sevcík and D. Aspinall. On validity of program transformations in the Java memory model. In ECOOP '08, pages 27--51, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. D. Shasha and M. Snir. Efficient and correct execution of parallel programs that share memory. ACM Transactions on Programming Languages and Systems (TOPLAS), 10 (2): 282--312, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. A. Singh, D. Marino, S. Narayanasamy, T. Millstein, and M. Musuvathi. Efficient processor support for DRFx: Technical report. Technical Report 110002, UCLA Computer Science Department, Mar. 2011.Google ScholarGoogle Scholar
  43. Z. Sura, X. Fang, C. Wong, S. Midkiff, J. Lee, and D. Padua. Compiler techniques for high performance sequentially consistent java programs. In Proceedings of PPoPP, pages 2--13, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. T. Wenisch, A. Ailamaki, B. Falsafi, and A. Moshovos. Mechanisms for store-wait-free multiprocessors. In ISCA'07, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The splash-2 programs: Characterization and methodological considerations. In ISCA'95, pages 24--36, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Efficient processor support for DRFx, a memory model with exceptions

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 46, Issue 3
      ASPLOS '11
      March 2011
      407 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/1961296
      Issue’s Table of Contents
      • cover image ACM Conferences
        ASPLOS XVI: Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
        March 2011
        432 pages
        ISBN:9781450302661
        DOI:10.1145/1950365

      Copyright © 2011 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 March 2011

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!