skip to main content
research-article
Public Access
Distinguished Paper

Hybrid STM/HTM for nested transactions on OpenJDK

Published:19 October 2016Publication History
Skip Abstract Section

Abstract

Transactional memory (TM) has long been advocated as a promising pathway to more automated concurrency control for scaling concurrent programs running on parallel hardware. Software TM (STM) has the benefit of being able to run general transactional programs, but at the significant cost of overheads imposed to log memory accesses, mediate access conflicts, and maintain other transaction metadata. Recently, hardware manufacturers have begun to offer commodity hardware TM (HTM) support in their processors wherein the transaction metadata is maintained "for free" in hardware. However, HTM approaches are only best-effort: they cannot successfully run all transactional programs, whether because of hardware capacity issues (causing large transactions to fail), or compatibility restrictions on the processor instructions permitted within hardware transactions (causing transactions that execute those instructions to fail). In such cases, programs must include failure-handling code to attempt the computation by some other software means, since retrying the transaction would be futile. Thus, a canonical use of HTM is lock elision: replacing lock regions with transactions, retrying some number of times in the case of conflicts, but falling back to locking when HTM fails for other reasons.

Here, we describe how software and hardware schemes can combine seamlessly into a hybrid system in support of transactional programs, allowing use of low-cost HTM when it works, but reverting to STM when it doesn't. We describe heuristics used to make this choice dynamically and automatically, but allowing the transition back to HTM opportunistically. Our implementation is for an extension of Java having syntax for both open and closed nested transactions, and boosting, running on the OpenJDK, with dynamic injection of STM mechanisms (into code variants used under STM) and HTM instructions (into code variants used under HTM). Both schemes are compatible to allow different threads to run concurrently with either mechanism, while preserving transaction safety. Using a standard synthetic benchmark we demonstrate that HTM offers significant acceleration of both closed and open nested transactions, while yielding parallel scaling up to the limits of the hardware, whereupon scaling in software continues but with the penalty to throughput imposed by software mechanisms.

References

  1. K. Chapman, A. L. Hosking, J. E. B. Moss, and T. Richards. Closed and open nested atomic actions for Java: Language design and prototype implementation. In International Conference on Principles and Practice of Programming on the Java Platform: Virtual Machines, Languages, and Tools, pages 169–180, Cracow, Poland, Oct. 2014. doi: 10.1145/2647508. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2647525.Google ScholarGoogle Scholar
  3. T. Crain, V. Gramoli, and M. Raynal. A contention-friendly methodology for search structures. Research report, INRIA, Feb. 2012.Google ScholarGoogle Scholar
  4. T. Crain, V. Gramoli, and M. Raynal. A speculation-friendly binary search tree. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 161–170, New Orleans, Louisiana, Feb. 2012. doi: 10.1145/2145816. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 2145837.Google ScholarGoogle Scholar
  6. L. Dalessandro, F. Carouge, S. White, Y. Lev, M. Moir, M. L. Scott, and M. F. Spear. Hybrid NOrec: A case study in the effectiveness of best effort hardware transactional memory. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems, pages 39–52, Newport Beach, California, Mar. 2011. doi: 10.1145/ 1950365.1950373. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Damron, A. Fedorova, Y. Lev, V. Luchangco, M. Moir, and D. Nussbaum. Hybrid transactional memory. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems, pages 336–346, San Jose, California, Oct. 2006. doi: 10.1145/1168857.1168900. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Dice, Y. Lev, M. Moir, and D. Nussbaum. Early experience with a commercial hardware transactional memory implementation. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems, pages 157–168, Washington, DC, Mar. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. doi: 10.1145/1508244.1508263.Google ScholarGoogle Scholar
  10. D. Dice, A. Kogan, and Y. Lev. Refined transactional lock elision. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 19:1–19:12, Barcelona, Spain, Mar. 2016. doi: 10.1145/2851141.2851162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Diegues and P. Romano. Self-tuning Intel transactional synchronization extensions. In USENIX International Conference on Autonomic Computing, pages 209–219, Philadelphia, PA, June 2014.Google ScholarGoogle Scholar
  12. N. Diegues, P. Romano, and L. Rodrigues. Virtues and limitations of commodity hardware transactional memory. In International Conference on Parallel Architectures and Compilation Techniques, pages 3–14, Aug. 2014. doi: 10. 1145/2628071.2628080. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. F. Ellen, P. Fatourou, E. Ruppert, and F. van Breugel. Nonblocking binary search trees. In ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, pages 131– 140, Zürich, Switzerland, July 2010. doi: 10.1145/1835698. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 1835736.Google ScholarGoogle Scholar
  15. V. Gramoli. More than you ever wanted to know about synchronization: Synchrobench, measuring the impact of the synchronization on concurrent algorithms. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 1–10, San Francisco, California, Feb. 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. doi: 10.1145/2688500.2688501.Google ScholarGoogle Scholar
  17. M. Herlihy and E. Koskinen. Transactional boosting: A methodology for highly-concurrent transactional objects. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 207–216, Salt Lake City, Utah, Feb. 2008. doi: 10.1145/1345206.1345237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. P. Herlihy and J. M. Wing. Linearizability: A correctness criterion for concurrent objects. ACM Trans. Prog. Lang. Syst., 12(3):463–492, July 1990. doi: 10.1145/78969.78972. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Intel. Intel Transactional Synchronization Extensions (Intel TSX) Programming Considerations.Google ScholarGoogle Scholar
  20. C. Jacobi, T. Slegel, and D. Greiner. Transactional memory architecture and implementation for IBM system Z. In International Symposium on Microarchitecture, pages 25–36, Dec. 2012. doi: 10.1109/MICRO.2012.12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Kasko, S. Kobylyanskiy, and A. Mironchenko. OpenJDK Cookbook. Packt Publishing, Jan. 2015. ISBN 1849698406.Google ScholarGoogle Scholar
  22. G. Korland, N. Shavit, and P. Felber. Noninvasive concurrency with Java STM. In Workshop on Programmability Issues for Heterogeneous Multicores, Pisa, Italy, Jan. 2010.Google ScholarGoogle Scholar
  23. S. Kumar, M. Chu, C. J. Hughes, P. Kundu, and A. Nguyen. Hybrid transactional memory. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 209–220, Mar. 2006. doi: 10.1145/1122971.1123003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Lev, M. Moir, and D. Nussbaum. PhTM: Phased transactional memory. In ACM SIGPLAN Workshop on Transactional Computing, Aug. 2007.Google ScholarGoogle Scholar
  25. A. Matveev and N. Shavit. Reduced hardware NORec: A safe and scalable hybrid transactional memory. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems, pages 59–71, Istanbul, Turkey, Mar. 2015. doi: 10.1145/2694344.2694393. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. E. B. Moss. Nested transactions: an approach to reliable distributed computing. PhD thesis, Massachusetts Institute of Technology, Cambridge, Massachusetts, 1981.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. E. B. Moss and A. L. Hosking. Nested transactional memory: model and architecture sketches. Science of Computer Programming, 63:186–201, Dec. 2006. doi: 10.1016/j.scico. 2006.05.010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. Ni, V. Menon, A.-R. Adl-Tabatabai, A. L. Hosking, R. L. Hudson, J. E. B. Moss, B. Saha, and T. Shpeisman. Open nesting in software transactional memory. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 68–78, San Jose, California, Mar. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. doi: 10.1145/1229428.1229442.Google ScholarGoogle Scholar
  30. M. Paleczny, C. Vick, and C. Click. The Java Hotspot server compiler. In USENIX Java Virtual Machine Research and Technology Symposium, Monterey, California, Apr. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. Rajwar and J. R. Goodman. Speculative lock elision: enabling highly concurrent multithreaded execution. In International Symposium on Microarchitecture, pages 294– 305, Austin, Texas, Dec. 2001. doi: 10.1109/MICRO.2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. 991127.Google ScholarGoogle Scholar
  33. T. Riegel, P. Marlier, M. Nowack, P. Felber, and C. Fetzer. Optimizing hybrid transactional memory: The importance of nonspeculative operations. In ACM Symposium on Parallelism in Algorithms and Architectures, pages 53–64, San Jose, California, June 2011. doi: 10.1145/1989493.1989501. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. R. M. Yoo, C. J. Hughes, K. Lai, and R. Rajwar. Performance evaluation of Intel transactional synchronization extensions for high-performance computing. In International Conference on High Performance Computing, Networking, Storage and Analysis, pages 19:1–19:11, Denver, Colorado, Nov. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. doi: 10.1145/2503210.2503232.Google ScholarGoogle Scholar

Index Terms

  1. Hybrid STM/HTM for nested transactions on OpenJDK

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!