Abstract
Many scripting languages use a Global Interpreter Lock (GIL) to simplify the internal designs of their interpreters, but this kind of lock severely lowers the multi-thread per-formance on multi-core machines. This paper presents our first results eliminating the GIL in Ruby using Hardware Transactional Memory (HTM) in the IBM zEnterprise EC12 and Intel 4th Generation Core processors. Though prior prototypes replaced a GIL with HTM, we tested real-istic programs, the Ruby NAS Parallel Benchmarks (NPB), the WEBrick HTTP server, and Ruby on Rails. We devised a new technique to dynamically adjust the transaction lengths on a per-bytecode basis, so that we can optimize the likelihood of transaction aborts against the relative overhead of the instructions to begin and end the transactions. Our results show that HTM achieved 1.9- to 4.4-fold speedups in the NPB programs over the GIL with 12 threads, and 1.6- and 1.2-fold speedups in WEBrick and Ruby on Rails, respectively. The dynamic transaction-length adjustment chose the best transaction lengths for any number of threads and applications with sufficiently long running times.
- Blundell, C., Raghavan, A., and Martin, M. M. K. RETCON: transactional repair without replay. In ISCA, pp. 258--269, 2010. Google Scholar
Digital Library
- Cascaval, C., Blundell, C., Michael, M., Cain, H. W., Wu, P., Chiras, S., and Chatterjee, S. Software transactional memory: why is it only a research toy? ACM Queue, 6(5), pp. 46--58, 2008. Google Scholar
Digital Library
- Dice, D., Lev, Y., Moir, M., and Nussbaum, D. Early experience with a commercial hardware transactional memory implementation. In ASPLOS, pp. 157--168, 2009. Google Scholar
Digital Library
- ECMAScript. http://www.ecmascript.org/.Google Scholar
- Haring, R. A., Ohmacht, M., Fox, T. W., Gschwind, M. K., Satterfield, D. L., Sugavanam, K., Coteus, P. W., Heidelberger, P., Blumrich, M. A., Wisniewski, R.W., Gara, A., Chiu, G. L.-T., Boyle, P.A., Chist, N.H., and Kim, C. The IBM Blue Gene/Q compute chip. IEEE Micro, 32(2), pp. 48--60, 2012. Google Scholar
Digital Library
- IBM. Power ISA Transactional Memory. Power.org, 2012.Google Scholar
- IBM. z/Architecture Principles of Operation Tenth Edition (September, 2012). http://publibfi.boulder.ibm.com/epubs/pdf/dz9zr009.pdf.Google Scholar
- Intel Corporation. Intel Architecture Instruction Set Extensions Programming Reference. 319433-012a edition, 2012.Google Scholar
- IronPython, http://ironpython.codeplex.com/.Google Scholar
- IronRuby, http://www.ironruby.net/.Google Scholar
- Jacobi, C., Slegel, T., and Greinder, D. Transactional memory architecture and implementation for IBM System z. In MICRO 45, 2012. Google Scholar
Digital Library
- JRuby, http://jruby.org/.Google Scholar
- Jython, http://www.jython.org/.Google Scholar
- Lua, http://www.lua.org/Google Scholar
- Minh, C. C., Chung, J., Kozyrakis, C., and Olukotun, K. STAMP: Stanford transactional applications for multi-processing. In IISWC, pp. 35--46, 2008.Google Scholar
- NAS Parallel Benchmarks, http://www.nas.nasa.gov/publications/npb.html.Google Scholar
- Nose, T. Ruby version of NAS Parallel Benchmarks 3.0. http://www-hiraki.is.s.u-tokyo.ac.jp/members/tknose/.Google Scholar
- Odaira, R. and Castanos, J. G. Eliminating global interpreter locks in Ruby through hardware transactional memory. Research Report RT0950, IBM Research -- Tokyo, 2013.Google Scholar
- Perl threads, http://perldoc.perl.org/perlthrtut.html.Google Scholar
- PyPy Status Blog. We need Software Transactional Memory. http://morepypy.blogspot.jp/2011/08/we-need-software-transactional-memory.html.Google Scholar
- Python programming language. http://www.python.org/.Google Scholar
- Rajwar, R. and Goodman, J. R. Speculative lock elision: enabling highly concurrent multithreaded execution. In MICRO, pp. 294--305, 2001. Google Scholar
Digital Library
- Riley, N. and Zilles, C. Hardware transactional memory support for lightweight dynamic language evolution. In Dynamic Language Symposium (OOPSLA Companion), pp. 998--1008, 2006. Google Scholar
Digital Library
- Rubinius, http://rubini.us/.Google Scholar
- Ruby on Rails. http://rubyonrails.org/.Google Scholar
- Ruby programming language, http://www.ruby-lang.org/.Google Scholar
- Shum, C.-L. IBM zNext: the 3rd generation high frequency micro-processor chip. In HotChips 24, 2012.Google Scholar
- Stuecheli, J. Next Generation POWER microprocessor. In HotChips 25, 2013.Google Scholar
- Tabba, F. Adding concurrency in python using a commercial processor's hardware transactional memory support. ACM SIGARCH Computer Architecture News, 38(5), pp. 12--19, 2010. Google Scholar
Digital Library
- Tatsubori, M., Tozawa, A., Suzumura, T., Trent, S., Onodera, T. Evaluation of a just-in-time compiler retrofitted for PHP. In VEE, pp. 121--132, 2010. Google Scholar
Digital Library
- Wang, A., Gaudet, M., Wu, P., Ohmacht, M., Amaral, J. N., Barton, C., Silvera, R., Michael, M. M. Evaluation of Blue Gene/Q hardware support for transactional memories. In PACT, pp. 127--136, 2012. Google Scholar
Digital Library
Index Terms
Eliminating global interpreter locks in ruby through hardware transactional memory
Recommendations
Eliminating global interpreter locks in ruby through hardware transactional memory
PPoPP '14: Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programmingMany scripting languages use a Global Interpreter Lock (GIL) to simplify the internal designs of their interpreters, but this kind of lock severely lowers the multi-thread per-formance on multi-core machines. This paper presents our first results ...
Transactional Lock Elision Meets Combining
PODC '17: Proceedings of the ACM Symposium on Principles of Distributed ComputingFlat combining (FC) and transactional lock elision (TLE) are two techniques that facilitate efficient multi-thread access to a sequentially implemented data structure protected by a lock. FC allows threads to delegate their operations to another (...
Refined transactional lock elision
PPoPP '16Transactional lock elision (TLE) is a well-known technique that exploits hardware transactional memory (HTM) to introduce concurrency into lock-based software. It achieves that by attempting to execute a critical section protected by a lock in an atomic ...







Comments