Abstract
Just-in-time (JIT) compilation of running programs provides more optimization opportunities than offline compilation. Modern JIT compilers, such as those in virtual machines like Oracle's HotSpot for Java or Google's V8 for JavaScript, rely on dynamic profiling as their key mechanism to guide optimizations. While these JIT compilers offer good average performance, their behavior is a black box and the achieved performance is highly unpredictable.
In this paper, we propose to turn JIT compilation into a precision tool by adding two essential and generic metaprogramming facilities: First, allow programs to invoke JIT compilation explicitly. This enables controlled specialization of arbitrary code at run-time, in the style of partial evaluation. It also enables the JIT compiler to report warnings and errors to the program when it is unable to compile a code path in the demanded way. Second, allow the JIT compiler to call back into the program to perform compile-time computation. This lets the program itself define the translation strategy for certain constructs on the fly and gives rise to a powerful JIT macro facility that enables "smart" libraries to supply domain-specific compiler optimizations or safety checks.
We present Lancet, a JIT compiler framework for Java bytecode that enables such a tight, two-way integration with the running program. Lancet itself was derived from a high-level Java bytecode interpreter: staging the interpreter using LMS (Lightweight Modular Staging) produced a simple bytecode compiler. Adding abstract interpretation turned the simple compiler into an optimizing compiler. This fact provides compelling evidence for the scalability of the staged-interpreter approach to compiler construction.
In the case of Lancet, JIT macros also provide a natural interface to existing LMS-based toolchains such as the Delite parallelism and DSL framework, which can now serve as accelerator macros for arbitrary JVM bytecode.
- M. S. Ager, D. Biernacki, O. Danvy, and J. Midtgaard. From interpreter to compiler and virtual machine: A functional derivation. Technical report, BRICS, 2003.Google Scholar
- B. Alpern, S. Augart, S. M. Blackburn, M. A. Butrico, A. Cocchi, P. Cheng, J. Dolby, S. J. Fink, D. Grove, M. Hind, K. S. McKinley, M. F. Mergen, J. E. B. Moss, T. A. Ngo, V. Sarkar, and M. Trapp. The jikes research virtual machine project: Building an open-source research community. IBM Systems Journal, 44(2):399--418, 2005. Google Scholar
Digital Library
- M. Arnold, S. J. Fink, D. Grove, M. Hind, and P. F. Sweeney. Adaptive optimization in the jalapeño jvm. In M. B. Rosson and D. Lea, editors, OOPSLA, pages 47--65. ACM, 2000. Google Scholar
Digital Library
- D.-B. Authors. Implicitly parallel distributed execution for domain-specific languages. Under submission, 2013.Google Scholar
- O. Beckmann, A. Houghton, M. R. Mellor, and P. H. J. Kelly. Runtime code generation in c++ as a foundation for domain-specific optimisation. In C. Lengauer, D. S. Batory, C. Consel, and M. Odersky, editors, Domain-Specific Program Generation, volume 3016 of Lecture Notes in Computer Science, pages 291--306. Springer, 2003.Google Scholar
- C. F. Bolz, A. Cuni, M. Fijalkowski, and A. Rigo. Tracing the meta-level: Pypy's tracing jit compiler. In Proceedings of the 4th workshop on the Implementation, Compilation, Optimization of Object-Oriented Languages and Programming Systems, pages 18--25. ACM, 2009. Google Scholar
Digital Library
- G. Bracha and D. Griswold. Strongtalk: Typechecking smalltalk in a production environment. In T. Babitsky and J. Salmons, editors, OOPSLA, pages 215--230. ACM, 1993. Google Scholar
Digital Library
- K. J. Brown, A. K. Sujeeth, H. Lee, T. Rompf, H. Chafi, M. Odersky, and K. Olukotun. A heterogeneous parallel framework for domain-specific languages. PACT, 2011. Google Scholar
Digital Library
- C. Calcagno, W. Taha, L. Huang, and X. Leroy. Implementing multi-stage languages using asts, gensym, and reflection. In GPCE, pages 57--76, 2003. Google Scholar
Digital Library
- C. Chambers, D. Ungar, and E. Lee. An efficient implementation of self, a dynamically-typed object-oriented language based on prototypes. Lisp and Symbolic Computation, 4(3):243--281, 1991. Google Scholar
Digital Library
- A. Cheung, O. Arden, S. Madden, A. Solar-Lezama, and A. Myers. Statusquo: Making familiar abstractions perform using program analysis. In CIDR, 2013.Google Scholar
- C. Click. Fixing the inlining problem. http://www.azulsystems.com/blog/cliff/2011-04-04-fixing-the-inlining-problem, 2011.Google Scholar
- C. Consel and S.-C. Khoo. Parameterized partial evaluation. ACM Trans. Program. Lang. Syst., 15(3):463--493, 1993. Google Scholar
Digital Library
- O. Danvy and A. Filinski. Representing Control: A Study of the CPS Transformation. Mathematical Structures in Computer Science, 2(4):361--391, 1992.Google Scholar
Cross Ref
- M. Felleisen. The Calculi of Lambda-v-CS Conversion: A Syntactic Theory of Control and State in Imperative Higher-Order Programming Languages. PhD thesis, 1987. Google Scholar
Digital Library
- Y. Futamura. Partial evaluation of computation process - an approach to a compiler-compiler. Higher-Order and Symbolic Computation, 12(4):381--391, 1999. Google Scholar
Digital Library
- A. Georges, D. Buytaert, and L. Eeckhout. Statistically rigorous java performance evaluation. In R. P. Gabriel, D. F. Bacon, C. V. Lopes, and G. L. S. Jr., editors, OOPSLA, pages 57--76. ACM, 2007. Google Scholar
Digital Library
- Google. Goole Web Toolkit. http://code.google.com/webtoolkit/.Google Scholar
- Google. The V8 JavaScript VM, 2009. https://developers.google.com/v8/intro.Google Scholar
- Google. A new crankshaft for V8, 2010. http://blog.chromium.org/2010/12/new-crankshaft-for-v8.html.Google Scholar
- B. Grant, M. Mock, M. Philipose, C. Chambers, and S. J. Eggers. Dyc: an expressive annotation-directed dynamic compiler for c. Theor. Comput. Sci., 248(1-2):147--199, 2000. Google Scholar
Digital Library
- T. Grust, J. Rittinger, and T. Schreiber. Avalanche-safe linq compilation. PVLDB, 3(1):162--172, 2010. Google Scholar
Digital Library
- U. Hölzle and O. Agesen. Dynamic versus static optimization techniques for object-oriented languages. TAPOS, 1(3):167--188, 1995. Google Scholar
Digital Library
- U. Hölzle, C. Chambers, and D. Ungar. Optimizing dynamically-typed object-oriented languages with polymorphic inline caches. In P. America, editor, ECOOP, volume 512 of Lecture Notes in Computer Science, pages 21--38. Springer, 1991. Google Scholar
Digital Library
- U. Hölzle, C. Chambers, and D. Ungar. Debugging optimized code with dynamic deoptimization. In S. I. Feldman and R. L. Wexelblat, editors, PLDI, pages 32--43. ACM, 1992. Google Scholar
Digital Library
- U. Hölzle and D. Ungar. Optimizing dynamically-dispatched calls with run-time type feedback. In V. Sarkar, B. G. Ryder, and M. L. Soffa, editors, PLDI, pages 326--336. ACM, 1994. Google Scholar
Digital Library
- U. Hölzle and D. Ungar. Reconciling responsiveness with performance in pure object-orieted languages. ACM Trans. Program. Lang. Syst., 18(4):355--400, 1996. Google Scholar
Digital Library
- N. D. Jones, C. K. Gomard, and P. Sestoft. Partial evaluation and automatic program generation. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1993. Google Scholar
Digital Library
- O. Kiselyov, K. N. Swadi, and W. Taha. A methodology for generating verified combinatorial circuits. In G. C. Buttazzo, editor, EMSOFT, pages 249--258. ACM, 2004. Google Scholar
Digital Library
- A. V. Klimov. A java supercompiler and its application to verification of cache-coherence protocols. In A. Pnueli, I. Virbitskaite, and A. Voronkov, editors, Ershov Memorial Conference, volume 5947 of Lecture Notes in Computer Science, pages 185--192. Springer, 2009. Google Scholar
Digital Library
- G. Kossakowski, N. Amin, T. Rompf, and M. Odersky. Javascript as an embedded dsl. In ECOOP, pages 409--434, 2012. Google Scholar
Digital Library
- T. Kotzmann, C. Wimmer, H. Mössenböck, T. Rodriguez, K. Russell, and D. Cox. Design of the java hotspot client compiler for java 6. TACO, 5(1), 2008. Google Scholar
Digital Library
- H. Lee, K. J. Brown, A. K. Sujeeth, H. Chafi, T. Rompf, M. Odersky, and K. Olukotun. Implementing domain-specific languages for heterogeneous parallel computing. IEEE Micro, 31(5):42--53, 2011. Google Scholar
Digital Library
- S. Lerner, D. Grove, and C. Chambers. Composing dataflow analyses and transformations. SIGPLAN Not., 37:270--282, January 2002. Google Scholar
Digital Library
- E. Meijer, B. Beckman, and G. Bierman. LINQ: Reconciling object, relations and XML in the .NET framework. In SIGMOD '06: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, SIGMOD, pages 706--706, New York, NY, USA, 2006. ACM. Google Scholar
Digital Library
- T. A. Mogensen. Partially static structures in a self-applicable partial evaluator. 1988.Google Scholar
- Oracle. OpenJDK: Graal project, 2012. http://openjdk.java.net/projects/graal/.Google Scholar
- M. Paleczny, C. A. Vick, and C. Click. The java hotspot server compiler. In Java Virtual Machine Research and Technology Symposium. USENIX, 2001. Google Scholar
Digital Library
- A. Rigo and S. Pedroni. Pypy's approach to virtual machine construction. In P. L. Tarr and W. R. Cook, editors, OOPSLA Companion, pages 944--953. ACM, 2006. Google Scholar
Digital Library
- T. Rompf and M. Odersky. Lightweight modular staging: a pragmatic approach to runtime code generation and compiled dsls. Commun. ACM, 55(6):121--130, 2012. Google Scholar
Digital Library
- T. Rompf, A. K. Sujeeth, N. Amin, K. Brown, V. Jovanovic, H. Lee, M. Jonnalagedda, K. Olukotun, and M. Odersky. Optimizing data structures in high-level programs. POPL, 2013. Google Scholar
Digital Library
- T. Rompf, A. K. Sujeeth, H. Lee, K. J. Brown, H. Chafi, M. Odersky, and K. Olukotun. Building-blocks for performance oriented DSLs. DSL, 2011.Google Scholar
Cross Ref
- U. P. Schultz, J. L. Lawall, and C. Consel. Automatic program specialization for java. ACM Trans. Program. Lang. Syst., 25(4):452--499, 2003. Google Scholar
Digital Library
- A. Shali and W. R. Cook. Hybrid partial evaluation. OOPSLA, pages 375--390, 2011. Google Scholar
Digital Library
- M. Sperber and P. Thiemann. Realistic compilation by partial evaluation. In PLDI, pages 206--214, 1996. Google Scholar
Digital Library
- A. K. Sujeeth, H. Lee, K. J. Brown, T. Rompf, M. Wu, A. R. Atreya, M. Odersky, and K. Olukotun. OptiML: an implicitly parallel domain-specific language for machine learning. In Proceedings of the 28th International Conference on Machine Learning, ICML, 2011.Google Scholar
- A. K. Sujeeth, T. Rompf, K. J. Brown, H. Lee, H. Chafi, V. Popic, M. Wu, A. Prokopec, V. Jovanovic, M. Odersky, and K. Olukotun. Composition and reuse with compiled domain-specific languages. In ECOOP, 2013. Google Scholar
Digital Library
- W. Taha and T. Sheard. Metaml and multi-stage programming with explicit annotations. Theor. Comput. Sci., 248(1-2):211--242, 2000. Google Scholar
Digital Library
- P. Thiemann. Partially static operations. In E. Albert and S.-C. Mu, editors, PEPM, pages 75--76. ACM, 2013. Google Scholar
Digital Library
- P. Thiemann and D. Dussart. Partial evaluation for higher-order languages with state. Technical report, 1999.Google Scholar
- V. F. Turchin. The concept of a supercompiler. ACM Trans. Program. Lang. Syst., 8(3):292--325, 1986. Google Scholar
Digital Library
- J. C. Vogt. Type Safe Integration of Query Languages into Scala. Diplomarbeit, RWTH Aachen, Germany, 2011.Google Scholar
- B. Wiedermann and W. R. Cook. Remote batch invocation for sql databases. In DBPL, 2011.Google Scholar
- C. Wimmer, M. Haupt, M. L. V. de Vanter, M. J. Jordan, L. Daynès, and D. Simon. Maxine: An approachable virtual machine for, and in, java. TACO, 9(4):30, 2013. Google Scholar
Digital Library
- T. Würthinger. Extending the graal compiler to optimize libraries. In C. V. Lopes and K. Fisher, editors, OOPSLA Companion, pages 41--42. ACM, 2011. Google Scholar
Digital Library
- T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. One vm to rule them all. In A. L. Hosking, P. T. Eugster, and R. Hirschfeld, editors, Onward!, pages 187--204. ACM, 2013. Google Scholar
Digital Library
- T. Würthinger, A. Wöß, L. Stadler, G. Duboscq, D. Simon, and C. Wimmer. Self-optimizing ast interpreters. In A. Warth, editor, DLS, pages 73--82. ACM, 2012. Google Scholar
Digital Library
Index Terms
Surgical precision JIT compilers
Recommendations
Surgical precision JIT compilers
PLDI '14: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and ImplementationJust-in-time (JIT) compilation of running programs provides more optimization opportunities than offline compilation. Modern JIT compilers, such as those in virtual machines like Oracle's HotSpot for Java or Google's V8 for JavaScript, rely on dynamic ...
Making collection operations optimal with aggressive JIT compilation
SCALA 2017: Proceedings of the 8th ACM SIGPLAN International Symposium on ScalaFunctional collection combinators are a neat and widely accepted data processing abstraction. However, their generic nature results in high abstraction overheads -- Scala collections are known to be notoriously slow for typical tasks. We show that ...
Mitigating JIT compilation latency in virtual execution environments
VEE 2019: Proceedings of the 15th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution EnvironmentsMany Virtual Execution Environments (VEEs) rely on Just-in-time (JIT) compilation technology for code generation at runtime, e.g. in Dynamic Binary Translation (DBT) systems or language Virtual Machines (VMs). While JIT compilation improves native ...







Comments