Abstract
High level data structures are a cornerstone of modern programming and at the same time stand in the way of compiler optimizations. In order to reason about user- or library-defined data structures compilers need to be extensible. Common mechanisms to extend compilers fall into two categories. Frontend macros, staging or partial evaluation systems can be used to programmatically remove abstraction and specialize programs before they enter the compiler. Alternatively, some compilers allow extending the internal workings by adding new transformation passes at different points in the compile chain or adding new intermediate representation (IR) types. None of these mechanisms alone is sufficient to handle the challenges posed by high level data structures. This paper shows a novel way to combine them to yield benefits that are greater than the sum of the parts.
Instead of using staging merely as a front end, we implement internal compiler passes using staging as well. These internal passes delegate back to program execution to construct the transformed IR. Staging is known to simplify program generation, and in the same way it can simplify program transformation. Defining a transformation as a staged IR interpreter is simpler than implementing a low-level IR to IR transformer. With custom IR nodes, many optimizations that are expressed as rewritings from IR nodes to staged program fragments can be combined into a single pass, mitigating phase ordering problems. Speculative rewriting can preserve optimistic assumptions around loops.
We demonstrate several powerful program optimizations using this architecture that are particularly geared towards data structures: a novel loop fusion and deforestation algorithm, array of struct to struct of array conversion, object flattening and code generation for heterogeneous parallel devices. We validate our approach using several non trivial case studies that exhibit order of magnitude speedups in experiments.
Supplemental Material
- S. Ackermann, V. Jovanovic, T. Rompf, and M. Odersky. Jet: An embedded dsl for high performance big data processing. BigData, 2012. http://infoscience.epfl.ch/record/181673/files/paper.pdf.Google Scholar
- M. S. Ager, O. Danvy, and H. K. Rohde. Fast partial evaluation of pattern matching in strings. ACM Trans. Program. Lang. Syst., 28 (4): 696--714, 2006. Google Scholar
Digital Library
- J. Auerbach, D. F. Bacon, P. Cheng, and R. Rabbah. Lime: a java-compatible and synthesizable language for heterogeneous architectures. OOPSLA, 2010. Google Scholar
Digital Library
- M. Bravenboer, A. van Dam, K. Olmos, and E. Visser. Program transformation with scoped dynamic rewrite rules. Fundam. Inf., 69: 123--178, July 2005. Google Scholar
Digital Library
- K. J. Brown, A. K. Sujeeth, H. Lee, T. Rompf, H. Chafi, M. Odersky, and K. Olukotun. A heterogeneous parallel framework for domain-specific languages. PACT, 2011. Google Scholar
Digital Library
- J. A. Brzozowski. Derivatives of regular expressions. J. ACM, 11 (4): 481--494, 1964. Google Scholar
Digital Library
- C. Calcagno, W. Taha, L. Huang, and X. Leroy. Implementing multi-stage languages using asts, gensym, and reflection. In GPCE, 2003. Google Scholar
Digital Library
- J. Carette, O. Kiselyov, and C. chieh Shan. Finally tagless, partially evaluated: Tagless staged interpreters for simpler typed languages. J. Funct. Program., 19 (5): 509--543, 2009. Google Scholar
Digital Library
- C. Click and K. D. Cooper. Combining analyses, combining optimizations. ACM Trans. Program. Lang. Syst., 17: 181--196, March 1995. Google Scholar
Digital Library
- C. Consel and O. Danvy. Partial evaluation of pattern matching in strings. Inf. Process. Lett., 30 (2): 79--86, 1989. Google Scholar
Digital Library
- W. R. Cook, B. Delaware, T. Finsterbusch, A. Ibrahim, and B. Wiedermann. Model transformation by partial evaluation of model interpreters. Technical Report TR-09-09, UT Austin Department of Computer Science, 2008.Google Scholar
- D. Coutts, R. Leshchinskiy, and D. Stewart. Stream fusion: from lists to streams to nothing at all. In ICFP, 2007. Google Scholar
Digital Library
- T. Ekman and G. Hedin. The jastadd system - modular extensible compiler construction. Sci. Comput. Program., 69 (1--3): 14--26, 2007. Google Scholar
Digital Library
- C. Elliott, S. Finne, and O. de Moor. Compiling embedded languages. In W. Taha, editor, phSemantics, Applications, and Implementation of Program Generation, volume 1924 of Lecture Notes in Computer Science, pages 9--26. Springer Berlin / Heidelberg, 2000. Google Scholar
Digital Library
- Y. Futamura. Partial evaluation of computation process - an approach to a compiler-compiler. Higher-Order and Symbolic Computation, 12 (4): 381--391, 1999. Google Scholar
Digital Library
- C. Grelck, K. Hinckfuß, and S.-B. Scholz. With-loop fusion for data locality and parallelism. IFL, 2006. Google Scholar
Digital Library
- D. M. Groenewegen, Z. Hemel, L. C. L. Kats, and E. Visser. WebDSL: a domain-specific language for dynamic web applications. In OOPSLA Companion, 2008. Google Scholar
Digital Library
- C. Hofer, K. Ostermann, T. Rendel, and A. Moors. Polymorphic embedding of DSLs. GPCE, 2008. Google Scholar
Digital Library
- JetBrains. Meta Programming System, 2009. URL http://www.jetbrains.com/mps/.Google Scholar
- N. D. Jones, C. K. Gomard, and P. Sestoft. Partial evaluation and automatic program generation. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1993. Google Scholar
Digital Library
- S. L. P. Jones, R. Leshchinskiy, G. Keller, and M. M. T. Chakravarty. Harnessing the Multicores: Nested Data Parallelism in Haskell. In FSTTCS, 2008.Google Scholar
- S. P. Jones, A. Tolmach, and T. Hoare. Playing by the rules: rewriting as a practical optimisation technique in ghc. Haskell, 2001.Google Scholar
- S. Karmesin, J. Crotinger, J. Cummings, S. Haney, W. Humphrey, J. Reynders, S. Smith, and T. J. Williams. Array design and expression evaluation in pooma ii. In ISCOPE, 1998. Google Scholar
Digital Library
- L. C. L. Kats and E. Visser. The Spoofax language workbench. rules for declarative specification of languages and IDEs. In SPLASH/OOPSLA Companion, 2010. Google Scholar
Digital Library
- R. Kelsey and P. Hudak. Realistic compilation by program transformation. In POPL, 1989. Google Scholar
Digital Library
- K. Kennedy, B. Broom, A. Chauhan, R. Fowler, J. Garvin, C. Koelbel, C. McCosh, and J. Mellor-Crummey. Telescoping languages: A system for automatic generation of domain languages. Proceedings of the IEEE, 93 (3): 387--408, 2005.Google Scholar
Cross Ref
- G. Kossakowski, N. Amin, T. Rompf, and M. Odersky. Javascript as an embedded dsl. In ECOOP, 2012. Google Scholar
Digital Library
- H. Lee, K. J. Brown, A. K. Sujeeth, H. Chafi, T. Rompf, M. Odersky, and K. Olukotun. Implementing domain-specific languages for heterogeneous parallel computing. IEEE Micro, 31 (5): 42--53, 2011. Google Scholar
Digital Library
- S. Lerner, D. Grove, and C. Chambers. Composing dataflow analyses and transformations. SIGPLAN Not., 37: 270--282, January 2002. Google Scholar
Digital Library
- S. Lerner, T. D. Millstein, and C. Chambers. Automatically proving the correctness of compiler optimizations. In PLDI, 2003. Google Scholar
Digital Library
- A. Møller. dk.brics.automaton -- finite-state automata and regular expressions for Java, 2010.texttthttp://www.brics.dk/automaton/.Google Scholar
- A. Moors, T. Rompf, P. Haller, and M. Odersky. Scala-virtualized. PEPM, 2012. Google Scholar
Digital Library
- N. Nystrom, M. R. Clarkson, and A. C. Myers. Polyglot: An extensible compiler framework for java. In CC, 2003. Google Scholar
Digital Library
- N. Nystrom, D. White, and K. Das. Firepile: run-time compilation for gpus in scala. GPCE, 2011. Google Scholar
Digital Library
- S. Owens, J. Reppy, and A. Turon. Regular-expression derivatives re-examined. J. Funct. Program., 19 (2): 173--190, Mar. 2009. Google Scholar
Digital Library
- D. J. Quinlan, M. Schordan, Q. Yi, and A. Sæbjørnsen. Classification and utilization of abstractions for optimization. In ISoLA (Preliminary proceedings), 2004. Google Scholar
Digital Library
- T. Rompf. phLightweight Modular Staging and Embedded Compilers: Abstraction Without Regret for High-Level High-Performance Programming. PhD thesis, EPFL, 2012.Google Scholar
- T. Rompf and M. Odersky. Lightweight modular staging: a pragmatic approach to runtime code generation and compiled dsls. GPCE, 2010. Google Scholar
Digital Library
- T. Rompf and M. Odersky. Lightweight modular staging: a pragmatic approach to runtime code generation and compiled dsls. Commun. ACM, 55 (6): 121--130, 2012. Google Scholar
Digital Library
- T. Rompf, I. Maier, and M. Odersky. Implementing first-class polymorphic delimited continuations by a type-directed selective cps-transform. In ICFP, 2009. Google Scholar
Digital Library
- T. Rompf, A. K. Sujeeth, H. Lee, K. J. Brown, H. Chafi, M. Odersky, and K. Olukotun. Building-blocks for performance oriented dsls. DSL, 2011.Google Scholar
Cross Ref
- A. Shali and W. R. Cook. Hybrid partial evaluation. OOPSLA, 2011. Google Scholar
Digital Library
- M. Sperber and P. Thiemann. Realistic compilation by partial evaluation. In PLDI, 1996. Google Scholar
Digital Library
- A. K. Sujeeth, H. Lee, K. J. Brown, T. Rompf, M. Wu, A. R. Atreya, M. Odersky, and K. Olukotun. OptiML: an implicitly parallel domain-specific language for machine learning. ICML, 2011.Google Scholar
- E. Sumii and N. Kobayashi. A hybrid approach to online and offline partial evaluation. Higher-Order and Symbolic Computation, 14 (2--3): 101--142, 2001. Google Scholar
Digital Library
- W. Taha and T. Sheard. Metaml and multi-stage programming with explicit annotations. Theor. Comput. Sci., 248 (1--2): 211--242, 2000. Google Scholar
Digital Library
- R. Tate, M. Stepp, Z. Tatlock, and S. Lerner. Equality saturation: a new approach to optimization. In POPL, 2009. Google Scholar
Digital Library
- R. Tate, M. Stepp, and S. Lerner. Generating compiler optimizations from proofs. In POPL, 2010. Google Scholar
Digital Library
- P. Thiemann and D. Dussart. Partial evaluation for higher-order languages with state. Technical report, 1999. URL http://www.informatik.uni-freiburg.de/ thiemann/papers/mlpe.ps.gz.Google Scholar
- S. Tobin-Hochstadt, V. St-Amour, R. Culpepper, M. Flatt, and M. Felleisen. Languages as libraries. PLDI'11, 2011. Google Scholar
Digital Library
- T. L. Veldhuizen. Expression templates, C++gems. SIGS Publications, Inc., New York, NY, 1996. Google Scholar
Digital Library
- T. L. Veldhuizen. Arrays in blitz. In ISCOPE, 1998. Google Scholar
Digital Library
- T. L. Veldhuizen and J. G. Siek. Combining optimizations, combining theories. Technical report, Indiana University, 2008.Google Scholar
- P. Wadler. Deforestation: Transforming programs to eliminate trees. Theor. Comput. Sci., 73 (2): 231--248, 1990. Google Scholar
Digital Library
- P. Wadler and S. Blott. How to make ad-hoc polymorphism less ad-hoc. In POPL, 1989. Google Scholar
Digital Library
Index Terms
Optimizing data structures in high-level programs: new directions for extensible compilers based on staging
Recommendations
Optimizing data structures in high-level programs: new directions for extensible compilers based on staging
POPL '13: Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesHigh level data structures are a cornerstone of modern programming and at the same time stand in the way of compiler optimizations. In order to reason about user- or library-defined data structures compilers need to be extensible. Common mechanisms to ...
Staging with control: type-safe multi-stage programming with control operators
GPCE 2017: Proceedings of the 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and ExperiencesStaging allows a programmer to write domain-specific, custom code generators. Ideally, a programming language for staging provides all necessary features for staging, and at the same time, gives static guarantee for the safety properties of generated ...
A facility for the downward extension of a high-level language
Proceedings of the 1982 SIGPLAN symposium on Compiler constructionThis paper presents a method whereby a high-level language can be extended to provide access to all the capabilities of the underlying hardware and operating system of a machine. In essence, it is a facility that allows a user to make special purpose ...







Comments