Abstract
Inlining is an optimization that replaces a call to a function with that function's body. This optimization not only reduces the overhead of a function call, but can expose additional optimization opportunities to the compiler, such as removing redundant operations or unused conditional branches. Another optimization, copy propagation, replaces a redundant copy of a still-live variable with the original. Copy propagation can reduce the total number of live variables, reducing register pressure and memory usage, and possibly eliminating redundant memory-to-memory copies. In practice, both of these optimizations are implemented in nearly every modern compiler.
These two optimizations are practical to implement and effective in first-order languages, but in languages with lexically-scoped first-class functions (aka, closures), these optimizations are not available to code programmed in a higher-order style. With higher-order functions, the analysis challenge has been that the environment at the call site must be the same as at the closure capture location, up to the free variables, or the meaning of the program may change. Olin Shivers' 1991 dissertation called this family of optimizations superΒ and he proposed one analysis technique, called reflow, to support these optimizations. Unfortunately, reflow has proven too expensive to implement in practice. Because these higher-order optimizations are not available in functional-language compilers, programmers studiously avoid uses of higher-order values that cannot be optimized (particularly in compiler benchmarks).
This paper provides the first practical and effective technique for superΒ (higher-order) inlining and copy propagation, which we call unchanged variable analysis. We show that this technique is practical by implementing it in the context of a real compiler for an ML-family language and showing that the required analyses have costs below 3% of the total compilation time. This technique's effectiveness is shown through a set of benchmarks and example programs, where this analysis exposes additional potential optimization sites.
- Ashley, J. M. and R. K. Dybvig. A practical and flexible flow analysis for higher-order languages. ACM TOPLAS, 20(4), July 1998, pp. 845--868. Google Scholar
Digital Library
- Appel, A. W. and D. B. MacQueen. Standard ML of New Jersey. In PLIP '91, vol. 528 of LNCS. Springer-Verlag, New York, NY, August 1991, pp. 1--26.Google Scholar
Cross Ref
- Barber, C. B., D. P. Dobkin, and H. Huhdanpaa. The quickhull algorithm for convex hulls. ACM TOMS, 22(4), 1996, pp. 469--483. Google Scholar
Digital Library
- Bergstrom, L. Arity raising and control-flow analysis in Manticore. Master's dissertation, University of Chicago, November 2009. Available from http://manticore.cs.uchicago.edu.Google Scholar
- Barnes, J. and P. Hut. A hierarchical O(N logN) force calculation algorithm. Nature, 324, December 1986, pp. 446--449.Google Scholar
Cross Ref
- Cejtin, H., S. Jagannathan, and S. Weeks. Flow-directed closure conversion for typed languages. In ESOP '00. Springer-Verlag, 2000, pp. 56--71. Google Scholar
Digital Library
- CLBG. The computer language benchmarks game, 2013. Available from http://benchmarksgame.alioth.debian.org/.Google Scholar
- Danvy, O. and A. Filinski. Representing control: A study of the CPS transformation. MSCS, 2(4), 1992, pp. 361--391.Google Scholar
- Fluet, M., N. Ford, M. Rainey, J. Reppy, A. Shaw, and Y. Xiao. Status Report: The Manticore Project. In ML '07. ACM, October 2007, pp. 15--24. Google Scholar
Digital Library
- Fluet, M., M. Rainey, J. Reppy, and A. Shaw. Implicitly-threaded parallelism in Manticore. JFP, 20(5-6), 2011, pp. 537--576. Google Scholar
Digital Library
- George, L., F. Guillame, and J. Reppy. A portable and optimizing back end for the SML/NJ compiler. In CC '94, April 1994, pp. 83--97. Google Scholar
Digital Library
- GHC. Barnes Hut benchmark written in Haskell. Available from http://darcs.haskell.org/packages/ndp/examples/barnesHut/.Google Scholar
- Hinze, R. Constructing red-black trees. In WAAAPL'99: Workshop on Algorithmic Aspects of Advanced Programming Languages, Paris, France, 1999. pp. 89--99.Google Scholar
- Hudak, P. A semantic model of reference counting and its abstraction (detailed summary). In LFP '86, Cambridge, Massachusetts, USA, 1986. ACM, pp. 351--363. Google Scholar
Digital Library
- Midtgaard, J. Control-flow analysis of functional programs. ACM Comp. Surveys, 44(3), June 2012, pp. 10:1--10:33. Google Scholar
Digital Library
- Might, M. Shape analysis in the absence of pointers and structure. In VMCAI '10, Madrid, Spain, 2010. Springer-Verlag, pp. 263--278. Google Scholar
Digital Library
- Might, M. and O. Shivers. Environment analysis via ΔCFA. In POPL '06, Charleston, South Carolina, USA, 2006. ACM, pp. 127--140. Google Scholar
Digital Library
- Nikhil, R. S. ID Language Reference Manual. Laboratory for Computer Science, MIT, Cambridge, MA, July 1991.Google Scholar
- Nielson, F., H. R. Nielson, and C. Hankin. Principles of Program Analysis. Springer-Verlag, New York, NY, 1999. Google Scholar
Digital Library
- Nuutila, E. An efficient transitive closure algorithm for cyclic digraphs. IPL, 52, 1994. Google Scholar
Digital Library
- Peyton Jones, S. and S. Marlow. Secrets of the Glasgow Haskell Compiler inliner. JFP, 12(5), July 2002. Google Scholar
Digital Library
- Reps, T., S. Horwitz, and M. Sagiv. Precise interprocedural dataflow analysis via graph reachability. In POPL '95, San Francisco, 1995. ACM. Google Scholar
Digital Library
- Reppy, J. and Y. Xiao. Specialization of CML message-passing primitives. In POPL '07. ACM, January 2007, pp. 315--326. Google Scholar
Digital Library
- Scandal Project. A library of parallel algorithms written NESL. Available from http://www.cs.cmu.edu/~scandal/nesl/algorithms.html.Google Scholar
- Serrano, M. Control flow analysis: a functional languages compilation paradigm. In SAC '95, Nashville, Tennessee, United States, 1995. ACM, pp. 118--122. Google Scholar
Digital Library
- Shivers, O. Control-flow analysis of higher-order languages. Ph.D. dissertation, School of C.S., CMU, Pittsburgh, PA, May 1991. Google Scholar
Digital Library
- Shivers, O. and M. Might. Continuations and transducer composition. In PLDI '06, Ottawa, Ontario, Canada, 2006. ACM, pp. 295--307. Google Scholar
Digital Library
- Tarjan, R. Depth-first search and linear graph algorithms. SIAM JC, 1(2), 1972, pp. 146--160.Google Scholar
Cross Ref
- Warshall, S. A theorem on boolean matrices. JACM, 9(1), January 1962. Google Scholar
Digital Library
- Waddell, O. and R. K. Dybvig. Fast and effective procedure inlining. In SAS '97, LNCS. Springer-Verlag, 1997, pp. 35--52. Google Scholar
Digital Library
- Weeks, S. Whole program compilation in MLton. Invited talk at ML '06 Workshop, September 2006. Google Scholar
Digital Library
Index Terms
Practical and effective higher-order optimizations
Recommendations
Practical and effective higher-order optimizations
ICFP '14: Proceedings of the 19th ACM SIGPLAN international conference on Functional programmingInlining is an optimization that replaces a call to a function with that function's body. This optimization not only reduces the overhead of a function call, but can expose additional optimization opportunities to the compiler, such as removing ...
Making collection operations optimal with aggressive JIT compilation
SCALA 2017: Proceedings of the 8th ACM SIGPLAN International Symposium on ScalaFunctional collection combinators are a neat and widely accepted data processing abstraction. However, their generic nature results in high abstraction overheads -- Scala collections are known to be notoriously slow for typical tasks. We show that ...
Definitional Interpreters for Higher-Order Programming Languages
Higher-order programming languages (i.e., languages in which procedures or labels can occur as values) are usually defined by interpreters that are themselves written in a programming language based on the lambda calculus (i.e., an applicative language ...







Comments