skip to main content
article

A graph-based iterative compiler pass selection and phase ordering approach

Published:13 June 2016Publication History
Skip Abstract Section

Abstract

Nowadays compilers include tens or hundreds of optimization passes, which makes it difficult to find sequences of optimizations that achieve compiled code more optimized than the one obtained using typical compiler options such as -O2 and -O3. The problem involves both the selection of the compiler passes to use and their ordering in the compilation pipeline. The improvement achieved by the use of custom phase orders for each function can be significant, and thus important to satisfy strict requirements such as the ones present in high-performance embedded computing systems. In this paper we present a new and fast iterative approach to the phase selection and ordering challenges resulting in compiled code with higher performance than the one achieved with the standard optimization levels of the LLVM compiler. The obtained performance improvements are comparable with the ones achieved by other iterative approaches while requiring considerably less time and resources. Our approach is based on sampling over a graph representing transitions between compiler passes. We performed a number of experiments targeting the LEON3 microarchitecture using the Clang/LLVM 3.7 compiler, considering 140 LLVM passes and a set of 42 representative signal and image processing C functions. An exhaustive cross-validation shows our new exploration method is able to achieve a geometric mean performance speedup of 1.28x over the best individually selected -OX flag when considering 100,000 iterations; versus geometric mean speedups from 1.16x to 1.25x obtained with state-of-the-art iterative methods not using the graph. From the set of exploration methods tested, our new method is the only one consistently finding compiler sequences that result in performance improvements when considering 100 or less exploration iterations. Specifically, it achieved geometric mean speedups of 1.08x and 1.16x for 10 and 100 iterations, respectively.

References

  1. GCC, the GNU Compiler Collection, https://www.gnu.org/software/gcc/.Google ScholarGoogle Scholar
  2. Lelac Almagor, Keith D. Cooper, Alexander Grosul, Timothy J. Harvey, Steven W. Reeves, Devika Subramanian, Linda Torczon, and Todd Waterman, 2004. Finding effective compilation sequences. SIGPLAN Not. 39, 7, 231-239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Yang Chen, Shuangde Fang, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Olivier Temam, and Chengyong Wu, 2012. Deconstructing iterative optimization. ACM Transactions on Architecture and Code Optimization (TACO). 9, 3, 1-30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ricardo Nobre, 2013. Identifying sequences of optimizations for HW/SW compilation. In 23rd International Conference on Field Programmable Logic and Applications (FPL), 2013, 1-2.Google ScholarGoogle ScholarCross RefCross Ref
  5. Luiz G.A. Martins, Ricardo Nobre, Alexandre C.B. Delbem, Eduardo Marques, and Jo˜ao M.P. Cardoso, 2014. Exploration of compiler optimization sequences using clustering-based selection. In ACM Proc. 2014 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems (LCTES), 63-72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ricardo Nobre, Luiz G.A. Martins, and Jo˜ao M.P. Cardoso, 2015. Use of Previously Acquired Positioning of Optimizations for Phase Ordering Exploration. In Proc. 18th International Workshop on Software and Compilers for Embedded Systems (SCOPES ’15) (Schloss Rheinfels, St. Goar, Germany, June 1-3, 2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Amir H. Ashouri, Giovanni Mariani, Gianluca Palermo, and Cristina Silvano, 2014. A Bayesian network approach for compiler autotuning for embedded processors. In IEEE 12th Symposium on Embedded Systems for Real-time Multimedia (ESTIMedia), 2014, 90- 97.Google ScholarGoogle ScholarCross RefCross Ref
  8. Aeroflex Gaisler, LEON3 Processor, http://www.gaisler.com/index.php/products/processors/leon3.Google ScholarGoogle Scholar
  9. Aeroflex, TSIM2 ERC32/LEON simulator, http://www.gaisler.com/index.php/products/simulators/tsim.Google ScholarGoogle Scholar
  10. Texas Instruments, 2008. TMS320C64x+ DSP Little-Endian Library Programmer’s Reference (Rev. B).Google ScholarGoogle Scholar
  11. Texas Instruments, 2008. TMS320C64x+ DSP Image/Video Processing Library (v2.0) Programmer’s Reference (Rev. A).Google ScholarGoogle Scholar
  12. Luiz G. A. Martins, Ricardo Nobre, Jo˜ao M.P. Cardoso, Alexandre C.B. Delbem, and Eduardo Marques. Clustering-Based Selection for the Exploration of Compiler Optimization Sequences. ACM Trans. Archit. Code Optim. 13, 1, Article 8 (March 2016), 28 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Huang Qijing, Ruolong Lian, Andrew Canis, Jongsok Choi, Ryan Xi, Nazanin Calagar, Stephen Brown, Jason Anderson, 2013. The Effect of Compiler Optimizations on High-Level Synthesis for FPGAs. In IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2013, 89-96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jo˜ao M.P. Cardoso, Tiago Carvalho, Jos G.F. Coutinho, Wayne Luk, Ricardo Nobre, Pedro Diniz, and Zlatko Petrov, 2012. LARA: an aspect-oriented programming language for embedded systems. In Proceedings of the 11th annual international conference on Aspectoriented Software Development (Potsdam, Germany, 2012), ACM, 2162071, 179-190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ricardo Nobre, Jo˜ao M.P. Cardoso, Bryan Olivier, Razvan Nane, Liam Fitzpatrick, Jos Gabriel de F. Coutinho, Hans van Someren, Vlad-Mihai Sima, Koen Bertels, and Pedro C. Diniz, 2013. Hardware/Software Compilation. In Compilation and Synthesis for Embedded Reconfigurable Systems, J.M.P. Cardoso, P.C. Diniz, J.G.F. Coutinho and Z.M. Petrov Eds. Springer New York, 105-134.Google ScholarGoogle Scholar
  16. clang: a C language family frontend for LLVM, http://clang.llvm.org/.Google ScholarGoogle Scholar
  17. The LLVM Compiler Infrastructure, http://llvm.org/.Google ScholarGoogle Scholar
  18. David E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, 1st ed., Addison-Wesley Longman, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Scott Kirkpatrick, C. D. Gelat, Mario P. Vecchi. Optimization by simulated annealing. Science 220, 671-680 (1983).Google ScholarGoogle Scholar
  20. Keith D. Cooper, Alexander Grosul, Timothy J. Harvey, Steve Reeves, Devika Subramanian, Linda Torczon, and Todd Waterman, 2006. Exploring the structure of the space of compilation sequences using randomized search algorithms. The Journal of Supercomputing 36, 2 (2006/05/01), 135-151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Prasad A. Kulkarni, Stephen R. Hines, David B. Whalley, Jason D. Hiser, Jack W. Davidson, and Douglas L. Jones, 2004. Fast searches for effective optimization phase sequences. SIGPLAN Not. 39, 6, 171-182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Prasad A. Kulkarni, David B. Whalley, Gary S. Tyson, and Jack W. Davidson, 2009. Practical exhaustive optimization phase order exploration and evaluation. ACM Trans. Archit. Code Optim. 6, 1, 1-36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Prasad A. Kulkarni, Michael R. Jantz, and David B. Whalley, 2010. Improving both the performance benefits and speed of optimization phase sequence searches. SIGPLAN Not. 45, 4, 95-104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Suresh Purini and Lakshya Jain, 2013. Finding good optimization sequences covering program space. ACM Trans. Archit. Code Optim. 9, 4, 1-23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Michael R. Jantz and Prasad A. Kulkarni, 2013. Performance potential of optimization phase selection during dynamic JIT compilation. SIGPLAN Not. 48, 7, 131-142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Felix Agakov, Edwin Bonilla, John Cavazos, Björn Franke, Grigori Fursin, Michael F.P. O’Boyle, John Thomson, Marc Toussaint, and Christopher K.I. Williams, 2006. Using Machine Learning to Focus Iterative Optimization. In Proc. International Symposium on Code Generation and Optimization (2006), IEEE Computer Society, 1122412, 295-305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Grigori Fursin, Yuriy Kashnikov, Abdul Wahid Memon, Zbigniew Chamski, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Bilha Mendelson, Ayal Zaks, Eric Courtois, Francois Bodin, Phil Barnard, Elton Ashton, Edwin Bonilla, John Thomson, Christopher K. I. Williams, and Michael OBoyle, 2011. Milepost GCC: Machine Learning Enabled Self-tuning Compiler. International Journal of Parallel Programming 39, 3 (2011/06/01), 296-327.Google ScholarGoogle ScholarCross RefCross Ref
  28. Gene Sher, Kyle Martin, and Damian Dechev, 2014. Preliminary results for neuroevolutionary optimization phase order generation for static compilation. In Proc. 11th Workshop on Optimizations for DSP and Embedded Systems (Orlando, Florida, USA, 2014), ACM, 33- 40. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A graph-based iterative compiler pass selection and phase ordering approach

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 51, Issue 5
      LCTES '16
      May 2016
      122 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2980930
      • Editor:
      • Andy Gill
      Issue’s Table of Contents
      • cover image ACM Conferences
        LCTES 2016: Proceedings of the 17th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, Tools, and Theory for Embedded Systems
        June 2016
        122 pages
        ISBN:9781450343169
        DOI:10.1145/2907950

      Copyright © 2016 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 June 2016

      Check for updates

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!