skip to main content
research-article

Polyhedra scanning revisited

Published:11 June 2012Publication History
Skip Abstract Section

Abstract

This paper presents a new polyhedra scanning system called CodeGen+ to address the challenge of generating high-performance code for complex iteration spaces resulting from compiler optimization and autotuning systems. The strength of our approach lies in two new algorithms. First, a loop overhead removal algorithm provides precise control of trade-offs between loop overhead and code size based on actual loop nesting depth. Second, an if-statement simplification algorithm further reduces the number of comparisons in the code. These algorithms combined with the expressive power of Presburger arithmetic enable CodeGen+ to support complex optimization strategies expressed in iteration spaces. We compare with the state-of-the-art polyhedra scanning tool CLooG on five loop nest computations, demonstrating that CodeGen+ generates code that is simpler and up to 1.15x faster.

References

  1. Corinne Ancourt and François Irigoin. Scanning polyhedra with DO loops. In Proceedings of ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, April 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Cédric Bastoul. Code generation in the polyhedral model is easier than you think. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, October 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Chen, J. Shin, S. Kintali, J. Chame, and M. Hall. Model-guided empirical optimization for multimedia extension architectures: A case study. In Proceedings of the Workshop on Performance Optimization for High-Level Languages and Libraries (POHLL 2007), May 2007.Google ScholarGoogle ScholarCross RefCross Ref
  4. Chun Chen, Jacqueline Chame, and Mary W. Hall. CHiLL: A framework for composing high-level loop transformations. Technical Report 08--897, University of Southern California, Jun 2008.Google ScholarGoogle Scholar
  5. Jean-François Collard, Tanguy Risset, and Paul Feautrier. Construction of DO loops from systems of affine constraints. Parallel Processing Letters, 5(3):421--436, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  6. Agustín Fernández, José M. Llabería, and Miguel Valero-García. Loop transformation using nonumimodular matrices. IEEE Transactions on Parallel and Distributed Systems, 6(8):832--840, August 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Marc Le Fur. Scanning parameterized polyhedron using Fourier-Motzkin elimination. Concurrency: Practice and Experience, 8(6):445--460, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  8. Sylvain Girbal, Nicolas Vasilache, Cédric Bastoul, Albert Cohen, David Parello, Marc Sigler, and Olivier Temam. Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies. International Journal of Parallel Programming, 34(3):261--317, June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Martin Griebl, Christian Lengauer, and Sabine Wetzel. Code generation in the polytope model. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, October 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Mary Hall, Jacqueline Chame, Jaewook Shin, Chun Chen, Gabe Rudy, and Malik Murtaza Khan. Loop transformation recipes for code generation and auto-tuning. In LCPC, October, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Wayne Kelly, Vadim Maslov, William Pugh, Evan Rosser, Tatiana Shpeisman, and David Wonnacott. The Omega Library interface guide. Technical Report CS-TR-3445, University of Maryland at College Park, March 1995. Google ScholarGoogle Scholar
  12. Wayne Kelly, William Pugh, and Evan Rosser. Code generation for multiple mappings. In Proceedings of the 5th Symposium on the Frontiers of Massively Parallel Computation, February 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Malik Khan. Autotuning, code generation and optimizing compiler technology for GPUs. PhD thesis, University of Southern California, May 2012.Google ScholarGoogle Scholar
  14. Wei Li and Keshav Pingali. A singular loop transformation framework based on non-singular matrices. In Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing, August 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. William Pugh. The Omega test: A practical algorithm for exact array dependence analysis. Communications of the ACM, 35(8):102--114, August 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. William Pugh and David Wonnacott. An exact method for analysis of value-based array data dependences. In Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing, August 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. William Pugh and David Wonnacott. Experiences with constraint-based array dependence analysis. In Proceedings of the Second International Workshop on Principles and Practice of Constraint Programming, May 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Fabien Quilleré, Sanjay Rajopadhye, and Doran Wilde. Generation of efficient nested loops from polyhedra. International Journal of Parallel Programming, 28(5):469--498, October 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Shreyas Ramalingam, Mary Hall, and Chun Chen. Improving high-performance sparse libraries using compiler-assisted specialization: A PETSc case study. In Proceedings of the Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS 2012), May 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Ramanujam. Beyond unimodular transformations. The Journal of Supercomputing, 9(4):365--389, February 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Gabe Rudy, Malik Murtaza Khan, Mary Hall, Chun Chen, and Cha Jacqueline. A programming language interface to describe transformations and code generation. In Proceedings of the 23rd international conference on Languages and compilers for parallel computing, pages 136--150. Springer-Verlag, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jaewook Shin, Mary W. Hall, Jacqueline Chame, Chun Chen, Paul F. Fischer, and Paul D. Hovland. Speeding up nek5000 with autotuning and specialization. In Proceedings of the 24th ACM International Conference on Supercomputing, ICS '10, pages 253--262, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jaewook Shin, Mary W. Hall, Jacqueline Chame, Chun Chen, and Paul D. Hovland. Transformation recipes for code generation and auto-tuning. In The Fourth International Workshop on Automatic Performance Tuning, October 2009.Google ScholarGoogle Scholar
  24. A. Tiwari, J. K. Hollingsworth, C. Chen, M. Hall, C. Liao, D. J. Quinlan, and J. Chame. Auto-tuning full applications: A case study. International Journal of High Performance Computing Applications, pages 286--294, August 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Ananta Tiwari, Chun Chen, Jacqueline Chame, Mary Hall, and Jeffrey K. Hollingsworth. A scalable autotuning framework for compiler optimization. In IPDPS, Rome, Italy, May 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Nicolas Vasilache, Cédric Bastoul, and Albert Cohen. Polyhedral code generation in the real world. In Proceedings of the International Conference on Compiler Construction, March 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Sven Verdoolaege. isl: An integer set library for the polyhedral model. In International Congress on Mathematical Software, September 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Michael E. Wolf and Monica S. Lam. A data locality optimizing algorithm. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, June 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Michael E. Wolf and Monica S. Lam. A loop transformation theory and an algorithm to maximize parallelism. IEEE Transactions on Parallel and Distributed Systems, 2(4):452--471, October 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jingling Xue. Automating non-unimodular loop transformations for massive parallelism. Parallel Computing, 20(5):711--728, May 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Polyhedra scanning revisited

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 47, Issue 6
      PLDI '12
      June 2012
      534 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2345156
      Issue’s Table of Contents
      • cover image ACM Conferences
        PLDI '12: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation
        June 2012
        572 pages
        ISBN:9781450312059
        DOI:10.1145/2254064

      Copyright © 2012 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 June 2012

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!