skip to main content
research-article
Free Access

Polyhedral AST Generation Is More Than Scanning Polyhedra

Authors Info & Claims
Published:15 July 2015Publication History
Skip Abstract Section

Abstract

Abstract mathematical representations such as integer polyhedra have been shown to be useful to precisely analyze computational kernels and to express complex loop transformations. Such transformations rely on abstract syntax tree (AST) generators to convert the mathematical representation back to an imperative program. Such generic AST generators avoid the need to resort to transformation-specific code generators, which may be very costly or technically difficult to develop as transformations become more complex. Existing AST generators have proven their effectiveness, but they hit limitations in more complex scenarios. Specifically, (1) they do not support or may fail to generate control flow for complex transformations using piecewise schedules or mappings involving modulo arithmetic; (2) they offer limited support for the specialization of the generated code exposing compact, straightline, vectorizable kernels with high arithmetic intensity necessary to exploit the peak performance of modern hardware; (3) they offer no support for memory layout transformations; and (4) they provide insufficient control over the AST generation strategy, preventing their application to complex domain-specific optimizations.

We present a new AST generation approach that extends classical polyhedral scanning to the full generality of Presburger arithmetic, including existentially quantified variables and piecewise schedules, and introduce new optimizations for the detection of components and shifted strides. Not limiting ourselves to control flow generation, we expose functionality to generate AST expressions from arbitrary piecewise quasi-affine expressions, which enables the use of our AST generator for data-layout transformations. We complement this with support for specialization by polyhedral unrolling, user-directed versioning, and specialization of AST expressions according to the location at which they are generated, and we complete this work with fine-grained user control over the AST generation strategies used. Using this generalized idea of AST generation, we present how to implement complex domain-specific transformations without the need to write specialized code generators, but instead relying on a generic AST generator parametrized to a specific problem domain.

References

  1. Corinne Ancourt and François Irigoin. 1991. Scanning polyhedra with DO loop. In Proceedings of the 3rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’91). 39--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Vinayaka Bandishti, Irshad Pananilath, and Uday Bondhugula. 2012. Tiling stencil computations to maximize parallelism. In Proceedings of the International Conference on High Performance Computing, Networking, Storage, and Analysis (SC’12). IEEE, Los Alamitos, CA, 40. http://www.csa.iisc.ernet.in/∼uday/publications/stencils_sc12.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Cédric Bastoul. 2004. Code generation in the polyhedral model is easier than you think. In Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT’04). IEEE, Los Alamitos, CA, 7--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Uday Bondhugula, Vinayaka Bandishti, Albert Cohen, Guillain Potron, and Nicolas Vasilache. 2014. Tiling and optimizing time-iterated computations on periodic domains. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation (PACT’14). ACM, New York, NY, 39--50. DOI:http://dx.doi.org/10.1145/2628071.2628106 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Uday Bondhugula, Albert Hartono, Jagannathan Ramanujam, and Ponnuswamy Sadayappan. 2008. A practical automatic polyhedral parallelization and locality optimization system. In Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’08). ACM, New York, NY, 101--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chun Chen. 2012. Polyhedra scanning revisited. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’12). ACM, New York, NY, 499--508. http://ctop.cs.utah.edu/downloads/pldi128-chen.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chun Chen, Jacqueline Chame, and Mary Hall. 2008. A Framework for Composing High-Level Loop Transformations. Technical Report 08-897. University of Southern California. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.214.8396&rep==rep1&type==pdf.Google ScholarGoogle Scholar
  8. Alain Darte, Yves Robert, and Frédéric Vivien. 2001. Loop parallelization algorithms. In Compiler Optimizations for Scalable Parallel Systems, S. Pande D. P. Agrawal (Eds.). Springer-Verlag, New York, NY, 141--171. http://graal.ens-lyon.fr/∼fvivien/Publications/Chapter-LNCS.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Paul Feautrier. 1988. Parametric integer programming. RAIRO Recherche Opérationnelle 22, 3, 243--268.Google ScholarGoogle ScholarCross RefCross Ref
  10. Paul Feautrier. 1992. Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time. International Journal of Parallel Programming 21, 6, 389--420. DOI:http://dx.doi.org/10.1007/BF01379404 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Paul Feautrier and Christian Lengauer. 2011. The polyhedron model. In Encyclopedia of Parallel Computing, D. Padua (Ed.). Springer, 1581--1592.Google ScholarGoogle Scholar
  12. Sylvain Girbal, Nicolas Vasilache, Cédric Bastoul, Albert Cohen, David Parello, Marc Sigler, and Olivier Temam. 2006. Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies. International Journal of Parallel Programming 34, 3, 261--317. DOI:http://dx.doi.org/10.1007/s10766-006-0012-3 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Georgios Goumas, Maria Athanasaki, and Nectarios Koziris. 2003. An efficient code generation technique for tiled iteration spaces. IEEE Transactions on Parallel and Distributed Systems 14, 10, 1021--1034. http://ftp.cslab.ece.ntua.gr/∼goumas/downloads/tpds2003.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Martin Griebl, Paul Feautrier, and Christian Lengauer. 2000. Index set splitting. International Journal of Parallel Programming 28, 6, 607--631. http://www.infosun.fim.uni-passau.de/cl/publications/docs/GFL00ijpp.pdf. Google ScholarGoogle ScholarCross RefCross Ref
  15. Tobias Grosser, Albert Cohen, Justin Holewinski, Ponuswamy Sadayappan, and Sven Verdoolaege. 2014. Hybrid hexagonal/classical tiling for GPUs. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’14). ACM, New York, NY, 66:66--66:75. http://hal.inria.fr/hal-00911177 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Tobias Grosser, Armin Größlinger, and Christian Lengauer. 2012. Polly—performing polyhedral optimizations on a low-level intermediate representation. Parallel Processing Letters 22, 4, 28. http://www.worldscientific.com/doi/abs/10.1142/S0129626412500107Google ScholarGoogle ScholarCross RefCross Ref
  17. Tobias Grosser, Louis-Noël Pouchet, Jagannathan Ramanujam, Ponnuswamy Sadayappan, and Sebastian Pop. 2015. Optimistic delinearization of parametrically sized arrays. In Proceedings of the 29th International Conference on Supercomputing (ICS’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Albert Hartono, Muthu Manikandan Baskaran, Jagannathan Ramanujam, and Ponnuswamy Sadayappan. 2010. DynTile: Parametric tiled loop generation for parallel execution on multicore processors. In Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS’10). IEEE, Los Alamitos, CA, 1--12.Google ScholarGoogle ScholarCross RefCross Ref
  19. Tom Henretty, Richard Veras, Franz Franchetti, Louis-Noël Pouchet, Jagannathan Ramanujam, and Ponnuswamy Sadayappan. 2013. A stencil compiler for short-vector SIMD architectures. In Proceedings of the 27th International ACM Conference on Supercomputing (ICS’13). ACM, New York, NY, 13--24. http://www.cs.ucla.edu/∼pouchet/doc/ics-article.13.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Justin Holewinski, Louis-Noël Pouchet, and Ponnuswamy Sadayappan. 2012. High-performance code generation for stencil computations on GPU architectures. In Proceedings of the 26th ACM International Conference on Supercomputing (ICS’12). ACM, New York, NY, 311--320. http://www.cse.ohio-state.edu/∼pouchet/doc/ics-article.12.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. ISO. 1999. ISO/IEC 9899:1999: Programming Languages C. International Organization for Standardization.Google ScholarGoogle Scholar
  22. Marta Jiménez, José M. Llabería, and Agustín Fernández. 2002. Register tiling in nonrectangular iteration spaces. ACM Transactions on Programming Languages and Systems 24, 4, 409--453. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Wayne Kelly and William Pugh. 1995. A unifying framework for iteration reordering transformations. In Proceedings of the IEEE 1st International Conference on Algorithms and Architectures for Parallel Processing (ICAPP’95), Vol. 1. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24.1382&rep==rep1&type==psGoogle ScholarGoogle ScholarCross RefCross Ref
  24. William Kelly, William Pugh, and Evan Rosser. 1995. Code generation for multiple mappings. In Proceedings of the 5th Symposium on the Frontiers of Massively Parallel Computation (Frontiers’95). IEEE, Los Alamitos, CA, 332--341. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. DaeGon Kim, Lakshminarayanan Renganarayanan, Dave Rostron, Sanjay Rajopadhye, and Michelle Mills Strout. 2007. Multi-level tiling: M for the price of one. In Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC’07). ACM, New York, NY, Article No. 51. DOI:http://dx.doi.org/10.1145/1362622.1362691 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Martin Kong, Richard Veras, Kevin Stock, Franz Franchetti, Louis-Noël Pouchet, and Ponnuswamy Sadayappan. 2013. When polyhedral transformations meet SIMD code generation. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, New York, NY, 127--138. http://users.ece.cmu.edu/∼franzf/papers/pldi13.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Vincent Loechner and Doran K. Wilde. 1997. Parameterized polyhedra and their vertices. International Journal of Parallel Programming 25, 6, 525--549. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.2146&rep==rep1&type==pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Louis-Noël Pouchet. 2012. PolyBench/C 3.2. Retrieved June 8, 2015, from http://www.cs.ucla.edu/∼pouchet/software/polybench/.Google ScholarGoogle Scholar
  29. Louis-Noël Pouchet, Cédric Bastoul, Albert Cohen, and John Cavazos. 2008. Iterative optimization in the polyhedral model: Part II, multidimensional time. In Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’08). ACM, New York, NY, 90--100. http://www.cse.ohio-state.edu/∼pouchet/doc/pldi-article.08.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Louis-Noël Pouchet, Cédric Bastoul, Albert Cohen, and Nicolas Vasilache. 2007. Iterative optimization in the polyhedral model: Part I, one-dimensional time. In Proceedings of the IEEE/ACM 5th International Symposium on Code Generation and Optimization (CGO’07). IEEE, Los Alamitos, CA, 144--156. http://www.cse.ohio-state.edu/∼pouchet/doc/cgo-article.07.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. William Pugh and Evan Rosser. 1997. Iteration space slicing and its application to communication optimization. In Proceedings of the 11th International Conference on Supercomputing (ICS’97). ACM, New York, NY, 221--228. DOI:http://dx.doi.org/10.1145/263580.263637 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. William Pugh and David Wonnacott. 1994. Static analysis of upper and lower bounds on dependences and parallelism. Transactions on Programming Languages and Systems 16, 4, 1248--1278. http://drum.lib.umd.edu/bitstream/1903/629/4/CS-TR-3250.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Fabien Quilleré, Sanjay Rajopadhye, and Doran Wilde. 2000. Generation of efficient nested loops from polyhedra. International Journal of Parallel Programming 28, 5, 469--498. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Lakshminarayanan Renganarayanan, DaeGon Kim, Sanjay Rajopadhye, and Michelle Mills Strout. 2007. Parameterized tiled loops for free. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’07). ACM, New York, NY, 405--414. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jun Shirako, Louis-Noël Pouchet, and Vivek Sarkar. 2014. Oil and water can mix: An integration of polyhedral and AST-based transformations. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’14). IEEE, Los Alamitos, CA, 287--298. DOI:http://dx.doi.org/10.1109/SC.2014.29 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Nicolas Vasilache, Cédric Bastoul, and Albert Cohen. 2006. Polyhedral code generation in the real world. In Compiler Construction. Lecture Notes in Computer Science, Vol. 3923. Springer, 185--201. http://icps.u-strasbg.fr/∼bastoul/research/papers/VBC06-CC.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Anand Venkat, Manu Shantharam, Mary Hall, and Michelle Strout. 2014. Non-affine extensions to polyhedral code generation. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’14). ACM, New York, NY, 185:185--185:194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Sven Verdoolaege. 2010. isl: An integer set library for the polyhedral model. In Mathematical Software—ICMS 2010. Lecture Notes in Computer Science, Vol. 6327. Springer, 299--302. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Sven Verdoolaege. 2011. Counting affine calculator and applications. In Proceedings of the 1st International Workshop on Polyhedral Compilation Techniques (IMPACT’11).Google ScholarGoogle Scholar
  40. Sven Verdoolaege. 2015. Integer set coalescing. In Proceedings of the 5th International Workshop on Polyhedral Compilation Techniques (IMPACT’15).Google ScholarGoogle Scholar
  41. Sven Verdoolaege and Tobias Grosser. 2012. Polyhedral extraction tool. In Proceedings of the 2nd International Workshop on Polyhedral Compilation Techniques (IMPACT’12). http://impact.gforge.inria.fr/impact2012/workshop_IMPACT/verdoolaege.pdf.Google ScholarGoogle Scholar
  42. Sven Verdoolaege, Serge Guelton, Tobias Grosser, and Albert Cohen. 2014. Schedule trees. In Proceedings of the 4th International Workshop on Polyhedral Compilation Techniques. http://impact.gforge.inria.fr/impact2014/papers/impact2014-verdoolaege.pdf.Google ScholarGoogle Scholar
  43. Sven Verdoolaege, Gerda Janssens, and Maurice Bruynooghe. 2012. Equivalence checking of static affine programs using widening to handle recurrences. ACM Transactions on Programming Languages and Systems 34, 3, Article No. 11. DOI:http://dx.doi.org/10.1145/2362389.2362390 Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Sven Verdoolaege, Juan Carlos Juega, Albert Cohen, José Ignacio Gómez, Christian Tenllado, and Francky Catthoor. 2013. Polyhedral parallel code generation for CUDA. ACM Transactions on Architecture and Code Optimization 9, 4, 54:1--54:23. DOI:http://dx.doi.org/10.1145/2400682.2400713 Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. David Wonnacott. 2002. Achieving scalable locality with time skewing. International Journal of Parallel Programming 30, 3, 181--221. DOI:http://dx.doi.org/10.1023/A:1015460304860 Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Tomofumi Yuki, Gautam Gupta, DaeGon Kim, Tanveer Pathan, and Sanjay Rajopadhye. 2012. AlphaZ: A system for design space exploration in the polyhedral model. In Proceedings of the 25th International Workshop on Languages and Compilers for Parallel Computing. http://people.rennes.inria.fr/Tomofumi.Yuki/papers/yuki-lcpc2012.pdf.Google ScholarGoogle Scholar
  47. Wei Zuo, Peng Li, Deming Chen, Louis-Noël Pouchet, Shunan Zhong, and Jason Cong. 2013. Improving polyhedral code generation for high-level synthesis. In Proceedings of the 9th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’13). IEEE, Los Alamitos, CA, 15:1--15:10. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Polyhedral AST Generation Is More Than Scanning Polyhedra

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!