Abstract
Abstract mathematical representations such as integer polyhedra have been shown to be useful to precisely analyze computational kernels and to express complex loop transformations. Such transformations rely on abstract syntax tree (AST) generators to convert the mathematical representation back to an imperative program. Such generic AST generators avoid the need to resort to transformation-specific code generators, which may be very costly or technically difficult to develop as transformations become more complex. Existing AST generators have proven their effectiveness, but they hit limitations in more complex scenarios. Specifically, (1) they do not support or may fail to generate control flow for complex transformations using piecewise schedules or mappings involving modulo arithmetic; (2) they offer limited support for the specialization of the generated code exposing compact, straightline, vectorizable kernels with high arithmetic intensity necessary to exploit the peak performance of modern hardware; (3) they offer no support for memory layout transformations; and (4) they provide insufficient control over the AST generation strategy, preventing their application to complex domain-specific optimizations.
We present a new AST generation approach that extends classical polyhedral scanning to the full generality of Presburger arithmetic, including existentially quantified variables and piecewise schedules, and introduce new optimizations for the detection of components and shifted strides. Not limiting ourselves to control flow generation, we expose functionality to generate AST expressions from arbitrary piecewise quasi-affine expressions, which enables the use of our AST generator for data-layout transformations. We complement this with support for specialization by polyhedral unrolling, user-directed versioning, and specialization of AST expressions according to the location at which they are generated, and we complete this work with fine-grained user control over the AST generation strategies used. Using this generalized idea of AST generation, we present how to implement complex domain-specific transformations without the need to write specialized code generators, but instead relying on a generic AST generator parametrized to a specific problem domain.
- Corinne Ancourt and François Irigoin. 1991. Scanning polyhedra with DO loop. In Proceedings of the 3rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’91). 39--50. Google Scholar
Digital Library
- Vinayaka Bandishti, Irshad Pananilath, and Uday Bondhugula. 2012. Tiling stencil computations to maximize parallelism. In Proceedings of the International Conference on High Performance Computing, Networking, Storage, and Analysis (SC’12). IEEE, Los Alamitos, CA, 40. http://www.csa.iisc.ernet.in/∼uday/publications/stencils_sc12.pdf. Google Scholar
Digital Library
- Cédric Bastoul. 2004. Code generation in the polyhedral model is easier than you think. In Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT’04). IEEE, Los Alamitos, CA, 7--16. Google Scholar
Digital Library
- Uday Bondhugula, Vinayaka Bandishti, Albert Cohen, Guillain Potron, and Nicolas Vasilache. 2014. Tiling and optimizing time-iterated computations on periodic domains. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation (PACT’14). ACM, New York, NY, 39--50. DOI:http://dx.doi.org/10.1145/2628071.2628106 Google Scholar
Digital Library
- Uday Bondhugula, Albert Hartono, Jagannathan Ramanujam, and Ponnuswamy Sadayappan. 2008. A practical automatic polyhedral parallelization and locality optimization system. In Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’08). ACM, New York, NY, 101--113. Google Scholar
Digital Library
- Chun Chen. 2012. Polyhedra scanning revisited. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’12). ACM, New York, NY, 499--508. http://ctop.cs.utah.edu/downloads/pldi128-chen.pdf. Google Scholar
Digital Library
- Chun Chen, Jacqueline Chame, and Mary Hall. 2008. A Framework for Composing High-Level Loop Transformations. Technical Report 08-897. University of Southern California. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.214.8396&rep==rep1&type==pdf.Google Scholar
- Alain Darte, Yves Robert, and Frédéric Vivien. 2001. Loop parallelization algorithms. In Compiler Optimizations for Scalable Parallel Systems, S. Pande D. P. Agrawal (Eds.). Springer-Verlag, New York, NY, 141--171. http://graal.ens-lyon.fr/∼fvivien/Publications/Chapter-LNCS.pdf. Google Scholar
Digital Library
- Paul Feautrier. 1988. Parametric integer programming. RAIRO Recherche Opérationnelle 22, 3, 243--268.Google Scholar
Cross Ref
- Paul Feautrier. 1992. Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time. International Journal of Parallel Programming 21, 6, 389--420. DOI:http://dx.doi.org/10.1007/BF01379404 Google Scholar
Digital Library
- Paul Feautrier and Christian Lengauer. 2011. The polyhedron model. In Encyclopedia of Parallel Computing, D. Padua (Ed.). Springer, 1581--1592.Google Scholar
- Sylvain Girbal, Nicolas Vasilache, Cédric Bastoul, Albert Cohen, David Parello, Marc Sigler, and Olivier Temam. 2006. Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies. International Journal of Parallel Programming 34, 3, 261--317. DOI:http://dx.doi.org/10.1007/s10766-006-0012-3 Google Scholar
Digital Library
- Georgios Goumas, Maria Athanasaki, and Nectarios Koziris. 2003. An efficient code generation technique for tiled iteration spaces. IEEE Transactions on Parallel and Distributed Systems 14, 10, 1021--1034. http://ftp.cslab.ece.ntua.gr/∼goumas/downloads/tpds2003.pdf. Google Scholar
Digital Library
- Martin Griebl, Paul Feautrier, and Christian Lengauer. 2000. Index set splitting. International Journal of Parallel Programming 28, 6, 607--631. http://www.infosun.fim.uni-passau.de/cl/publications/docs/GFL00ijpp.pdf. Google Scholar
Cross Ref
- Tobias Grosser, Albert Cohen, Justin Holewinski, Ponuswamy Sadayappan, and Sven Verdoolaege. 2014. Hybrid hexagonal/classical tiling for GPUs. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’14). ACM, New York, NY, 66:66--66:75. http://hal.inria.fr/hal-00911177 Google Scholar
Digital Library
- Tobias Grosser, Armin Größlinger, and Christian Lengauer. 2012. Polly—performing polyhedral optimizations on a low-level intermediate representation. Parallel Processing Letters 22, 4, 28. http://www.worldscientific.com/doi/abs/10.1142/S0129626412500107Google Scholar
Cross Ref
- Tobias Grosser, Louis-Noël Pouchet, Jagannathan Ramanujam, Ponnuswamy Sadayappan, and Sebastian Pop. 2015. Optimistic delinearization of parametrically sized arrays. In Proceedings of the 29th International Conference on Supercomputing (ICS’15). Google Scholar
Digital Library
- Albert Hartono, Muthu Manikandan Baskaran, Jagannathan Ramanujam, and Ponnuswamy Sadayappan. 2010. DynTile: Parametric tiled loop generation for parallel execution on multicore processors. In Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS’10). IEEE, Los Alamitos, CA, 1--12.Google Scholar
Cross Ref
- Tom Henretty, Richard Veras, Franz Franchetti, Louis-Noël Pouchet, Jagannathan Ramanujam, and Ponnuswamy Sadayappan. 2013. A stencil compiler for short-vector SIMD architectures. In Proceedings of the 27th International ACM Conference on Supercomputing (ICS’13). ACM, New York, NY, 13--24. http://www.cs.ucla.edu/∼pouchet/doc/ics-article.13.pdf. Google Scholar
Digital Library
- Justin Holewinski, Louis-Noël Pouchet, and Ponnuswamy Sadayappan. 2012. High-performance code generation for stencil computations on GPU architectures. In Proceedings of the 26th ACM International Conference on Supercomputing (ICS’12). ACM, New York, NY, 311--320. http://www.cse.ohio-state.edu/∼pouchet/doc/ics-article.12.pdf. Google Scholar
Digital Library
- ISO. 1999. ISO/IEC 9899:1999: Programming Languages C. International Organization for Standardization.Google Scholar
- Marta Jiménez, José M. Llabería, and Agustín Fernández. 2002. Register tiling in nonrectangular iteration spaces. ACM Transactions on Programming Languages and Systems 24, 4, 409--453. Google Scholar
Digital Library
- Wayne Kelly and William Pugh. 1995. A unifying framework for iteration reordering transformations. In Proceedings of the IEEE 1st International Conference on Algorithms and Architectures for Parallel Processing (ICAPP’95), Vol. 1. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24.1382&rep==rep1&type==psGoogle Scholar
Cross Ref
- William Kelly, William Pugh, and Evan Rosser. 1995. Code generation for multiple mappings. In Proceedings of the 5th Symposium on the Frontiers of Massively Parallel Computation (Frontiers’95). IEEE, Los Alamitos, CA, 332--341. Google Scholar
Digital Library
- DaeGon Kim, Lakshminarayanan Renganarayanan, Dave Rostron, Sanjay Rajopadhye, and Michelle Mills Strout. 2007. Multi-level tiling: M for the price of one. In Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC’07). ACM, New York, NY, Article No. 51. DOI:http://dx.doi.org/10.1145/1362622.1362691 Google Scholar
Digital Library
- Martin Kong, Richard Veras, Kevin Stock, Franz Franchetti, Louis-Noël Pouchet, and Ponnuswamy Sadayappan. 2013. When polyhedral transformations meet SIMD code generation. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, New York, NY, 127--138. http://users.ece.cmu.edu/∼franzf/papers/pldi13.pdf. Google Scholar
Digital Library
- Vincent Loechner and Doran K. Wilde. 1997. Parameterized polyhedra and their vertices. International Journal of Parallel Programming 25, 6, 525--549. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.2146&rep==rep1&type==pdf. Google Scholar
Digital Library
- Louis-Noël Pouchet. 2012. PolyBench/C 3.2. Retrieved June 8, 2015, from http://www.cs.ucla.edu/∼pouchet/software/polybench/.Google Scholar
- Louis-Noël Pouchet, Cédric Bastoul, Albert Cohen, and John Cavazos. 2008. Iterative optimization in the polyhedral model: Part II, multidimensional time. In Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’08). ACM, New York, NY, 90--100. http://www.cse.ohio-state.edu/∼pouchet/doc/pldi-article.08.pdf. Google Scholar
Digital Library
- Louis-Noël Pouchet, Cédric Bastoul, Albert Cohen, and Nicolas Vasilache. 2007. Iterative optimization in the polyhedral model: Part I, one-dimensional time. In Proceedings of the IEEE/ACM 5th International Symposium on Code Generation and Optimization (CGO’07). IEEE, Los Alamitos, CA, 144--156. http://www.cse.ohio-state.edu/∼pouchet/doc/cgo-article.07.pdf. Google Scholar
Digital Library
- William Pugh and Evan Rosser. 1997. Iteration space slicing and its application to communication optimization. In Proceedings of the 11th International Conference on Supercomputing (ICS’97). ACM, New York, NY, 221--228. DOI:http://dx.doi.org/10.1145/263580.263637 Google Scholar
Digital Library
- William Pugh and David Wonnacott. 1994. Static analysis of upper and lower bounds on dependences and parallelism. Transactions on Programming Languages and Systems 16, 4, 1248--1278. http://drum.lib.umd.edu/bitstream/1903/629/4/CS-TR-3250.pdf. Google Scholar
Digital Library
- Fabien Quilleré, Sanjay Rajopadhye, and Doran Wilde. 2000. Generation of efficient nested loops from polyhedra. International Journal of Parallel Programming 28, 5, 469--498. Google Scholar
Digital Library
- Lakshminarayanan Renganarayanan, DaeGon Kim, Sanjay Rajopadhye, and Michelle Mills Strout. 2007. Parameterized tiled loops for free. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’07). ACM, New York, NY, 405--414. Google Scholar
Digital Library
- Jun Shirako, Louis-Noël Pouchet, and Vivek Sarkar. 2014. Oil and water can mix: An integration of polyhedral and AST-based transformations. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’14). IEEE, Los Alamitos, CA, 287--298. DOI:http://dx.doi.org/10.1109/SC.2014.29 Google Scholar
Digital Library
- Nicolas Vasilache, Cédric Bastoul, and Albert Cohen. 2006. Polyhedral code generation in the real world. In Compiler Construction. Lecture Notes in Computer Science, Vol. 3923. Springer, 185--201. http://icps.u-strasbg.fr/∼bastoul/research/papers/VBC06-CC.pdf. Google Scholar
Digital Library
- Anand Venkat, Manu Shantharam, Mary Hall, and Michelle Strout. 2014. Non-affine extensions to polyhedral code generation. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’14). ACM, New York, NY, 185:185--185:194. Google Scholar
Digital Library
- Sven Verdoolaege. 2010. isl: An integer set library for the polyhedral model. In Mathematical Software—ICMS 2010. Lecture Notes in Computer Science, Vol. 6327. Springer, 299--302. Google Scholar
Digital Library
- Sven Verdoolaege. 2011. Counting affine calculator and applications. In Proceedings of the 1st International Workshop on Polyhedral Compilation Techniques (IMPACT’11).Google Scholar
- Sven Verdoolaege. 2015. Integer set coalescing. In Proceedings of the 5th International Workshop on Polyhedral Compilation Techniques (IMPACT’15).Google Scholar
- Sven Verdoolaege and Tobias Grosser. 2012. Polyhedral extraction tool. In Proceedings of the 2nd International Workshop on Polyhedral Compilation Techniques (IMPACT’12). http://impact.gforge.inria.fr/impact2012/workshop_IMPACT/verdoolaege.pdf.Google Scholar
- Sven Verdoolaege, Serge Guelton, Tobias Grosser, and Albert Cohen. 2014. Schedule trees. In Proceedings of the 4th International Workshop on Polyhedral Compilation Techniques. http://impact.gforge.inria.fr/impact2014/papers/impact2014-verdoolaege.pdf.Google Scholar
- Sven Verdoolaege, Gerda Janssens, and Maurice Bruynooghe. 2012. Equivalence checking of static affine programs using widening to handle recurrences. ACM Transactions on Programming Languages and Systems 34, 3, Article No. 11. DOI:http://dx.doi.org/10.1145/2362389.2362390 Google Scholar
Digital Library
- Sven Verdoolaege, Juan Carlos Juega, Albert Cohen, José Ignacio Gómez, Christian Tenllado, and Francky Catthoor. 2013. Polyhedral parallel code generation for CUDA. ACM Transactions on Architecture and Code Optimization 9, 4, 54:1--54:23. DOI:http://dx.doi.org/10.1145/2400682.2400713 Google Scholar
Digital Library
- David Wonnacott. 2002. Achieving scalable locality with time skewing. International Journal of Parallel Programming 30, 3, 181--221. DOI:http://dx.doi.org/10.1023/A:1015460304860 Google Scholar
Digital Library
- Tomofumi Yuki, Gautam Gupta, DaeGon Kim, Tanveer Pathan, and Sanjay Rajopadhye. 2012. AlphaZ: A system for design space exploration in the polyhedral model. In Proceedings of the 25th International Workshop on Languages and Compilers for Parallel Computing. http://people.rennes.inria.fr/Tomofumi.Yuki/papers/yuki-lcpc2012.pdf.Google Scholar
- Wei Zuo, Peng Li, Deming Chen, Louis-Noël Pouchet, Shunan Zhong, and Jason Cong. 2013. Improving polyhedral code generation for high-level synthesis. In Proceedings of the 9th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’13). IEEE, Los Alamitos, CA, 15:1--15:10. Google Scholar
Digital Library
Index Terms
Polyhedral AST Generation Is More Than Scanning Polyhedra
Recommendations
Polyhedral parallel code generation for CUDA
Special Issue on High-Performance Embedded Architectures and CompilersThis article addresses the compilation of a sequential program for parallel execution on a modern GPU. To this end, we present a novel source-to-source compiler called PPCG. PPCG singles out for its ability to accelerate computations from any static ...
Non-affine Extensions to Polyhedral Code Generation
CGO '14: Proceedings of Annual IEEE/ACM International Symposium on Code Generation and OptimizationThis paper describes a loop transformation framework that extends a polyhedral representation of loop nests to represent and transform computations with non-affine index arrays in loop bounds and subscripts via a new interface between compile-time and ...
Non-affine Extensions to Polyhedral Code Generation
CGO '14: Proceedings of Annual IEEE/ACM International Symposium on Code Generation and OptimizationThis paper describes a loop transformation framework that extends a polyhedral representation of loop nests to represent and transform computations with non-affine index arrays in loop bounds and subscripts via a new interface between compile-time and ...






Comments