Abstract
Polyhedral compilation has been successful in the design and implementation of complex loop nest optimizers and parallelizing compilers. The algorithmic complexity and scalability limitations remain one important weakness. We address it using sub-polyhedral under-aproximations of the systems of constraints resulting from affine scheduling problems. We propose a sub-polyhedral scheduling technique using (Unit-)Two-Variable-Per-Inequality or (U)TVPI Polyhedra. This technique relies on simple polynomial time algorithms to under-approximate a general polyhedron into (U)TVPI polyhedra. We modify the state-of-the-art PLuTo compiler using our scheduling technique, and show that for a majority of the Polybench (2.0) kernels, the above under-approximations yield polyhedra that are non-empty. Solving the under-approximated system leads to asymptotic gains in complexity, and shows practically significant improvements when compared to a traditional LP solver. We also verify that code generated by our sub-polyhedral parallelization prototype matches the performance of PLuTo-optimized code when the under-approximation preserves feasibility.
Supplemental Material
- R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network flows: theory, algorithms, and applications. Prentice-Hall, Inc., NJ, USA, 1993. Google Scholar
Digital Library
- R. Allen and K. Kennedy. Automatic translation of fortran programs to vector form. ACM Trans. Program. Lang. Syst., 9(4):491--542, Oct. 1987. Google Scholar
Digital Library
- A. Andersson and S. Nilsson. Implementing radixsort. J. Exp. Algorithmics, 3, Sept. 1998. Google Scholar
Digital Library
- B. Aspvall and Y. Shiloach. A polynomial time algorithm for solving systems of linear inequalities with two variables per inequality. SIAM J. Comput., 9(4):827--845, 1980.Google Scholar
Digital Library
- R. Bagnara, P. M. Hill, and E. Zaffanella. The Parma Polyhedra Library: Toward a complete set of numerical abstractions for the analysis and verification of hardware and software systems. Science of Computer Programming, 72(1--2):3--21, 2008. Google Scholar
Digital Library
- R. Bagnara, P. M. Hill, and E. Zaffanella. Weakly-relational shapes for numeric abstractions: improved algorithms and proofs of correctness. Formal Methods in System Design, 35(3):279--323, 2009. Google Scholar
Digital Library
- V. Balasundaram and K. Kennedy. A technique for summarizing data access and its use in parallelism enhancing transformations. In PLDI, pages 41--53, 1989. Google Scholar
Digital Library
- U. Banerjee. Loop Transformations for Restructuring Compilers: The Foundations. Kluwer Academic Publishers, Boston, 1992. Google Scholar
Digital Library
- R. Bellman. On a routing problem. Quarterly of Applied Mathematics, 16:87--90, 1958.Google Scholar
Cross Ref
- M.-W. Benabderrahmane, L.-N. Pouchet, A. Cohen, and C. Bastoul. The polyhedral model is more widely applicable than you think. In Proceedings of the International Conference on Compiler Construction (ETAPS CC'10), number 6011 in LNCS, Paphos, Cyprus, Mar. 2010. Springer-Verlag. Google Scholar
Digital Library
- R. E. Bixby. Solving real-world linear programs: A decade and more of progress. Oper. Res., 50(1):3--15, Jan. 2002. Google Scholar
Digital Library
- B. Blanchet, P. Cousot, R. Cousot, J. Feret, L. Mauborgne, A. Miné, D. Monniaux, and X. Rival. A static analyzer for large safety-critical software. In PLDI, pages 196--207. ACM, 2003. Google Scholar
Digital Library
- U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral parallelizer and locality optimizer. In PLDI, pages 101--113, 2008. Google Scholar
Digital Library
- P.-Y. Calland, A. Darte, and Y. Robert. Circuit retiming applied to decomposed software pipelining. IEEE Trans. Parallel Distrib. Syst., 9(1):24--35, Jan. 1998. Google Scholar
Digital Library
- B. V. Cherkassky, L. Georgiadis, A. V. Goldberg, R. E. Tarjan, and R. F. Werneck. Shortest-path feasibility algorithms: An experimental evaluation. J. Exp. Algorithmics, 14:7:2.7--7:2.37, Jan. 2010. Google Scholar
Digital Library
- E. Cohen and N. Megiddo. Improved algorithms for linear inequalities with two variables per inequality. SIAM J. Comput., 23:1313--1350, December 1994. Google Scholar
Digital Library
- P. Cousot, R. Cousot, J. Feret, L. Mauborgne, A. Miné, and X. Rival. Why does Astrée scale up? Formal Methods in System Design, 35(3):229--264, Dec 2009. Google Scholar
Digital Library
- G. B. Dantzig. Linear Programming and Extensions. Princeton University Press, Princeton, NJ, 1963.Google Scholar
Digital Library
- A. Darte and G. Huard. Loop shifting for loop compaction. Int. J. Parallel Program., 28(5):499--534, Oct. 2000. Google Scholar
Digital Library
- A. Darte and G. Huard. Complexity of multi-dimensional loop alignment. In Proceedings of the 19th Annual Symposium on Theoretical Aspects of Computer Science, STACS '02, pages 179--191, London, UK, UK, 2002. Springer-Verlag. Google Scholar
Digital Library
- A. Darte, Y. Robert, and F. Vivien. Scheduling and Automatic Parallelization. Birkhaüser, 2000. Google Scholar
Digital Library
- A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Int. J. Parallel Program., 25:447--496, December 1997. Google Scholar
Digital Library
- H. Edelsbrunner, G. Rote, and E. Welzl. Testing the necklace condition for shortest tours and optimal factors in the plane. Theor. Comput. Sci., 66(2):157--180, 1989. Google Scholar
Digital Library
- P. Feautrier. Parametric integer programming. RAIRO Recherche Opérationnelle, 22(3):243--268, 1988. http://www.piplib.org/.Google Scholar
Cross Ref
- P. Feautrier. Some efficient solutions to the affine scheduling problem: I. one-dimensional time. IJPP, 21:313--348, October 1992. Google Scholar
Digital Library
- P. Feautrier. Some efficient solutions to the affine scheduling problem: Part ii: Multidimensional time. Int. J. Parallel Program., 21:389--420, December 1992. Google Scholar
Digital Library
- P. Feautrier. Scalable and structured scheduling. International Journal of Parallel Programming, 34(5):459--487, 2006. Google Scholar
Digital Library
- L. R. Ford, Jr. and D. R. Fulkerson. Flows in Networks. Princeton University Press, 1962.Google Scholar
- M. Griebl, P. Feautrier, and A. Größlinger. Forward communication only placements and their use for parallel program construction. In W. Pugh and C.-W. Tseng, editors, LCPC, volume 2481 of Lecture Notes in Computer Science, pages 16--30. Springer, 2002. Google Scholar
Digital Library
- T. Grosser, H. Zheng, A. Raghesh, A. Simbürger, A. Größlinger, and L.-N. Pouchet. Polly - Polyhedral Optimization in LLVM. In IMPACT 2011, in conjunction with CGO 2011, Chamonix, France, Apr 2011.Google Scholar
- N. Halbwachs, D. Merchat, and L. Gonnord. Some ways to reduce the space dimension in polyhedra computations. Form. Methods Syst. Des., 29:79--95, July 2006. Google Scholar
Digital Library
- D. S. Hochbaum and J. Naor. Simple and fast algorithms for linear and integer programs with two variables per inequality. SIAM J. Comput., 23(6):1179--1192, 1994. Google Scholar
Digital Library
- J. Jaffar, M. J. Maher, P. J. Stuckey, and R. H. C. Yap. Beyond finite domains. In A. Borning, editor, PPCP, volume 874 of Lecture Notes in Computer Science, pages 86--94. Springer, 1994. Google Scholar
Digital Library
- B. Jeannet and A. Miné. Apron: A library of numerical abstract domains for static analysis. In CAV, pages 661--667, 2009. Google Scholar
Digital Library
- J. C. Lagarias. The computational complexity of simultaneous diophantine approximation problems. SIAM J. Comput., 14(1):196--209, 1985. Google Scholar
Digital Library
- A. W. Lim and M. S. Lam. Communication-free parallelization via affine transformations. In POPL, pages 201--214, Paris, France, jan 1997.Google Scholar
Digital Library
- D. E. Maydan, J. L. Hennessy, and M. S. Lam. Efficient and exact data dependence analysis. In Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation, PLDI '91, pages 1--14, New York, NY, USA, 1991. ACM. Google Scholar
Digital Library
- A. Miné. The octagon abstract domain. Higher-Order and Symbolic Computation, 19(1):31--100, 2006. Google Scholar
Digital Library
- L.-N. Pouchet, U. Bondhugula, C. Bastoul, A. Cohen, J. Ramanujam, P. Sadayappan, and N. Vasilache. Loop transformations: convexity, pruning and optimization. In POPL, pages 549--562, 2011. Google Scholar
Digital Library
- L.-N. Pouchet, U. Bondhugula, et al. The polybench benchmarks. http://www.cse.ohio-state.edu/ pouchet/software/polybench.Google Scholar
- V. R. Pratt. Two easy theories whose combination is hard. Technical report, Massachusetts Institute of Technology, Cambridge, Mass, 1977. http://boole.stanford.edu/pub/sefnp.pdf.Google Scholar
- W. Pugh. A practical algorithm for exact array dependence analysis. Commun. ACM, 35(8):102--114, Aug. 1992. Google Scholar
Digital Library
- F. Santos. A counterexample to the hirsch conjecture. Annals of Mathematics, 176:383--412, 2012.Google Scholar
Cross Ref
- A. Schrijver. Theory of linear and integer programming. John Wiley & Sons, Inc., New York, NY, USA, 1986. Google Scholar
Digital Library
- S. A. Seshia, K. Subramani, and R. E. Bryant. On solving boolean combinations of UTVPI constraints. Journal on Satisfiability, Boolean Modeling and Computation (JSAT), 3(1--2):67--90, 2007.Google Scholar
- R. Shostak. Deciding linear inequalities by computing loop residues. J. ACM, 28:769--779, October 1981. Google Scholar
Digital Library
- A. Simon and A. King. The two variable per inequality abstract domain. Higher Order Symbol. Comput., 23(1):87--143, Mar. 2010. Google Scholar
Digital Library
- D. A. Spielman and S.-H. Teng. Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. J. ACM, 51:385--463, May 2004. Google Scholar
Digital Library
- A. Tarski. A decision method for elementary algebra and geometry. Univ. of California Press, Berkeley, 2nd edition, 1951.Google Scholar
- M. J. Todd. The many facets of linear programming. Mathematical Programming, 91:417--436, 2002.Google Scholar
Digital Library
- K. Trifunovic, A. Cohen, D. Edelsohn, F. Li, T. Grosser, H. Jagasia, R. Ladelsky, S. Pop, J. Sjödin, and R. Upadrasta. Graphite two years after: First lessons learned from real-world polyhedral compilation. In GCC Research Opportunities Workshop (GROW'10), Pisa, Italy, Jan. 2010.Google Scholar
- R. Upadrasta. Scalability Challenges in the Polyhedral Model: An Algorithmic Approach using (Unit-)Two-variable Per Inequality Sub-Polyhedra. PhD thesis, Université Paris-Sud (11), Orsay, France, January 2013.Google Scholar
- R. Upadrasta and A. Cohen. Potential and Challenges of Two-Variable-Per-Inequality Sub-Polyhedral Compilation. In First International Workshop on Polyhedral Compilation Techniques (IMPACT'11), in conjunction with CGO'11, Chamonix, France, Apr. 2011.Google Scholar
- R. Upadrasta and A. Cohen. A Case for Strongly Polynomial Time Sub-Polyhedral Scheduling Using Two-Variable-Per-Inequality Polyhedra. In Second International Workshop on Polyhedral Compilation Techniques (IMPACT'12), in conjunction with HiPEAC'12, Paris, France, Jan. 2012.Google Scholar
- N. Vasilache. Scalable Program Optimization Techniques In The Polyhedral Model. PhD thesis, Paris-Sud 11 University, Sept. 2007.Google Scholar
- F. Vivien. On the optimality of feautrier's scheduling algorithm. Concurrency and Computation: Practice and Experience, 15(11--12):1047--1068, 2003.Google Scholar
- F. Vivien and N. Wicker. Minimal enclosing parallelepiped in 3d. Comput. Geom. Theory Appl., 29:177--190, November 2004. Google Scholar
Digital Library
- K. D. Wayne. A polynomial combinatorial algorithm for generalized minimum cost flow. In Proceedings of the thirty-first annual ACM symposium on Theory of computing, STOC '99, pages 11--18, New York, NY, USA, 1999. ACM. Google Scholar
Digital Library
- M. E. Wolf and M. S. Lam. A loop transformation theory and an algorithm to maximize parallelism. IEEE Trans. Parallel Distrib. Syst., 2(4):452--471, Oct. 1991. Google Scholar
Digital Library
- Y.-Q. Yang, C. Ancourt, and F. Irigoin. Minimal data dependence abstractions for loop transformations. In K. Pingali, U. Banerjee, D. Gelernter, A. Nicolau, and D. A. Padua, editors, LCPC, volume 892 of Lecture Notes in Computer Science, pages 201--216. Springer, 1994. Google Scholar
Digital Library
- G. Ziegler. Lectures on polytopes. Graduate texts in mathematics. Springer Science, 2006.Google Scholar
Index Terms
Sub-polyhedral scheduling using (unit-)two-variable-per-inequality polyhedra
Recommendations
When polyhedral transformations meet SIMD code generation
PLDI '13: Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and ImplementationData locality and parallelism are critical optimization objectives for performance on modern multi-core machines. Both coarse-grain parallelism (e.g., multi-core) and fine-grain parallelism (e.g., vector SIMD) must be effectively exploited, but despite ...
Sub-polyhedral scheduling using (unit-)two-variable-per-inequality polyhedra
POPL '13: Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesPolyhedral compilation has been successful in the design and implementation of complex loop nest optimizers and parallelizing compilers. The algorithmic complexity and scalability limitations remain one important weakness. We address it using sub-...
A practical automatic polyhedral parallelizer and locality optimizer
PLDI '08We present the design and implementation of an automatic polyhedral source-to-source transformation framework that can optimize regular programs (sequences of possibly imperfectly nested loops) for parallelism and locality simultaneously. Through this ...







Comments