skip to main content
research-article

Logical inference techniques for loop parallelization

Published:11 June 2012Publication History
Skip Abstract Section

Abstract

This paper presents a fully automatic approach to loop parallelization that integrates the use of static and run-time analysis and thus overcomes many known difficulties such as nonlinear and indirect array indexing and complex control flow. Our hybrid analysis framework validates the parallelization transformation by verifying the independence of the loop's memory references. To this end it represents array references using the USR (uniform set representation) language and expresses the independence condition as an equation, S=0, where S is a set expression representing array indexes. Using a language instead of an array-abstraction representation for S results in a smaller number of conservative approximations but exhibits a potentially-high runtime cost. To alleviate this cost we introduce a language translation F from the USR set-expression language to an equally rich language of predicates (F(S) ==> S = 0). Loop parallelization is then validated using a novel logic inference algorithm that factorizes the obtained complex predicates (F(S)) into a sequence of sufficient independence conditions that are evaluated first statically and, when needed, dynamically, in increasing order of their estimated complexities. We evaluate our automated solution on 26 benchmarks from PERFECT-Club and SPEC suites and show that our approach is effective in parallelizing large, complex loops and obtains much better full program speedups than the Intel and IBM Fortran compilers.

References

  1. V. Adve and J. Mellor-Crummey. Using Integer Sets for Data-Parallel Program Analysis and Optimization. In Procs. Int. Conf. Prog. Lang. Design and Implementation, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Allen and K. Kennedy. Optimizing Compilers for Modern Architectures. Morgan Kaufmann, 2002. ISBN 1-55860-286-0. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Armstrong and R. Eigenmann. Application of Automatic Parallelization to Modern Challenges of Scientific Computing Industries. In Int. Conf. Parallel Proc., pages 279--286, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. U. Banerjee. Speedup of Ordinary Programs. Ph.D. Thesis, Univ. of Illinois at Urbana-Champaign, Report No. 79--989, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. W. Blume et al. Parallel Programming with Polaris. Computer, 29(12), 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. Blume and R. Eigenmann. Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs. IEEE Transactions on Parallel and Distributed Systems, 3: 643--656, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. W. Blume and R. Eigenmann. The Range Test: A Dependence Test for Symbolic, Non-Linear Expressions. In Procs. Int. Conf. on Supercomp, pages 528--537, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. W. Blume and R. Eigenmann. Demand-Driven,Symbolic Range Propagation. In Procs. Int. Lang. Comp. Par. Comp., 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. A. V. Engelen. A unified framework for nonlinear dependence testing and symbolic analysis. In Procs. Int. Conf.on Supercomp, pages 106--115, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Fahringer. Efficient Symbolic Analysis for Parallelizing Compilers and Performance Estimator. Journal of Supercomp,12: 227--252,1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. Feautrier. Parametric Integer Programming. Operations Research, 22(3): 243--268, 1988.Google ScholarGoogle ScholarCross RefCross Ref
  12. P. Feautrier. Dataflow Analysis of Array and Scalar References. Int. Journal of Par. Prog, 20(1): 23--54, 1991.Google ScholarGoogle ScholarCross RefCross Ref
  13. M. W. Hall, S. P. Amarasinghe, B. R. Murphy, S.-W. Liao, and M. S. Lam. Interprocedural Parallelization Analysis in SUIF. Trans. on Prog. Lang. and Sys., 27(4): 662--731, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Hoeflinger, Y. Paek, and K. Yi. Unified Interprocedural Parallelism Detection. Int. Journal of Par. Prog, 29(2): 185--215, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Lin and D. Padua. Demand-Driven Interprocedural Array Property Analysis. In Procs. Int. Lang. Comp. Par. Comp., 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. Lin and D. Padua. Analysis of Irregular Single-Indexed Arrays and its Applications in Compiler Optimizations. In Procs. Int. Conf. on Compiler Construction, pages 202--218, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. B. Lu and J. Mellor-Crummey. Compiler Optimization of Implicit Reductions for Distributed Memory Multiprocessors In Int. Par. Proc. Symp., 1998 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Moon and M. W. Hall. Evaluation of Predicated Array Data-Flow Analysis for Automatic Parallelization. In Proc. of Principles and Practice of Parallel Programming, pages 84--95, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. E. Oancea and L. Rauchwerger. A Hybrid Approach to Proving Memory Reference Monotonicity. In Procs. Int. Lang. Comp. Par. Comp., 2011.Google ScholarGoogle Scholar
  20. Y. Paek, J. Hoeflinger, and D. Padua. Efficient and Precise Array Access Analysis. Trans. on Prog. Lang. and Sys., 24(1): 65--109, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L.N. Pouchet, et al. Loop Transformations: Convexity, Pruning and Optimization. In Procs. of Princ. of Prog. Lang., 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. W. Pugh. The Omega Test: a Fast and Practical Integer Programming Algorithm for Dependence Analysis. Com. of the ACM, 8: 4--13, 1992.Google ScholarGoogle Scholar
  23. W. Pugh and D. Wonnacott. Nonlinear Array Dependence Analysis. In Proc. Lang. Comp. Run-Time Support Scal. Sys., 1995.Google ScholarGoogle Scholar
  24. W. Pugh and D. Wonnacott. Constraint-Based Array Dependence Analysis. In Trans. on Prog. Lang. and Sys., 20(3), 635--678, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. L. Rauchwerger and D. Padua. The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization. IEEE Trans. Par. Distrib. Sys, 10(2): 160--199, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. Rauchwerger, N. Amato, and D. Padua. A Scalable Method for Run Time Loop Parallelization. Int. Journal of Par. Prog,26: 26--6,1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. L. Renganarayanan, D. Kim, S. Rajopadhye, and M. M. Strout.L. Renganarayanan, et. al. Parameterized Tiled Loops for Free. In Int. Conf. Prog. Lang. Design and Implementation., 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Rus, J. Hoeflinger, and L. Rauchwerger. Hybrid analysis: Static & dynamic memory reference analysis. Int. Journal of Par. Prog, 31(3): 251--283, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Rus, M. Pennings, and L. Rauchwerger. Sensitivity Analysis for Automatic Parallelization on Multi-Cores. In Procs. Int. Conf. on Supercomp, pages 263--273, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. X. Zhuang, et. al. Exploiting Parallelism with Dependence-Aware Scheduling. In Int. Conf. Par. Arch. Compilation Tech., 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Logical inference techniques for loop parallelization

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 47, Issue 6
          PLDI '12
          June 2012
          534 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2345156
          Issue’s Table of Contents
          • cover image ACM Conferences
            PLDI '12: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation
            June 2012
            572 pages
            ISBN:9781450312059
            DOI:10.1145/2254064

          Copyright © 2012 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 11 June 2012

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!