skip to main content
article

Loop-oriented array- and field-sensitive pointer analysis for automatic SIMD vectorization

Published:13 June 2016Publication History
Skip Abstract Section

Abstract

Compiler-based auto-vectorization is a promising solution to automatically generate code that makes efficient use of SIMD processors in high performance platforms and embedded systems. Two main auto-vectorization techniques, superword-level parallelism vectorization (SLP) and loop-level vectorization (LLV), re- quire precise dependence analysis on arrays and structs in order to vectorize isomorphic scalar instructions and/or reduce dynamic dependence checks incurred at runtime. The alias analyses used in modern vectorizing compilers are either intra-procedural (without tracking inter-procedural data-flows) or inter-procedural (by using field-insensitive models, which are too imprecise in handling arrays and structs). This paper pro- poses an inter-procedural Loop-oriented Pointer Analysis, called LPA, for analyzing arrays and structs to support aggressive SLP and LLV optimizations. Unlike field-insensitive solutions that pre- allocate objects for each memory allocation site, our approach uses a fine-grained memory model to generate location sets based on how structs and arrays are accessed. LPA can precisely analyze ar- rays and nested aggregate structures to enable SIMD optimizations for large programs. By separating the location set generation as an independent concern from the rest of the pointer analysis, LPA is designed to reuse easily existing points-to resolution algorithms. We evaluate LPA using SLP and LLV, the two classic vectorization techniques on a set of 20 CPU2000/2006 benchmarks. For SLP, LPA enables it to vectorize a total of 133 more basic blocks, with an average of 12.09 per benchmark, resulting in the best speedup of 2.95% for 173.applu. For LLV, LPA has reduced a total of 319 static bound checks, with an average of 22.79 per benchmark, resulting in the best speedup of 7.18% for 177.mesa.

References

  1. L. Andersen. Program analysis and specialization for the C programming language. PhD thesis, 1994.Google ScholarGoogle Scholar
  2. O. Bachmann, P. S. Wang, and E. V. Zima. Chains of recurrences - a method to expedite the evaluation of closed-form functions. In ISSAC ’94, pages 242–249, 1994. Google ScholarGoogle Scholar
  3. R. Barik, J. Zhao, and V. Sarkar. Efficient selection of vector instructions using dynamic programming. In MICRO ’10, pages 201–212, 2010. Google ScholarGoogle Scholar
  4. B. Hardekopf and C. Lin. Flow-Sensitive Pointer Analysis for Millions of Lines of Code. In CGO ’11, pages 289–298, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. ISO90. ISO/IEC. international standard ISO/IEC 9899, programming languages - C. 1990.Google ScholarGoogle Scholar
  6. M. Jung and S. A. Huss. Fast points-to analysis for languages with structured types. In Software and Compilers for Embedded Systems, pages 107–121. Springer, 2004.Google ScholarGoogle Scholar
  7. S. Larsen and S. Amarasinghe. Exploiting superword level parallelism with multimedia instruction sets. In PLDI ’00, pages 145–156, 2000. Google ScholarGoogle Scholar
  8. O. Lhoták and K.-C. A. Chung. Points-to analysis with efficient strong updates. In POPL ’11, pages 3–16, 2011. Google ScholarGoogle Scholar
  9. Y. Li, T. Tan, Y. Sui, and J. Xue. Self-inferencing reflection resolution for java. In ECOOP ’14, pages 27–53. Springer, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Li, T. Tan, Y. Zhang, and J. Xue. Program tailoring: Slicing by sequential criteria. In ECOOP ’16, 2016.Google ScholarGoogle Scholar
  11. J. Liu, Y. Zhang, O. Jang, W. Ding, and M. Kandemir. A compiler framework for extracting superword level parallelism. In PLDI ’12, pages 347–358, 2012. Google ScholarGoogle Scholar
  12. S. Maleki, Y. Gao, M. J. Garzarán, T. Wong, and D. A. Padua. An evaluation of vectorizing compilers. In PACT ’11, pages 372–382, 2011. Google ScholarGoogle Scholar
  13. P. H. Nguyen and J. Xue. Interprocedural side-effect analysis and optimisation in the presence of dynamic class loading. In ACSC ’05, pages 9–18, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. Nuutila and E. Soisalon-Soininen. On finding the strongly connected components in a directed graph. Information Processing Letters, 49(1):9–14, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Nuzman and A. Zaks. Outer-loop vectorization: Revisited for short SIMD architectures. In PACT ’08, pages 2–11. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Nuzman, I. Rosen, and A. Zaks. Auto-vectorization of interleaved data for SIMD. In PLDI ’06, pages 132–143, 2006. Google ScholarGoogle Scholar
  17. D. J. Pearce, P. H. Kelly, and C. Hankin. Efficient field-sensitive pointer analysis of C. ACM Transactions on Programming Languages and Systems, 30(1):4, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. F. M. Q. Pereira and D. Berlin. Wave propagation and deep propagation for pointer analysis. In CGO ’09, pages 126–135, 2009. Google ScholarGoogle Scholar
  19. V. Porpodas, A. Magni, and T. M. Jones. PSLP: Padded SLP automatic vectorization. In CGO ’15, pages 190–201, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. R. Rick Hank, Loreena Lee. Implementing next generation pointsto in open64. In Open64 Developers Forum, 2010. URL http: //www.affinic.com/documents/open64workshop/2010/.Google ScholarGoogle Scholar
  21. J. Shin. Introducing control flow into vectorized code. In PACT ’07, pages 280–291, 2007. Google ScholarGoogle Scholar
  22. J. Shin, M. Hall, and J. Chame. Superword-level parallelism in the presence of control flow. In CGO ’05, pages 165–175, 2005. Google ScholarGoogle Scholar
  23. B. Steensgaard. Points-to analysis in almost linear time. In POPL ’96, pages 32–41. ACM, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Sui and J. Xue. SVF: Interprocedural static value-flow analysis in LLVM. In CC ’16, 2016. https://github.com/unsw-corg/SVF. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Sui, D. Ye, and J. Xue. Static memory leak detection using fullsparse value-flow analysis. In ISSTA ’12, pages 254–264, 2012. Google ScholarGoogle Scholar
  26. Y. Sui, Y. Li, and X. Jingling. Query-directed adaptive heap cloning for optimizing compilers. In CGO ’13, CGO ’13, pages 1–11, 2013. Google ScholarGoogle Scholar
  27. Y. Sui, S. Ye, J. Xue, and J. Zhang. Making context-sensitive inclusion-based pointer analysis practical for compilers using parameterised summarisation. Software: Practice and Experience, 44(12): 1485–1510, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. Sui, P. Di, and J. Xue. Sparse flow-sensitive pointer analysis for multithreaded programs. In CGO ’16, pages 160–170, 2016. Google ScholarGoogle Scholar
  29. K. Trifunovic, D. Nuzman, A. Cohen, A. Zaks, and I. Rosen. Polyhedral-model guided loop-nest auto-vectorization. In PACT ’09, pages 327–337, 2009. Google ScholarGoogle Scholar
  30. R. van Engelen. Efficient symbolic analysis for optimizing compilers. In CC ’01, pages 118–132, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. P. Wilson and M. S. Lam. Efficient context-sensitive pointer analysis for C programs. In PLDI ’95, pages 1–12, 1995. Google ScholarGoogle Scholar
  32. S. Ye, Y. Sui, and J. Xue. Region-based selective flow-sensitive pointer analysis. In SAS ’14, pages 319–336. Springer, 2014.Google ScholarGoogle Scholar
  33. H. Zhou and J. Xue. A compiler approach for exploiting partial SIMD parallelism. ACM Transactions on Architecture and Code Optimization, 13(1):11:1–11:26, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. H. Zhou and J. Xue. Exploiting mixed SIMD parallelism by reducing data reorganization overhead. In CGO ’16, pages 59–69, 2016. Google ScholarGoogle Scholar

Index Terms

  1. Loop-oriented array- and field-sensitive pointer analysis for automatic SIMD vectorization

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 51, Issue 5
      LCTES '16
      May 2016
      122 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2980930
      • Editor:
      • Andy Gill
      Issue’s Table of Contents
      • cover image ACM Conferences
        LCTES 2016: Proceedings of the 17th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, Tools, and Theory for Embedded Systems
        June 2016
        122 pages
        ISBN:9781450343169
        DOI:10.1145/2907950

      Copyright © 2016 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 June 2016

      Check for updates

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!