skip to main content
research-article

EigenCFA: accelerating flow analysis with GPUs

Published:26 January 2011Publication History
Skip Abstract Section

Abstract

We describe, implement and benchmark EigenCFA, an algorithm for accelerating higher-order control-flow analysis (specifically, 0CFA) with a GPU. Ultimately, our program transformations, reductions and optimizations achieve a factor of 72 speedup over an optimized CPU implementation.

We began our investigation with the view that GPUs accelerate high-arithmetic, data-parallel computations with a poor tolerance for branching. Taking that perspective to its limit, we reduced Shivers's abstract-interpretive 0CFA to an algorithm synthesized from linear-algebra operations. Central to this reduction were "abstract" Church encodings, and encodings of the syntax tree and abstract domains as vectors and matrices.

A straightforward (dense-matrix) implementation of EigenCFA performed slower than a fast CPU implementation. Ultimately, sparse-matrix data structures and operations turned out to be the critical accelerants. Because control-flow graphs are sparse in practice (up to 96% empty), our control-flow matrices are also sparse, giving the sparse matrix operations an overwhelming space and speed advantage.

We also achieved speedups by carefully permitting data races. The monotonicity of 0CFA makes it sound to perform analysis operations in parallel, possibly using stale or even partially-updated data.

Skip Supplemental Material Section

Supplemental Material

47-mpeg-4.mp4

References

  1. NVIDIA CUDA Programming Guide 2.3, Aug. 2009.Google ScholarGoogle Scholar
  2. F. E. Allen and J. Cocke. A program data flow analysis procedure. Commun. ACM, 19 (3): 137, 1976. ISSN 0001-0782. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Balay, K. Buschelman, V. Eijkhout, W. Gropp, D. Kaushik, M. Knepley, L. C. McInnes, B. Smith, and H. Zhang. Sparse Matrices. In PETSc Users Manual, chapter 3, pages 55--66. 3.0.0 edition, Dec. 2008.Google ScholarGoogle Scholar
  4. F. Banterle and R. Giacobazzi. A Fast Implementation of the Octagon Abstract Domain on Graphics Hardware. In H. R. Nielson and G. Filé, editors, phStatic Analysis, volume 4634 of Lecture Notes in Computer Science, chapter 20, pages 315--332. Springer Berlin Heidelberg, Berlin, Heidelberg, 2007. ISBN 978-3-540-74060-5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. N. Bell and M. Garland. Implementing sparse matrix-vector multiplication on throughput-oriented processors. In SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pages 1--11, New York, NY, USA, 2009. ACM. ISBN 9781-605-5874-4-8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Chaudhuri. Subcubic algorithms for recursive state machines. In phProceedings of the 35th annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '08, pages 159--169, New York, NY, USA, 2008. ACM. ISBN 9781-595-9368-9-9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Conference Record of the Fourth ACM Symposium on Principles of Programming Languages, pages 238--252, New York, NY, USA, 1977. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. Cousot and R. Cousot. Systematic design of program analysis frameworks. In POPL '79: Proceedings of the 6th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pages 269--282, New York, NY, USA, 1979. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Gupta, L. Pollock, and M. L. Soffa. Parallelizing data flow analysis. 1990.Google ScholarGoogle Scholar
  10. M. S. Hecht. phFlow Analysis of Computer Programs. Elsevier Science Inc., New York, NY, USA, 1977. ISBN 0-444-00216-2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. D. Jones. Flow Analysis of Lambda Expressions (Preliminary Version). In Proceedings of the 8th Colloquium on Automata, Languages and Programming, pages 114--128, London, UK, 1981. Springer-Verlag. ISBN 3-540-10843-2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. A. Kildall. A unified approach to global program optimization. In POPL '73: Proceedings of the 1st annual ACM SIGACT-SIGPLAN symposium on Principles of programming languages, pages 194--206, New York, NY, USA, 1973. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Kramer, R. Gupta, and M. L. Soffa. The Combining DAG: A Technique for Parallel Data Flow Analysis. IEEE Transactions on Parallel and Distributed Systems, 5 (8): 805--813, Aug. 1994. ISSN 1045-9219. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. F. Lee, T. J. Marlowe, and B. G. Ryder. Performing data flow analysis in parallel. In Supercomputing '90: Proceedings of the 1990 ACM/IEEE conference on Supercomputing, pages 942--951, Los Alamitos, CA, USA, 1990. IEEE Computer Society Press. ISBN 0-89791-412-0. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. F. Lee, B. G. Ryder, and M. E. Fiuczynski. Region Analysis: A Parallel Elimination Method for Data Flow Analysis. IEEE Trans. Softw. Eng., 21 (11): 913--926, 1995. ISSN 0098-5589. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. P. Lopes and A. Rybalchenko. Distributed and Predictable Software Model Checking. In Proc. of the 12th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI), Jan. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. T. J. Marlowe and B. G. Ryder. An efficient hybrid algorithm for incremental data flow analysis. In POPL '90: Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 184--196, New York, NY, USA, 1990. ACM. ISBN 0-89791-343--4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Méndez-Lojo, A. Mathew, and K. Pingali. Parallel inclusion-based points-to analysis. In Proceedings of the ACM International Conference on Object Oriented Programming Systems languages and Applications, OOPSLA '10, pages 428--443, New York, NY, USA, 2010. ACM. ISBN 9781-450-3020-3-6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Midtgaard and D. Van Horn. Subcubic control flow analysis algorithm. Higher-Order and Symbolic Computation, To appear.Google ScholarGoogle Scholar
  20. M. Might and O. Shivers. Improving flow analyses via (Γ)CFA: Abstract garbage collection and counting. In ICFP '06: Proceedings of the Eleventh ACM SIGPLAN International Conference on Functional Programming, pages 13--25, New York, NY, USA, 2006. ACM. ISBN 1-59593-309-3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Might, Y. Smaragdakis, and D. V. Horn. Resolving and exploiting the k-CFA paradox: illuminating functional vs. object-oriented program analysis. In PLDI '10: Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 305--315, New York, NY, USA, 2010. ACM. ISBN 9781-450-3001-9-3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Palsberg. Closure analysis in constraint form. ACM Transactions on Programming Languages and Systems, 17 (1): 47--62, Jan. 1995. ISSN 0164-0925. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. G. Ryder and M. C. Paull. Elimination algorithms for data flow analysis. ACM Comput. Surv., 18 (3): 277--316, 1986. ISSN 0360-0300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. O. Shivers. Control flow analysis in Scheme. In Proceedings of the ACM SIGPLAN 1988 Conference on Programming Language Design and Implementation, volume 23, pages 164--174, New York, NY, USA, July 1988. ACM. ISBN 0-89791-269-1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. O. G. Shivers. Control-Flow Analysis of Higher-Order Languages. PhD thesis, Carnegie Mellon University, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. Van Horn and H. G. Mairson. Relating complexity and precision in control flow analysis. In ICFP '07: Proceedings of the 12th ACM SIGPLAN International Conference on Functional Programming, pages 85--96, New York, NY, USA, 2007. ACM. ISBN 9781-59593-815-2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Van Horn and H. G. Mairson. Deciding k-CFA is complete for EXPTIME. In ICFP '08: Proceeding of the 13th ACM SIGPLAN International Conference on Functional Programming, pages 275--282, New York, NY, USA, 2008. ACM. ISBN 978-1-59593-919-7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Zobel. Parallel interval analysis of data flow equations. volume II. The Penn State University press, Aug. 1990.Google ScholarGoogle Scholar

Index Terms

  1. EigenCFA: accelerating flow analysis with GPUs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 46, Issue 1
      POPL '11
      January 2011
      624 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/1925844
      Issue’s Table of Contents
      • cover image ACM Conferences
        POPL '11: Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
        January 2011
        652 pages
        ISBN:9781450304900
        DOI:10.1145/1926385

      Copyright © 2011 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 January 2011

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!