skip to main content
research-article

Giga-scale exhaustive points-to analysis for Java in under a minute

Published:23 October 2015Publication History
Skip Abstract Section

Abstract

Computing a precise points-to analysis for very large Java programs remains challenging despite the large body of research on points-to analysis. Any approach must solve an underlying dynamic graph reachability problem, for which the best algorithms have near-cubic worst-case runtime complexity, and, hence, previous work does not scale to programs with millions of lines of code. In this work, we present a novel approach for solving the field-sensitive points-to problem for Java with the means of (1) a transitive-closure data-structure, and (2) a pre-computed set of potentially matching load/store pairs to accelerate the fix-point calculation. Experimentation on Java benchmarks validates the superior performance of our approach over the standard context-free language reachability implementations. Our approach computes a points-to index for the OpenJDK with over 1.5 billion tuples in under a minute.

Skip Supplemental Material Section

Supplemental Material

References

  1. R. Agrawal, A. Borgida, and H. V. Jagadish. Efficient management of transitive relationships in large data and knowledge bases. In Proceedings SIGMOD’89. ACM, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. O. Andersen. Program analysis and specialization for the C programming language. PhD thesis, University of Copenhagen, 1994.Google ScholarGoogle Scholar
  3. V. Arlazarov, E. Dinic, M. Kronrod, and I. Faradzev. On economic construction of the transitive closure of a directed graph. Soviet Math. Dokl., 11:1209–1210, 1970.Google ScholarGoogle Scholar
  4. S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanovi´c, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The dacapo benchmarks: Java benchmarking development and analysis. In Proceedings OOPSLA’06. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. E. Bodden, A. Sewe, J. Sinschek, M. Mezini, and H. Oueslati. Taming reflection: Aiding static analysis in the presence of reflection and custom class loaders. In Proceeding ICSE ’11. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Chatterjee, B. G. Ryder, and W. A. Landi. Relevant context inference. In Proceedings POPL’99. ACM, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Chaudhuri. Subcubic algorithms for recursive state machines. In Proceedings POPL’08. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SIAM Journal on Computing, 32(5):1338–1355, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Colantonio and R. Di Pietro. Concise: Compressed ncomposable integer set. Information Processing Letters, 110(16): 644–650, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Coppersmith and S. Winograd. Matrix multiplication via arithmetic progressions. Journal of Symbolic Computation, 9 (3):251–280, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. L. Gall. Powers of tensors and fast matrix multiplication. arXiv preprint arXiv:1401.7714, 2014.Google ScholarGoogle Scholar
  12. D. Grove and C. Chambers. A framework for call graph construction algorithms. ACM Transactions on Programming Languages and Systems (TOPLAS), 23(6):685–746, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. H. Jagadish. A compression technique to materialize transitive closure. ACM Transactions on Database Systems (TODS), 15 (4):558–598, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Jin, Y. Xiang, N. Ruan, and H. Wang. Efficiently answering reachability queries on very large directed graphs. In Proceedings SIGMOD’2008. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3-hop: a highcompression indexing scheme for reachability query. In Proceedings SIGMOD’2009. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. O. Lhoták. Spark: A flexible points-to analysis framework for java, 2002.Google ScholarGoogle Scholar
  17. O. Lhoták and L. Hendren. Evaluating the benefits of context-sensitive points-to analysis using a bdd-based implementation. ACM Transactions on Software Engineering and Methodology (TOSEM), 18(1):3, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. V. B. Livshits and M. S. Lam. Finding security vulnerabilities in java applications with static analysis. In Proceedings USENIX’05, pages 18–18, Berkeley, CA, USA, 2005. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Melski and T. Reps. Interconvertibility of a class of set constraints and context-free-language reachability. Theoretical Computer Science, 248(1):29–98, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Milanova, A. Rountev, and B. G. Ryder. Parameterized object sensitivity for points-to and side-effect analyses for java. In ACM SIGSOFT Software Engineering Notes, volume 27, pages 1–11. ACM, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. E. Nuutila. Efficient transitive closure computation in large digraphs. PhD thesis, PhD thesis, Helsinki University of Technology, 1995. Acta Polytechnica Scandinavica, Mathematics and Computing in Engineering Series, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. J. Pearce, P. H. J. Kelly, and C. Hankin. Online cycle detection and difference propagation: Applications to pointer analysis. volume 12, pages 311–337. Kluwer, Dec. 2004.. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Reps. Program analysis via graph reachability. Information and Software Technology, 40(11):701–726, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  24. B. G. Ryder. Dimensions of precision in reference analysis of object-oriented programming languages. In Proceedings CC’03. Springer, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. L. Shang, X. Xie, and J. Xue. On-demand dynamic summarybased points-to analysis. In Proceedings CGO’12. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Sharir and A. Pnueli. Two approaches to interprocedural data flow analysis. New York University, 1978.Google ScholarGoogle Scholar
  27. O. Shivers. Control-flow analysis of higher-order languages. PhD thesis, Carnegie Mellon University Pittsburgh, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. Smaragdakis and M. Bravenboer. Using datalog for fast and easy program analysis. In O. de Moor, G. Gottlob, T. Furche, and A. Sellers, editors, Datalog Reloaded, volume 6702 of LNCS, pages 245–251. Springer Berlin Heidelberg, 2011. ISBN 978-3-642-24205-2.. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Y. Smaragdakis, M. Bravenboer, and O. Lhoták. Pick your contexts well: understanding object-sensitivity. ACM SIGPLAN Notices, 46(1):17–30, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Smaragdakis, G. Kastrinis, and G. Balatsouras. Introspective analysis: context-sensitivity, across the board. In Proceedings PLDI’2014. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Sridharan and R. Bod´ık. Refinement-based contextsensitive points-to analysis for java. In Proceedings PLDI’06. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Sridharan and S. J. Fink. The complexity of Andersens analysis in practice. In Static Analysis, pages 205–221. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Sridharan, D. Gopan, L. Shan, and R. Bod´ık. Demanddriven points-to analysis for java. In Proceedings OOPSLA’05. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. B. Steensgaard. Points-to analysis in almost linear time. In Proceedings POPL ’96. ACM, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. V. Strassen. Gaussian elimination is not optimal. Numerische Mathematik, 13(4):354–356, 1969. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. R. Tarjan. Depth-first search and linear graph algorithms. SIAM journal on computing, 1(2):146–160, 1972.Google ScholarGoogle Scholar
  37. E. Tempero. How fields are used in Java: An empirical study. In Proceedings ASWEC’09. IEEE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. F. Tip and J. Palsberg. Scalable propagation-based call graph construction algorithms. In Proceedings OOPSLA’00. ACM, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. R. Vallée-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and V. Sundaresan. Soot - a java bytecode optimization framework. In Proceedings CASCON ’99. IBM Press, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. S. J. van Schaik and O. de Moor. A memory efficient reachability data structure through bit vector compression. In Proceedings SIGMOD’11. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. J. Whaley and M. S. Lam. An efficient inclusion-based pointsto analysis for strictly-typed languages. In Static Analysis, pages 180–195. Springer, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. K. Wu, E. J. Otoo, and A. Shoshani. Optimizing bitmap indices with efficient compression. ACM Transactions on Database Systems (TODS), 31(1):1–38, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. G. Xu and A. Rountev. Merging equivalent contexts for scalable heap-cloning-based context-sensitive points-to analysis. In Proceedings ISSTA ’08. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. G. Xu, A. Rountev, and M. Sridharan. Scaling cflreachability-based points-to analysis using context-sensitive must-not-alias analysis. In Proceedings ECOOP’09. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. D. Yan, G. Xu, and A. Rountev. Demand-driven contextsensitive alias analysis for java. In Proceedings ISSTA’11. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. M. Yannakakis. Graph-theoretic methods in database theory. In Proceedings PODS’90. ACM, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. H. Yildirim, V. Chaoji, and M. J. Zaki. Grail: Scalable reachability index for large graphs. Proceedings of the VLDB Endowment, 3(1-2):276–284, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Q. Zhang, M. R. Lyu, H. Yuan, and Z. Su. Fast algorithms for dyck-cfl-reachability with applications to alias analysis. In Proceedings PLDI’13. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Giga-scale exhaustive points-to analysis for Java in under a minute

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 50, Issue 10
      OOPSLA '15
      October 2015
      953 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2858965
      • Editor:
      • Andy Gill
      Issue’s Table of Contents
      • cover image ACM Conferences
        OOPSLA 2015: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications
        October 2015
        953 pages
        ISBN:9781450336895
        DOI:10.1145/2814270

      Copyright © 2015 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 October 2015

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!