Abstract
Pointer information is a prerequisite for most program analyses, and the quality of this information can greatly affect their precision and performance. Inclusion-based (i.e. Andersen-style) pointer analysis is an important point in the space of pointer analyses, offering a potential sweet-spot in the trade-off between precision and performance. However, current techniques for inclusion-based pointer analysis can have difficulties delivering on this potential.
We introduce and evaluate two novel techniques for inclusion-based pointer analysis---one lazy, one eager1---that significantly improve upon the current state-of-the-art without impacting precision. These techniques focus on the problem of online cycle detection, a critical optimization for scaling such analyses. Using a suite of six open-source C programs, which range in size from 169K to 2.17M LOC, we compare our techniques against the three best inclusion-based analyses--described by Heintze and Tardieu [11], by Pearce et al. [21], and by Berndl et al. [4]. The combination of our two techniques results in an algorithm which is on average 3.2 xfaster than Heintze and Tardieu's algorithm, 6.4 xfaster than Pearce et al.'s algorithm, and 20.6 faster than Berndl et al.'s algorithm.
We also investigate the use of different data structures to represent points-to sets, examining the impact on both performance and memory consumption. We compare a sparse-bitmap implementation used in the GCC compiler with a BDD-based implementation, and we find that the BDD implementation is on average 2x slower than using sparse bitmaps but uses 5.5x less memory.
- Aesop. The Ant and the Grasshopper, rm from Aesop's Fables. Greece, 6th century BC.Google Scholar
- Lars Ole Andersen. Program Analysis and Specialization for the C Programming Language. PhD thesis, DIKU, University of Copenhagen, May 1994.Google Scholar
- Dzintars Avots, Michael Dalton, VBenjamin Livshits, and Monica S. Lam. Improving software security with a C pointer analysis. In 27th International Conference on Software Engineering (ICSE), pages 332--341, 2005. Google Scholar
Digital Library
- Marc Berndl, Ondrej Lhotak, Feng Qian, Laurie Hendren, and Navindra Umanee. Points-to analysis using BDDs. In Programming Language Design and Implementation (PLDI), pages 103--114, 2003. Google Scholar
Digital Library
- Randal E. Bryant. Graph-based algorithms for Boolean function manipulation. IEEE Transactions on Computers, C-35(8):677--691, August 1986. Google Scholar
Digital Library
- Jong-Deok Choi, Michael Burke, and Paul Carini. Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side effects. In Principles of Programming Languages (POPL), pages 232--245, 1993. Google Scholar
Digital Library
- Manuvir Das. Unification-based pointer analysis with directional assignments. In Programming Language Design and Implementation (PLDI), pages 35--46, 2000. Google Scholar
Digital Library
- Maryam Emami, Rakesh Ghiya, and Laurie J. Hendren. Context-sensitive interprocedural points-to analysis in the presence of function pointers. In Programming Language Design and Implementation (PLDI), pages 242--256, 1994. Google Scholar
Digital Library
- Manuel Faehndrich, Jeffrey S. Foster, Zhendong Su, and Alexander Aiken. Partial online cycle elimination in inclusion constraint graphs. In Programming Language Design and Implementation (PLDI), pages 85--96, 1998. Google Scholar
Digital Library
- Samuel Z. Guyer and Calvin Lin. Error checking with client-driven pointer analysis. Science of Computer Programming, 58(1-2):83--114, 2005. Google Scholar
Digital Library
- Nevin Heintze and Olivier Tardieu. Ultra-fast aliasing analysis using CLA: A million lines of C code in a second. In Programming Language Design and Implementation (PLDI), pages 24--34, 2001. Google Scholar
Digital Library
- Michael Hind. Pointer analysis: haven't we solved this problem yet? In Workshop on Program Analysis for Software Tools and Engineering (PASTE), pages 54--61, 2001. Google Scholar
Digital Library
- Michael Hind, Michael Burke, Paul Carini, and Jong-Deok Choi. Interprocedural pointer alias analysis. ACM Transactions on Programming Languages and Systems, 21(4):848--894, 1999. Google Scholar
Digital Library
- William Landi and Barbara G. Ryder. Pointer-induced aliasing: a problem taxonomy. In Symposium on Principles of Programming Languages (POPL), pages 93--103, 1991. Google Scholar
Digital Library
- William Landi and Barbara G. Ryder. A safe approximate algorithm for interprocedural pointer aliasing. In Programming Language Design and Implementation (PLDI), pages 235--248, 1992. Google Scholar
Digital Library
- J. Lind-Nielson. BuDDy, a binary decision package. http://www.itu.dk/research/buddy/.Google Scholar
- George C. Necula, Scott McPeak, Shree Prakash Rahul, and Westley Weimer. CIL: Intermediate language and tools for analysis and transformation of C programs. In Computational Complexity, pages 213--228, 2002. Google Scholar
Digital Library
- F. Nielson, H. R. Nielson, and C. L. Hankin. Principles of Program Analysis. Springer-Verlag, 1999. Google Scholar
Digital Library
- Esko Nuutila and Eljas Soisalon-Soininen. On finding the strong components in a directed graph. Technical Report TKO-B94, Helsinki University of Technology, Laboratory of Information Processing Science, 1995.Google Scholar
- Erik M. Nystrom, Hong-Seok Kim, and Wen mei WHwu. Bottom-up and top-down context-sensitive summary-based pointer analysis. In International Symposium on Static Analysis, pages 165--180, 2004.Google Scholar
Cross Ref
- David Pearce, Paul Kelly, and Chris Hankin. Efficient field-sensitive pointer analysis for C. In ACM workshop on Program Analysis for Software Tools and Engineering (PASTE), pages 37--42, 2004. Google Scholar
Digital Library
- David J. Pearce, Paul H. J. Kelly, and Chris Hankin. Online cycle detection and difference propagation for pointer analysis. In 3rd International IEEE Workshop on Source Code Analysis and Manipulation (SCAM), pages 3--12, 2003.Google Scholar
Cross Ref
- Atanas Rountev and Satish Chandra. Off-line variable substitution for scaling points-to analysis. In Programming Language Design and Implementation (PLDI), pages 47--56, 2000. Google Scholar
Digital Library
- M. Shapiro and S. Horwitz. The effects of the precision of pointer analysis. Lecture Notes in Computer Science, 1302:16--34, 1997. Google Scholar
Digital Library
- Bjarne Steensgaard. Points-to analysis in almost linear time. In Symposium on Principles of Programming Languages (POPL), pages 32--41, 1996. Google Scholar
Digital Library
- Robert Tarjan. Depth-first search and linear graph algorithms. SIAM J. Comput., 1(2):146--160, June 1972.Google Scholar
Digital Library
- Teck Bok Tok, Samuel Z. Guyer, and Calvin Lin. Efficient flow-sensitive interprocedural data-flow analysis in the presence of pointers. In 15th International Conference on Compiler Construction (CC), pages 17--31, 2006. Google Scholar
Digital Library
- John Whaley and Monica S. Lam. Cloning-based context-sensitive pointer alias analysis. In Programming Language Design and Implementation (PLDI), pages 131--144, 2004. Google Scholar
Digital Library
- Robert P. Wilson and Monica S. Lam. Efficient context-sensitive pointer analysis for c programs. In Programming Language Design and Implementation (PLDI), pages 1--12, 1995. Google Scholar
Digital Library
- Jianwen Zhu and Silvian Calman. Symbolic pointer analysis revisited. In Programming Language Design and Implementation (PLDI), pages 145--157, 2004. Google Scholar
Digital Library
Index Terms
The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code
Recommendations
SVF: interprocedural static value-flow analysis in LLVM
CC 2016: Proceedings of the 25th International Conference on Compiler ConstructionThis paper presents SVF, a tool that enables scalable and precise interprocedural Static Value-Flow analysis for C programs by leveraging recent advances in sparse analysis. SVF, which is fully implemented in LLVM, allows value-flow construction and ...
The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code
PLDI '07: Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and ImplementationPointer information is a prerequisite for most program analyses, and the quality of this information can greatly affect their precision and performance. Inclusion-based (i.e. Andersen-style) pointer analysis is an important point in the space of pointer ...
Precise flow-insensitive may-alias analysis is NP-hard
Determining aliases is one of the foundamental static analysis problems, in part because the precision with which this problem is solved can affect the precision of other analyses such as live variables, available expressions, and constant propagation. ...







Comments