Abstract
When dealing with millions of lines of code, we still cannot have the cake and eat it: sparse value-flow analysis is powerful in checking source-sink problems, but existing work cannot escape from the “pointer trap” – a precise points-to analysis limits its scalability and an imprecise one seriously undermines its precision. We present Pinpoint, a holistic approach that decomposes the cost of high-precision points-to analysis by precisely discovering local data dependence and delaying the expensive inter-procedural analysis through memorization. Such memorization enables the on-demand slicing of only the necessary inter-procedural data dependence and path feasibility queries, which are then solved by a costly SMT solver. Experiments show that Pinpoint can check programs such as MySQL (around 2 million lines of code) within 1.5 hours. The overall false positive rate is also very low (14.3% - 23.6%). Pinpoint has discovered over forty real bugs in mature and extensively checked open source systems. And the implementation of Pinpoint and all experimental results are freely available.
Supplemental Material
- Alex Aiken, Suhabe Bugrara, Isil Dillig, Thomas Dillig, Brian Hackett, and Peter Hawkins. 2006. The Saturn Program Analysis System. Stanford University.Google Scholar
- Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick McDaniel. 2014. Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. Acm Sigplan Notices 49, 6 (2014), 259–269. Google Scholar
Digital Library
- D. Babic and A. Hu. 2008. Calysto: Scalable and Precise Extended Static Checking. In 2008 ACM/IEEE 30th International Conference on Software Engineering (ICSE 2008). IEEE, 211–220. Google Scholar
Digital Library
- Thomas Ball and Sriram K. Rajamani. 2002. The SLAM Project: Debugging System Software via Static Analysis. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’02). ACM, 1–3. Google Scholar
Digital Library
- Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, and Dawson Engler. 2010. A few billion lines of code later: using static analysis to find bugs in the real world. Commun. ACM 53, 2 (2010), 66–75. Google Scholar
Digital Library
- Frederick E Boland Jr and Paul E Black. 2012. The Juliet 1.1 C/C++ and Java Test Suite. Computer (IEEE Computer) 45, 10 (2012). Google Scholar
Digital Library
- Juan Caballero, Gustavo Grieco, Mark Marron, and Antonio Nappa. 2012. Undangle: early detection of dangling pointers in use-after-free and double-free vulnerabilities. In Proceedings of the 2012 International Symposium on Software Testing and Analysis. ACM, 133–143. Google Scholar
Digital Library
- Sagar Chaki, Edmund M Clarke, Alex Groce, Somesh Jha, and Helmut Veith. 2004. Modular verification of software components in C. IEEE Transactions on Software Engineering 30, 6 (2004), 388–402. Google Scholar
Digital Library
- Sigmund Cherem, Lonnie Princehouse, and Radu Rugina. 2007. Practical Memory Leak Detection Using Guarded Value-flow Analysis. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’07). ACM, 480–491. Google Scholar
Digital Library
- Chia Yuan Cho, Vijay D’Silva, and Dawn Song. 2013. Blitz: Compositional bounded model checking for real-world programs. In Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference on. IEEE, 136–146. Google Scholar
Digital Library
- Edmund Clarke, Daniel Kroening, Natasha Sharygina, and Karen Yorav. 2004. Predicate Abstraction of ANSI-C Programs Using SAT. Formal Methods in System Design 25, 2 (2004), 105–127. Google Scholar
Digital Library
- Edmund Clarke, Daniel Kroening, and Karen Yorav. 2003. Behavioral consistency of C and Verilog programs using bounded model checking. In Proceedings of the 40th annual Design Automation Conference. ACM, 368–371. Google Scholar
Digital Library
- Manuvir Das, Sorin Lerner, and Mark Seigle. 2002. ESP: Path-sensitive Program Verification in Polynomial Time. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI ’02). ACM, 57–68. Google Scholar
Digital Library
- Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 337–340. Google Scholar
Digital Library
- Jeffrey Dean, David Grove, and Craig Chambers. 1995. Optimization of object-oriented programs using static class hierarchy analysis. In European Conference on Object-Oriented Programming. Springer, 77– 101. Google Scholar
Digital Library
- David Dewey, Bradley Reaves, and Patrick Traynor. 2015. Uncovering Use-After-Free Conditions in Compiled Code. In Availability, Reliability and Security (ARES), 2015 10th International Conference on. IEEE, 90–99. Google Scholar
Digital Library
- Isil Dillig, Thomas Dillig, and Alex Aiken. 2008. Sound, complete and scalable path-sensitive analysis. In ACM SIGPLAN Notices, Vol. 43. ACM, 270–280. Google Scholar
Digital Library
- Isil Dillig, Thomas Dillig, Alex Aiken, and Mooly Sagiv. 2011. Precise and compact modular procedure summaries for heap manipulating programs. In ACM SIGPLAN Notices, Vol. 46. ACM, 567–577. Google Scholar
Digital Library
- Lisa Nguyen Quang Do, Karim Ali, Benjamin Livshits, Eric Bodden, Justin Smith, and Emerson Murphy-Hill. 2017. Just-in-time static analysis. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM, 307–317. Google Scholar
Digital Library
- N. Dor, S. Adams, M. Das, and Z. Yang. 2004. Software Validation via scalable path-sensitive value flow analysis. In Proceedings of the 2004 ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA ’04). ACM, 12–22. Google Scholar
Digital Library
- Josselin Feist, Laurent Mounier, and Marie-Laure Potet. 2014. Statically detecting use after free on binary code. Journal of Computer Virology and Hacking Techniques 10, 3 (2014), 211–217.Google Scholar
Cross Ref
- Jeanne Ferrante, Karl J. Ottenstein, and Joe D. Warren. 1987. The Program Dependence Graph and Its Use in Optimization. ACM Trans. Program. Lang. Syst. 9, 3 (1987), 319–349. Google Scholar
Digital Library
- Neville Grech and Yannis Smaragdakis. 2017. P/Taint: Unified Pointsto and Taint Analysis. Proc. ACM Program. Lang. 1, OOPSLA (2017), 102:1–102:28. Google Scholar
Digital Library
- Samuel Guyer and Calvin Lin. 2003. Client-driven pointer analysis. Static Analysis (2003), 1073–1073. Google Scholar
Digital Library
- Samuel Z Guyer and Calvin Lin. 2005. Error checking with clientdriven pointer analysis. Science of Computer Programming 58, 1-2 (2005), 83–114. Google Scholar
Digital Library
- Nevin Heintze and Olivier Tardieu. 2001. Demand-driven pointer analysis. In ACM SIGPLAN Notices, Vol. 36. ACM, 24–34. Google Scholar
Digital Library
- Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar, and Grégoire Sutre. 2002. Lazy Abstraction. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’02). ACM, 58–70. Google Scholar
Digital Library
- Michael Hind. 2001. Pointer analysis: Haven’t we solved this problem yet?. In Proceedings of the 2001 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering. ACM, 54–61. Google Scholar
Digital Library
- David Hovemeyer and William Pugh. 2007. Finding more null pointer bugs, but not too many. In Proceedings of the 7th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering. ACM, 9–14. Google Scholar
Digital Library
- David Hovemeyer, Jaime Spacco, and William Pugh. 2005. Evaluating and tuning a static analysis to find null pointer bugs. In ACM SIGSOFT Software Engineering Notes, Vol. 31. ACM, 13–19. Google Scholar
Digital Library
- James C King. 1976. Symbolic execution and program testing. Commun. ACM 19, 7 (1976), 385–394. Google Scholar
Digital Library
- Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization. IEEE, 75. Google Scholar
Digital Library
- Chris Lattner, Andrew Lenharth, and Vikram Adve. 2007. Making context-sensitive points-to analysis with heap cloning practical for the real world. ACM SIGPLAN Notices 42, 6 (2007), 278–289. Google Scholar
Digital Library
- Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondřej Lhoták, J Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z Guyer, Uday P Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In defense of soundiness: a manifesto. Commun. ACM 58, 2 (2015), 44–46. Google Scholar
Digital Library
- V Benjamin Livshits and Monica S Lam. 2003. Tracking pointers with path and context sensitivity for bug detection in C programs. ACM SIGSOFT Software Engineering Notes 28, 5 (2003), 317–326. Google Scholar
Digital Library
- Scott McPeak, Charles-Henri Gros, and Murali Krishna Ramanathan. 2013. Scalable and incremental software bug detection. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. ACM, 554–564. Google Scholar
Digital Library
- Nomair A Naeem and Ondrej Lhoták. 2011. Faster Alias Set Analysis Using Summaries.. In CC. Springer, 82–103. Google Scholar
Digital Library
- Hakjoo Oh, Kihong Heo, Wonchan Lee, Woosuk Lee, and Kwangkeun Yi. 2012. Design and implementation of sparse global analyses for C-like languages. In ACM SIGPLAN Notices, Vol. 47. ACM, 229–238. Google Scholar
Digital Library
- Thomas Reps, Susan Horwitz, and Mooly Sagiv. 1995. Precise interprocedural dataflow analysis via graph reachability. In Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages. ACM, 49–61. Google Scholar
Digital Library
- Wolf-Steffen Rödiger. 2011. Merging Static Analysis and model checking for improved security vulnerability detection. Ph.D. Dissertation. Master thesis, Dept. of Com. Sc. Augsburg University.Google Scholar
- Diptikalyan Saha and CR Ramakrishnan. 2005. Incremental and demand-driven points-to analysis using logic programming. In Proceedings of the 7th ACM SIGPLAN international conference on Principles and practice of declarative programming. ACM, 117–128. Google Scholar
Digital Library
- LA Sandra. 1994. PHB Practical Handbook of Curve Fitting.Google Scholar
- G Snelting, T Robschink, and J Krinke. 2006. Efficient Path Conditions in Dependence Graphs for Software Safety Analysis. ACM Transactions on Software Engineering and Methodology (TOSEM) 15, 4 (2006), 410– 457. Google Scholar
Digital Library
- Manu Sridharan, Denis Gopan, Lexin Shan, and Rastislav Bodík. 2005. Demand-driven points-to analysis for Java. In ACM SIGPLAN Notices, Vol. 40. ACM, 59–76. Google Scholar
Digital Library
- Yulei Sui and Jingling Xue. 2016. SVF: Interprocedural static value-flow analysis in LLVM. In Proceedings of the 25th International Conference on Compiler Construction. ACM, 265–266. Google Scholar
Digital Library
- Yulei Sui and Jingling Xue. 2016. SVF: Interprocedural Static Value-flow Analysis in LLVM. In Proceedings of the 25th International Conference on Compiler Construction (CC 2016). ACM, 265–266. Google Scholar
Digital Library
- Y. Sui, D. Ye, and J. Xue. 2014. Detecting Memory Leaks Statically with Full-Sparse Value-Flow Analysis. IEEE Transactions on Software Engineering 40, 2 (2014), 107–122. Google Scholar
Digital Library
- Peng Tu and David Padua. 1995. Efficient building and placing of gating functions. ACM SIGPLAN Notices 30, 6 (1995), 47–55. Google Scholar
Digital Library
- Mark N Wegman and F Kenneth Zadeck. 1991. Constant propagation with conditional branches. ACM Transactions on Programming Languages and Systems (TOPLAS) 13, 2 (1991), 181–210. Google Scholar
Digital Library
- John Whaley and Monica S Lam. 2004. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In ACM SIGPLAN Notices, Vol. 39. ACM, 131–144. Google Scholar
Digital Library
- Robert P Wilson and Monica S Lam. 1995. Efficient context-sensitive pointer analysis for C programs. Vol. 30. ACM. Google Scholar
Digital Library
- Yichen Xie and Alex Aiken. 2005. Context-and path-sensitive memory leak detection. In ACM SIGSOFT Software Engineering Notes, Vol. 30. ACM, 115–125. Google Scholar
Digital Library
- Yichen Xie and Alex Aiken. 2005. Scalable Error Detection Using Boolean Satisfiability. In Proceedings of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’05). ACM, 351–363. Google Scholar
Digital Library
- Dacong Yan, Guoqing Xu, and Atanas Rountev. 2011. Demand-driven context-sensitive alias analysis for Java. In Proceedings of the 2011 International Symposium on Software Testing and Analysis. ACM, 155– 165. Google Scholar
Digital Library
- Xin Zheng and Radu Rugina. 2008. Demand-driven alias analysis for C. ACM SIGPLAN Notices 43, 1 (2008), 197–208. Google Scholar
Digital Library
Index Terms
Pinpoint: fast and precise sparse value flow analysis for million lines of code
Recommendations
Pinpoint: fast and precise sparse value flow analysis for million lines of code
PLDI 2018: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and ImplementationWhen dealing with millions of lines of code, we still cannot have the cake and eat it: sparse value-flow analysis is powerful in checking source-sink problems, but existing work cannot escape from the “pointer trap” – a precise points-to analysis limits ...
Tracking pointers with path and context sensitivity for bug detection in C programs
This paper proposes a pointer alias analysis for automatic error detection. State-of-the-art pointer alias analyses are either too slow or too imprecise for finding errors in real-life programs. We propose a hybrid pointer analysis that tracks actively ...
Tracking pointers with path and context sensitivity for bug detection in C programs
ESEC/FSE-11: Proceedings of the 9th European software engineering conference held jointly with 11th ACM SIGSOFT international symposium on Foundations of software engineeringThis paper proposes a pointer alias analysis for automatic error detection. State-of-the-art pointer alias analyses are either too slow or too imprecise for finding errors in real-life programs. We propose a hybrid pointer analysis that tracks actively ...







Comments