Abstract
Building a cost-effective static analyser for real-world programs is still regarded an art. One key contributor to this grim reputation is the difficulty in balancing the cost and the precision of an analyser. An ideal analyser should be adaptive to a given analysis task, and avoid using techniques that unnecessarily improve precision and increase analysis cost. However, achieving this ideal is highly nontrivial, and it requires a large amount of engineering efforts. In this paper we present a new approach for building an adaptive static analyser. In our approach, the analyser includes a sophisticated parameterised strategy that decides, for each part of a given program, whether to apply a precision-improving technique to that part or not. We present a method for learning a good parameter for such a strategy from an existing codebase via Bayesian optimisation. The learnt strategy is then used for new, unseen programs. Using our approach, we developed partially flow- and context-sensitive variants of a realistic C static analyser. The experimental results demonstrate that using Bayesian optimisation is crucial for learning from an existing codebase. Also, they show that among all program queries that require flow- or context-sensitivity, our partially flow- and context-sensitive analysis answers the 75% of them, while increasing the analysis cost only by 3.3x of the baseline flow- and context-insensitive analysis, rather than 40x or more of the fully sensitive version.
- T. Ball and S. Rajamani. The SLAM project: Debugging system software via static analysis. In POPL, 2002. Google Scholar
Digital Library
- Nels E. Beckman and Aditya V. Nori. Probabilistic, modular and scalable inference of typestate specifications. In PLDI, pages 211–221, 2011. Google Scholar
Digital Library
- Eric Brochu, Vlad M. Cora, and Nando de Freitas. A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. CoRR, abs/1012.2599, 2010.Google Scholar
- S. Chaki, E. Clarke, A. Groce, S. Jha, and H. Veith. Modular verification of software components in C. In ICSE, 2003. Google Scholar
Digital Library
- S. Grebenshchikov, A. Gupta, N. Lopes, C. Popeea, and A. Rybalchenko. HSF(C): A software verifier based on Horn clauses. In TACAS, 2012. Google Scholar
Digital Library
- B. Gulavani, S. Chakraborty, A. Nori, and S. Rajamani. Automatically refining abstract interpretations. In TACAS, 2008. Google Scholar
Digital Library
- Ashutosh Gupta, Rupak Majumdar, and Andrey Rybalchenko. From tests to proofs. STTT, 15(4):291–303, 2013.Google Scholar
Digital Library
- T. Henzinger, R. Jhala, R. Majumdar, and K. McMillan. Abstractions from proofs. In POPL, 2004. Google Scholar
Digital Library
- T. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Software verification with blast. In SPIN Workshop on Model Checking of Software, 2003. Google Scholar
Digital Library
- Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. Sequential model-based optimization for general algorithm configuration. In Proceedings of the 5th International Conference on Learning and Intelligent Optimization, 2011. Google Scholar
Digital Library
- Ted Kremenek, Andrew Y. Ng, and Dawson R. Engler. A factor graph model for software bug finding. In IJCAI, pages 2510–2516, 2007. Google Scholar
Digital Library
- Percy Liang, Omer Tripp, and Mayur Naik. Learning minimal abstractions. In POPL, 2011. Google Scholar
Digital Library
- V. Benjamin Livshits, Aditya V. Nori, Sriram K. Rajamani, and Anindya Banerjee. Merlin: specification inference for explicit information flow problems. In PLDI, 2009. Google Scholar
Digital Library
- Alon Mishne, Sharon Shoham, and Eran Yahav. Typestatebased semantic code search over partial programs. In OOPSLA, pages 997–1016, 2012. Google Scholar
Digital Library
- Jonas Mockus. Application of bayesian approach to numerical methods of global and stochastic optimization. Journal of Global Optimization, 4(4), 1994.Google Scholar
Cross Ref
- Mayur Naik, Hongseok Yang, Ghila Castelnuovo, and Mooly Sagiv. Abstractions from tests. In POPL, 2012. Google Scholar
Digital Library
- Aditya V. Nori and Rahul Sharma. Termination proofs from tests. In FSE, 2013. Google Scholar
Digital Library
- Hakjoo Oh, Wonchan Lee, Kihong Heo, Hongseok Yang, and Kwangkeun Yi. Selective context-sensitivity guided by impact pre-analysis. In PLDI, 2014. Google Scholar
Digital Library
- Hakjoo Oh, Kihong Heo, Wonchan Lee, Woosuk Lee, and Kwangkeun Yi. Sparrow. http://ropas.snu.ac.kr/ sparrow.Google Scholar
- Hakjoo Oh, Kihong Heo, Wonchan Lee, Woosuk Lee, and Kwangkeun Yi. Design and implementation of sparse global analyses for C-like languages. In PLDI, 2012. Google Scholar
Digital Library
- Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, 2005. Google Scholar
Cross Ref
- Veselin Raychev, Martin Vechev, and Andreas Krause. Predicting program properties from ”big code”. In POPL, 2015. Google Scholar
Digital Library
- Veselin Raychev, Martin Vechev, and Eran Yahav. Code completion with statistical language models. In PLDI, 2014. Google Scholar
Digital Library
- Sriram Sankaranarayanan, Swarat Chaudhuri, Franjo Ivancic, and Aarti Gupta. Dynamic inference of likely data preconditions over predicates by tree learning. In ISSTA, 2008. Google Scholar
Digital Library
- Sriram Sankaranarayanan, Franjo Ivancic, and Aarti Gupta. Mining library specifications using inductive logic programming. In ICSE, 2008. Google Scholar
Digital Library
- Rahul Sharma, Saurabh Gupta, Bharath Hariharan, Alex Aiken, Percy Liang, and Aditya V. Nori. A data driven approach for algebraic loop invariants. In ESOP, 2013. Google Scholar
Digital Library
- Rahul Sharma, Saurabh Gupta, Bharath Hariharan, Alex Aiken, and Aditya V. Nori. Verification as learning geometric concepts. In SAS, 2013.Google Scholar
Cross Ref
- Rahul Sharma, Aditya V. Nori, and Alex Aiken. Interpolants as classifiers. In CAV, 2012. Google Scholar
Digital Library
- Yannis Smaragdakis, George Kastrinis, and George Balatsouras. Introspective analysis: Context-sensitivity, across the board. In PLDI, 2014. Google Scholar
Digital Library
- Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. Practical bayesian optimization of machine learning algorithms. In 26th Annual Conference on Neural Information Processing Systems, 2012.Google Scholar
- Xin Zhang, Ravi Mangal, Radu Grigore, Mayur Naik, and Hongseok Yang. On abstraction refinement for program analyses in datalog. In PLDI, 2014. Google Scholar
Digital Library
- Xin Zhang, Mayur Naik, and Hongseok Yang. Finding optimum abstractions in parametric dataflow analysis. In PLDI, 2013. Google Scholar
Digital Library
Index Terms
Learning a strategy for adapting a program analysis via bayesian optimisation
Recommendations
Learning a strategy for adapting a program analysis via bayesian optimisation
OOPSLA 2015: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and ApplicationsBuilding a cost-effective static analyser for real-world programs is still regarded an art. One key contributor to this grim reputation is the difficulty in balancing the cost and the precision of an analyser. An ideal analyser should be adaptive to a ...
Adaptive Static Analysis via Learning with Bayesian Optimization
Building a cost-effective static analyzer for real-world programs is still regarded an art. One key contributor to this grim reputation is the difficulty in balancing the cost and the precision of an analyzer. An ideal analyzer should be adaptive to a ...
Efficient points-to analysis for whole-program analysis
To function on programs written in languages such as C that make extensive use of pointers, automated software engineering tools require safe alias information. Existing alias-analysis techniques that are sufficiently efficient for analysis on large ...






Comments