skip to main content
research-article

Learning a strategy for adapting a program analysis via bayesian optimisation

Published:23 October 2015Publication History
Skip Abstract Section

Abstract

Building a cost-effective static analyser for real-world programs is still regarded an art. One key contributor to this grim reputation is the difficulty in balancing the cost and the precision of an analyser. An ideal analyser should be adaptive to a given analysis task, and avoid using techniques that unnecessarily improve precision and increase analysis cost. However, achieving this ideal is highly nontrivial, and it requires a large amount of engineering efforts. In this paper we present a new approach for building an adaptive static analyser. In our approach, the analyser includes a sophisticated parameterised strategy that decides, for each part of a given program, whether to apply a precision-improving technique to that part or not. We present a method for learning a good parameter for such a strategy from an existing codebase via Bayesian optimisation. The learnt strategy is then used for new, unseen programs. Using our approach, we developed partially flow- and context-sensitive variants of a realistic C static analyser. The experimental results demonstrate that using Bayesian optimisation is crucial for learning from an existing codebase. Also, they show that among all program queries that require flow- or context-sensitivity, our partially flow- and context-sensitive analysis answers the 75% of them, while increasing the analysis cost only by 3.3x of the baseline flow- and context-insensitive analysis, rather than 40x or more of the fully sensitive version.

References

  1. T. Ball and S. Rajamani. The SLAM project: Debugging system software via static analysis. In POPL, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Nels E. Beckman and Aditya V. Nori. Probabilistic, modular and scalable inference of typestate specifications. In PLDI, pages 211–221, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Eric Brochu, Vlad M. Cora, and Nando de Freitas. A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. CoRR, abs/1012.2599, 2010.Google ScholarGoogle Scholar
  4. S. Chaki, E. Clarke, A. Groce, S. Jha, and H. Veith. Modular verification of software components in C. In ICSE, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Grebenshchikov, A. Gupta, N. Lopes, C. Popeea, and A. Rybalchenko. HSF(C): A software verifier based on Horn clauses. In TACAS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B. Gulavani, S. Chakraborty, A. Nori, and S. Rajamani. Automatically refining abstract interpretations. In TACAS, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ashutosh Gupta, Rupak Majumdar, and Andrey Rybalchenko. From tests to proofs. STTT, 15(4):291–303, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. Henzinger, R. Jhala, R. Majumdar, and K. McMillan. Abstractions from proofs. In POPL, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Software verification with blast. In SPIN Workshop on Model Checking of Software, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. Sequential model-based optimization for general algorithm configuration. In Proceedings of the 5th International Conference on Learning and Intelligent Optimization, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ted Kremenek, Andrew Y. Ng, and Dawson R. Engler. A factor graph model for software bug finding. In IJCAI, pages 2510–2516, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Percy Liang, Omer Tripp, and Mayur Naik. Learning minimal abstractions. In POPL, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. V. Benjamin Livshits, Aditya V. Nori, Sriram K. Rajamani, and Anindya Banerjee. Merlin: specification inference for explicit information flow problems. In PLDI, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Alon Mishne, Sharon Shoham, and Eran Yahav. Typestatebased semantic code search over partial programs. In OOPSLA, pages 997–1016, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jonas Mockus. Application of bayesian approach to numerical methods of global and stochastic optimization. Journal of Global Optimization, 4(4), 1994.Google ScholarGoogle ScholarCross RefCross Ref
  16. Mayur Naik, Hongseok Yang, Ghila Castelnuovo, and Mooly Sagiv. Abstractions from tests. In POPL, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Aditya V. Nori and Rahul Sharma. Termination proofs from tests. In FSE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hakjoo Oh, Wonchan Lee, Kihong Heo, Hongseok Yang, and Kwangkeun Yi. Selective context-sensitivity guided by impact pre-analysis. In PLDI, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hakjoo Oh, Kihong Heo, Wonchan Lee, Woosuk Lee, and Kwangkeun Yi. Sparrow. http://ropas.snu.ac.kr/ sparrow.Google ScholarGoogle Scholar
  20. Hakjoo Oh, Kihong Heo, Wonchan Lee, Woosuk Lee, and Kwangkeun Yi. Design and implementation of sparse global analyses for C-like languages. In PLDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, 2005. Google ScholarGoogle ScholarCross RefCross Ref
  22. Veselin Raychev, Martin Vechev, and Andreas Krause. Predicting program properties from ”big code”. In POPL, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Veselin Raychev, Martin Vechev, and Eran Yahav. Code completion with statistical language models. In PLDI, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Sriram Sankaranarayanan, Swarat Chaudhuri, Franjo Ivancic, and Aarti Gupta. Dynamic inference of likely data preconditions over predicates by tree learning. In ISSTA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Sriram Sankaranarayanan, Franjo Ivancic, and Aarti Gupta. Mining library specifications using inductive logic programming. In ICSE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Rahul Sharma, Saurabh Gupta, Bharath Hariharan, Alex Aiken, Percy Liang, and Aditya V. Nori. A data driven approach for algebraic loop invariants. In ESOP, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Rahul Sharma, Saurabh Gupta, Bharath Hariharan, Alex Aiken, and Aditya V. Nori. Verification as learning geometric concepts. In SAS, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  28. Rahul Sharma, Aditya V. Nori, and Alex Aiken. Interpolants as classifiers. In CAV, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yannis Smaragdakis, George Kastrinis, and George Balatsouras. Introspective analysis: Context-sensitivity, across the board. In PLDI, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. Practical bayesian optimization of machine learning algorithms. In 26th Annual Conference on Neural Information Processing Systems, 2012.Google ScholarGoogle Scholar
  31. Xin Zhang, Ravi Mangal, Radu Grigore, Mayur Naik, and Hongseok Yang. On abstraction refinement for program analyses in datalog. In PLDI, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Xin Zhang, Mayur Naik, and Hongseok Yang. Finding optimum abstractions in parametric dataflow analysis. In PLDI, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning a strategy for adapting a program analysis via bayesian optimisation

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 50, Issue 10
          OOPSLA '15
          October 2015
          953 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2858965
          • Editor:
          • Andy Gill
          Issue’s Table of Contents
          • cover image ACM Conferences
            OOPSLA 2015: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications
            October 2015
            953 pages
            ISBN:9781450336895
            DOI:10.1145/2814270

          Copyright © 2015 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 23 October 2015

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!