skip to main content
research-article

Bias-variance tradeoffs in program analysis

Published:08 January 2014Publication History
Skip Abstract Section

Abstract

It is often the case that increasing the precision of a program analysis leads to worse results. It is our thesis that this phenomenon is the result of fundamental limits on the ability to use precise abstract domains as the basis for inferring strong invariants of programs. We show that bias-variance tradeoffs, an idea from learning theory, can be used to explain why more precise abstractions do not necessarily lead to better results and also provides practical techniques for coping with such limitations. Learning theory captures precision using a combinatorial quantity called the VC dimension. We compute the VC dimension for different abstractions and report on its usefulness as a precision metric for program analyses. We evaluate cross validation, a technique for addressing bias-variance tradeoffs, on an industrial strength program verification tool called YOGI. The tool produced using cross validation has significantly better running time, finds new defects, and has fewer time-outs than the current production version. Finally, we make some recommendations for tackling bias-variance tradeoffs in program analysis.

Skip Supplemental Material Section

Supplemental Material

d1_left_t7.mp4

References

  1. G. Amato, M. Parton, and F. Scozzari. Discovering invariants via simple component analysis. J. Symb. Comput., 47(12):1533--1560, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Arlot and A. Celisse. A survey of cross-validation procedures for model selection. Statistics Surveys, 4:40--79, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  3. R. Bagnara, P. M. Hill, E. Ricci, and E. Zaffanella. Precise widening operators for convex polyhedra. Sci. Comput. Program., 58(1-2):28--56, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Beyer. Second competition on software verification - (summary of SV-COMP 2013). In TACAS, pages 594--609, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Beyer, T. A. Henzinger, and G. Théoduloz. Program analysis with dynamic precision adjustment. In ASE, pages 29--38, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. M. Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag New York, Inc., 2006. ISBN 0387310738. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. N. Bjørner, K. L. McMillan, and A. Rybalchenko. On solving universally quantified horn clauses. In SAS, pages 105--125, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  8. A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. J. ACM, 36(4):929--965, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. H. Bshouty, S. A. Goldman, H. D. Mathias, S. Suri, and H. Tamaki. Noise-tolerant distribution-free learning of general geometric concepts. J. ACM, 45(5):863--890, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Calcagno, D. Distefano, P. W. O'Hearn, and H. Yang. Compositional shape analysis by means of bi-abduction. In POPL, pages 289--300, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. C. Cawley and N. L. C. Talbot. On over-fitting in model selection and subsequent selection bias in performance evaluation. Journal of Machine Learning Research, 11:2079--2107, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Chaki, E. M. Clarke, A. Groce, and O. Strichman. Predicate abstraction with minimum predicates. In CHARME, pages 19--34, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  13. E. M. Clarke, O. Grumberg, S. Jha, Y. Lu, and H. Veith. Counterexample-guided abstraction refinement. In CAV, pages 154--169, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. M. Clarke, D. Kroening, N. Sharygina, and K. Yorav. Predicate abstraction of ansi-c programs using sat. Formal Methods in System Design, 25(2-3):105--127, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Colón, S. Sankaranarayanan, and H. Sipma. Linear invariant generation using non-linear constraint solving. In CAV, pages 420--432, 2003.Google ScholarGoogle Scholar
  16. P. Cousot and R. Cousot. Static determination of dynamic properties of programs. In ISOP, pages 106--130, 1976.Google ScholarGoogle Scholar
  17. P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In POPL, pages 238--252, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Cousot and R. Cousot. Comparing the Galois connection and widening/narrowing approaches to abstract interpretation. In PLILP, pages 269--295, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. P. Cousot and N. Halbwachs. Automatic discovery of linear restraints among variables of a program. In POPL, pages 84--96, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. Cousot, R. Cousot, J. Feret, L. Mauborgne, A. Miné, and X. Rival. Why does Astrée scale up? Formal Methods in System Design, 35(3): 229--264, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Domingos. A few useful things to know about machine learning. Commun. ACM, 55(10):78--87, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. T. Gawlitza and H. Seidl. Precise fixpoint computation through strategy iteration. In ESOP, pages 300--315, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Gawlitza and H. Seidl. Precise relational invariants through strategy iteration. In CSL, pages 23--40, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the bias/variance dilemma. Neural Computation, 4(1):1--58, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. P. Godefroid, A. V. Nori, S. K. Rajamani, and S. Tetali. Compositional may-must program analysis: unleashing the power of alternation. In POPL, pages 43--56, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. B. S. Gulavani and S. Gulwani. A numerical abstract domain based on expression abstraction and max operator with application in timing analysis. In CAV, pages 370--384, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Gulwani, S. Srivastava, and R. Venkatesan. Program analysis as constraint solving. In PLDI, pages 281--292, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Gupta, R. Majumdar, and A. Rybalchenko. From tests to proofs. In TACAS, pages 262--276, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Henry, D. Monniaux, and M. Moy. Pagai: A path sensitive static analyser. Electr. Notes Theor. Comput. Sci., 289:15--25, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Henry, D. Monniaux, and M. Moy. Succinct representations for abstract interpretation - combined analysis algorithms and experimental evaluation. In SAS, pages 283--299, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. T. A. Henzinger, R. Jhala, R. Majumdar, and K. L. McMillan. Abstractions from proofs. In POPL, pages 232--244, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. R. Jhala and K. L. McMillan. A practical and complete approach to predicate refinement. In TACAS, pages 459--473, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Kearns and D. Ron. Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural Computation, 11:152--162, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. J. Kearns and U. V. Vazirani. An introduction to computational learning theory. MIT Press, Cambridge, MA, USA, 1994. ISBN 0-262-11193-4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. G. Klein, J. Andronick, K. Elphinstone, G. Heiser, D. Cock, P. Derrin, D. Elkaduwe, K. Engelhardt, R. Kolanski, M. Norrish, T. Sewell, H. Tuch, and S. Winwood. seL4: formal verification of an operatingsystem kernel. Commun. ACM, 53(6):107--115, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. G. Lalire, M. Argoud, and B. Jeannet. The Interproc Analyzer. http://pop-art.inrialpes.fr/people/bjeannet/bjeannetforge/interproc/index.html.Google ScholarGoogle Scholar
  37. P. Liang, O. Tripp, and M. Naik. Learning minimal abstractions. In POPL, pages 31--42, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. K. L. McMillan. An interpolating theorem prover. Theoretical Computer Science, 345(1):101--121, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. Minè. The octagon abstract domain. Higher-Order and Symbolic Computation, 19(1):31--100, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. D. Monniaux and L. Gonnord. Using bounded model checking to focus fixpoint iterations. In SAS, pages 369--385, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. D. Monniaux and J. L. Guen. Stratified static analysis based on variable dependencies. Electr. Notes Theor. Comput. Sci., 288:61--74, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. A. Y. Ng. Preventing "overfitting" of cross-validation data. In ICML, pages 245--253, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. A. V. Nori and S. K. Rajamani. An empirical study of optimizations in YOGI. In ICSE (1), pages 355--364, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. J. C. Reynolds. Separation logic: A logic for shared mutable data structures. In LICS, pages 55--74, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. S. Sankaranarayanan, H. B. Sipma, and Z. Manna. Scalable analysis of linear systems using mathematical programming. In VMCAI, pages 25--41, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. S. Sankaranarayanan, F. Ivancic, I. Shlyakhter, and A. Gupta. Static analysis in disjunctive numerical domains. In SAS, pages 3--17, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. R. Sharma, S. Gupta, B. Hariharan, A. Aiken, and A. V. Nori. Verification as learning geometric concepts. In SAS, pages 388--411, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  48. L. G. Valiant. A theory of the learnable. Commun. ACM, 27(11): 1134--1142, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. X. Zhang, M. Naik, and H. Yang. Finding optimum abstractions in parametric dataflow analysis. In PLDI, pages 365--376, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Bias-variance tradeoffs in program analysis

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 49, Issue 1
          POPL '14
          January 2014
          661 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2578855
          Issue’s Table of Contents
          • cover image ACM Conferences
            POPL '14: Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
            January 2014
            702 pages
            ISBN:9781450325448
            DOI:10.1145/2535838

          Copyright © 2014 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 8 January 2014

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!