skip to main content

A formal foundation for symbolic evaluation with merging

Published:12 January 2022Publication History
Skip Abstract Section

Abstract

Reusable symbolic evaluators are a key building block of solver-aided verification and synthesis tools. A reusable evaluator reduces the semantics of all paths in a program to logical constraints, and a client tool uses these constraints to formulate a satisfiability query that is discharged with SAT or SMT solvers. The correctness of the evaluator is critical to the soundness of the tool and the domain properties it aims to guarantee. Yet so far, the trust in these evaluators has been based on an ad-hoc foundation of testing and manual reasoning.

This paper presents the first formal framework for reasoning about the behavior of reusable symbolic evaluators. We develop a new symbolic semantics for these evaluators that incorporates state merging. Symbolic evaluators use state merging to avoid path explosion and generate compact encodings. To accommodate a wide range of implementations, our semantics is parameterized by a symbolic factory, which abstracts away the details of merging and creation of symbolic values. The semantics targets a rich language that extends Core Scheme with assumptions and assertions, and thus supports branching, loops, and (first-class) procedures. The semantics is designed to support reusability, by guaranteeing two key properties: legality of the generated symbolic states, and the reducibility of symbolic evaluation to concrete evaluation. Legality makes it simpler for client tools to formulate queries, and reducibility enables testing of client tools on concrete inputs. We use the Lean theorem prover to mechanize our symbolic semantics, prove that it is sound and complete with respect to the concrete semantics, and prove that it guarantees legality and reducibility.

To demonstrate the generality of our semantics, we develop Leanette, a reference evaluator written in Lean, and Rosette 4, an optimized evaluator written in Racket. We prove Leanette correct with respect to the semantics, and validate Rosette 4 against Leanette via solver-aided differential testing. To demonstrate the practicality of our approach, we port 16 published verification and synthesis tools from Rosette 3 to Rosette 4. Rosette 3 is an existing reusable evaluator that implements the classic merging semantics, adopted from bounded model checking. Rosette 4 replaces the semantic core of Rosette 3 but keeps its optimized symbolic factory. Our results show that Rosette 4 matches the performance of Rosette 3 across a wide range of benchmarks, while providing a cleaner interface that simplifies the implementation of client tools.

Skip Supplemental Material Section

Supplemental Material

Auxiliary Presentation Video

This is a presentation video of my lightning talk for POPL 2022 on our paper, "A formal foundation for symbolic evaluation with merging."

References

  1. Amazon Web Services. 2018. Quivela. https://github.com/jamesbornholt/quivelaGoogle ScholarGoogle Scholar
  2. Clark W. Barrett, Christopher L. Conway, Morgan Deters, Liana Hadarean, Dejan Jovanovic, Tim King, Andrew Reynolds, and Cesare Tinelli. 2011. CVC4. In Proceedings of the 23rd International Conference on Computer Aided Verification (CAV), Ganesh Gopalakrishnan and Shaz Qadeer (Eds.) (Lecture Notes in Computer Science, Vol. 6806). Springer, 171–177. https://doi.org/10.1007/978-3-642-22110-1_14 Google ScholarGoogle ScholarCross RefCross Ref
  3. Armin Biere, Alessandro Cimatti, Edmund M. Clarke, and Yunshan Zhu. 1999. Symbolic Model Checking Without BDDs. In Proceedings of the 5th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS). Amsterdam, The Netherlands. 193–207.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jasmin Christian Blanchette, Mathias Fleury, and Christoph Weidenbach. 2017. A Verified SAT Solver Framework with Learn, Forget, Restart, and Incrementality. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI). Melbourne, Australia. 4786–4790. https://doi.org/10.24963/ijcai.2017/667 Google ScholarGoogle ScholarCross RefCross Ref
  5. Rastislav Bodik, Satish Chandra, Joel Galenson, Doug Kimelman, Nicholas Tung, Shaon Barman, and Casey Rodarmor. 2010. Programming with Angelic Nondeterminism. In Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL). Madrid, Spain. 339–352. https://doi.org/10.1145/1706299.1706339 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. James Bornholt, Antoine Kaufmann, Jialin Li, Arvind Krishnamurthy, Emina Torlak, and Xi Wang. 2016. Specifying and checking file system crash-consistency models. In Proceedings of the 21st International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Atlanta, GA, USA. 83–98. https://doi.org/10.1145/2872362.2872406 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. James Bornholt and Emina Torlak. 2017. Synthesizing Memory Models from Framework Sketches and Litmus Tests. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Barcelona, Spain. 467–481. https://doi.org/10.1145/3140587.3062353 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. James Bornholt and Emina Torlak. 2018. Finding Code That Explodes Under Symbolic Evaluation. In Proceedings of the 2018 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). Boston, MA. Article 149, 26 pages. https://doi.org/10.1145/3276519 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Alan Borning. 2016. Wallingford: Toward a Constraint Reactive Programming Language. In Proceedings of the Constrained and Reactive Objects Workshop (CROW). Málaga, Spain. 45–49. https://doi.org/10.1145/2892664.2892667 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Eric Butler, Emina Torlak, and Zoran Popović. 2017. Synthesizing Interpretable Strategies for Solving Puzzle Games. In Proceedings of the 12th International Conference on the Foundations of Digital Games (FDG). Hyannis, MA. Article 10, 17 pages. https://doi.org/10.1145/3102071.3102084 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kartik Chandra and Rastislav Bodik. 2018. Bonsai: Synthesis-Based Reasoning for Type Systems. Proc. ACM Program. Lang., 2, POPL (2018), Jan., 62:1–62:34. https://doi.org/10.1145/3158150 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Adam Chlipala. 2010. A Verified Compiler for an Impure Functional Language. In Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL). Madrid, Spain. 93–106. https://doi.org/10.1145/1706299.1706312 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Shumo Chu, Chenglong Wang, Konstantin Weitz, and Alvin Cheung. 2017. Cosette: An Automated Prover for SQL. In Proceedings of the 8th Biennial Conference on Innovative Data Systems (CIDR). Chaminade, CA.Google ScholarGoogle Scholar
  14. Edmund Clarke, Daniel Kroening, and Flavio Lerda. 2004. A Tool for Checking ANSI-C Programs. In Proceedings of the 10th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS). Barcelona, Spain. 168–176.Google ScholarGoogle ScholarCross RefCross Ref
  15. Edmund Clarke, Daniel Kroening, and Karen Yorav. 2003. Behavioral Consistency of C and Verilog Programs. Carnegie Mellon University. https://doi.org/10.1145/775832.775928 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Lori A. Clarke. 1976. A System to Generate Test Data and Symbolically Execute Programs. IEEE Transactions on Software Engineering, 2, 3 (1976), 215–222. https://doi.org/10.1109/TSE.1976.233817 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Patrick Cousot and Radhia Cousot. 1977. Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints. In Proceedings of the 4th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL). Los Angeles, CA. 238–252. https://doi.org/10.1145/512950.512973 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Patrick Cousot and Radhia Cousot. 1979. Systematic Design of Program Analysis Frameworks. In Proceedings of the 6th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL). San Antonio, TX. 269–282. https://doi.org/10.1145/567752.567778 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Luís Cruz-Filipe, Marijn J. H. Heule, Warren A. Hunt, Matt Kaufmann, and Peter Schneider-Kamp. 2017. Efficient Certified RAT Verification. In Proceedings of the 26th International Conference on Automated Deduction (CADE). Gothenburg, Sweden. 220–236.Google ScholarGoogle ScholarCross RefCross Ref
  20. Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS). Budapest, Hungary. 337–340.Google ScholarGoogle ScholarCross RefCross Ref
  21. Leonardo de Moura, Soonho Kong, Jeremy Avigad, Floris van Doorn, and Jakob von Raumer. 2015. The Lean Theorem Prover. In Proceedings of the 25th International Conference on Automated Deduction (CADE). Berlin, Germany. 378–388.Google ScholarGoogle ScholarCross RefCross Ref
  22. Matthias Felleisen, Robert Bruce Findler, Matthew Flatt, Shriram Krishnamurthi, Eli Barzilay, Jay McCarthy, and Sam Tobin-Hochstadt. 2018. A Programmable Programming Language. Commun. ACM, 61, 3 (2018), March, 62–71. https://doi.org/10.1145/3127323 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Cormac Flanagan, Amr Sabry, Bruce F. Duba, and Matthias Felleisen. 1993. The Essence of Compiling with Continuations. In Proceedings of the 14th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Albuquerque, NM. 237–247. https://doi.org/10.1145/173262.155113 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Matthew Flatt and PLT. 2010. Reference: Racket. PLT Design Inc..Google ScholarGoogle Scholar
  25. Matt Fleming. 2017. A thorough introduction to eBPF. https://lwn.net/Articles/740157/Google ScholarGoogle Scholar
  26. José Fragoso Santos, Petar Maksimović, Sacha-Élie Ayoun, and Philippa Gardner. 2020. Gillian, Part I: A Multi-Language Platform for Symbolic Execution. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). London, United Kingdom. 927–942. https://doi.org/10.1145/3385412.3386014 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jacques-Henri Jourdan. 2016. Verasco: a Formally Verified C Static Analyzer. Universite Paris Diderot-Paris VII. https://hal.archives-ouvertes.fr/tel-01327023Google ScholarGoogle Scholar
  28. Jacques-Henri Jourdan, Vincent Laporte, Sandrine Blazy, Xavier Leroy, and David Pichardie. 2015. A Formally-Verified C Static Analyzer. In Proceedings of the 42nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL). Mumbai, India. 247–259. https://doi.org/10.1145/2676726.2676966 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Milod Kazerounian, Niki Vazou, Austin Bourgerie, Jeffrey S. Foster, and Emina Torlak. 2018. Refinement Types for Ruby. In Proceedings of the 19th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI). Los Angeles, CA, USA. 269–290.Google ScholarGoogle ScholarCross RefCross Ref
  30. James C. King. 1976. Symbolic Execution and Program Testing. Commun. ACM, 19, 7 (1976), 385–394. https://doi.org/10.1145/360248.360252 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Dorel Lucanu, Vlad Rusu, and Andrei Arusoaie. 2017. A generic framework for symbolic execution: A coinductive approach. Journal of Symbolic Computation, 80 (2017), May–June, 125–163. https://doi.org/10.1016/j.jsc.2016.07.012 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. William M. McKeeman. 1998. Differential Testing for Software. Digital Technical Journal, 10, 1 (1998), 100–107.Google ScholarGoogle Scholar
  33. Luke Nelson, James Bornholt, Ronghui Gu, Andrew Baumann, Emina Torlak, and Xi Wang. 2019. Scaling symbolic evaluation for automated verification of systems code with Serval. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP). Huntsville, Ontario, Canada. 225–242. https://doi.org/10.1145/3341301.3359641 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Luke Nelson, Jacob Van Geffen, Emina Torlak, and Xi Wang. 2020. Specification and verification in the field: Applying formal methods to BPF just-in-time compilers in the Linux kernel. In Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Banff, Alberta, Canada. 41–61.Google ScholarGoogle Scholar
  35. Max S. New, Burke Fetscher, Robert Bruce Findler, and Jay McCarthy. 2017. Fair enumeration combinators. Journal of Functional Programming, 27 (2017), e19. https://doi.org/10.1017/S0956796817000107 Google ScholarGoogle ScholarCross RefCross Ref
  36. Phúc C. Nguyên, Sam Tobin-Hochstadt, and David Van Horn. 2017. Higher order symbolic execution for contract verification and refutation. Journal of Functional Programming, 27 (2017), e3. https://doi.org/10.1017/S0956796816000216 Google ScholarGoogle ScholarCross RefCross Ref
  37. Aina Niemetz, Mathias Preiner, and Armin Biere. 2014. Boolector 2.0. J. Satisf. Boolean Model. Comput., 9, 1 (2014), 53–58. https://doi.org/10.3233/sat190101 Google ScholarGoogle ScholarCross RefCross Ref
  38. Stuart Pernsteiner, Calvin Loncaric, Emina Torlak, Zachary Tatlock, Xi Wang, Michael D. Ernst, and Jonathan Jacky. 2016. Investigating Safety of a Radiotherapy Machine Using System Models with Pluggable Checkers. In Proceedings of the 28th International Conference on Computer Aided Verification (CAV). 2, Toronto, ON, Canada. 23–41.Google ScholarGoogle ScholarCross RefCross Ref
  39. Phitchaya Mangpo Phothilimthana, Aditya Thakur, Rastislav Bodik, and Dinakar Dhurjati. 2016. Scaling Up Superoptimization. In Proceedings of the 21st International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Atlanta, GA, USA. 297–310. https://doi.org/10.1145/2872362.2872387 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Koushik Sen, George Necula, Liang Gong, and Wontae Choi. 2015. MultiSE: multi-path symbolic execution using value summaries. In Proceedings of the 10th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE). Bergamo, Italy. 842–853. https://doi.org/10.1145/2786805.2786830 Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Vijay Saraswat, and Sanjit Seshia. 2006. Combinatorial Sketching for Finite Programs. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). San Jose, CA, USA. 404–415. https://doi.org/10.1145/1168857.1168907 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Sol Otis Swords. 2010. A verified framework for symbolic execution in the ACL2 theorem prover. Ph.D. Dissertation. The University of Texas at Austin.Google ScholarGoogle Scholar
  43. Emina Torlak and Rastislav Bodik. 2013. Growing Solver-Aided Languages with Rosette. In Proceedings of the 2013 ACM Symposium on New Ideas in Programming and Reflections on Software (Onward!). Indianapolis, IN, USA. 135–152. https://doi.org/10.1145/2509578.2509586 Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Emina Torlak and Rastislav Bodik. 2014. A Lightweight Symbolic Virtual Machine for Solver-Aided Host Languages. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Edinburgh, United Kingdom. 530–541. https://doi.org/10.1145/2666356.2594340 Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Richard Uhler and Nirav Dave. 2014. Smten with Satisfiability-Based Search. In Proceedings of the 29th ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). Portland, OR, USA. 157–176. https://doi.org/10.1145/2714064.2660208 Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Konstantin Weitz, Steven Lyubomirsky, Stefan Heule, Emina Torlak, Michael D. Ernst, and Zachary Tatlock. 2017. SpaceSearch: A Library for Building and Verifying Solver-Aided Tools. In Proceedings of the 22nd ACM SIGPLAN International Conference on Functional Programming (ICFP). Oxford, United Kingdom. Article 25, 28 pages. https://doi.org/10.1145/3110269 Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Konstantin Weitz, Doug Woos, Emina Torlak, Michael D. Ernst, Arvind Krishnamurthy, and Zachary Tatlock. 2016. Scalable Verification of Border Gateway Protocol Configurations with an SMT Solver. In Proceedings of the 31st ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). Amsterdam, The Netherlands. 765–780. https://doi.org/10.1145/2983990.2984012 Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Max Willsey, Ashley P. Stephenson, Chris Takahashi, Pranav Vaid, Bichlien H. Nguyen, Michal Piszczek, Christine Betts, Sharon Newman, Sarang Joshi, Karin Strauss, and Luis Ceze. 2019. Puddle: A Dynamic, Error-Correcting, Full-Stack Microfluidics Platform. In Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Providence, RI. 183–197. https://doi.org/10.1145/3297858.3304027 Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Dominik Winterer, Chengyu Zhang, and Zhendong Su. 2020. Validating SMT Solvers via Semantic Fusion. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). London, United Kingdom. 718–730. https://doi.org/10.1145/3385412.3385985 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A formal foundation for symbolic evaluation with merging

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!