Abstract
Reusable symbolic evaluators are a key building block of solver-aided verification and synthesis tools. A reusable evaluator reduces the semantics of all paths in a program to logical constraints, and a client tool uses these constraints to formulate a satisfiability query that is discharged with SAT or SMT solvers. The correctness of the evaluator is critical to the soundness of the tool and the domain properties it aims to guarantee. Yet so far, the trust in these evaluators has been based on an ad-hoc foundation of testing and manual reasoning.
This paper presents the first formal framework for reasoning about the behavior of reusable symbolic evaluators. We develop a new symbolic semantics for these evaluators that incorporates state merging. Symbolic evaluators use state merging to avoid path explosion and generate compact encodings. To accommodate a wide range of implementations, our semantics is parameterized by a symbolic factory, which abstracts away the details of merging and creation of symbolic values. The semantics targets a rich language that extends Core Scheme with assumptions and assertions, and thus supports branching, loops, and (first-class) procedures. The semantics is designed to support reusability, by guaranteeing two key properties: legality of the generated symbolic states, and the reducibility of symbolic evaluation to concrete evaluation. Legality makes it simpler for client tools to formulate queries, and reducibility enables testing of client tools on concrete inputs. We use the Lean theorem prover to mechanize our symbolic semantics, prove that it is sound and complete with respect to the concrete semantics, and prove that it guarantees legality and reducibility.
To demonstrate the generality of our semantics, we develop Leanette, a reference evaluator written in Lean, and Rosette 4, an optimized evaluator written in Racket. We prove Leanette correct with respect to the semantics, and validate Rosette 4 against Leanette via solver-aided differential testing. To demonstrate the practicality of our approach, we port 16 published verification and synthesis tools from Rosette 3 to Rosette 4. Rosette 3 is an existing reusable evaluator that implements the classic merging semantics, adopted from bounded model checking. Rosette 4 replaces the semantic core of Rosette 3 but keeps its optimized symbolic factory. Our results show that Rosette 4 matches the performance of Rosette 3 across a wide range of benchmarks, while providing a cleaner interface that simplifies the implementation of client tools.
Supplemental Material
- Amazon Web Services. 2018. Quivela. https://github.com/jamesbornholt/quivelaGoogle Scholar
- Clark W. Barrett, Christopher L. Conway, Morgan Deters, Liana Hadarean, Dejan Jovanovic, Tim King, Andrew Reynolds, and Cesare Tinelli. 2011. CVC4. In Proceedings of the 23rd International Conference on Computer Aided Verification (CAV), Ganesh Gopalakrishnan and Shaz Qadeer (Eds.) (Lecture Notes in Computer Science, Vol. 6806). Springer, 171–177. https://doi.org/10.1007/978-3-642-22110-1_14 Google Scholar
Cross Ref
- Armin Biere, Alessandro Cimatti, Edmund M. Clarke, and Yunshan Zhu. 1999. Symbolic Model Checking Without BDDs. In Proceedings of the 5th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS). Amsterdam, The Netherlands. 193–207.Google Scholar
Digital Library
- Jasmin Christian Blanchette, Mathias Fleury, and Christoph Weidenbach. 2017. A Verified SAT Solver Framework with Learn, Forget, Restart, and Incrementality. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI). Melbourne, Australia. 4786–4790. https://doi.org/10.24963/ijcai.2017/667 Google Scholar
Cross Ref
- Rastislav Bodik, Satish Chandra, Joel Galenson, Doug Kimelman, Nicholas Tung, Shaon Barman, and Casey Rodarmor. 2010. Programming with Angelic Nondeterminism. In Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL). Madrid, Spain. 339–352. https://doi.org/10.1145/1706299.1706339 Google Scholar
Digital Library
- James Bornholt, Antoine Kaufmann, Jialin Li, Arvind Krishnamurthy, Emina Torlak, and Xi Wang. 2016. Specifying and checking file system crash-consistency models. In Proceedings of the 21st International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Atlanta, GA, USA. 83–98. https://doi.org/10.1145/2872362.2872406 Google Scholar
Digital Library
- James Bornholt and Emina Torlak. 2017. Synthesizing Memory Models from Framework Sketches and Litmus Tests. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Barcelona, Spain. 467–481. https://doi.org/10.1145/3140587.3062353 Google Scholar
Digital Library
- James Bornholt and Emina Torlak. 2018. Finding Code That Explodes Under Symbolic Evaluation. In Proceedings of the 2018 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). Boston, MA. Article 149, 26 pages. https://doi.org/10.1145/3276519 Google Scholar
Digital Library
- Alan Borning. 2016. Wallingford: Toward a Constraint Reactive Programming Language. In Proceedings of the Constrained and Reactive Objects Workshop (CROW). Málaga, Spain. 45–49. https://doi.org/10.1145/2892664.2892667 Google Scholar
Digital Library
- Eric Butler, Emina Torlak, and Zoran Popović. 2017. Synthesizing Interpretable Strategies for Solving Puzzle Games. In Proceedings of the 12th International Conference on the Foundations of Digital Games (FDG). Hyannis, MA. Article 10, 17 pages. https://doi.org/10.1145/3102071.3102084 Google Scholar
Digital Library
- Kartik Chandra and Rastislav Bodik. 2018. Bonsai: Synthesis-Based Reasoning for Type Systems. Proc. ACM Program. Lang., 2, POPL (2018), Jan., 62:1–62:34. https://doi.org/10.1145/3158150 Google Scholar
Digital Library
- Adam Chlipala. 2010. A Verified Compiler for an Impure Functional Language. In Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL). Madrid, Spain. 93–106. https://doi.org/10.1145/1706299.1706312 Google Scholar
Digital Library
- Shumo Chu, Chenglong Wang, Konstantin Weitz, and Alvin Cheung. 2017. Cosette: An Automated Prover for SQL. In Proceedings of the 8th Biennial Conference on Innovative Data Systems (CIDR). Chaminade, CA.Google Scholar
- Edmund Clarke, Daniel Kroening, and Flavio Lerda. 2004. A Tool for Checking ANSI-C Programs. In Proceedings of the 10th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS). Barcelona, Spain. 168–176.Google Scholar
Cross Ref
- Edmund Clarke, Daniel Kroening, and Karen Yorav. 2003. Behavioral Consistency of C and Verilog Programs. Carnegie Mellon University. https://doi.org/10.1145/775832.775928 Google Scholar
Digital Library
- Lori A. Clarke. 1976. A System to Generate Test Data and Symbolically Execute Programs. IEEE Transactions on Software Engineering, 2, 3 (1976), 215–222. https://doi.org/10.1109/TSE.1976.233817 Google Scholar
Digital Library
- Patrick Cousot and Radhia Cousot. 1977. Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints. In Proceedings of the 4th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL). Los Angeles, CA. 238–252. https://doi.org/10.1145/512950.512973 Google Scholar
Digital Library
- Patrick Cousot and Radhia Cousot. 1979. Systematic Design of Program Analysis Frameworks. In Proceedings of the 6th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL). San Antonio, TX. 269–282. https://doi.org/10.1145/567752.567778 Google Scholar
Digital Library
- Luís Cruz-Filipe, Marijn J. H. Heule, Warren A. Hunt, Matt Kaufmann, and Peter Schneider-Kamp. 2017. Efficient Certified RAT Verification. In Proceedings of the 26th International Conference on Automated Deduction (CADE). Gothenburg, Sweden. 220–236.Google Scholar
Cross Ref
- Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS). Budapest, Hungary. 337–340.Google Scholar
Cross Ref
- Leonardo de Moura, Soonho Kong, Jeremy Avigad, Floris van Doorn, and Jakob von Raumer. 2015. The Lean Theorem Prover. In Proceedings of the 25th International Conference on Automated Deduction (CADE). Berlin, Germany. 378–388.Google Scholar
Cross Ref
- Matthias Felleisen, Robert Bruce Findler, Matthew Flatt, Shriram Krishnamurthi, Eli Barzilay, Jay McCarthy, and Sam Tobin-Hochstadt. 2018. A Programmable Programming Language. Commun. ACM, 61, 3 (2018), March, 62–71. https://doi.org/10.1145/3127323 Google Scholar
Digital Library
- Cormac Flanagan, Amr Sabry, Bruce F. Duba, and Matthias Felleisen. 1993. The Essence of Compiling with Continuations. In Proceedings of the 14th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Albuquerque, NM. 237–247. https://doi.org/10.1145/173262.155113 Google Scholar
Digital Library
- Matthew Flatt and PLT. 2010. Reference: Racket. PLT Design Inc..Google Scholar
- Matt Fleming. 2017. A thorough introduction to eBPF. https://lwn.net/Articles/740157/Google Scholar
- José Fragoso Santos, Petar Maksimović, Sacha-Élie Ayoun, and Philippa Gardner. 2020. Gillian, Part I: A Multi-Language Platform for Symbolic Execution. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). London, United Kingdom. 927–942. https://doi.org/10.1145/3385412.3386014 Google Scholar
Digital Library
- Jacques-Henri Jourdan. 2016. Verasco: a Formally Verified C Static Analyzer. Universite Paris Diderot-Paris VII. https://hal.archives-ouvertes.fr/tel-01327023Google Scholar
- Jacques-Henri Jourdan, Vincent Laporte, Sandrine Blazy, Xavier Leroy, and David Pichardie. 2015. A Formally-Verified C Static Analyzer. In Proceedings of the 42nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL). Mumbai, India. 247–259. https://doi.org/10.1145/2676726.2676966 Google Scholar
Digital Library
- Milod Kazerounian, Niki Vazou, Austin Bourgerie, Jeffrey S. Foster, and Emina Torlak. 2018. Refinement Types for Ruby. In Proceedings of the 19th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI). Los Angeles, CA, USA. 269–290.Google Scholar
Cross Ref
- James C. King. 1976. Symbolic Execution and Program Testing. Commun. ACM, 19, 7 (1976), 385–394. https://doi.org/10.1145/360248.360252 Google Scholar
Digital Library
- Dorel Lucanu, Vlad Rusu, and Andrei Arusoaie. 2017. A generic framework for symbolic execution: A coinductive approach. Journal of Symbolic Computation, 80 (2017), May–June, 125–163. https://doi.org/10.1016/j.jsc.2016.07.012 Google Scholar
Digital Library
- William M. McKeeman. 1998. Differential Testing for Software. Digital Technical Journal, 10, 1 (1998), 100–107.Google Scholar
- Luke Nelson, James Bornholt, Ronghui Gu, Andrew Baumann, Emina Torlak, and Xi Wang. 2019. Scaling symbolic evaluation for automated verification of systems code with Serval. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP). Huntsville, Ontario, Canada. 225–242. https://doi.org/10.1145/3341301.3359641 Google Scholar
Digital Library
- Luke Nelson, Jacob Van Geffen, Emina Torlak, and Xi Wang. 2020. Specification and verification in the field: Applying formal methods to BPF just-in-time compilers in the Linux kernel. In Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Banff, Alberta, Canada. 41–61.Google Scholar
- Max S. New, Burke Fetscher, Robert Bruce Findler, and Jay McCarthy. 2017. Fair enumeration combinators. Journal of Functional Programming, 27 (2017), e19. https://doi.org/10.1017/S0956796817000107 Google Scholar
Cross Ref
- Phúc C. Nguyên, Sam Tobin-Hochstadt, and David Van Horn. 2017. Higher order symbolic execution for contract verification and refutation. Journal of Functional Programming, 27 (2017), e3. https://doi.org/10.1017/S0956796816000216 Google Scholar
Cross Ref
- Aina Niemetz, Mathias Preiner, and Armin Biere. 2014. Boolector 2.0. J. Satisf. Boolean Model. Comput., 9, 1 (2014), 53–58. https://doi.org/10.3233/sat190101 Google Scholar
Cross Ref
- Stuart Pernsteiner, Calvin Loncaric, Emina Torlak, Zachary Tatlock, Xi Wang, Michael D. Ernst, and Jonathan Jacky. 2016. Investigating Safety of a Radiotherapy Machine Using System Models with Pluggable Checkers. In Proceedings of the 28th International Conference on Computer Aided Verification (CAV). 2, Toronto, ON, Canada. 23–41.Google Scholar
Cross Ref
- Phitchaya Mangpo Phothilimthana, Aditya Thakur, Rastislav Bodik, and Dinakar Dhurjati. 2016. Scaling Up Superoptimization. In Proceedings of the 21st International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Atlanta, GA, USA. 297–310. https://doi.org/10.1145/2872362.2872387 Google Scholar
Digital Library
- Koushik Sen, George Necula, Liang Gong, and Wontae Choi. 2015. MultiSE: multi-path symbolic execution using value summaries. In Proceedings of the 10th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE). Bergamo, Italy. 842–853. https://doi.org/10.1145/2786805.2786830 Google Scholar
Digital Library
- Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Vijay Saraswat, and Sanjit Seshia. 2006. Combinatorial Sketching for Finite Programs. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). San Jose, CA, USA. 404–415. https://doi.org/10.1145/1168857.1168907 Google Scholar
Digital Library
- Sol Otis Swords. 2010. A verified framework for symbolic execution in the ACL2 theorem prover. Ph.D. Dissertation. The University of Texas at Austin.Google Scholar
- Emina Torlak and Rastislav Bodik. 2013. Growing Solver-Aided Languages with Rosette. In Proceedings of the 2013 ACM Symposium on New Ideas in Programming and Reflections on Software (Onward!). Indianapolis, IN, USA. 135–152. https://doi.org/10.1145/2509578.2509586 Google Scholar
Digital Library
- Emina Torlak and Rastislav Bodik. 2014. A Lightweight Symbolic Virtual Machine for Solver-Aided Host Languages. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Edinburgh, United Kingdom. 530–541. https://doi.org/10.1145/2666356.2594340 Google Scholar
Digital Library
- Richard Uhler and Nirav Dave. 2014. Smten with Satisfiability-Based Search. In Proceedings of the 29th ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). Portland, OR, USA. 157–176. https://doi.org/10.1145/2714064.2660208 Google Scholar
Digital Library
- Konstantin Weitz, Steven Lyubomirsky, Stefan Heule, Emina Torlak, Michael D. Ernst, and Zachary Tatlock. 2017. SpaceSearch: A Library for Building and Verifying Solver-Aided Tools. In Proceedings of the 22nd ACM SIGPLAN International Conference on Functional Programming (ICFP). Oxford, United Kingdom. Article 25, 28 pages. https://doi.org/10.1145/3110269 Google Scholar
Digital Library
- Konstantin Weitz, Doug Woos, Emina Torlak, Michael D. Ernst, Arvind Krishnamurthy, and Zachary Tatlock. 2016. Scalable Verification of Border Gateway Protocol Configurations with an SMT Solver. In Proceedings of the 31st ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). Amsterdam, The Netherlands. 765–780. https://doi.org/10.1145/2983990.2984012 Google Scholar
Digital Library
- Max Willsey, Ashley P. Stephenson, Chris Takahashi, Pranav Vaid, Bichlien H. Nguyen, Michal Piszczek, Christine Betts, Sharon Newman, Sarang Joshi, Karin Strauss, and Luis Ceze. 2019. Puddle: A Dynamic, Error-Correcting, Full-Stack Microfluidics Platform. In Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Providence, RI. 183–197. https://doi.org/10.1145/3297858.3304027 Google Scholar
Digital Library
- Dominik Winterer, Chengyu Zhang, and Zhendong Su. 2020. Validating SMT Solvers via Semantic Fusion. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). London, United Kingdom. 718–730. https://doi.org/10.1145/3385412.3385985 Google Scholar
Digital Library
Index Terms
A formal foundation for symbolic evaluation with merging
Recommendations
Symbolic Testing and the DISSECT Symbolic Evaluation System
Symbolic testing and a symbolic evaluation system called DISSECT are described. The principle features of DISSECT are outlined. The results of two classes of experiments in the use of symbolic evaluadon are summarized. Several classes of program errors ...
Symbolic evaluation of lisp functions with side effects for verification
IJCAI'81: Proceedings of the 7th international joint conference on Artificial intelligence - Volume 1In this paper we present a symbolic evaluator of LISP functions. It can handle data-altering functions of the RPLACA type, i. e. functions that change one datastructure by replacing parts of it by other structures that will themselves not be changed ...
Fixing Code that Explodes Under Symbolic Evaluation
Verification, Model Checking, and Abstract InterpretationAbstractEffective symbolic evaluation is key to building scalable verification and synthesis tools based on SMT solving. These tools use symbolic evaluators to reduce the semantics of all paths through a finite program to logical constraints, discharged ...






Comments