Abstract
We propose Generative Type-Aware Mutation, an effective approach for testing SMT solvers. The key idea is to realize generation through the mutation of expressions rooted with parametric operators from the SMT-LIB specification. Generative Type-Aware Mutation is a hybrid of mutation-based and grammar-based fuzzing and features an infinite mutation space—overcoming a major limitation of OpFuzz, the state-of-the-art fuzzer for SMT solvers. We have realized Generative Type-Aware Mutation in a practical SMT solver bug hunting tool, TypeFuzz. During our testing period with TypeFuzz, we reported over 237 bugs in the state-of-the-art SMT solvers Z3 and CVC4. Among these, 189 bugs were confirmed and 176 bugs were fixed. Most notably, we found 18 soundness bugs in CVC4’s default mode alone. Several of them were two years latent (7/18). CVC4 has been proved to be a very stable SMT solver and has resisted several fuzzing campaigns.
Supplemental Material
- Domenico Amalfitano, Nicola Amatucci, Anna Rita Fasolino, Porfirio Tramontana, Emily Kowalczyk, and Atif M. Memon. 2015. Exploiting the Saturation Effect in Automatic Random Testing of Android Applications. In MOBILESoft ’15. 33–43. isbn:9781479919345 https://doi.org/10.1109/MobileSoft.2015.11 Google Scholar
Cross Ref
- Cornelius Aschermann, Tommaso Frassetto, Thorsten Holz, Patrick Jauernig, Ahmad-Reza Sadeghi, and Daniel Teuchert. 2019. NAUTILUS: Fishing for Deep Bugs with Grammars. In NDSS ’19. https://doi.org/10.14722/ndss.2019.23412 Google Scholar
Cross Ref
- Clark Barrett, Christopher L. Conway, Morgan Deters, Liana Hadarean, Dejan Jovanović, Tim King, Andrew Reynolds, and Cesare Tinelli. 2011. CVC4. In CAV ’11. 171–177. https://doi.org/10.1007/978-3-642-22110-1_14 Google Scholar
Cross Ref
- Clark Barrett, Pascal Fontaine, and Cesare Tinelli. 2019. The Satisfiability Modulo Theories Library (SMT-LIB). www.SMT-LIB.orgGoogle Scholar
- Clark Barrett, Aaron Stump, and Cesare Tinelli. 2010. The SMT-LIB Standard: Version 2.0. In SMT ’10.Google Scholar
- Dmitry Blotsky, Federico Mora, Murphy Berzish, Yunhui Zheng, Ifaz Kabir, and Vijay Ganesh. 2018. StringFuzz: A Fuzzer for String Solvers. In CAV ’18. 45–51. https://doi.org/10.1007/978-3-319-96142-2_6 Google Scholar
Cross Ref
- Robert Brummayer and Armin Biere. 2009. Boolector: An Efficient SMT Solver for Bit-Vectors and Arrays. In TACAS ’09. 174–177. https://doi.org/10.1007/978-3-642-00768-2_16 Google Scholar
Digital Library
- Robert Brummayer and Armin Biere. 2009. Fuzzing and delta-debugging SMT solvers. In SMT ’09. 1–5. https://doi.org/10.1145/1670412.1670413 Google Scholar
Digital Library
- Alexandra Bugariu and Peter Müller. 2020. Automatically Testing String Solvers. In ICSE ’20. https://doi.org/10.1145/3377811.3380398 Google Scholar
Digital Library
- Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. 2008. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs. In OSDI ’08. 209–224. https://www.usenix.org/conference/osdi-08/klee-unassisted-and-automatic-generation-high-coverage-tests-complex-systems Google Scholar
Digital Library
- The International SMT Competition.. 2021. SMT-COMP. https://smt-comp.github.io/2021/Google Scholar
- Leonardo de Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In TACAS ’08. 337–340. https://doi.org/10.1007/978-3-540-78800-3_24 Google Scholar
Digital Library
- Rob DeLine and Rustan Leino. 2005. BoogiePL: A Typed Procedural Language for Checking Object-Oriented Programs.Google Scholar
- David Detlefs, Greg Nelson, and James B. Saxe. 2005. Simplify: A Theorem Prover for Program Checking. JACM, 365–473. https://doi.org/10.1145/1066100.1066102 Google Scholar
Digital Library
- Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: directed automated random testing. In PLDI ’05. 213–223. https://doi.org/10.1145/1064978.1065036 Google Scholar
Digital Library
- Patrice Godefroid, Hila Peleg, and Rishabh Singh. 2017. Learn Fuzz: Machine learning for input fuzzing. In ASE ’17. 50–59. https://doi.org/10.1109/ASE.2017.8115618 Google Scholar
Digital Library
- K. V. Hanford. 1970. Automatic generation of test cases. IBM Systems Journal, 9, 4 (1970), 242–257. https://doi.org/10.1147/sj.94.0242 Google Scholar
Digital Library
- Gereon Kremer. 2021. pyDelta: delta debugging for SMT-LIB. https://github.com/nafur/pydeltaGoogle Scholar
- Leonidas Lampropoulos, Michael Hicks, and Benjamin C. Pierce. 2019. Coverage guided, property based testing. In OOPSLA ’19. 1–29. https://doi.org/10.1145/3360607 Google Scholar
Digital Library
- Vsevolod Livinskii, Dmitry Babokin, and John Regehr. 2020. Random Testing for C and C++ Compilers with YARPGen. In OOPSLA ’20. 1–25. https://doi.org/10.1145/3428264 Google Scholar
Digital Library
- Aina Niemetz, Mathias Preiner, and Armin Biere. 2017. Model-based API testing for SMT solvers. In SMT ’17. 10.Google Scholar
- Muhammad Numair Mansur, Maria Christakis, Valentin Wüstholz, and Fuyuan Zhang. 2020. Detecting Critical Bugs in SMT Solvers Using Blackbox Mutational Fuzzing. In FSE ’20. 701–712. https://doi.org/10.1145/3368089.3409763 Google Scholar
Digital Library
- Van-Thuan Pham, Marcel Böhme, Andrew Edward Santosa, Alexandru Razvan Caciulescu, and Abhik Roychoudhury. 2019. Smart greybox fuzzing. TSE ’19, https://doi.org/10.1109/TSE.2019.2941681 Google Scholar
Cross Ref
- John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case Reduction for C Compiler Bugs. In PLDI ’12. 335–346. https://doi.org/10.1145/2345156.2254104 Google Scholar
Digital Library
- Manuel Rigger and Zhendong Su. 2020. Testing Database Engines via Pivoted Query Synthesis. In OSDI ’20. 667–682. https://www.usenix.org/conference/osdi20/presentation/rigger Google Scholar
Digital Library
- Armando Solar-Lezama. 2008. Program Synthesis by Sketching. Ph.D. Dissertation. isbn:9781109097450 https://dl.acm.org/doi/10.5555/1714168 Google Scholar
Digital Library
- Emina Torlak and Rastislav Bodik. 2014. A lightweight symbolic virtual machine for solver-aided host languages. In PLDI ’14. 530–541. https://doi.org/10.1145/2666356.2594340 Google Scholar
Digital Library
- Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2019. Superion: grammar-aware greybox fuzzing. In ICSE ’19. 724–735. https://doi.org/10.1109/ICSE.2019.00081 Google Scholar
Digital Library
- Dominik Winterer, Chengyu Zhang, and Zhendong Su. 2020. On the Unusal Effectiveness of Type-Aware Operator Mutation. OOPSLA ’20, https://doi.org/10.1145/3428261 Google Scholar
Digital Library
- Dominik Winterer, Chengyu Zhang, and Zhendong Su. 2020. Validating SMT Solvers via Semantic Fusion. PLDI ’20, https://doi.org/10.1145/3385412.3385985 Google Scholar
Digital Library
- Dominik Winterer, Chengyu Zhang, and Zhendong Su. 2021. yinyang: a fuzzer for SMT solvers. https://github.com/testsmt/yinyangGoogle Scholar
- Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In PLDI ’11. 283–294. https://doi.org/10.1145/1993316.1993532 Google Scholar
Digital Library
- Michal Zalewski. 2021. american fuzzy lop. https://lcamtuf.coredump.cx/afl/Google Scholar
- Qirun Zhang, Chengnian Sun, and Zhendong Su. 2017. Skeletal program enumeration for rigorous compiler testing. In PLDI ’17. 347–361. https://doi.org/10.1145/3140587.3062379 Google Scholar
Digital Library
Index Terms
Generative type-aware mutation for testing SMT solvers
Recommendations
On the unusual effectiveness of type-aware operator mutations for testing SMT solvers
We propose type-aware operator mutation, a simple, but unusually effective approach for testing SMT solvers. The key idea is to mutate operators of conforming types within the seed formulas to generate well-typed mutant formulas. These mutant formulas ...
Detecting critical bugs in SMT solvers using blackbox mutational fuzzing
ESEC/FSE 2020: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software EngineeringFormal methods use SMT solvers extensively for deciding formula satisfiability, for instance, in software verification, systematic test generation, and program synthesis. However, due to their complex implementations, solvers may contain critical bugs ...
Validating SMT solvers via semantic fusion
PLDI 2020: Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and ImplementationWe introduce Semantic Fusion, a general, effective methodology for validating Satisfiability Modulo Theory (SMT) solvers. Our key idea is to fuse two existing equisatisfiable (i.e., both satisfiable or unsatisfiable) formulas into a new formula that ...






Comments