Abstract
Satisfiability modulo theories (SMT) solving has become a critical part of many static analyses, including symbolic execution, refinement type checking, and model checking. We propose Formulog, a domain-specific language that makes it possible to write a range of SMT-based static analyses in a way that is both close to their formal specifications and amenable to high-level optimizations and efficient evaluation.
Formulog extends the logic programming language Datalog with a first-order functional language and mechanisms for representing and reasoning about SMT formulas; a novel type system supports the construction of expressive formulas, while ensuring that neither normal evaluation nor SMT solving goes wrong. Our case studies demonstrate that a range of SMT-based analyses can naturally and concisely be encoded in Formulog, and that — thanks to this encoding — high-level Datalog-style optimizations can be automatically and advantageously applied to these analyses.
Supplemental Material
- Alex Aiken, Suhabe Bugrara, Isil Dillig, Thomas Dillig, Brian Hackett, and Peter Hawkins. 2007. An Overview of the Saturn Project. In Proceedings of the 7th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering. 43-48. https://doi.org/10.1145/1251535.1251543 Google Scholar
Digital Library
- Aws Albarghouthi, Paraschos Koutris, Mayur Naik, and Calvin Smith. 2017. Constraint-Based Synthesis of Datalog Programs. In Proceedings of the 23rd International Conference on Principles and Practice of Constraint Programming. 689-706. https://doi.org/10.1007/978-3-319-66158-2_44 Google Scholar
- Sergio Antoy and Michael Hanus. 2010. Functional Logic Programming. Commun. ACM 53, 4 ( 2010 ), 74-85. https: //doi.org/10.1145/1721654.1721675 Google Scholar
Digital Library
- Krzysztof R Apt, Howard A Blair, and Adrian Walker. 1988. Towards a Theory of Declarative Knowledge. In Foundations of Deductive Databases and Logic Programming. Elsevier, 89-148. https://doi.org/10.1016/B978-0-934613-40-8. 50006-3 Google Scholar
- Molham Aref, Balder ten Cate, Todd J Green, Benny Kimelfeld, Dan Olteanu, Emir Pasalic, Todd L Veldhuizen, and Geofrey Washburn. 2015. Design and Implementation of the LogicBlox System. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. 1371-1382. https://doi.org/10.1145/2723372.2742796 Google Scholar
Digital Library
- Michael Arntzenius and Neel Krishnaswami. 2020. Seminäive Evaluation for a Higher-Order Functional Language. Proceedings of the ACM on Programming Languages 4, POPL ( 2020 ), 22 : 1-22 : 28. https://doi.org/10.1145/3371090 Google Scholar
Digital Library
- Michael Arntzenius and Neelakantan R. Krishnaswami. 2016. Datafun: A Functional Datalog. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming. 214-227. https://doi.org/10.1145/2951913.2951948 Google Scholar
Digital Library
- Pavel Avgustinov, Oege De Moor, Michael Peyton Jones, and Max Schäfer. 2016. QL: Object-Oriented Queries on Relational Data. In Proceedings of the 30th European Conference on Object-Oriented Programming. 2 : 1-2 : 25. https://doi.org/10.4230/ LIPIcs.ECOOP. 2016.2 Google Scholar
Cross Ref
- Francois Bancilhon. 1986. Naive Evaluation of Recursively Defined Relations. In On Knowledge Base Management Systems. Springer, 165-178. https://doi.org/10.1007/978-1-4612-4980-1_17 Google Scholar
Cross Ref
- Francois Bancilhon, David Maier, Yehoshua Sagiv, and Jefrey D Ullman. 1985. Magic Sets and Other Strange Ways to Implement Logic Programs. In Proceedings of the Fifth ACM SIGACT-SIGMOD Symposium on Principles of Database Systems. 1-15. https://doi.org/10.1145/6012.15399 Google Scholar
Digital Library
- Clark Barrett, Christopher L. Conway, Morgan Deters, Liana Hadarean, Dejan Jovanović, Tim King, Andrew Reynolds, and Cesare Tinelli. 2011. CVC4. In Proceedings of the 23rd International Conference on Computer Aided Verification. 171-177. https://doi.org/10.1007/978-3-642-22110-1_14 Google Scholar
- Clark Barrett, Pascal Fontaine, and Cesare Tinelli. 2016. The Satisfiability Modulo Theories Library (SMT-LIB). www.SMTLIB.org.Google Scholar
- Catriel Beeri and Raghu Ramakrishnan. 1991. On the Power of Magic. The Journal of Logic Programming 10, 3-4 ( 1991 ), 255-299. https://doi.org/10.1016/ 0743-1066 ( 91 ) 90038-Q Google Scholar
Digital Library
- Aaron Bembenek, Michael Greenberg, and Stephen Chong. 2020. Formulog: Datalog for SMT-Based Static Analysis (Extended Version). arXiv: 2009. 08361 [cs.PL]Google Scholar
- Gavin M. Bierman, Andrew D. Gordon, Cătălin Hriţcu, and David Langworthy. 2012. Semantic Subtyping with an SMT Solver. Journal of Functional Programming 22, 1 ( 2012 ), 31-105. https://doi.org/10.1145/1863543.1863560 Google Scholar
Digital Library
- Nikolaj Bjørner, Arie Gurfinkel, Ken McMillan, and Andrey Rybalchenko. 2015. Horn Clause Solvers for Program Verification. In Fields of Logic and Computation II. Springer, 24-51. https://doi.org/10.1007/978-3-319-23534-9_2 Google Scholar
- Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly Declarative Specification of Sophisticated Points-to Analyses. In Proceedings of the 24th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications. 243-262. https://doi.org/10.1145/1640089.1640108 Google Scholar
Digital Library
- Gerhard Brewka, Thomas Eiter, and Mirosław Truszczyński. 2011. Answer Set Programming at a Glance. Commun. ACM 54, 12 ( 2011 ), 92-103. https://doi.org/10.1145/2043174.2043195 Google Scholar
Digital Library
- Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation. 209-224.Google Scholar
- Cristian Cadar and Koushik Sen. 2013. Symbolic Execution for Software Testing: Three Decades Later. Commun. ACM 56, 2 (Feb. 2013 ), 82-90. https://doi.org/10.1145/2408776.2408795 Google Scholar
Digital Library
- Mats Carlsson and Per Mildner. 2012. SICStus Prolog-The First 25 years. Theory and Practice of Logic Programming 12, 1-2 ( 2012 ), 35-66. https://doi.org/10.1017/S1471068411000482 Google Scholar
Digital Library
- Alessandro Cimatti and Alberto Griggio. 2012. Software Model Checking via IC3. In Proceedings of the 24th International Conference on Computer Aided Verification. 277-293. https://doi.org/10.1007/978-3-642-31424-7_23 Google Scholar
Digital Library
- Edmund Clarke, Daniel Kroening, and Flavio Lerda. 2004. A Tool for Checking ANSI-C Programs. In Proceedings of the 10th International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 168-176. https: //doi.org/10.1007/978-3-540-24730-2_15 Google Scholar
Cross Ref
- Michael Codish, Vitaly Lagoon, and Peter J Stuckey. 2008. Logic Programming with Satisfiability. Theory and Practice of Logic Programming 8, 1 ( 2008 ), 121-128. https://doi.org/10.1017/S1471068407003146 Google Scholar
Digital Library
- Patrick Cousot and Radhia Cousot. 1977. Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints. In Proceedings of the 4th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages. 238-252. https://doi.org/10.1145/512950.512973 Google Scholar
Digital Library
- William Craig. 1957. Three Uses of the Herbrand-Gentzen Theorem in Relating Model Theory and Proof Theory. The Journal of Symbolic Logic 22, 3 ( 1957 ), 269-285. https://doi.org/10.2307/2963594 Google Scholar
Cross Ref
- Leonardo de Moura and Nikolaj Bjørner. 2008. Z3: An Eficient SMT Solver. In Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 337-340. https://doi.org/10.1007/978-3-540-78800-3_24 Google Scholar
Cross Ref
- Giorgio Delzanno and Andreas Podelski. 1999. Model Checking in CLP. In Proceedings of the 5th International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 223-239. https://doi.org/10.1007/3-540-49059-0_16 Google Scholar
Cross Ref
- David Detlefs, Greg Nelson, and James B. Saxe. 2005. Simplify: A Theorem Prover for Program Checking. J. ACM 52, 3 ( 2005 ), 365-473. https://doi.org/10.1145/1066100.1066102 Google Scholar
Digital Library
- Bruno Dutertre. 2014. Yices 2.2. In Proceedings of the 26th International Conference on Computer Aided Verification. 737-744. https://doi.org/10.1007/978-3-319-08867-9_49 Google Scholar
Digital Library
- Matthias Felleisen, Robert Bruce Findler, and Matthew Flatt. 2009. Semantics Engineering with PLT Redex (1st ed.). The MIT Press.Google Scholar
Digital Library
- Yu Feng, Xinyu Wang, Isil Dillig, and Thomas Dillig. 2015. Bottom-up Context-Sensitive Pointer Analysis for Java. In Proceedings of the 13th Asian Symposium on Programming Languages and Systems. 465-484. https://doi.org/10.1007/978-3-319-26529-2_25 Google Scholar
Cross Ref
- Cormac Flanagan. 2004. Automatic Software Model Checking via Constraint Logic. Science of Computer Programming 50, 1-3 ( 2004 ), 253-270. https://doi.org/10.1016/j.scico. 2004. 01.006 Google Scholar
Digital Library
- Antonio Flores-Montoya and Eric Schulte. 2020. Datalog Disassembly. In 29th USENIX Security Symposium. 1075-1092.Google Scholar
- Laurent Fribourg and Julian Richardson. 1996. Symbolic Verification with Gap-Order Constraints. In Proceedings of the 6th International Workshop on Logic Programming Synthesis and Transformation. 20-37. https://doi.org/10.1007/3-540-62718-9_2 Google Scholar
Cross Ref
- Hervé Gallaire and Jack Minker (Eds.). 1978. Logic and Data Bases. Plenum Press.Google Scholar
- Michael Gelfond and Vladimir Lifschitz. 1988. The Stable Model Semantics for Logic Programming. In Proceedings of the 5th International Conference and Symposium on Logic Programming. 1070-1080.Google Scholar
- Sergey Grebenshchikov, Nuno Lopes, Corneliu Popeea, and Andrey Rybalchenko. 2012. Synthesizing Software Verifiers from Proof Rules. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation. 405-416. https://doi.org/10.1145/2254064.2254112 Google Scholar
Digital Library
- Neville Grech, Lexi Brent, Bernhard Scholz, and Yannis Smaragdakis. 2019. Gigahorse: Thorough, Declarative Decompilation of Smart Contracts. In Proceedings of the 41st International Conference on Software Engineering. 1176-1186. https: //doi.org/10.1109/ICSE. 2019.00120 Google Scholar
Digital Library
- Neville Grech, Michael Kong, Anton Jurisevic, Lexi Brent, Bernhard Scholz, and Yannis Smaragdakis. 2018. Madmax: Surviving Out-of-Gas Conditions in Ethereum Smart Contracts. Proceedings of the ACM on Programming Languages 2, OOPSLA ( 2018 ), 116 : 1-116 : 27. https://doi.org/10.1145/3276486 Google Scholar
Digital Library
- Todd J. Green, Shan Shan Huang, Boon Thau Loo, and Wenchao Zhou. 2013. Datalog and Recursive Query Processing. Foundations and Trends in Databases 5, 2 ( 2013 ), 105-195. https://doi.org/10.1561/1900000017 Google Scholar
Digital Library
- Salvatore Guarnieri and V Benjamin Livshits. 2009. GATEKEEPER: Mostly Static Enforcement of Security and Reliability Policies for JavaScript Code. In Proceedings of the 18th USENIX Security Symposium. 78-85.Google Scholar
- Ashish Gupta, Inderpal Singh Mumick, and Venkatramanan Siva Subrahmanian. 1993. Maintaining Views Incrementally. ACM SIGMOD Record 22, 2 ( 1993 ), 157-166. https://doi.org/10.1145/170035.170066 Google Scholar
Digital Library
- Arie Gurfinkel, Temesghen Kahsai, Anvesh Komuravelli, and Jorge A Navas. 2015. The SeaHorn Verification Framework. In Proceedings of the 27th International Conference on Computer Aided Verification. 343-361. https://doi.org/10.1007/978-3-319-21690-4_20 Google Scholar
- Brian Hackett. 2010. Type Safety in the Linux Kernel. Ph.D. Dissertation. Stanford University.Google Scholar
- Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar, and Grégoire Sutre. 2002. Lazy Abstraction. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 58-70. https://doi.org/10.1145/503272.503279 Google Scholar
Digital Library
- Kryštof Hoder and Nikolaj Bjørner. 2012. Generalized Property Directed Reachability. In Proceedings of the 15th International Conference on Theory and Applications of Satisfiability Testing. Springer, 157-171. https://doi.org/10.1007/978-3-642-31612-8_13 Google Scholar
Digital Library
- Kryštof Hoder, Nikolaj Bjørner, and Leonardo De Moura. 2011. µ Z-An Eficient Engine for Fixed Points with Constraints. In Proceedings of the 23rd International Conference on Computer Aided Verification. 457-462. https://doi.org/10.1007/978-3-642-22110-1_36 Google Scholar
Cross Ref
- Joxan Jafar and Jean-Louis Lassez. 1987. Constraint Logic Programming. In Proceedings of the 14th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages. 111-119. https://doi.org/10.1145/41625.41635 Google Scholar
Digital Library
- Joxan Jafar and Michael J. Maher. 1994. Constraint Logic Programming: A Survey. The Journal of Logic Programming 19 ( 1994 ), 503-581. https://doi.org/10.1016/ 0743-1066 ( 94 ) 90033-7 Google Scholar
Cross Ref
- Herbert Jordan, Bernhard Scholz, and Pavle Subotić. 2016. Souflé: On Synthesis of Program Analyzers. In Proceedings of the 28th International Conference on Computer Aided Verification. 422-430. https://doi.org/10.1007/978-3-319-41540-6_23 Google Scholar
Cross Ref
- Herbert Jordan, Pavle Subotic, David Zhao, and Bernhard Scholz. 2019. A Specialized B-tree for Concurrent Datalog Evaluation. In Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 327-339. https://doi.org/10.1145/3293883.3295719 Google Scholar
Digital Library
- Lennart C.L. Kats and Eelco Visser. 2010. The Spoofax Language Workbench: Rules for Declarative Specification of Languages and IDEs. In Proceedings of the 25th ACM International Conference on Object-Oriented Programming, Systems, Languages, and Applications. 444-463. https://doi.org/10.1145/1869459.1869497 Google Scholar
Digital Library
- James C. King. 1976. Symbolic Execution and Program Testing. Commun. ACM 19, 7 ( 1976 ), 385-394. https://doi.org/10. 1145/360248.360252 Google Scholar
Digital Library
- Ali Sinan Köksal, Viktor Kuncak, and Philippe Suter. 2011. Scala to the Power of Z3: Integrating SMT and Programming. In Proceedings of the 23rd International Conference on Automated Deduction. 400-406. https://doi.org/10.1007/978-3-642-22438-6_30 Google Scholar
Cross Ref
- Robert Kowalski. 1979. Algorithm = Logic + Control. Commun. ACM 22, 7 ( 1979 ), 424-436. https://doi.org/10.1145/359131. 359136 Google Scholar
Digital Library
- Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the 2nd IEEE/ACM International Symposium on Code Generation and Optimization. 75-88. https: //doi.org/10.1109/CGO. 2004.1281665 Google Scholar
Cross Ref
- Chin Soon Lee, Neil D. Jones, and Amir M. Ben-Amram. 2001. The Size-Change Principle for Program Termination. In Proceedings of the 28th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 81-92. https: //doi.org/10.1145/360204.360210 Google Scholar
Digital Library
- V. Benjamin Livshits and Monica S. Lam. 2005. Finding Security Vulnerabilities in Java Applications with Static Analysis. In Proceedings of the 14th USENIX Security Symposium. 271-286.Google Scholar
- Magnus Madsen, Ming-Ho Yee, and Ondřej Lhoták. 2016. From Datalog to Flix: a Declarative Language for Fixed Points on Lattices. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation. 194-208. https://doi.org/10.1145/2908080.2908096 Google Scholar
Digital Library
- Kenneth L McMillan. 2006. Lazy Abstraction with Interpolants. In Proceedings of the 18th International Conference on Computer Aided Verification. Springer, 123-136. https://doi.org/10.1007/11817963_14 Google Scholar
Digital Library
- Dale Miller and Gopalan Nadathur. 1987. A Logic Programming Approach to Manipulating Formulas and Programs. In Proceedings of the 1987 Symposium on Logic Programming. 379-388.Google Scholar
- Mayur Naik. 2011. Chord: A Program Analysis Platform for Java. https://www.seas.upenn.edu/~mhnaik/chord/user_guide/ index.html. Accessed: 2020-04-01.Google Scholar
- Frank Pfenning and Conal Elliott. 1988. Higher-Order Abstract Syntax. In Proceedings of the ACM SIGPLAN 1988 Conference on Programming Language Design and Implementation. 199-208. https://doi.org/10.1145/53990.54010 Google Scholar
Digital Library
- Andreas Podelski and Andrey Rybalchenko. 2007. ARMC: The Logical Choice for Software Model Checking with Abstraction Refinement. In Proceedings of the 9th International Symposium on Practical Aspects of Declarative Languages. 245-259. https://doi.org/10.1007/978-3-540-69611-7_16 Google Scholar
Digital Library
- Teodor C Przymusinski. 1988. On the Declarative Semantics of Deductive Databases and Logic Programs. In Foundations of Deductive Databases and Logic Programming. Elsevier, 193-216. https://doi.org/10.1016/b978-0-934613-40-8. 50009-9 Google Scholar
- Mukund Raghothaman, Jonathan Mendelson, David Zhao, Mayur Naik, and Bernhard Scholz. 2019. Provenance-Guided Synthesis of Datalog Programs. Proceedings of the ACM on Programming Languages 4, POPL ( 2019 ), 1-27. https: //doi.org/10.1145/3371130 Google Scholar
Digital Library
- Thomas W. Reps. 1995. Demand Interprocedural Program Analysis Using Logic Databases. In Proceedings of the 3rd ACM SIGSOFT Symposium on Foundations of Software Engineering. 163-196.Google Scholar
Cross Ref
- Patrick M. Rondon, Ming Kawaguci, and Ranjit Jhala. 2008. Liquid Types. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation. 159-169. https://doi.org/10.1145/1375581.1375602 Google Scholar
Digital Library
- Bernhard Scholz, Herbert Jordan, Pavle Subotić, and Till Westmann. 2016. On Fast Large-Scale Program Analysis in Datalog. In Proceedings of the 25th International Conference on Compiler Construction. 196-206. https://doi.org/10.1145/2892208. 2892226 Google Scholar
Digital Library
- Yannis Smaragdakis and Martin Bravenboer. 2011. Using Datalog for Fast and Easy Program Analysis. In Datalog Reloaded. Springer, 245-251. https://doi.org/10.1007/978-3-642-24206-9_14 Google Scholar
Digital Library
- Tamás Szabó, Gábor Bergmann, Sebastian Erdweg, and Markus Voelter. 2018. Incrementalizing Lattice-Based Program Analyses in Datalog. Proceedings of the ACM on Programming Languages 2, OOPSLA ( 2018 ), 139 : 1-139 : 29. https: //doi.org/10.1145/3276509 Google Scholar
Digital Library
- Emina Torlak and Rastislav Bodik. 2013. Growing Solver-Aided Languages with Rosette. In Proceedings of the 2013 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software. 135-152. https://doi.org/10.1145/2509578.2509586 Google Scholar
Digital Library
- Petar Tsankov, Andrei Dan, Dana Drachsler-Cohen, Arthur Gervais, Florian Buenzli, and Martin Vechev. 2018. Securify: Practical Security Analysis of Smart Contracts. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 67-82. https://doi.org/10.1145/3243734.3243780 Google Scholar
Digital Library
- Richard Uhler and Nirav Dave. 2013. Smten: Automatic Translation of High-Level Symbolic Computations into SMT Queries. In Proceedings of the 25th International Conference on Computer Aided Verification. 678-683. https://doi.org/10.1007/978-3-642-39799-8_45 Google Scholar
Cross Ref
- Hendrik van Antwerpen, Casper Bach Poulsen, Arjen Rouvoet, and Eelco Visser. 2018. Scopes as Types. Proceedings of the ACM on Programming Languages 2, OOPSLA ( 2018 ), 114 : 1-114 : 30. https://doi.org/10.1145/3276484 Google Scholar
Digital Library
- Allen Van Gelder. 1989. Negation as Failure Using Tight Derivations for General Logic Programs. The Journal of Logic Programming 6, 1-2 ( 1989 ), 109-133. https://doi.org/10.1016/ 0743-1066 ( 89 ) 90032-0 Google Scholar
Digital Library
- Mark Weiser. 1984. Program Slicing. IEEE Transactions on Software Engineering 4 ( 1984 ), 352-357. https://doi.org/10.1109/ TSE. 1984.5010248 Google Scholar
Digital Library
- John Whaley, Dzintars Avots, Michael Carbin, and Monica S. Lam. 2005. Using Datalog with Binary Decision Diagrams for Program Analysis. In Proceedings of the Third Asian Symposium on Programming Languages and Systems. 97-118. https://doi.org/10.1007/11575467_8 Google Scholar
Digital Library
- John Whaley and Monica S. Lam. 2004. Cloning-Based Context-Sensitive Pointer Alias Analysis Using Binary Decision Diagrams. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation. 131-144. https://doi.org/10.1145/996841.996859 Google Scholar
Digital Library
Index Terms
Formulog: Datalog for SMT-based static analysis
Recommendations
Static analysis in datalog extensions
We consider the problems of containment, equivalence, satisfiability and query-reachability for datalog programs with negation. These problems are important for optimizing datalog programs. We show that both query-reachability and satisfiability are ...
Abstract Hilbertian deductive systems, infon logic, and Datalog
In the first part of the paper, we discuss abstract Hilbertian deductive systems; these are systems defined by abstract notions of formula, axiom, and inference rule. We use these systems to develop a general method for converting derivability problems, ...
Disjunctive datalog with existential quantifiers: Semantics, decidability, and complexity issues
Datalog is one of the best-known rule-based languages, and extensions of it are used in a wide context of applications. An important Datalog extension is Disjunctive Datalog , which significantly increases the expressivity of the basic language. ...






Comments