Abstract
This paper addresses the problem of verifying equivalence between a pair of programs that operate over databases with different schemas. This problem is particularly important in the context of web applications, which typically undergo database refactoring either for performance or maintainability reasons. While web applications should have the same externally observable behavior before and after schema migration, there are no existing tools for proving equivalence of such programs. This paper takes a first step towards solving this problem by formalizing the equivalence and refinement checking problems for database-driven applications. We also propose a proof methodology based on the notion of bisimulation invariants over relational algebra with updates and describe a technique for synthesizing such bisimulation invariants. We have implemented the proposed technique in a tool called Mediator for verifying equivalence between database-driven applications written in our intermediate language and evaluate our tool on 21 benchmarks extracted from textbooks and real-world web applications. Our results show that the proposed methodology can successfully verify 20 of these benchmarks.
Supplemental Material
- Alfred V. Aho, Yehoshua Sagiv, and Jeffrey D. Ullman. 1979. Equivalences Among Relational Expressions. SIAM J. Comput. 8, 2 (1979), 218–246. Google Scholar
Cross Ref
- Joseph Albert, Yannis E. Ioannidis, and Raghu Ramakrishnan. 1999. Equivalence of Keyed Relational Schemas by Conjunctive Queries. J. Comput. Syst. Sci. 58, 3 (1999), 512–534. Google Scholar
Digital Library
- Scott W Ambler. 2007. Test-driven development of relational databases. Ieee Software 24, 3 (2007).Google Scholar
- Scott W Ambler and Pramod J Sadalage. 2006. Refactoring databases: Evolutionary database design. Pearson Education.Google Scholar
- Shay Artzi, Adam Kiezun, Julian Dolby, Frank Tip, Danny Dig, Amit Paradkar, and Michael D Ernst. 2008. Finding bugs in dynamic web applications. In Proc. of ISSTA. 261–272. Google Scholar
Digital Library
- Paolo Atzeni, Giorgio Ausiello, Carlo Batini, and Marina Moscarini. 1982. Inclusion and Equivalence between Relational Database Schemata. Theor. Comput. Sci. 19 (1982), 267–285. Google Scholar
Cross Ref
- Thomas Ball, Todd Millstein, and Sriram K. Rajamani. 2005. Polymorphic Predicate Abstraction. ACM Trans. Program. Lang. Syst. 27, 2 (2005), 314–343. Google Scholar
Digital Library
- Clark Barrett, Aaron Stump, and Cesare Tinelli. 2010. The satisfiability modulo theories library (SMT-LIB). www. SMT-LIB. org 15 (2010), 18–52.Google Scholar
- Gilles Barthe, Juan Manuel Crespo, and César Kunz. 2011. Relational verification using product programs. In Proc. of FM. 200–214. Google Scholar
Cross Ref
- Gilles Barthe, Juan Manuel Crespo, and César Kunz. 2013. Beyond 2-Safety: Asymmetric Product Programs for Relational Program Verification. In Proc. of LFCS. 29–43. Google Scholar
Cross Ref
- Catriel Beeri, Alberto O. Mendelzon, Yehoshua Sagiv, and Jeffrey D. Ullman. 1979. Equivalence of Relational Database Schemes. In Proc. of STOC. 319–329. Google Scholar
Digital Library
- Michael Benedikt, Timothy Griffin, and Leonid Libkin. 1998. Verifiable Properties of Database Transactions. Inf. Comput. 147, 1 (1998), 57–88. Google Scholar
Digital Library
- Nick Benton. 2004. Simple Relational Correctness Proofs for Static Analyses and Program Transformations. In Proc. of POPL. 14–25. Google Scholar
Digital Library
- Loredana Caruccio, Giuseppe Polese, and Genoveffa Tortora. 2016. Synchronization of Queries and Views Upon Schema Evolutions: A Survey. Proc. of TODS 41, 2 (2016), 9:1–9:41.Google Scholar
Digital Library
- Ashok K. Chandra and Philip M. Merlin. 1977. Optimal Implementation of Conjunctive Queries in Relational Data Bases. In Proc. of STOC. 77–90. Google Scholar
Digital Library
- David Chays, Saikat Dan, Phyllis G. Frankl, Filippos I. Vokolos, and Elaine J. Weyuker. 2000. A Framework for Testing Database Applications. In Proc. of ISSTA. 147–157. Google Scholar
Digital Library
- Alvin Cheung, Armando Solar-Lezama, and Samuel Madden. 2013. Optimizing database-backed applications with query synthesis. In Proc. of PLDI. 3–14. Google Scholar
Digital Library
- Shumo Chu, Chenglong Wang, Konstantin Weitz, and Alvin Cheung. 2017a. Cosette: An Automated Prover for SQL. In Proc. of CIDR.Google Scholar
- Shumo Chu, Konstantin Weitz, Alvin Cheung, and Dan Suciu. 2017b. HoTTSQL: Proving query rewrites with univalent SQL semantics. In Proc. of PLDI. 510–524. Google Scholar
Digital Library
- Rance Cleaveland and Matthew Hennessy. 1993. Testing equivalence as a bisimulation equivalence. Formal Aspects of Computing 5, 1 (1993), 1–20. Google Scholar
Cross Ref
- Sara Cohen, Werner Nutt, and Alexander Serebrenik. 1999. Rewriting Aggregate Queries Using Views. In Proc. of PODS. 155–166. Google Scholar
Digital Library
- Carlo Curino, Hyun Jin Moon, Alin Deutsch, and Carlo Zaniolo. 2013. Automating the database schema evolution process. VLDB J. 22, 1 (2013), 73–98. Google Scholar
Digital Library
- Satyaki Das, David Dill, and Seungjoon Park. 1999. Experience with predicate abstraction. In Proc. of CAV. 687–687. Google Scholar
Cross Ref
- Leonardo Mendonça de Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Proc. of TACAS. 337–340.Google Scholar
Cross Ref
- Benjamin Delaware, Clément Pit-Claudel, Jason Gross, and Adam Chlipala. 2015. Fiat: Deductive Synthesis of Abstract Data Types in a Proof Assistant. In Proc. of POPL. 689–700. Google Scholar
Digital Library
- Yuetang Deng, Phyllis Frankl, and David Chays. 2005. Testing Database Transactions with AGENDA. In Proc. of ICSE. 78–87. Google Scholar
Digital Library
- Alin Deutsch, Monica Marcus, Liying Sui, Victor Vianu, and Dayou Zhou. 2005. A Verifier for Interactive, Data-Driven Web Applications. In Proc. of SIGMOD. 539–550. Google Scholar
Digital Library
- Alin Deutsch, Liying Sui, and Victor Vianu. 2007. Specification and verification of data-driven Web applications. J. Comput. Syst. Sci. 73, 3 (2007), 442–474. Google Scholar
Digital Library
- Michael Emmi, Rupak Majumdar, and Koushik Sen. 2007. Dynamic Test Input Generation for Database Applications. In Proc. of ISSTA. 151–162. Google Scholar
Digital Library
- Ronald Fagin, Phokion G Kolaitis, Lucian Popa, and Wang-Chiew Tan. 2011. Schema mapping evolution through composition and inversion. In Schema matching and mapping. Springer, 191–222.Google Scholar
- Stéphane Faroult and Pascal L’Hermite. 2008. Refactoring SQL applications. O’Reilly Media.Google Scholar
- Dennis Felsing, Sarah Grebing, Vladimir Klebanov, Philipp Rümmer, and Mattias Ulbrich. 2014. Automating regression verification. In Proc. of ASE. 349–360. Google Scholar
Digital Library
- Milos Gligoric and Rupak Majumdar. 2013. Model Checking Database Applications. In Proc. of TACAS. 549–564. Google Scholar
Digital Library
- Benny Godlin and Ofer Strichman. 2009. Regression verification. In Proc. of DAC. 466–471. Google Scholar
Digital Library
- Benny Godlin and Ofer Strichman. 2013. Regression verification: proving the equivalence of similar programs. Software Testing, Verification and Reliability 23, 3 (2013), 241–258. Google Scholar
Cross Ref
- Todd J. Green. 2009. Containment of conjunctive queries on annotated relations. In Proc. of ICDT. 296–309. Google Scholar
Digital Library
- Chris Hawblitzel, Ming Kawaguchi, Shuvendu K. Lahiri, and Henrique Rebêlo. 2013. Towards Modularly Comparing Programs Using Automated Theorem Provers. In Proc. of CADE. 282–299. Google Scholar
Digital Library
- Richard Hull. 1986. Relative Information Capacity of Simple Relational Database Schemata. SIAM J. Comput. 15, 3 (1986), 856–886. Google Scholar
Digital Library
- Vasileios Koutavas and Mitchell Wand. 2006a. Bisimulations for Untyped Imperative Objects. In Proc. of ESOP. 146–161. Google Scholar
Digital Library
- Vasileios Koutavas and Mitchell Wand. 2006b. Small bisimulations for reasoning about higher-order imperative programs. In Proc. of POPL. 141–152. Google Scholar
Digital Library
- Sudipta Kundu, Zachary Tatlock, and Sorin Lerner. 2009. Proving optimizations correct using parameterized program equivalence. In Proc. of PLDI. 327–337. Google Scholar
Digital Library
- Shuvendu K Lahiri, Chris Hawblitzel, Ming Kawaguchi, and Henrique Rebêlo. 2012. Symdiff: A language-agnostic semantic diff tool for imperative programs. In Proc. of CAV. 712–717.Google Scholar
Digital Library
- Shuvendu K Lahiri, Kenneth L McMillan, Rahul Sharma, and Chris Hawblitzel. 2013. Differential assertion checking. In Proc. of ESEC/FSE. 345–355.Google Scholar
Digital Library
- Shuvendu K Lahiri and Shaz Qadeer. 2009. Complexity and algorithms for monomial and clausal predicate abstraction. In Proc. of CADE. 214–229.Google Scholar
Digital Library
- Renée J. Miller, Yannis E. Ioannidis, and Raghu Ramakrishnan. 1993. The Use of Information Capacity in Schema Integration and Translation. In Proc. of VLDB. 120–133.Google Scholar
- Joseph P. Near and Daniel Jackson. 2012. Rubicon: bounded verification of web applications. In Proc. of FSE. 60. Google Scholar
Digital Library
- George C. Necula. 2000. Translation validation for an optimizing compiler. In Proc. of PLDI. 83–94. Google Scholar
Digital Library
- Oracle. 2005. Oracle Schema Optimization Guide. https://docs.oracle.com/cd/B14099_19/web.1012/b15901/tuning007.htm . (2005).Google Scholar
- Suzette Person, Matthew B. Dwyer, Sebastian G. Elbaum, and Corina S. Pasareanu. 2008. Differential symbolic execution. In Proc. of FSE. 226–237. Google Scholar
Digital Library
- Amir Pnueli, Michael Siegel, and Eli Singerman. 1998. Translation Validation. In Proc. of TACAS. 151–166. Google Scholar
Cross Ref
- Erhard Rahm and Philip A. Bernstein. 2006. An online bibliography on schema evolution. SIGMOD Record 35, 4 (2006), 30–31. Google Scholar
Digital Library
- Martin Rinard. 1999. Credible Compilation. In MIT TechReport. MIT–LCS–TR–776.Google Scholar
- Arnon Rosenthal and David S. Reiner. 1994. Tools and Transformations - Rigorous and Otherwise - for Practical Database Design. ACM Trans. Database Syst. 19, 2 (1994), 167–211. Google Scholar
Digital Library
- Yehoshua Sagiv and Mihalis Yannakakis. 1980. Equivalences Among Relational Expressions with the Union and Difference Operators. J. ACM 27, 4 (1980), 633–655. Google Scholar
Digital Library
- Davide Sangiorgi, Naoki Kobayashi, and Eijiro Sumii. 2007. Environmental Bisimulations for Higher-Order Languages. In Proc. of LICS. 293–302. Google Scholar
Digital Library
- Davide Sangiorgi, Naoki Kobayashi, and Eijiro Sumii. 2011. Environmental bisimulations for higher-order languages. ACM Trans. Program. Lang. Syst. 33, 1 (2011), 5:1–5:69.Google Scholar
Digital Library
- Marcelo Sousa and Isil Dillig. 2016. Cartesian Hoare logic for verifying k-safety properties. In Proc. of PLDI. 57–69. Google Scholar
Digital Library
- Michael Stepp, Ross Tate, and Sorin Lerner. 2011. Equality-based translation validator for LLVM. In Proc. of CAV. 737–742. Google Scholar
Cross Ref
- Eijiro Sumii and Benjamin C. Pierce. 2004. A bisimulation for dynamic sealing. In Proc. of POPL. 161–172. Google Scholar
Digital Library
- Eijiro Sumii and Benjamin C. Pierce. 2005. A bisimulation for type abstraction and recursion. In Proc. of POPL. 63–74. Google Scholar
Digital Library
- Boris A Trakhtenbrot. 1950. Impossibility of an algorithm for the decision problem in finite classes. Doklady Akademii Nauk SSSR 70 (1950), 569–572.Google Scholar
- Victor Vianu. 2009. Automatic verification of database-driven systems: a new frontier. In Proc. of ICDT. 1–13. Google Scholar
Digital Library
- Joost Visser. 2008. Coupled Transformation of Schemas, Documents, Queries, and Constraints. Electr. Notes Theor. Comput. Sci. 200, 3 (2008), 3–23. Google Scholar
Digital Library
- Yuepeng Wang, Isil Dillig, Shuvendu K. Lahiri, and William R. Cook. 2017. Verifying Equivalence of Database-Driven Applications. http://arxiv.org/abs/1710.07660 . (2017). arXiv: 1710.07660Google Scholar
- Gary Wassermann, Dachuan Yu, Ajay Chander, Dinakar Dhurjati, Hiroshi Inamura, and Zhendong Su. 2008. Dynamic test input generation for web applications. In Proc. of ISSTA. 249–260. Google Scholar
Digital Library
- Wikimedia. 2017. Schema changes. https://wikitech.wikimedia.org/wiki/Schema_changes . (2017).Google Scholar
- Tim Wood, Sophia Drossopoulou, Shuvendu K. Lahiri, and Susan Eisenbach. 2017. Modular Verification of Procedure Equivalence in the Presence of Memory Allocation. In Proc. of ESOP. 937–963. Google Scholar
Digital Library
- Hongseok Yang. 2007. Relational separation logic. Theoretical Computer Science 375, 1-3 (2007), 308–334.Google Scholar
Digital Library
- Anna Zaks and Amir Pnueli. 2008. Covac: Compiler validation by program analysis of the cross-product. In Proc. of FM. 35–51. Google Scholar
Digital Library
- Lenore D. Zuck, Amir Pnueli, and Benjamin Goldberg. 2003. VOC: A Methodology for the Translation Validation of Optimizing Compilers. J. UCS 9, 3 (2003), 223–247.Google Scholar
Index Terms
Verifying equivalence of database-driven applications
Recommendations
Formal verification of code motion techniques using data-flow-driven equivalence checking
Special section on verification challenges in the concurrent worldA formal verification method for checking correctness of code motion techniques is presented in this article. Finite State Machine with Datapath (FSMD) models have been used to represent the input and the output behaviors of each synthesis step. The ...
Proving Functional Equivalence of Two AES Implementations Using Bounded Model Checking
ICST '09: Proceedings of the 2009 International Conference on Software Testing Verification and ValidationBounded model checking---as well as symbolic equivalence checking---are highly successful techniques in the hardware domain. Recently, bit-vector bounded model checkers like CBMC have been developed that are able to check properties of (mostly low-level)...
A Relational Algebraic Approach to Protocol Verification
Communications protocols are usually modeled by a pair of finite-state machines that generate the interaction between processes. Protocol verification is a procedure to validate the logical correctness of these interaction sequences and to detect ...






Comments