skip to main content
research-article
Open Access

Verifying equivalence of database-driven applications

Published:27 December 2017Publication History
Skip Abstract Section

Abstract

This paper addresses the problem of verifying equivalence between a pair of programs that operate over databases with different schemas. This problem is particularly important in the context of web applications, which typically undergo database refactoring either for performance or maintainability reasons. While web applications should have the same externally observable behavior before and after schema migration, there are no existing tools for proving equivalence of such programs. This paper takes a first step towards solving this problem by formalizing the equivalence and refinement checking problems for database-driven applications. We also propose a proof methodology based on the notion of bisimulation invariants over relational algebra with updates and describe a technique for synthesizing such bisimulation invariants. We have implemented the proposed technique in a tool called Mediator for verifying equivalence between database-driven applications written in our intermediate language and evaluate our tool on 21 benchmarks extracted from textbooks and real-world web applications. Our results show that the proposed methodology can successfully verify 20 of these benchmarks.

Skip Supplemental Material Section

Supplemental Material

verifyingequivalence.webm

References

  1. Alfred V. Aho, Yehoshua Sagiv, and Jeffrey D. Ullman. 1979. Equivalences Among Relational Expressions. SIAM J. Comput. 8, 2 (1979), 218–246. Google ScholarGoogle ScholarCross RefCross Ref
  2. Joseph Albert, Yannis E. Ioannidis, and Raghu Ramakrishnan. 1999. Equivalence of Keyed Relational Schemas by Conjunctive Queries. J. Comput. Syst. Sci. 58, 3 (1999), 512–534. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Scott W Ambler. 2007. Test-driven development of relational databases. Ieee Software 24, 3 (2007).Google ScholarGoogle Scholar
  4. Scott W Ambler and Pramod J Sadalage. 2006. Refactoring databases: Evolutionary database design. Pearson Education.Google ScholarGoogle Scholar
  5. Shay Artzi, Adam Kiezun, Julian Dolby, Frank Tip, Danny Dig, Amit Paradkar, and Michael D Ernst. 2008. Finding bugs in dynamic web applications. In Proc. of ISSTA. 261–272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Paolo Atzeni, Giorgio Ausiello, Carlo Batini, and Marina Moscarini. 1982. Inclusion and Equivalence between Relational Database Schemata. Theor. Comput. Sci. 19 (1982), 267–285. Google ScholarGoogle ScholarCross RefCross Ref
  7. Thomas Ball, Todd Millstein, and Sriram K. Rajamani. 2005. Polymorphic Predicate Abstraction. ACM Trans. Program. Lang. Syst. 27, 2 (2005), 314–343. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Clark Barrett, Aaron Stump, and Cesare Tinelli. 2010. The satisfiability modulo theories library (SMT-LIB). www. SMT-LIB. org 15 (2010), 18–52.Google ScholarGoogle Scholar
  9. Gilles Barthe, Juan Manuel Crespo, and César Kunz. 2011. Relational verification using product programs. In Proc. of FM. 200–214. Google ScholarGoogle ScholarCross RefCross Ref
  10. Gilles Barthe, Juan Manuel Crespo, and César Kunz. 2013. Beyond 2-Safety: Asymmetric Product Programs for Relational Program Verification. In Proc. of LFCS. 29–43. Google ScholarGoogle ScholarCross RefCross Ref
  11. Catriel Beeri, Alberto O. Mendelzon, Yehoshua Sagiv, and Jeffrey D. Ullman. 1979. Equivalence of Relational Database Schemes. In Proc. of STOC. 319–329. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Michael Benedikt, Timothy Griffin, and Leonid Libkin. 1998. Verifiable Properties of Database Transactions. Inf. Comput. 147, 1 (1998), 57–88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Nick Benton. 2004. Simple Relational Correctness Proofs for Static Analyses and Program Transformations. In Proc. of POPL. 14–25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Loredana Caruccio, Giuseppe Polese, and Genoveffa Tortora. 2016. Synchronization of Queries and Views Upon Schema Evolutions: A Survey. Proc. of TODS 41, 2 (2016), 9:1–9:41.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ashok K. Chandra and Philip M. Merlin. 1977. Optimal Implementation of Conjunctive Queries in Relational Data Bases. In Proc. of STOC. 77–90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. David Chays, Saikat Dan, Phyllis G. Frankl, Filippos I. Vokolos, and Elaine J. Weyuker. 2000. A Framework for Testing Database Applications. In Proc. of ISSTA. 147–157. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Alvin Cheung, Armando Solar-Lezama, and Samuel Madden. 2013. Optimizing database-backed applications with query synthesis. In Proc. of PLDI. 3–14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Shumo Chu, Chenglong Wang, Konstantin Weitz, and Alvin Cheung. 2017a. Cosette: An Automated Prover for SQL. In Proc. of CIDR.Google ScholarGoogle Scholar
  19. Shumo Chu, Konstantin Weitz, Alvin Cheung, and Dan Suciu. 2017b. HoTTSQL: Proving query rewrites with univalent SQL semantics. In Proc. of PLDI. 510–524. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Rance Cleaveland and Matthew Hennessy. 1993. Testing equivalence as a bisimulation equivalence. Formal Aspects of Computing 5, 1 (1993), 1–20. Google ScholarGoogle ScholarCross RefCross Ref
  21. Sara Cohen, Werner Nutt, and Alexander Serebrenik. 1999. Rewriting Aggregate Queries Using Views. In Proc. of PODS. 155–166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Carlo Curino, Hyun Jin Moon, Alin Deutsch, and Carlo Zaniolo. 2013. Automating the database schema evolution process. VLDB J. 22, 1 (2013), 73–98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Satyaki Das, David Dill, and Seungjoon Park. 1999. Experience with predicate abstraction. In Proc. of CAV. 687–687. Google ScholarGoogle ScholarCross RefCross Ref
  24. Leonardo Mendonça de Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Proc. of TACAS. 337–340.Google ScholarGoogle ScholarCross RefCross Ref
  25. Benjamin Delaware, Clément Pit-Claudel, Jason Gross, and Adam Chlipala. 2015. Fiat: Deductive Synthesis of Abstract Data Types in a Proof Assistant. In Proc. of POPL. 689–700. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yuetang Deng, Phyllis Frankl, and David Chays. 2005. Testing Database Transactions with AGENDA. In Proc. of ICSE. 78–87. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Alin Deutsch, Monica Marcus, Liying Sui, Victor Vianu, and Dayou Zhou. 2005. A Verifier for Interactive, Data-Driven Web Applications. In Proc. of SIGMOD. 539–550. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Alin Deutsch, Liying Sui, and Victor Vianu. 2007. Specification and verification of data-driven Web applications. J. Comput. Syst. Sci. 73, 3 (2007), 442–474. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Michael Emmi, Rupak Majumdar, and Koushik Sen. 2007. Dynamic Test Input Generation for Database Applications. In Proc. of ISSTA. 151–162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Ronald Fagin, Phokion G Kolaitis, Lucian Popa, and Wang-Chiew Tan. 2011. Schema mapping evolution through composition and inversion. In Schema matching and mapping. Springer, 191–222.Google ScholarGoogle Scholar
  31. Stéphane Faroult and Pascal L’Hermite. 2008. Refactoring SQL applications. O’Reilly Media.Google ScholarGoogle Scholar
  32. Dennis Felsing, Sarah Grebing, Vladimir Klebanov, Philipp Rümmer, and Mattias Ulbrich. 2014. Automating regression verification. In Proc. of ASE. 349–360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Milos Gligoric and Rupak Majumdar. 2013. Model Checking Database Applications. In Proc. of TACAS. 549–564. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Benny Godlin and Ofer Strichman. 2009. Regression verification. In Proc. of DAC. 466–471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Benny Godlin and Ofer Strichman. 2013. Regression verification: proving the equivalence of similar programs. Software Testing, Verification and Reliability 23, 3 (2013), 241–258. Google ScholarGoogle ScholarCross RefCross Ref
  36. Todd J. Green. 2009. Containment of conjunctive queries on annotated relations. In Proc. of ICDT. 296–309. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Chris Hawblitzel, Ming Kawaguchi, Shuvendu K. Lahiri, and Henrique Rebêlo. 2013. Towards Modularly Comparing Programs Using Automated Theorem Provers. In Proc. of CADE. 282–299. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Richard Hull. 1986. Relative Information Capacity of Simple Relational Database Schemata. SIAM J. Comput. 15, 3 (1986), 856–886. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Vasileios Koutavas and Mitchell Wand. 2006a. Bisimulations for Untyped Imperative Objects. In Proc. of ESOP. 146–161. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Vasileios Koutavas and Mitchell Wand. 2006b. Small bisimulations for reasoning about higher-order imperative programs. In Proc. of POPL. 141–152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Sudipta Kundu, Zachary Tatlock, and Sorin Lerner. 2009. Proving optimizations correct using parameterized program equivalence. In Proc. of PLDI. 327–337. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Shuvendu K Lahiri, Chris Hawblitzel, Ming Kawaguchi, and Henrique Rebêlo. 2012. Symdiff: A language-agnostic semantic diff tool for imperative programs. In Proc. of CAV. 712–717.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Shuvendu K Lahiri, Kenneth L McMillan, Rahul Sharma, and Chris Hawblitzel. 2013. Differential assertion checking. In Proc. of ESEC/FSE. 345–355.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Shuvendu K Lahiri and Shaz Qadeer. 2009. Complexity and algorithms for monomial and clausal predicate abstraction. In Proc. of CADE. 214–229.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Renée J. Miller, Yannis E. Ioannidis, and Raghu Ramakrishnan. 1993. The Use of Information Capacity in Schema Integration and Translation. In Proc. of VLDB. 120–133.Google ScholarGoogle Scholar
  46. Joseph P. Near and Daniel Jackson. 2012. Rubicon: bounded verification of web applications. In Proc. of FSE. 60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. George C. Necula. 2000. Translation validation for an optimizing compiler. In Proc. of PLDI. 83–94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Oracle. 2005. Oracle Schema Optimization Guide. https://docs.oracle.com/cd/B14099_19/web.1012/b15901/tuning007.htm . (2005).Google ScholarGoogle Scholar
  49. Suzette Person, Matthew B. Dwyer, Sebastian G. Elbaum, and Corina S. Pasareanu. 2008. Differential symbolic execution. In Proc. of FSE. 226–237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Amir Pnueli, Michael Siegel, and Eli Singerman. 1998. Translation Validation. In Proc. of TACAS. 151–166. Google ScholarGoogle ScholarCross RefCross Ref
  51. Erhard Rahm and Philip A. Bernstein. 2006. An online bibliography on schema evolution. SIGMOD Record 35, 4 (2006), 30–31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Martin Rinard. 1999. Credible Compilation. In MIT TechReport. MIT–LCS–TR–776.Google ScholarGoogle Scholar
  53. Arnon Rosenthal and David S. Reiner. 1994. Tools and Transformations - Rigorous and Otherwise - for Practical Database Design. ACM Trans. Database Syst. 19, 2 (1994), 167–211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Yehoshua Sagiv and Mihalis Yannakakis. 1980. Equivalences Among Relational Expressions with the Union and Difference Operators. J. ACM 27, 4 (1980), 633–655. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Davide Sangiorgi, Naoki Kobayashi, and Eijiro Sumii. 2007. Environmental Bisimulations for Higher-Order Languages. In Proc. of LICS. 293–302. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Davide Sangiorgi, Naoki Kobayashi, and Eijiro Sumii. 2011. Environmental bisimulations for higher-order languages. ACM Trans. Program. Lang. Syst. 33, 1 (2011), 5:1–5:69.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Marcelo Sousa and Isil Dillig. 2016. Cartesian Hoare logic for verifying k-safety properties. In Proc. of PLDI. 57–69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Michael Stepp, Ross Tate, and Sorin Lerner. 2011. Equality-based translation validator for LLVM. In Proc. of CAV. 737–742. Google ScholarGoogle ScholarCross RefCross Ref
  59. Eijiro Sumii and Benjamin C. Pierce. 2004. A bisimulation for dynamic sealing. In Proc. of POPL. 161–172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Eijiro Sumii and Benjamin C. Pierce. 2005. A bisimulation for type abstraction and recursion. In Proc. of POPL. 63–74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Boris A Trakhtenbrot. 1950. Impossibility of an algorithm for the decision problem in finite classes. Doklady Akademii Nauk SSSR 70 (1950), 569–572.Google ScholarGoogle Scholar
  62. Victor Vianu. 2009. Automatic verification of database-driven systems: a new frontier. In Proc. of ICDT. 1–13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Joost Visser. 2008. Coupled Transformation of Schemas, Documents, Queries, and Constraints. Electr. Notes Theor. Comput. Sci. 200, 3 (2008), 3–23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Yuepeng Wang, Isil Dillig, Shuvendu K. Lahiri, and William R. Cook. 2017. Verifying Equivalence of Database-Driven Applications. http://arxiv.org/abs/1710.07660 . (2017). arXiv: 1710.07660Google ScholarGoogle Scholar
  65. Gary Wassermann, Dachuan Yu, Ajay Chander, Dinakar Dhurjati, Hiroshi Inamura, and Zhendong Su. 2008. Dynamic test input generation for web applications. In Proc. of ISSTA. 249–260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Wikimedia. 2017. Schema changes. https://wikitech.wikimedia.org/wiki/Schema_changes . (2017).Google ScholarGoogle Scholar
  67. Tim Wood, Sophia Drossopoulou, Shuvendu K. Lahiri, and Susan Eisenbach. 2017. Modular Verification of Procedure Equivalence in the Presence of Memory Allocation. In Proc. of ESOP. 937–963. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Hongseok Yang. 2007. Relational separation logic. Theoretical Computer Science 375, 1-3 (2007), 308–334.Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Anna Zaks and Amir Pnueli. 2008. Covac: Compiler validation by program analysis of the cross-product. In Proc. of FM. 35–51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Lenore D. Zuck, Amir Pnueli, and Benjamin Goldberg. 2003. VOC: A Methodology for the Translation Validation of Optimizing Compilers. J. UCS 9, 3 (2003), 223–247.Google ScholarGoogle Scholar

Index Terms

  1. Verifying equivalence of database-driven applications

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!