skip to main content

Translating canonical SQL to imperative code in Coq

Published:29 April 2022Publication History
Skip Abstract Section

Abstract

SQL is by far the most widely used and implemented query language. Yet, on some key features, such as correlated queries and NULL value semantics, many implementations diverge or contain bugs. We leverage recent advances in the formalization of SQL and query compilers to develop DBCert, the first mechanically verified compiler from SQL queries written in a canonical form to imperative code. Building DBCert required several new contributions which are described in this paper. First, we specify and mechanize a complete translation from SQL to the Nested Relational Algebra which can be used for query optimization. Second, we define Imp, a small imperative language sufficient to express SQL and which can target several execution languages including JavaScript. Finally, we develop a mechanized translation from the nested relational algebra to Imp, using the nested relational calculus as an intermediate step.

References

  1. Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2022. AlaSQL JavaScript SQL Database Library. http://alasql.org.Google ScholarGoogle Scholar
  3. Joshua S. Auerbach, Martin Hirzel, Louis Mandel, Avraham Shinnar, and Jérôme Siméon. 2017. Handling Environments in a Nested Relational Algebra with Combinators and an Implementation in a Verified Query Compiler. 1555–1569. isbn:978-1-4503-4197-4 https://doi.org/10.1145/3035918.3035961 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Joshua S. Auerbach, Martin Hirzel, Louis Mandel, Avraham Shinnar, and Jérôme Siméon. 2017. Q*cert: A Platform for Implementing and Verifying Query Compilers. 1703–1706. isbn:978-1-4503-4197-4 https://doi.org/10.1145/3035918.3056447 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Véronique Benzaken and Évelyne Contejean. 2019. A Coq mechanised formal semantics for realistic SQL queries: formally reconciling SQL and bag relational algebra. In Proceedings of the 8th ACM SIGPLAN International Conference on Certified Programs and Proofs, CPP 2019, Cascais, Portugal, January 14-15, 2019. 249–261. https://doi.org/10.1145/3293880.3294107 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Véronique Benzaken, Éveleyne Contejean, Mohammed Houssem Hachmaoui, Chantal Keller, Louis Mandel, Avraham Shinnar, and Jérôme Siméon. 2022. Translating Canonical SQL to Imperative Code in Coq. https://doi.org/10.5281/zenodo.6366579 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Véronique Benzaken, Évelyne Contejean, Mohammed Houssem Hachmaoui, Chantal Keller, Louis Mandel, Avraham Shinnar, and Jérôme Siméon. 2022. Translating Canonical SQL to Imperative Code in Coq – extended. CoRR, abs/2203.08941 (2022), https://doi.org/10.48550/arXiv.2203.08941Google ScholarGoogle Scholar
  8. Dariusz Biernacki, Jean-Louis Colaço, Grégoire Hamon, and Marc Pouzet. 2008. Clock-directed modular code generation for synchronous data-flow languages. In LCTES. ACM, 121–130. https://doi.org/10.1145/1375657.1375674 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Martin Bodin, Arthur Charguéraud, Daniele Filaretti, Philippa Gardner, Sergio Maffeis, Daiva Naudziuniene, Alan Schmitt, and Gareth Smith. 2014. A trusted mechanised JavaScript specification. In The 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’14, San Diego, CA, USA, January 20-21, 2014. 87–100. https://doi.org/10.1145/2535838.2535876 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bin Cao and Antonio Badia. 2007. SQL query optimization through nested relational algebra. ACM Trans. Database Syst., 32, 3 (2007), 18. https://doi.org/10.1145/1272743.1272748 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Shumo Chu, Konstantin Weitz, Alvin Cheung, and Dan Suciu. 2017. HoTTSQL: Proving Query Rewrites with Univalent SQL Semantics. In PLDI 2017. ACM, New York, NY, USA. 510–524. https://doi.org/10.1145/3062341.3062348 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jens Clauß en, Alfons Kemper, Guido Moerkotte, and Klaus Peithner. 1997. Optimizing Queries with Universal Quantification in Object-Oriented and Object-Relational Databases. In Conference on Very Large Data Bases (VLDB). 286–295.Google ScholarGoogle Scholar
  13. Jens Clauß en, Alfons Kemper, Guido Moerkotte, Klaus Peithner, and Michael Steinbrunn. 2000. Optimization and Evaluation of Disjunctive Queries. IEEE Trans. Knowl. Data Eng., 12, 2 (2000), 238–260. https://doi.org/10.1109/69.842265 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Sophie Cluet and Guido Moerkotte. 1993. Nested Queries in Object Bases. In Database Programming Languages (DBPL-4), Manhattan, New York City, USA, 30 August - 1 September 1993. 226–242.Google ScholarGoogle Scholar
  15. Todd J. Green, Gregory Karvounarakis, and Val Tannen. 2007. Provenance semirings. In Proceedings of the Twenty-Sixth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 11-13, 2007, Beijing, China. 31–40. https://doi.org/10.1145/1265530.1265535 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Akshay Grover, Jay Gholap, Vandana P. Janeja, Yelena Yesha, Raghu Chintalapati, Harsh Marwaha, and Kunal Modi. 2015. SQL-like big data environments: Case study in clinical trial analytics. In 2015 IEEE International Conference on Big Data, Big Data 2015, Santa Clara, CA, USA, October 29 - November 1, 2015. IEEE Computer Society, 2680–2689. isbn:978-1-4799-9926-2 https://doi.org/10.1109/BigData.2015.7364068 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Paolo Guagliardo and Leonid Libkin. 2017. A Formal Semantics of SQL Queries, Its Validation, and Applications. PVLDB, 11, 1 (2017), 27–39. https://doi.org/10.14778/3151113.3151116 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. TU Jin-De. 2010. StreamSQL: A Query Language for Stream Data. Computer Systems & Applications, 26.Google ScholarGoogle Scholar
  19. Jacques-Henri Jourdan, François Pottier, and Xavier Leroy. 2012. Validating LR(1) Parsers. In Programming Languages and Systems - 21st European Symposium on Programming, ESOP 2012, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2012, Tallinn, Estonia, March 24 - April 1, 2012. Proceedings, Helmut Seidl (Ed.) (Lecture Notes in Computer Science, Vol. 7211). Springer, 397–416. isbn:978-3-642-28868-5 https://doi.org/10.1007/978-3-642-28869-2_20 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Taewhi Lee, Moonyoung Chung, Sung-Soo Kim, Hyewon Song, and Jongho Won. 2016. Partial Materialization for Data Integration in SQL-on-Hadoop Engines. In 6th International Conference on IT Convergence and Security, ICITCS 2016, Prague, Czech Republic, September 26, 2016. IEEE Computer Society, 1–2. isbn:978-1-5090-3765-0 https://doi.org/10.1109/ICITCS.2016.7740361 Google ScholarGoogle ScholarCross RefCross Ref
  21. J. Gregory Malecha, Greg Morrisett, Avraham Shinnar, and Ryan Wisnesky. 2010. Toward a Verified Relational Database Management System. In ACM Int. Conf. POPL. https://doi.org/10.1145/1706299.1706329 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Guido Moerkotte. 2020. Building Query Compilers. Univ. Mannheim. https://pi3.informatik.uni-mannheim.de/~moer/querycompiler.pdfGoogle ScholarGoogle Scholar
  23. 2017. Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14-19, 2017, Semih Salihoglu, Wenchao Zhou, Rada Chirkova, Jun Yang, and Dan Suciu (Eds.). ACM. isbn:978-1-4503-4197-4 http://dl.acm.org/citation.cfm?id=3035918Google ScholarGoogle Scholar
  24. Amir Shaikhha, Yannis Klonatos, Lionel Parreaux, Lewis Brown, Mohammad Dashti, and Christoph Koch. 2016. How to Architect a Query Compiler. In SIGMOD Conference. ACM, 1907–1922. https://doi.org/10.1145/2882903.2915244 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Avraham Shinnar, Jérôme Siméon, and Martin Hirzel. 2015. A Pattern Calculus for Rule Languages: Expressiveness, Compilation, and Mechanization. In 29th European Conference on Object-Oriented Programming, ECOOP 2015, July 5-10, 2015, Prague, Czech Republic. 542–567. https://doi.org/10.4230/LIPIcs.ECOOP.2015.542 Google ScholarGoogle ScholarCross RefCross Ref
  26. 2021. SQL Alchemy: The Python SQL Toolkit and Object Relational Mapper. https://www.sqlalchemy.org.Google ScholarGoogle Scholar
  27. 2022. SQLite compiled to JavaScript. https://sql.js.org/.Google ScholarGoogle Scholar
  28. Jeffrey D. Ullman. 1982. Principles of Database Systems, 2nd Edition. Computer Science Press. isbn:0-914894-36-6Google ScholarGoogle Scholar
  29. Jan Van den Bussche and Stijn Vansummeren. 2007. Polymorphic type inference for the named nested relational calculus. Transactions on Computational Logic (TOCL), 9, 1 (2007), https://doi.org/10.1145/1297658.1297661 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Translating canonical SQL to imperative code in Coq

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Article Metrics

            • Downloads (Last 12 months)216
            • Downloads (Last 6 weeks)12

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!