Abstract
SQL is by far the most widely used and implemented query language. Yet, on some key features, such as correlated queries and NULL value semantics, many implementations diverge or contain bugs. We leverage recent advances in the formalization of SQL and query compilers to develop DBCert, the first mechanically verified compiler from SQL queries written in a canonical form to imperative code. Building DBCert required several new contributions which are described in this paper. First, we specify and mechanize a complete translation from SQL to the Nested Relational Algebra which can be used for query optimization. Second, we define Imp, a small imperative language sufficient to express SQL and which can target several execution languages including JavaScript. Finally, we develop a mechanized translation from the nested relational algebra to Imp, using the nested relational calculus as an intermediate step.
- Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley.Google Scholar
Digital Library
- 2022. AlaSQL JavaScript SQL Database Library. http://alasql.org.Google Scholar
- Joshua S. Auerbach, Martin Hirzel, Louis Mandel, Avraham Shinnar, and Jérôme Siméon. 2017. Handling Environments in a Nested Relational Algebra with Combinators and an Implementation in a Verified Query Compiler. 1555–1569. isbn:978-1-4503-4197-4 https://doi.org/10.1145/3035918.3035961 Google Scholar
Digital Library
- Joshua S. Auerbach, Martin Hirzel, Louis Mandel, Avraham Shinnar, and Jérôme Siméon. 2017. Q*cert: A Platform for Implementing and Verifying Query Compilers. 1703–1706. isbn:978-1-4503-4197-4 https://doi.org/10.1145/3035918.3056447 Google Scholar
Digital Library
- Véronique Benzaken and Évelyne Contejean. 2019. A Coq mechanised formal semantics for realistic SQL queries: formally reconciling SQL and bag relational algebra. In Proceedings of the 8th ACM SIGPLAN International Conference on Certified Programs and Proofs, CPP 2019, Cascais, Portugal, January 14-15, 2019. 249–261. https://doi.org/10.1145/3293880.3294107 Google Scholar
Digital Library
- Véronique Benzaken, Éveleyne Contejean, Mohammed Houssem Hachmaoui, Chantal Keller, Louis Mandel, Avraham Shinnar, and Jérôme Siméon. 2022. Translating Canonical SQL to Imperative Code in Coq. https://doi.org/10.5281/zenodo.6366579 Google Scholar
Digital Library
- Véronique Benzaken, Évelyne Contejean, Mohammed Houssem Hachmaoui, Chantal Keller, Louis Mandel, Avraham Shinnar, and Jérôme Siméon. 2022. Translating Canonical SQL to Imperative Code in Coq – extended. CoRR, abs/2203.08941 (2022), https://doi.org/10.48550/arXiv.2203.08941Google Scholar
- Dariusz Biernacki, Jean-Louis Colaço, Grégoire Hamon, and Marc Pouzet. 2008. Clock-directed modular code generation for synchronous data-flow languages. In LCTES. ACM, 121–130. https://doi.org/10.1145/1375657.1375674 Google Scholar
Digital Library
- Martin Bodin, Arthur Charguéraud, Daniele Filaretti, Philippa Gardner, Sergio Maffeis, Daiva Naudziuniene, Alan Schmitt, and Gareth Smith. 2014. A trusted mechanised JavaScript specification. In The 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’14, San Diego, CA, USA, January 20-21, 2014. 87–100. https://doi.org/10.1145/2535838.2535876 Google Scholar
Digital Library
- Bin Cao and Antonio Badia. 2007. SQL query optimization through nested relational algebra. ACM Trans. Database Syst., 32, 3 (2007), 18. https://doi.org/10.1145/1272743.1272748 Google Scholar
Digital Library
- Shumo Chu, Konstantin Weitz, Alvin Cheung, and Dan Suciu. 2017. HoTTSQL: Proving Query Rewrites with Univalent SQL Semantics. In PLDI 2017. ACM, New York, NY, USA. 510–524. https://doi.org/10.1145/3062341.3062348 Google Scholar
Digital Library
- Jens Clauß en, Alfons Kemper, Guido Moerkotte, and Klaus Peithner. 1997. Optimizing Queries with Universal Quantification in Object-Oriented and Object-Relational Databases. In Conference on Very Large Data Bases (VLDB). 286–295.Google Scholar
- Jens Clauß en, Alfons Kemper, Guido Moerkotte, Klaus Peithner, and Michael Steinbrunn. 2000. Optimization and Evaluation of Disjunctive Queries. IEEE Trans. Knowl. Data Eng., 12, 2 (2000), 238–260. https://doi.org/10.1109/69.842265 Google Scholar
Digital Library
- Sophie Cluet and Guido Moerkotte. 1993. Nested Queries in Object Bases. In Database Programming Languages (DBPL-4), Manhattan, New York City, USA, 30 August - 1 September 1993. 226–242.Google Scholar
- Todd J. Green, Gregory Karvounarakis, and Val Tannen. 2007. Provenance semirings. In Proceedings of the Twenty-Sixth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 11-13, 2007, Beijing, China. 31–40. https://doi.org/10.1145/1265530.1265535 Google Scholar
Digital Library
- Akshay Grover, Jay Gholap, Vandana P. Janeja, Yelena Yesha, Raghu Chintalapati, Harsh Marwaha, and Kunal Modi. 2015. SQL-like big data environments: Case study in clinical trial analytics. In 2015 IEEE International Conference on Big Data, Big Data 2015, Santa Clara, CA, USA, October 29 - November 1, 2015. IEEE Computer Society, 2680–2689. isbn:978-1-4799-9926-2 https://doi.org/10.1109/BigData.2015.7364068 Google Scholar
Digital Library
- Paolo Guagliardo and Leonid Libkin. 2017. A Formal Semantics of SQL Queries, Its Validation, and Applications. PVLDB, 11, 1 (2017), 27–39. https://doi.org/10.14778/3151113.3151116 Google Scholar
Digital Library
- TU Jin-De. 2010. StreamSQL: A Query Language for Stream Data. Computer Systems & Applications, 26.Google Scholar
- Jacques-Henri Jourdan, François Pottier, and Xavier Leroy. 2012. Validating LR(1) Parsers. In Programming Languages and Systems - 21st European Symposium on Programming, ESOP 2012, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2012, Tallinn, Estonia, March 24 - April 1, 2012. Proceedings, Helmut Seidl (Ed.) (Lecture Notes in Computer Science, Vol. 7211). Springer, 397–416. isbn:978-3-642-28868-5 https://doi.org/10.1007/978-3-642-28869-2_20 Google Scholar
Digital Library
- Taewhi Lee, Moonyoung Chung, Sung-Soo Kim, Hyewon Song, and Jongho Won. 2016. Partial Materialization for Data Integration in SQL-on-Hadoop Engines. In 6th International Conference on IT Convergence and Security, ICITCS 2016, Prague, Czech Republic, September 26, 2016. IEEE Computer Society, 1–2. isbn:978-1-5090-3765-0 https://doi.org/10.1109/ICITCS.2016.7740361 Google Scholar
Cross Ref
- J. Gregory Malecha, Greg Morrisett, Avraham Shinnar, and Ryan Wisnesky. 2010. Toward a Verified Relational Database Management System. In ACM Int. Conf. POPL. https://doi.org/10.1145/1706299.1706329 Google Scholar
Digital Library
- Guido Moerkotte. 2020. Building Query Compilers. Univ. Mannheim. https://pi3.informatik.uni-mannheim.de/~moer/querycompiler.pdfGoogle Scholar
- 2017. Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14-19, 2017, Semih Salihoglu, Wenchao Zhou, Rada Chirkova, Jun Yang, and Dan Suciu (Eds.). ACM. isbn:978-1-4503-4197-4 http://dl.acm.org/citation.cfm?id=3035918Google Scholar
- Amir Shaikhha, Yannis Klonatos, Lionel Parreaux, Lewis Brown, Mohammad Dashti, and Christoph Koch. 2016. How to Architect a Query Compiler. In SIGMOD Conference. ACM, 1907–1922. https://doi.org/10.1145/2882903.2915244 Google Scholar
Digital Library
- Avraham Shinnar, Jérôme Siméon, and Martin Hirzel. 2015. A Pattern Calculus for Rule Languages: Expressiveness, Compilation, and Mechanization. In 29th European Conference on Object-Oriented Programming, ECOOP 2015, July 5-10, 2015, Prague, Czech Republic. 542–567. https://doi.org/10.4230/LIPIcs.ECOOP.2015.542 Google Scholar
Cross Ref
- 2021. SQL Alchemy: The Python SQL Toolkit and Object Relational Mapper. https://www.sqlalchemy.org.Google Scholar
- 2022. SQLite compiled to JavaScript. https://sql.js.org/.Google Scholar
- Jeffrey D. Ullman. 1982. Principles of Database Systems, 2nd Edition. Computer Science Press. isbn:0-914894-36-6Google Scholar
- Jan Van den Bussche and Stijn Vansummeren. 2007. Polymorphic type inference for the named nested relational calculus. Transactions on Computational Logic (TOCL), 9, 1 (2007), https://doi.org/10.1145/1297658.1297661 Google Scholar
Digital Library
Index Terms
Translating canonical SQL to imperative code in Coq
Recommendations
A Coq mechanised formal semantics for realistic SQL queries: formally reconciling SQL and bag relational algebra
CPP 2019: Proceedings of the 8th ACM SIGPLAN International Conference on Certified Programs and ProofsIn this article, we provide a Coq mechanised, executable, formal semantics for a realistic fragment of SQL consisting of "select [distinct] from where group by having" queries with null values, functions, aggregates, quantifiers and nested potentially ...
Translating SQL Into Relational Algebra: Optimization, Semantics, and Equivalence of SQL Queries
In this paper, we present a translator from a relevant subset of SQL into relational algebra. The translation is syntax-directed, with translation rules associated with grammar productions; each production corresponds to a particular type of SQL ...
Comparing NoSQL MongoDB to an SQL DB
ACMSE '13: Proceedings of the 51st ACM Southeast ConferenceNoSQL database solutions are becoming more and more prevalent in a world currently dominated by SQL relational databases. NoSQL databases were designed to provide database solutions for large volumes of data that is not structured. However, the ...






Comments