skip to main content
research-article

Functional pearl: a SQL to C compiler in 500 lines of code

Published:29 August 2015Publication History
Skip Abstract Section

Abstract

We present the design and implementation of a SQL query processor that outperforms existing database systems and is written in just about 500 lines of Scala code -- a convincing case study that high-level functional programming can handily beat C for systems-level programming where the last drop of performance matters. The key enabler is a shift in perspective towards generative programming. The core of the query engine is an interpreter for relational algebra operations, written in Scala. Using the open-source LMS Framework (Lightweight Modular Staging), we turn this interpreter into a query compiler with very low effort. To do so, we capitalize on an old and widely known result from partial evaluation known as Futamura projections, which state that a program that can specialize an interpreter to any given input program is equivalent to a compiler. In this pearl, we discuss LMS programming patterns such as mixed-stage data structures (e.g. data records with static schema and dynamic field components) and techniques to generate low-level C code, including specialized data structures and data loading primitives.

References

  1. E. Axelsson, K. Claessen, M. Sheeran, J. Svenningsson, D. Engdal, and A. Persson. The design and implementation of feldspar: An embedded language for digital signal processing. IFL’10, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Catanzaro, M. Garland, and K. Keutzer. Copperhead: compiling an embedded data parallel language. PPoPP, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Consel and O. Danvy. Tutorial notes on partial evaluation. In POPL, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Z. DeVito, J. Hegarty, A. Aiken, P. Hanrahan, and J. Vitek. Terra: a multi-stage language for high-performance computing. In PLDI, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Z. DeVito, D. Ritchie, M. Fisher, A. Aiken, and P. Hanrahan. Firstclass runtime generation of high-performance types using exotypes. In PLDI, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Y. Futamura. Partial evaluation of computation process, revisited. Higher-Order and Symbolic Computation, 12(4):377–380, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. G. Graefe. Volcano - an extensible and parallel query evaluation system. IEEE Trans. Knowl. Data Eng., 6(1):120–135, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. D. Jones, C. K. Gomard, and P. Sestoft. Partial evaluation and automatic program generation. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. U. Jørring and W. L. Scherlis. Compilers and staging transformations. In POPL, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Klonatos, C. Koch, T. Rompf, and H. Chafi. Building efficient query engines in a high-level language. PVLDB, 7(10):853–864, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. Mainland and G. Morrisett. Nikola: embedding compiled GPU functions in Haskell. Haskell, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. L. McDonell, M. M. Chakravarty, G. Keller, and B. Lippmeier. Optimising purely functional GPU programs. ICFP, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Neumann. Efficiently compiling efficient query plans for modern hardware. PVLDB, 4(9):539–550, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. C. Reynolds. Definitional interpreters for higher-order programming languages. Higher-Order and Symbolic Computation, 11(4):363–397, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Rompf, N. Amin, A. Moors, P. Haller, and M. Odersky. Scalavirtualized: Linguistic reuse for deep embeddings. Higher-Order and Symbolic Computation (Special issue for PEPM’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. Rompf, K. J. Brown, H. Lee, A. K. Sujeeth, M. Jonnalagedda, N. Amin, G. Ofenbeck, A. Stojanov, Y. Klonatos, M. Dashti, C. Koch, M. Püschel, and K. Olukotun. Go meta! A case for generative programming and dsls in performance critical systems. In SNAPL, 2015.Google ScholarGoogle Scholar
  17. T. Rompf and M. Odersky. Lightweight modular staging: a pragmatic approach to runtime code generation and compiled dsls. Commun. ACM, 55(6):121–130, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. Rompf, A. K. Sujeeth, N. Amin, K. Brown, V. Jovanovic, H. Lee, M. Jonnalagedda, K. Olukotun, and M. Odersky. Optimizing data structures in high-level programs. POPL, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Stonebraker and U. Çetintemel. "One Size Fits All": An idea whose time has come and gone (abstract). In ICDE, pages 2–11, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland. The end of an architectural era (it’s time for a complete rewrite). In VLDB, pages 1150–1160, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Svenningsson and E. Axelsson. Combining deep and shallow embedding for EDSL. In TFP, 2012.Google ScholarGoogle Scholar
  22. W. Taha and T. Sheard. Metaml and multi-stage programming with explicit annotations. Theor. Comput. Sci., 248(1-2):211–242, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Tobin-Hochstadt, V. St-Amour, R. Culpepper, M. Flatt, and M. Felleisen. Languages as libraries. PLDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Zukowski, P. A. Boncz, N. Nes, and S. Héman. MonetDB/X100 - A DBMS In The CPU Cache. IEEE Data Eng. Bull., 28(2):17–22, 2005.Google ScholarGoogle Scholar

Index Terms

  1. Functional pearl: a SQL to C compiler in 500 lines of code

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 50, Issue 9
        ICFP '15
        September 2015
        436 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/2858949
        • Editor:
        • Andy Gill
        Issue’s Table of Contents
        • cover image ACM Conferences
          ICFP 2015: Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming
          August 2015
          436 pages
          ISBN:9781450336697
          DOI:10.1145/2784731

        Copyright © 2015 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 29 August 2015

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!