skip to main content
research-article
Open Access

StreamJIT: a commensal compiler for high-performance stream programming

Authors Info & Claims
Published:15 October 2014Publication History
Skip Abstract Section

Abstract

There are many domain libraries, but despite the performance benefits of compilation, domain-specific languages are comparatively rare due to the high cost of implementing an optimizing compiler. We propose commensal compilation, a new strategy for compiling embedded domain-specific languages by reusing the massive investment in modern language virtual machine platforms. Commensal compilers use the host language's front-end, use host platform APIs that enable back-end optimizations by the host platform JIT, and use an autotuner for optimization selection. The cost of implementing a commensal compiler is only the cost of implementing the domain-specific optimizations. We demonstrate the concept by implementing a commensal compiler for the stream programming language StreamJIT atop the Java platform. Our compiler achieves performance 2.8 times better than the StreamIt native code (via GCC) compiler with considerably less implementation effort.

References

  1. Expression Trees. URL http://msdn.microsoft.com/en-us/library/bb397951%28v=vs.110%29.aspx.Google ScholarGoogle Scholar
  2. Emitting Dynamic Methods and Assemblies. URL http://msdn.microsoft.com/en-us/library/8ffc3x75%28v=vs.110%29.aspx.Google ScholarGoogle Scholar
  3. E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. LAPACK Users' Guide. Society for Industrial and Applied Mathematics, Philadelphia, PA, third edition, 1999. ISBN 0-89871-447-8 (paperback). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Ansel, C. Chan, Y. L. Wong, M. Olszewski, Q. Zhao, A. Edelman, and S. Amarasinghe. PetaBricks: A Language and Compiler for Algorithmic Choice. In Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '09, pages 38--49, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-392-1. doi: 10.1145/1542476.1542481. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Ansel, M. Pacula, Y. L. Wong, C. Chan, M. Olszewski, U.-M. O'Reilly, and S. Amarasinghe. SiblingRivalry: Online Autotuning Through Local Competitions. In Proceedings of the 2012 International Conference on Compilers, Architectures and Synthesis for Embedded Systems, CASES '12, pages 91--100, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1424-4. doi: 10.1145/2380403.2380425. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Ansel, S. Kamil, K. Veeramachaneni, U.-M. O'Reilly, and S. Amarasinghe. OpenTuner: An Extensible Framework for Program Autotuning. 2013. URL http://hdl.handle.net/1721.1/81958.Google ScholarGoogle Scholar
  7. ASM. ASM bytecode transformation library. http://asm.ow2.org/.Google ScholarGoogle Scholar
  8. J. Auerbach, D. F. Bacon, P. Cheng, and R. Rabbah. Lime: A Java-compatible and Synthesizable Language for Heterogeneous Architectures. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA '10, pages 89--108, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0203-6. doi: 10.1145/1869459.1869469. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Auerbach, D. F. Bacon, I. Burcea, P. Cheng, S. J. Fink, R. Rabbah, and S. Shukla. A Compiler and Runtime for Heterogeneous Computing. In Proceedings of the 49th Annual Design Automation Conference, DAC '12, pages 271--276, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1199-1. doi: 10.1145/2228360.2228411. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Axelsson, K. Claessen, G. Devai, Z. Horvath, K. Keijzer, B. Lyckegrd, A. Persson, M. Sheeran, J. Svenningsson, and A. Vajda. Feldspar: A domain specific language for digital signal processing algorithms. In Formal Methods and Models for Codesign (MEMOCODE), 2010 8th IEEE/ACM International Conference on, pages 169--178, July 2010. doi: 10.1109/MEMCOD.2010.5558637.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Bharadwaj. invokedynamic and Jython. In JVM Language Summit 2011, JVMLS 2011, 2011.Google ScholarGoogle Scholar
  12. K. Brown, A. Sujeeth, H. J. Lee, T. Rompf, H. Chafi, M. Odersky, and K. Olukotun. A Heterogeneous Parallel Framework for Domain-Specific Languages. In Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on, pages 89--100, Oct 2011. doi: 10.1109/PACT.2011.15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. W. Dietl, S. Dietzel, M. D. Ernst, K. Muşlu, and T. Schiller. Building and using pluggable type-checkers. In ICSE'11, Proceedings of the 33rd International Conference on Software Engineering, pages 681--690, Waikiki, Hawaii, USA, May 25-27, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Frigo and S. Johnson. FFTW: an adaptive software architecture for the FFT. In Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on, volume 3, pages 1381--1384 vol.3, May 1998. doi: 10.1109/ICASSP.1998.681704.Google ScholarGoogle Scholar
  15. M. Garcia. Runtime metaprogramming via java.lang.invoke.MethodHandle. May 2012. URL http://lamp.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/2012Q2/RuntimeMP.pdf.Google ScholarGoogle Scholar
  16. M. Gordon. Compiler Techniques for Scalable Performance of Stream Programs on Multicore Architectures. PhD thesis, Massachusetts Institute of Technology, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Heidinga. MethodHandle Implementation Tips and Tricks. In JVM Language Summit 2011, JVMLS 2011, 2011.Google ScholarGoogle Scholar
  18. ImageMagick. ImageMagick. URL http://www.imagemagick.org/.Google ScholarGoogle Scholar
  19. M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed Data-parallel Programs from Sequential Building Blocks. In Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, EuroSys '07, pages 59--72, New York, NY, USA, 2007. ACM. ISBN 978-1-59593-636-3. doi: 10.1145/1272996.1273005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Karczmarek. Constrained and Phased Scheduling of Synchronous Data Flow Graphs for StreamIt Language. PhD thesis, Massachusetts Institute of Technology, 2002.Google ScholarGoogle Scholar
  21. A. Kulkarni and R. R. Newton. Embrace, Defend, Extend: A Methodology for Embedding Preexisting DSLs. In Proceedings of the 1st Annual Workshop on Functional Programming Concepts in Domain-specific Languages, FPCDSL '13, pages 27--34, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2380-2. doi: 10.1145/2505351.2505355. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. E. Meijer, B. Beckman, and G. Bierman. LINQ: Reconciling Object, Relations and XML in the .NET Framework. In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, SIGMOD '06, pages 706--706, New York, NY, USA, 2006. ACM. ISBN 1-59593-434-0. doi: 10.1145/1142473.1142552. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. Nutter. Adding invokedynamic Support to JRuby. In JVM Language Summit 2011, JVMLS 2011, 2011.Google ScholarGoogle Scholar
  24. F. Otto, V. Pankratius, and W. Tichy. High-level Multicore Programming with XJava. In Software Engineering - Companion Volume, 2009. ICSE-Companion 2009. 31st International Conference on, pages 319--322, May 2009. doi: 10.1109/ICSE-COMPANION.2009.5071011.Google ScholarGoogle ScholarCross RefCross Ref
  25. F. Otto, V. Pankratius, and W. Tichy. XJava: Exploiting Parallelism with Object-Oriented Stream Programming. In H. Sips, D. Epema, and H.-X. Lin, editors, Euro-Par 2009 Parallel Processing, volume 5704 of Lecture Notes in Computer Science, pages 875--886. Springer Berlin Heidelberg, 2009. ISBN 978-3-642-03868-6. doi: 10.1007/978-3-642-03869-3n_81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. F. Otto, C. Schaefer, M. Dempe, and W. Tichy. A Language-Based Tuning Mechanism for Task and Pipeline Parallelism. In P. DAmbra, M. Guarracino, and D. Talia, editors, Euro-Par 2010 - Parallel Processing, volume 6272 of Lecture Notes in Computer Science, pages 328--340. Springer Berlin Heidelberg, 2010. ISBN 978-3-642-15290-0. doi: 10.1007/978-3-642-15291-7n_30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Ponge and F. L. Mouël. JooFlux: Hijacking Java 7 InvokeDynamic To Support Live Code Modifications. CoRR, abs/1210.1039, 2012.Google ScholarGoogle Scholar
  28. J. Ragan-Kelley, C. Barnes, A. Adams, S. Paris, F. Durand, and S. Amarasinghe. Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines. In ACM SIGPLAN Conference on Programming Language Design and Implementation, Seattle, WA, June 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. B. R. Rau and C. D. Glaeser. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. In Proceedings of the 14th Annual Workshop on Microprogramming, MICRO 14, pages 183--198, Piscataway, NJ, USA, 1981. IEEE Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. T. Rompf and M. Odersky. Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs. In Proceedings of the Ninth International Conference on Generative Programming and Component Engineering, GPCE '10, pages 127--136, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0154-1. doi: 10.1145/1868294.1868314. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Rose. Bytecodes meet combinators: invokedynamic on the JVM. In Proceedings of the Third Workshop on Virtual Machines and Intermediate Languages, page 2. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. H. Spring, J. Privat, R. Guerraoui, and J. Vitek. Stream-Flex: High-throughput Stream Programming in Java. In Proceedings of the 22Nd Annual ACM SIGPLAN Conference on Object-oriented Programming Systems and Applications, OOPSLA '07, pages 211--228, New York, NY, USA, 2007. ACM. ISBN 978-1-59593-786-5. doi: 10.1145/1297027.1297043. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. K. Sujeeth, A. Gibbons, K. J. Brown, H. Lee, T. Rompf, M. Odersky, and K. Olukotun. Forge: Generating a High Performance DSL Implementation from a Declarative Specification. In Proceedings of the 12th International Conference on Generative Programming: Concepts & Experiences, GPCE '13, pages 145--154, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2373-4. doi: 10.1145/2517208.2517220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. C. Thalinger and J. Rose. Optimizing Invokedynamic. In Proceedings of the 8th International Conference on the Principles and Practice of Programming in Java, PPPJ '10, pages 1--9, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0269-2. doi: 10.1145/1852761.1852763. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. The Java Tutorials. Aggregate Operations. URL http://docs.oracle.com/javase/tutorial/collections/streams/index.html.Google ScholarGoogle Scholar
  36. The Java Tutorials. Type Annotations and Pluggable Type Systems. URL http://docs.oracle.com/javase/tutorial/java/annotations/type_annotations.html.Google ScholarGoogle Scholar
  37. W. Thies. Language and Compiler Support for Stream Programs. PhD thesis, Massachusetts Institute of Technology, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. W. Thies and S. Amarasinghe. An Empirical Characterization of Stream Programs and Its Implications for Language and Compiler Design. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT '10, pages 365--376, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0178-7. doi: 10.1145/1854273.1854319. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. Tiwari, C. Chen, J. Chame, M. Hall, and J. Hollingsworth. A Scalable Auto-tuning Framework for Compiler Optimization. In Parallel Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, pages 1--12, May 2009. doi: 10.1109/IPDPS.2009.5161054. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. TypeToken. TypeToken. URL https://code.google.com/p/guava-libraries/wiki/ReflectionExplained#TypeToken.Google ScholarGoogle Scholar
  41. Z. Wang and M. F. P. O'boyle. Using Machine Learning to Partition Streaming Programs. ACM Trans. Archit. Code Optim., 10(3):20:1--20:25, Sept. 2008. ISSN 1544-3566. doi: 10.1145/2512436. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. R. C. Whaley and J. J. Dongarra. Automatically Tuned Linear Algebra Software. In Proceedings of the 1998 ACM/IEEE Conference on Supercomputing, Supercomputing '98, pages 1--27, Washington, DC, USA, 1998. IEEE Computer Society. ISBN 0-89791-984-X. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. C. Wimmer and T. Wüurthinger. Truffle: A Self-optimizing Runtime System. In Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity, SPLASH '12, pages 13--14, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1563-0. doi: 10.1145/2384716.2384723. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. T. Würthinger. Extending the Graal Compiler to Optimize Libraries. In Proceedings of the ACM International Conference Companion on Object Oriented Programming Systems Languages and Applications Companion, SPLASH '11, pages 41--42, New York, NY, USA, 2011. ACM. ISBN 978-1-4503-0942-4. doi: 10.1145/2048147.2048168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. J. Xiong, J. Johnson, R. Johnson, and D. Padua. SPL: A Language and Compiler for DSP Algorithms. In Proceedings of the ACM SIGPLAN 2001 Conference on Programming Language Design and Implementation, PLDI '01, pages 298--308, New York, NY, USA, 2001. ACM. ISBN 1-58113-414-2. doi: 10.1145/378795.378860. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Y. Yu, M. Isard, D. Fetterly, M. Budiu, U. Erlingsson, P. K. Gunda, and J. Currey. DryadLINQ: A System for General-purpose Distributed Data-parallel Computing Using a High-level Language. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI'08, pages 1--14, Berkeley, CA, USA, 2008. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster Computing with Working Sets. In Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud'10, pages 10--10, Berkeley, CA, USA, 2010. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. StreamJIT: a commensal compiler for high-performance stream programming

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM SIGPLAN Notices
              ACM SIGPLAN Notices  Volume 49, Issue 10
              OOPSLA '14
              October 2014
              907 pages
              ISSN:0362-1340
              EISSN:1558-1160
              DOI:10.1145/2714064
              • Editor:
              • Andy Gill
              Issue’s Table of Contents
              • cover image ACM Conferences
                OOPSLA '14: Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications
                October 2014
                946 pages
                ISBN:9781450325851
                DOI:10.1145/2660193

              Copyright © 2014 Owner/Author

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 15 October 2014

              Check for updates

              Qualifiers

              • research-article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!