skip to main content
research-article

LaminarIR: compile-time queues for structured streams

Published:03 June 2015Publication History
Skip Abstract Section

Abstract

Stream programming languages employ FIFO (first-in, first-out) semantics to model data channels between producers and consumers. A FIFO data channel stores tokens in a buffer that is accessed indirectly via read- and write-pointers. This indirect token-access decouples a producer’s write-operations from the read-operations of the consumer, thereby making dataflow implicit. For a compiler, indirect token-access obscures data-dependencies, which renders standard optimizations ineffective and impacts stream program performance negatively. In this paper we propose a transformation for structured stream programming languages such as StreamIt that shifts FIFO buffer management from run-time to compile-time and eliminates splitters and joiners, whose task is to distribute and merge streams. To show the effectiveness of our lowering transformation, we have implemented a StreamIt to C compilation framework. We have developed our own intermediate representation (IR) called LaminarIR, which facilitates the transformation. We report on the enabling effect of the LaminarIR on LLVM’s optimizations, which required the conversion of several standard StreamIt benchmarks from static to randomized input, to prevent computation of partial results at compile-time. We conducted our experimental evaluation on the Intel i7-2600K, AMD Opteron 6378, Intel Xeon Phi 3120A and ARM Cortex-A15 platforms. Our LaminarIR reduces data-communication on average by 35.9% and achieves platform-specific speedups between 3.73x and 4.98x over StreamIt. We reduce memory accesses by more than 60% and achieve energy savings of up to 93.6% on the Intel i7-2600K.

References

  1. LaminarIR website. http://LaminarIR.github.io.Google ScholarGoogle Scholar
  2. D. J. Abadi, Y. Ahmad, M. Balazinska, U. Cetintemel, M. Cherniack, J.-H. Hwang, W. Lindner, A. S. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. Zdonik. The design of the Borealis stream processing engine. In Second Biennial Conference on Innovative Data Systems Research, CIDR ’05, pages 277–289, Asilomar, CA, 2005.Google ScholarGoogle Scholar
  3. B. Alpern, M. N. Wegman, and F. K. Zadeck. Detecting equality of variables in programs. In Proceedings of the 15th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL ’88, pages 1–11, New York, NY, USA, 1988. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Arasu, S. Babu, and J. Widom. The CQL continuous query language: Semantic foundations and query execution. The VLDB Journal, 15(2):121–142, June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Auerbach, D. F. Bacon, P. Cheng, and R. Rabbah. Lime: A Java-compatible and synthesizable language for heterogeneous architectures. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA ’10, pages 89–108, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. S. Battacharyya, E. A. Lee, and P. K. Murthy. Software Synthesis from Dataflow Graphs. Kluwer Academic Publishers, Norwell, MA, USA, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. S. Bhattacharyya, J. T. Buck, S. Ha, and E. A. Lee. Generating compact code from dataflow specifications of multirate signal processing algorithms. IEEE Trans. on Circuits and Systems — I: Fundamental Theory and Applications, 42:138–150, March 1995.Google ScholarGoogle ScholarCross RefCross Ref
  8. J. C. Bier, E. E. Goei, W. H. Ho, P. D. Lapsley, M. P. O’Reilly, G. C. Sih, and E. A. Lee. Gabriel: A design environment for DSP. IEEE Micro, 10(5):28–45, Sept. 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Bosboom, S. Rajadurai, W.-F. Wong, and S. Amarasinghe. StreamJIT: A commensal compiler for high-performance stream programming. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA ’’14, pages 177–195, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Browne, J. Dongarra, N. Garner, G. Ho, and P. Mucci. A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl., 14(3):189–204, Aug. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan. Brook for GPUs: Stream computing on graphics hardware. In ACM SIGGRAPH 2004, SIGGRAPH ’04, pages 777– 786, New York, NY, USA, 2004. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. K. Chen, X. F. Li, R. Lian, J. H. Lin, L. Liu, T. Liu, and R. Ju. Shangri-La: Achieving high performance from compiled network applications while enabling ease of programming. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’05, pages 224–236, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. M. Farhad, Y. Ko, B. Burgstaller, and B. Scholz. Orchestration by approximation: Mapping stream programs onto multicore architectures. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVI, pages 357–368, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. I. Gordon. Compiler techniques for scalable performance of stream programs on multicore architectures. PhD thesis, Cambridge, MA, USA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. H. Hormati, Y. Choi, M. Kudlur, R. Rabbah, T. Mudge, and S. Mahlke. Flextream: Adaptive compilation of streaming applications for heterogeneous architectures. In Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques, PACT ’09, pages 214–223, Washington, DC, USA, 2009. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Kudlur and S. Mahlke. Orchestrating the execution of stream programs on multicore platforms. In Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’08, pages 114–124, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. E. A. Lee and D. G. Messerschmitt. Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. Comput., 36(1):24–35, Jan. 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. W. R. Mark, R. S. Glanville, K. Akeley, and M. J. Kilgard. Cg: A system for programming graphics hardware in a C-like language. In ACM SIGGRAPH 2003, SIGGRAPH ’03, pages 896–907, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. Min and Y. I. Eom. DANBI: Dynamic scheduling of irregular stream programs for many-core systems. In Proceedings of the 22Nd International Conference on Parallel Architectures and Compilation Techniques, PACT ’13, pages 189–200, Piscataway, NJ, USA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. IEEE Press.Google ScholarGoogle Scholar
  21. J. Sermulins, W. Thies, R. Rabbah, and S. Amarasinghe. Cache aware optimization of stream programs. In Proceedings of the 2005 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, LCTES ’05, pages 115–126, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Soulé, M. I. Gordon, S. Amarasinghe, R. Grimm, and M. Hirzel. Dynamic expressivity with static optimization for streaming languages. In Proceedings of the 7th ACM International Conference on Distributed Event-based Systems, DEBS ’13, pages 159–170, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. H. Spring, J. Privat, R. Guerraoui, and J. Vitek. StreamFlex: Highthroughput stream programming in Java. pages 211–228, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. W. Thies and S. Amarasinghe. An empirical characterization of stream programs and its implications for language and compiler design. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT ’10, pages 365–376, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. W. Thies, M. Karczmarek, and S. P. Amarasinghe. StreamIt: A language for streaming applications. In Proceedings of the 11th International Conference on Compiler Construction, CC ’02, pages 179–196, London, UK, 2002. Springer-Verlag. Introduction Motivating Example LaminarIR Local Direct Access Transformation Global Direct Access Transformation Background and Notation Concrete SDF Semantics Auxiliary Semantics Experimental Results Performance Communication Elimination LLVM Optimization Statistics Related Work Conclusion Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. LaminarIR: compile-time queues for structured streams

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!