Abstract
Stream programming languages employ FIFO (first-in, first-out) semantics to model data channels between producers and consumers. A FIFO data channel stores tokens in a buffer that is accessed indirectly via read- and write-pointers. This indirect token-access decouples a producer’s write-operations from the read-operations of the consumer, thereby making dataflow implicit. For a compiler, indirect token-access obscures data-dependencies, which renders standard optimizations ineffective and impacts stream program performance negatively. In this paper we propose a transformation for structured stream programming languages such as StreamIt that shifts FIFO buffer management from run-time to compile-time and eliminates splitters and joiners, whose task is to distribute and merge streams. To show the effectiveness of our lowering transformation, we have implemented a StreamIt to C compilation framework. We have developed our own intermediate representation (IR) called LaminarIR, which facilitates the transformation. We report on the enabling effect of the LaminarIR on LLVM’s optimizations, which required the conversion of several standard StreamIt benchmarks from static to randomized input, to prevent computation of partial results at compile-time. We conducted our experimental evaluation on the Intel i7-2600K, AMD Opteron 6378, Intel Xeon Phi 3120A and ARM Cortex-A15 platforms. Our LaminarIR reduces data-communication on average by 35.9% and achieves platform-specific speedups between 3.73x and 4.98x over StreamIt. We reduce memory accesses by more than 60% and achieve energy savings of up to 93.6% on the Intel i7-2600K.
- LaminarIR website. http://LaminarIR.github.io.Google Scholar
- D. J. Abadi, Y. Ahmad, M. Balazinska, U. Cetintemel, M. Cherniack, J.-H. Hwang, W. Lindner, A. S. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. Zdonik. The design of the Borealis stream processing engine. In Second Biennial Conference on Innovative Data Systems Research, CIDR ’05, pages 277–289, Asilomar, CA, 2005.Google Scholar
- B. Alpern, M. N. Wegman, and F. K. Zadeck. Detecting equality of variables in programs. In Proceedings of the 15th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL ’88, pages 1–11, New York, NY, USA, 1988. ACM. Google Scholar
Digital Library
- A. Arasu, S. Babu, and J. Widom. The CQL continuous query language: Semantic foundations and query execution. The VLDB Journal, 15(2):121–142, June 2006. Google Scholar
Digital Library
- J. Auerbach, D. F. Bacon, P. Cheng, and R. Rabbah. Lime: A Java-compatible and synthesizable language for heterogeneous architectures. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA ’10, pages 89–108, New York, NY, USA, 2010. ACM. Google Scholar
Digital Library
- S. S. Battacharyya, E. A. Lee, and P. K. Murthy. Software Synthesis from Dataflow Graphs. Kluwer Academic Publishers, Norwell, MA, USA, 1996. Google Scholar
Digital Library
- S. S. Bhattacharyya, J. T. Buck, S. Ha, and E. A. Lee. Generating compact code from dataflow specifications of multirate signal processing algorithms. IEEE Trans. on Circuits and Systems — I: Fundamental Theory and Applications, 42:138–150, March 1995.Google Scholar
Cross Ref
- J. C. Bier, E. E. Goei, W. H. Ho, P. D. Lapsley, M. P. O’Reilly, G. C. Sih, and E. A. Lee. Gabriel: A design environment for DSP. IEEE Micro, 10(5):28–45, Sept. 1990. Google Scholar
Digital Library
- J. Bosboom, S. Rajadurai, W.-F. Wong, and S. Amarasinghe. StreamJIT: A commensal compiler for high-performance stream programming. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA ’’14, pages 177–195, New York, NY, USA, 2014. ACM. Google Scholar
Digital Library
- S. Browne, J. Dongarra, N. Garner, G. Ho, and P. Mucci. A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl., 14(3):189–204, Aug. 2000. Google Scholar
Digital Library
- I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan. Brook for GPUs: Stream computing on graphics hardware. In ACM SIGGRAPH 2004, SIGGRAPH ’04, pages 777– 786, New York, NY, USA, 2004. ACM. Google Scholar
Digital Library
- M. K. Chen, X. F. Li, R. Lian, J. H. Lin, L. Liu, T. Liu, and R. Ju. Shangri-La: Achieving high performance from compiled network applications while enabling ease of programming. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’05, pages 224–236, New York, NY, USA, 2005. ACM. Google Scholar
Digital Library
- S. M. Farhad, Y. Ko, B. Burgstaller, and B. Scholz. Orchestration by approximation: Mapping stream programs onto multicore architectures. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVI, pages 357–368, New York, NY, USA, 2011. ACM. Google Scholar
Digital Library
- M. I. Gordon. Compiler techniques for scalable performance of stream programs on multicore architectures. PhD thesis, Cambridge, MA, USA, 2010. Google Scholar
Digital Library
- A. H. Hormati, Y. Choi, M. Kudlur, R. Rabbah, T. Mudge, and S. Mahlke. Flextream: Adaptive compilation of streaming applications for heterogeneous architectures. In Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques, PACT ’09, pages 214–223, Washington, DC, USA, 2009. IEEE Computer Society. Google Scholar
Digital Library
- M. Kudlur and S. Mahlke. Orchestrating the execution of stream programs on multicore platforms. In Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’08, pages 114–124, New York, NY, USA, 2008. ACM. Google Scholar
Digital Library
- E. A. Lee and D. G. Messerschmitt. Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. Comput., 36(1):24–35, Jan. 1987. Google Scholar
Digital Library
- W. R. Mark, R. S. Glanville, K. Akeley, and M. J. Kilgard. Cg: A system for programming graphics hardware in a C-like language. In ACM SIGGRAPH 2003, SIGGRAPH ’03, pages 896–907, New York, NY, USA, 2003. ACM. Google Scholar
Digital Library
- C. Min and Y. I. Eom. DANBI: Dynamic scheduling of irregular stream programs for many-core systems. In Proceedings of the 22Nd International Conference on Parallel Architectures and Compilation Techniques, PACT ’13, pages 189–200, Piscataway, NJ, USA, 2013. Google Scholar
Digital Library
- IEEE Press.Google Scholar
- J. Sermulins, W. Thies, R. Rabbah, and S. Amarasinghe. Cache aware optimization of stream programs. In Proceedings of the 2005 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, LCTES ’05, pages 115–126, New York, NY, USA, 2005. ACM. Google Scholar
Digital Library
- R. Soulé, M. I. Gordon, S. Amarasinghe, R. Grimm, and M. Hirzel. Dynamic expressivity with static optimization for streaming languages. In Proceedings of the 7th ACM International Conference on Distributed Event-based Systems, DEBS ’13, pages 159–170, New York, NY, USA, 2013. ACM. Google Scholar
Digital Library
- J. H. Spring, J. Privat, R. Guerraoui, and J. Vitek. StreamFlex: Highthroughput stream programming in Java. pages 211–228, 2007. Google Scholar
Digital Library
- W. Thies and S. Amarasinghe. An empirical characterization of stream programs and its implications for language and compiler design. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT ’10, pages 365–376, New York, NY, USA, 2010. ACM. Google Scholar
Digital Library
- W. Thies, M. Karczmarek, and S. P. Amarasinghe. StreamIt: A language for streaming applications. In Proceedings of the 11th International Conference on Compiler Construction, CC ’02, pages 179–196, London, UK, 2002. Springer-Verlag. Introduction Motivating Example LaminarIR Local Direct Access Transformation Global Direct Access Transformation Background and Notation Concrete SDF Semantics Auxiliary Semantics Experimental Results Performance Communication Elimination LLVM Optimization Statistics Related Work Conclusion Google Scholar
Digital Library
Index Terms
LaminarIR: compile-time queues for structured streams
Recommendations
LaminarIR: compile-time queues for structured streams
PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and ImplementationStream programming languages employ FIFO (first-in, first-out) semantics to model data channels between producers and consumers. A FIFO data channel stores tokens in a buffer that is accessed indirectly via read- and write-pointers. This indirect token-...
Synergistic execution of stream programs on multicores with accelerators
LCTES '09: Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systemsThe StreamIt programming model has been proposed to exploit parallelism in streaming applications on general purpose multicore architectures. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on accelerators such as ...
Synergistic execution of stream programs on multicores with accelerators
LCTES '09The StreamIt programming model has been proposed to exploit parallelism in streaming applications on general purpose multicore architectures. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on accelerators such as ...






Comments