Abstract
Many application areas for embedded systems, such as DSP, media coding, and image processing, are based on stream processing. Stream programs in these areas are often naturally described as graphs, where nodes are computational kernels that send data over the edges. This structure also exhibits large amounts of concurrency, because the kernels can execute independently as long as there are data to process on the edges. The explicit data dependencies also help making efficient sequential implementations of such programs, allowing programs to be more portable between platforms with various degrees of parallelism.
The kernels can be expressed in many different ways; for example, as imperative programs with read and write statements for the communication or as a set of actions that can be performed and conditions for when these actions can be executed. Traditionally, there has been a tension between how the kernels are expressed and how efficiently they can be implemented. There are very efficient implementation techniques for stream programs with restricted expressiveness, such as synchronous dataflow.
In this article, we present a framework for building stream program compilers that we call Tÿcho. At the core of this framework is a common kernel representation, based on a machine model for stream program kernels called actor machine, on which transformations and optimizations are performed. Both imperative and action-based kernels are translated to this common representation, making the same optimizations applicable to different kinds of kernels, and even across source language boundaries. An actor machine is described by the steps of execution that a kernel can take, and the conditions for taking them, together with a controller that decides how the conditions are tested and the steps are taken.
We outline how kernels of an imperative process language and an action-based language are decomposed and translated to the common kernel representation, and we describe a simple backend that generates sequential C code from this representation. We present optimization heuristics of the decision process in the controller that we evaluate using a few dozen kernels from a video decoder with various degrees of complexity. We also present kernel fusion, by merging the controllers of actor machines, as a way of scheduling kernels on the same processor, which we compare to prior art.
- Marianne Baudinet and David MacQueen. 1985. Tree pattern matching for ML. (1985). http://www.smlnj.org/compiler-notes/85-note-baudinet.ps.Google Scholar
- E. Bezati, M. Mattavelli, and J. W. Janneck. 2013. High-level synthesis of dataflow programs for signal processing systems. In Proceedings of the 2013 8th International Symposium on Image and Signal Processing and Analysis (ISPA’13). 750--754.Google Scholar
- Greet Bilsen, Marc Engels, Rud Lauwereins, and Jean Peperstraete. 1996. Cycle-static dataflow. IEEE Trans. Sign. Process. 44, 2 (1996), 397--408.Google Scholar
Digital Library
- Jani Boutellier, Johan Ersfolk, Johan Lilius, Marco Mattavelli, Ghislain Roquier, and Olli Silven. 2015. Actor merging for dataflow process networks. IEEE Trans. Sign. Process. 63, 10 (2015), 2496--2508.Google Scholar
Digital Library
- J. T. Buck and E. A. Lee. 1993. Scheduling dynamic dataflow graphs with bounded memory using the token flow model. In Proceedings of the 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 1. 429--432.Google Scholar
- Gustav Cedersjö and Jörn W. Janneck. 2014. Software code generation for dynamic dataflow programs. In Proceedings of the 17th International Workshop on Software and Compilers for Embedded Systems. ACM, 31--39.Google Scholar
- Gustav Cedersjö and Jörn W. Janneck. 2016. Processes and actors: Translating kahn processes to dataflow with firing. In Proceedings of the 2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS’16). IEEE, 21--30.Google Scholar
- Gustav Cedersjö, Jörn W. Janneck, and Jonas Skeppstedt. 2014. Finding fast action selectors for dataflow actors. In Proceedings of the 2014 48th Asilomar Conference on Signals, Systems and Computers. IEEE, 1435--1439.Google Scholar
Cross Ref
- G. Cedersjö and J. W. Janneck. 2012. Toward efficient execution of dataflow actors. In Proceedings of the 2012 Conference Record of the 46th Asilomar Conference on Signals, Systems and Computers (ASILOMAR’12). 1465--1469.Google Scholar
- G. Cedersjö and J. W. Janneck. 2013. Actor classification using actor machines. In Proceedings of the 2013 Conference on Signals, Systems and Computers (ASILOMAR’13). 1801--1804.Google Scholar
- Jack B. Dennis. 1974. First version of a data flow procedure language. In Proceedings of the Programming Symposium. Springer, 362--376.Google Scholar
Cross Ref
- Stephen A. Edwards. 2003. Tutorial: Compiling concurrent languages for sequential processors. ACM Trans. Des. Autom. Electron. Syst. 8, 2 (Apr. 2003), 141--187.Google Scholar
Digital Library
- Johan Eker and Jörn W. Janneck. 2003. CAL Language Report Specification of the CAL Actor Language. EECS Department, University of California, Berkeley. http://www2.eecs.berkeley.edu/Pubs/TechRpts/2003/4186.html.Google Scholar
- Joachim Falk, Christian Haubelt, and Jürgen Teich. 2006. Efficient representation and simulation of model-based designs in systemc. In Proceedings of the Forum on Specification 8 Design Languages (FDL’06), Vol. 6.Google Scholar
- Joachim Falk, Christian Zebelein, Joachim Keinert, Christian Haubelt, Juergen Teich, and Shuvra S. Bhattacharyya. 2010. Analysis of systemc actor networks for efficient synthesis. ACM Trans. Embed. Comput. Syste. 10, 2 (2010), 18.Google Scholar
- Essayas Gebrewahid, Mingkun Yang, Gustav Cedersjö, Zain Ul Abdin, Veronica Gaspes, Jörn W. Janneck, and Bertil Svensson. 2014. Realizing efficient execution of dataflow actors on manycores. In Proceedings of the 2014 12th IEEE International Conference on Embedded and Ubiquitous Computing (EUC’14). IEEE, 321--328.Google Scholar
Digital Library
- Ruirui Gu, Jörn W. Janneck, Mickaël Raulet, and Shuvra S. Bhattacharyya. 2011. Exploiting statically schedulable regions in dataflow programs. J. Sign. Process. Syst. 63, 1 (2011), 129--142.Google Scholar
Digital Library
- Rajiv Gupta, Eduard Mehofer, and Youtao Zhang. 2002. Profile-guided compiler optimizations. In The Compiler Design Handbook: Optimizations and Machine Code Generation. CRC Press.Google Scholar
- Chia-Jui Hsu, Fuat Keceli, Ming-Yung Ko, Shahrooz Shahparnia, and Shuvra S. Bhattacharyya. 2004. DIF: An interchange format for dataflow-based design tools. In Proceedings of the International Workshop on Embedded Computer Systems. Springer, 423--432.Google Scholar
- Huynh Phung Huynh, Andrei Hagiescu, Weng-Fai Wong, and Rick Siow Mong Goh. 2012. Scalable framework for mapping streaming applications onto multi-GPU systems. SIGPLAN Not. 47, 8 (Feb. 2012), 1--10.Google Scholar
Digital Library
- ISO/IEC 23001-4:2009 2009. Information Technology—MPEG Systems Technologies—Part 4: Codec Configuration Representation. Standard.Google Scholar
- Jorn W. Janneck. 2011. A machine model for dataflow actors and its applications. In Proceedings of the 2011 Conference Record of the 45th Asilomar Conference on Signals, Systems and Computers (ASILOMAR’11). IEEE, 756--760.Google Scholar
Cross Ref
- Gilles Kahn. 1974. The semantics of a simple language for parallel programming. In Proceedings of the IFIP Congress on Information Processing’74, Vol. 74. 471--475.Google Scholar
- Bart Kienhuis and Ed F. Deprettere. 2003. Modeling stream-based applications using the SBF model of computation. J. VLSI Sign. Process. Syst. Sign. Image Vid. Technol. 34, 3 (2003), 291--300.Google Scholar
Digital Library
- Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis 8 transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization. IEEE Computer Society, 75.Google Scholar
Cross Ref
- Edward Lee, David G. Messerschmitt, et al. 1987. Synchronous data flow. Proc. IEEE 75, 9 (1987), 1235--1245.Google Scholar
Cross Ref
- Edward A. Lee. 1997. A Denotational Semantics for Dataflow with Firing. Electronics Research Laboratory, College of Engineering, University of California. http://www2.eecs.berkeley.edu/Pubs/TechRpts/1997/3167.html.Google Scholar
- William Plishker, Nimish Sane, and Shuvra S. Bhattacharyya. 2009. A generalized scheduling approach for dynamic dataflow applications. In Proceedings of the Conference on Design, Automation and Test in Europe. European Design and Automation Association, 111--116.Google Scholar
- Jonas Skeppstedt. 2016. The ASIM Power Architecture simulator.Google Scholar
- Robert Soulé, Michael I. Gordon, Saman Amarasinghe, Robert Grimm, and Martin Hirzel. 2013. Dynamic expressivity with static optimization for streaming languages. In Proceedings of the 7th ACM International Conference on Distributed Event-based Systems. ACM, 159--170.Google Scholar
Digital Library
- K. Strehl, L. Thiele, M. Gries, D. Ziegenbein, R. Ernst, and J. Teich. 2001. FunState-an internal design representation for codesign. IEEE Trans. VLSI Syst. 9, 4 (Aug. 2001), 524--544.Google Scholar
Digital Library
- Sander Stuijk, Marc Geilen, and Twan Basten. 2006. SDF3: SDF for free. In Proceedings of the 6th International Conference on Application of Concurrency to System Design 2006 (ACSD’06). IEEE, 276--278.Google Scholar
- William Thies, Michal Karczmarek, and Saman Amarasinghe. 2002. StreamIt: A language for streaming applications. In Proceedings of the International Conference on Compiler Construction. Springer, 179--196.Google Scholar
Cross Ref
- William Thies, Michal Karczmarek, Janis Sermulins, Rodric Rabbah, and Saman Amarasinghe. 2005. Teleport messaging for distributed stream programs. In Proceedings of the 10th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’05). ACM, New York, NY, 224--235.Google Scholar
Digital Library
- Matthieu Wipliez, Ghislain Roquier, and Jean-François Nezan. 2011. Software code generation for the RVC-CAL language. J. Sign. Process. Syst. 63, 2 (2011), 203--213.Google Scholar
Digital Library
- Herve Yviquel, Antoine Lorence, Khaled Jerbi, Gildas Cocherel, Alexandre Sanchez, and Mickael Raulet. 2013. Orcc: Multimedia development made easy. In Proceedings of the 21st ACM International Conference on Multimedia (MM’13). ACM, 863--866.Google Scholar
Digital Library
Index Terms
Tÿcho: A Framework for Compiling Stream Programs
Recommendations
McLab: an extensible compiler toolkit for MATLAB and related languages
C3S2E '10: Proceedings of the Third C* Conference on Computer Science and Software EngineeringMatlab is a popular language for scientific computation. Effectively compiling Matlab presents many challenges due to the dynamic nature of the language. We present McLab, an extensible compiler toolkit for the Matlab and related languages. McLab aims ...
Using Prolog to implement a compiler for a parallel image processing language
ICIP '95: Proceedings of the 1995 International Conference on Image Processing (Vol. 1)-Volume 1 - Volume 1This paper describes the use of Prolog as an implementation language for a compiler for a parallel image processing language. The target machine for which code is generated is an abstract model for parallel image processing. The structure of the final ...
Compiler algorithm language (CAL): an interpreter and compiler
ACST'07: Proceedings of the third conference on IASTED International Conference: Advances in Computer Science and TechnologyWe have designed a Compiler Algorithm Language (CAL) to provide compiler writers with a language which is close to actual algorithmic notation. In this work, we have developed an interpreter and debugger for CAL which can be used by researchers for ...






Comments