Abstract
Dataflow models, such as SDF, have been effectively used to program streaming applications while ensuring their liveness and boundedness. Yet, industrials are struggling to design the next generation of high definition video applications using these models. Such applications demand new features such as parameters to express dynamic input/output rate and topology modifications. Their implementation on modern many-core platforms is a major challenge.
We tackle these problems by proposing a generic and flexible framework to schedule streaming applications designed in a parametric dataflow model of computation. We generate parallel as soon as possible (ASAP) schedules targeted to the new STHORM many-core platform of STMicroelectronics. Furthermore, these schedules can be customized using user-defined ordering and resource constraints.
The parametric dataflow graph is associated with generic or user-defined specific constraints aimed at minimizing timing, buffer sizes, power consumption, or other criteria. The scheduling algorithm executes with minimal overhead and can be adapted to different scheduling policies just by adding some constraints. The safety of both the dataflow graph and constraints can be checked statically and all schedules are guaranteed to be bounded and deadlock free. We illustrate the scheduling capabilities of our approach using a real world application: the VC-1 video decoder for high definition video streaming.
- J.-P. Banatre and D. Le Metayer. Programming by multiset transformation. Comm. of the ACM, 36(1):98--111, Jan. 1993. Google Scholar
Digital Library
- M. Bariani, P. Lambruschini, and M. Raggio. Vc-1 decoder on stmicroelectronics p2012 architecture. In Proc. of 8th Annual Intl. Workshop "STreaming Day", Sept 2010.Google Scholar
- V. Bebelis, P. Fradet, A. Girault, and B. Lavigueur. BPDF: A statically analyzable dataflow model with integer and boolean parameters. In ACM Int. Conf. Embedded Software, EMSOFT'13, pages 1--10, Montreal, Canada, Sept. 2013. Google Scholar
Digital Library
- L. Benini, E. Flamand, D. Fuin, and D. Melpignano. P2012: Building an ecosystem for a scalable, modular and high-efficiency embedded computing accelerator. In Design Automation and Test in Europe, DATE'12, pages 983--987, 2012. Google Scholar
Digital Library
- B. Bhattacharya and S. S. Bhattacharyya. Quasi-static scheduling of reconfigurable dataflow graphs for DSP systems. In IEEE International Workshop on Rapid System Prototyping, pages 84--89, 2000. Google Scholar
Digital Library
- B. Bhattacharya and S. S. Bhattacharyya. Parameterized dataflow modeling for DSP systems. IEEE Trans. on Signal Processing, 49(10):2408--2421, 2001. Google Scholar
Digital Library
- G. Bilsen, M. Engels, R. Lauwereins, and J. Peperstraete. Cyclo-static dataflow. IEEE Trans. on Signal Processing, 44(2):397--408, 1996. Google Scholar
Digital Library
- P. Fradet, A. Girault, and P. Poplavko. SPDF: A schedulable parametric data-flow MoC. In Design Automation and Test in Europe, DATE'12, pages 769--774, 2012. Google Scholar
Digital Library
- S. Ha and E. A. Lee. Compile-time scheduling and assignment of data-flow program graphs with data-dependent iteration. IEEE Trans. Computers, 40(11):1225--1238, 1991. Google Scholar
Digital Library
- H. Kee, C.-C. Shen, S. S. Bhattacharyya, I. Wong, Y. Rao, and J. Kornerup. Mapping parameterized cyclo-static dataflow graphs onto configurable hardware. Signal Processing Systems, 66(3):285--301, 2012. Google Scholar
Digital Library
- E. A. Lee. Recurrences, iteration, and conditionals in statically scheduled block diagrams languages. In VLSI Signal Processing III, chapter 31, pages 330--340. IEEE Press, 1988.Google Scholar
- E. A. Lee and D. G. Messerschmitt. Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. Computers, 36(1):24--35, 1987. Google Scholar
Digital Library
- E. A. Lee and D. G. Messerschmitt. Synchronous data flow. IEEE Trans. Computers, 36(1):24--35, 1987. Google Scholar
Digital Library
- J.-B. Lee and H. Kalva. The VC-1 and H.264 Video Compression Standards for Broadband Video Services. Springer, 2008. Google Scholar
Digital Library
- C. E. Leiserson and J. B. Saxe. Retiming synchronous circuitry. Algorithmica, 6(1):5--35, 1991.Google Scholar
Digital Library
- A. Munshi. The OpenCL Specification. Khronos OpenCL Working Group, 1.1 edition, June 2011.Google Scholar
- NVIDIA CUDA Programming Guide. NVIDIA Corp., 4.1 edition, 2012.Google Scholar
- J. L. Pino, S. S. Bhattacharyya, and E. A. Lee. A hierarchical multiprocessor scheduling framework for synchronous dataflow graphs. Tech. report UCB/ERL M95/36, Univ. of California at Berkeley, May 1995. Google Scholar
Digital Library
- W. Plishker, N. Sane, and S. S. Bhattacharyya. A generalized scheduling approach for dynamic dataflow applications. In Design Automation and Test in Europe, DATE'09, pages 111--116, Nice, France, Apr. 2009. Google Scholar
Digital Library
- S. Sriram and S. S. Bhattacharyya. Embedded Multiprocessors: Scheduling and Synchronization. Marcel Dekker, Inc., New York, NY, USA, 1st edition, 2000. ISBN 0824793188. Google Scholar
Digital Library
- B. Theelen, M. Geilen, T. Basten, J. Voeten, S. Gheorghita, and S. Stuijk. A scenario-aware data flow model for combined long-run average and worst-case performance analysis. In International Conference on Formal Methods and Models for Codesign, MEMOCODE'06, pages 185--194, Napa Valley (CA), USA, July 2006. ACM-IEEE.Google Scholar
Digital Library
- M. H. Wiggers, M. J. G. Bekooij, and G. J. M. Smit. Buffer capacity computation for throughput constrained streaming applications with data-dependent inter-task commnication.ACM Trans. Embedded Comput. Syst., 10(2):17, 2010. Google Scholar
Digital Library
Index Terms
A framework to schedule parametric dataflow applications on many-core platforms
Recommendations
A framework to schedule parametric dataflow applications on many-core platforms
LCTES '14: Proceedings of the 2014 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systemsDataflow models, such as SDF, have been effectively used to program streaming applications while ensuring their liveness and boundedness. Yet, industrials are struggling to design the next generation of high definition video applications using these ...
Dataflow Support in x86_64 Multicore Architectures through Small Hardware Extensions
DSD '15: Proceedings of the 2015 Euromicro Conference on Digital System DesignThe path towards future high performance computers requires architectures able to efficiently run multi-threaded applications. In this context, dataflow-based execution models can improve the performance by limiting the synchronization overhead, thanks ...
Approximate weighted matching on emerging manycore and multithreaded architectures
Graph matching is a prototypical combinatorial problem with many applications in high-performance scientific computing. Optimal algorithms for computing matchings are challenging to parallelize. Approximation algorithms are amenable to parallelization ...







Comments