Abstract
Pipelined heterogeneous multiprocessor system-on-chip (MPSoC) can provide high throughput for streaming applications. In the design of such systems, time performance and system cost are the most concerning issues. By analyzing runtime behaviors of benchmarks in real-world platforms, we find that execution times of tasks are not fixed but spread with probabilities. In terms of this feature, we model execution times of tasks as random variables. In this paper, we study how to design high-performance and low-cost MPSoC systems to execute a set of such tasks with data dependencies in a pipelined fashion. Our objective is to obtain the optimal functional unit assignment and voltage selection for the pipelined MPSoC systems, such that the system cost is minimized while timing constraints can be met with a given guaranteed probability. For each required probability, our proposed algorithm can efficiently obtain the optimal solution. Experiments show that other existing algorithms cannot find feasible solutions in most cases, but ours can. Even for those solutions that other algorithms can obtain, ours can reach 30% reductions in total cost compared with others.
- L. Abeni et al. Stochastic analysis of buffer–less pipelines of real–time tasks. In ACM Symposium On Applied Computing (SAC), 2016, pages 1–8. ACM, 2016. Google Scholar
Digital Library
- S. Bakshi and D. D. Gajski. Partitioning and pipelining for performanceconstrained hardware/software systems. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 7(4):419–432, 1999. Google Scholar
Digital Library
- S. Carta, A. Alimonda, A. Pisano, A. Acquaviva, and L. Benini. A control theoretic approach to energy-efficient pipelined computation in mpsocs. ACM Transactions on Embedded Computing Systems (TECS), 6(4):27, 2007. Google Scholar
Digital Library
- F. Catthoor, S. Wuytack, G. de Greef, F. Banica, L. Nachtergaele, and A. Vandecappelle. Custom memory management methodology: Exploration of memory organisation for embedded multimedia system design. Springer Science & Business Media, 2013.Google Scholar
- L.-F. Chao and E. H.-M. Sha. Scheduling data-flow graphs via retiming and unfolding. IEEE Transactions on Parallel and Distributed Systems, 8 (12):1259–1267, 1997. Google Scholar
Digital Library
- Y.-H. Chen, J. Emer, and V. Sze. Eyeriss: A spatial architecture for energyefficient dataflow for convolutional neural networks. In Computer Architecture (ISCA), 2016 ACM/IEEE 43rd Annual International Symposium on, pages 367–379. IEEE, 2016. Google Scholar
Digital Library
- T. Cucinotta and L. Palopoli. Qos control for pipelines of tasks using multiple resources. IEEE Transactions on Computers, 59(3):416–430, 2010. Google Scholar
Digital Library
- L. Cucu-Grosjean, L. Santinelli, M. Houston, C. Lo, T. Vardanega, L. Kosmidis, J. Abella, E. Mezzetti, E. Quinones, and F. J. Cazorla. Measurement-based probabilistic timing analysis for multi-path programs. In Real-Time Systems (ECRTS), 2012 24th Euromicro Conference on, pages 91–101. IEEE, 2012. Google Scholar
Digital Library
- B. Donyanavard, T. Mück, S. Sarma, and N. Dutt. Sparta: runtime task allocation for energy efficient heterogeneous many-cores. In Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, page 27. ACM, 2016. Google Scholar
Digital Library
- M. I. Gordon, W. Thies, and S. Amarasinghe. Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. ACM SIGOPS Operating Systems Review, 40(5):151–162, 2006. Google Scholar
Digital Library
- J. Hu, Q. Zhuge, C. J. Xue, W.-C. Tseng, and E. H.-M. Sha. Software enabled wear-leveling for hybrid pcm main memory on embedded systems. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013, pages 599–602. IEEE, 2013. Google Scholar
Digital Library
- S. Hua, G. Qu, and S. S. Bhattacharyya. Energy reduction techniques for multimedia applications with tolerance to deadline misses. In Proceedings of the 40th annual Design Automation Conference, pages 131–136. ACM, 2003. Google Scholar
Digital Library
- W.-L. Hung, Y. Xie, N. ViJ’aykrishnan, M. Kandemir, and M. J. Irwin. Thermal-aware task allocation and scheduling for embedded systems. In Design, Automation and Test in Europe, 2005. Proceedings, pages 898–899. IEEE, 2005. Google Scholar
Digital Library
- K. Ito, L. E. Lucke, and K. K. Parhi. Ilp-based cost-optimal dsp synthesis with module selection and data format conversion. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 6(4):582–594, 1998. Google Scholar
Digital Library
- J. Jalle, L. Kosmidis, J. Abella, E. Qui˜nones, and F. J. Cazorla. Bus designs for time-probabilistic multicore processors. In Proceedings of the conference on Design, Automation & Test in Europe, page 50. European Design and Automation Association, 2014. Google Scholar
Digital Library
- H. Javaid, X. He, A. Ignjatovic, and S. Parameswaran. Optimal synthesis of latency and throughput constrained pipelined mpsocs targeting streaming applications. In Hardware/Software Codesign and System Synthesis (CODES+ ISSS), 2010 IEEE/ACM/IFIP International Conference on, pages 75–84. IEEE, 2010. Google Scholar
Digital Library
- H. Javaid, A. Ignjatovic, and S. Parameswaran. Performance estimation of pipelined multiprocessor system-on-chips (mpsocs). IEEE Transactions on Parallel and Distributed Systems, 25(8):2159–2168, 2014.Google Scholar
Cross Ref
- W. Jiang, E. H.-M. Sha, Q. Zhuge, and X. Chen. Optimal functionalunit assignment and buffer placement for probabilistic pipelines. In Hardware/Software Codesign and System Synthesis (CODES+ ISSS), 2016 International Conference on, pages 1–10. IEEE, 2016. Google Scholar
Digital Library
- W. Jiang, E. H.-M. Sha, X. Chen, L. Yang, L. Zhou, and Q. Zhuge. Optimal functional-unit assignment for heterogeneous systems under timing constraint. IEEE Transactions on Parallel and Distributed Systems, 2017.Google Scholar
Cross Ref
- I. Karkowski and H. Corporaal. Design of heterogenous multi-processor embedded systems: applying functional pipelining. In Parallel Architectures and Compilation Techniques., 1997. Proceedings., 1997 International Conference on, pages 156–165. IEEE, 1997. Google Scholar
Digital Library
- S.-R. Kuang, C.-Y. Chen, and R.-Z. Liao. Partitioning and pipelined scheduling of embedded system using integer linear programming. In Parallel and Distributed Systems, 2005. Proceedings. 11th International Conference on, volume 2, pages 37–41. IEEE, 2005. Google Scholar
Digital Library
- Y. LeCun, K. Kavukcuoglu, and C. Farabet. Convolutional networks and applications in vision. In Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium on, pages 253–256. IEEE, 2010.Google Scholar
- MediaTek. Mediatek helio x20. http://mediatek-helio.com/x20/. J. Niu, C. Liu, Y. Gao, and M. Qiu. Energy efficient task assignment with guaranteed probability satisfying timing constraints for embedded systems. IEEE Transactions on Parallel and Distributed Systems, 25(8): 2043–2052, 2014.Google Scholar
Cross Ref
- Odroid-XU3. Odroid-xu3. http://goo.gl/Nn6z3O. A. Prakash, S. Wang, A. E. Irimiea, and T. Mitra. Energy-efficient execution of data-parallel applications on heterogeneous mobile platforms. In Computer Design (ICCD), 2015 33rd IEEE International Conference on, pages 208–215. IEEE, 2015. Google Scholar
Digital Library
- M. Qiu and E. H.-M. Sha. Cost minimization while satisfying hard/soft timing constraints for heterogeneous embedded systems. ACM Transactions on Design Automation of Electronic Systems (TODAES), 14(2):25, 2009. Google Scholar
Digital Library
- M. Qiu, C. Xue, Q. Zhuge, Z. Shao, M. Liu, and E. H.-M. Sha. Voltage assignment and loop scheduling for energy minimization while satisfying timing constraint with guaranteed probability. In Application-specific Systems, Architectures and Processors, 2006. ASAP’06. International Conference on, pages 178–181. IEEE, 2006. Google Scholar
Digital Library
- A. Salman, I. Ahmad, and S. Al-Madani. Particle swarm optimization for task assignment problem. Microprocessors and Microsystems, 26(8): 363–371, 2002.Google Scholar
- E. H.-M. Sha, W. Jiang, Q. Zhuge, L. Yang, and X. Chen. On the design of high-performance and energy-efficient probabilistic self-timed systems. In High Performance Computing and Communications (HPCC), 2015, pages 260–265. IEEE, 2015. Google Scholar
Digital Library
- Z. Shao, Q. Zhuge, C. Xue, and E.-M. Sha. Efficient assignment and scheduling for heterogeneous dsp systems. IEEE Transactions on Parallel and Distributed Systems, 16(6):516–525, 2005. Google Scholar
Digital Library
- N. V. SMP. A multi-core cpu architecture for low power and high performance. Technical report, NVidia, 2011.Google Scholar
- W. Thies, M. Karczmarek, and S. Amarasinghe. Streamit: A language for streaming applications. In International Conference on Compiler Construction, pages 179–196. Springer, 2002. Google Scholar
Digital Library
- S. Tongsima, E.-M. Sha, C. Chantrapornchai, D. R. Surma, and N. L. Passos. Probabilistic loop scheduling for applications with uncertain execution time. IEEE transactions on computers, 49(1):65–80, 2000. Google Scholar
Digital Library
- Y. Wang, H. Liu, D. Liu, Z. Qin, Z. Shao, and E. H.-M. Sha. Overheadaware energy optimization for real-time streaming applications on multiprocessor system-on-chip. ACM Transactions on Design Automation of Electronic Systems (TODAES), 16(2):14, 2011. Google Scholar
Digital Library
- Y. Xie and W.-L. Hung. Temperature-aware task allocation and scheduling for embedded multiprocessor systems-on-chip (mpsoc) design. The Journal of VLSI Signal Processing, 45(3):177–189, 2006. Google Scholar
Digital Library
- D. Zhu, H. Aydin, and J.-J. Chen. Optimistic reliability aware energy management for real-time tasks with probabilistic execution times. In Real-Time Systems Symposium, 2008, pages 313–322. IEEE, 2008. Google Scholar
Digital Library
Index Terms
Optimal functional unit assignment and voltage selection for pipelined MPSoC with guaranteed probability on time performance
Recommendations
Optimal functional unit assignment and voltage selection for pipelined MPSoC with guaranteed probability on time performance
LCTES 2017: Proceedings of the 18th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded SystemsPipelined heterogeneous multiprocessor system-on-chip (MPSoC) can provide high throughput for streaming applications. In the design of such systems, time performance and system cost are the most concerning issues. By analyzing runtime behaviors of ...
Optimal functional-unit assignment and buffer placement for probabilistic pipelines
CODES '16: Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System SynthesisApplications, such as streaming applications, modeled by task graphs can be efficiently executed in a pipelined fashion. In synthesizing application-specific heterogeneous pipelined systems, where to place buffers (called buffer placement) and what type ...
Flexible MPSoC platform with fast interconnect exploration for optimal system performance for a specific application
DATE '06: Proceedings of the conference on Design, automation and test in Europe: Designers' forumOne of the key elements in Multi-Processor Systems-on-Chip (MPSoC) design is to select the optimal on-chip interconnect architecture, in order to maximize the overall system performance.This paper proposes a flexible MPSoC platform, designed for a ...






Comments