skip to main content
article

Optimal functional unit assignment and voltage selection for pipelined MPSoC with guaranteed probability on time performance

Published:21 June 2017Publication History
Skip Abstract Section

Abstract

Pipelined heterogeneous multiprocessor system-on-chip (MPSoC) can provide high throughput for streaming applications. In the design of such systems, time performance and system cost are the most concerning issues. By analyzing runtime behaviors of benchmarks in real-world platforms, we find that execution times of tasks are not fixed but spread with probabilities. In terms of this feature, we model execution times of tasks as random variables. In this paper, we study how to design high-performance and low-cost MPSoC systems to execute a set of such tasks with data dependencies in a pipelined fashion. Our objective is to obtain the optimal functional unit assignment and voltage selection for the pipelined MPSoC systems, such that the system cost is minimized while timing constraints can be met with a given guaranteed probability. For each required probability, our proposed algorithm can efficiently obtain the optimal solution. Experiments show that other existing algorithms cannot find feasible solutions in most cases, but ours can. Even for those solutions that other algorithms can obtain, ours can reach 30% reductions in total cost compared with others.

References

  1. L. Abeni et al. Stochastic analysis of buffer–less pipelines of real–time tasks. In ACM Symposium On Applied Computing (SAC), 2016, pages 1–8. ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Bakshi and D. D. Gajski. Partitioning and pipelining for performanceconstrained hardware/software systems. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 7(4):419–432, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Carta, A. Alimonda, A. Pisano, A. Acquaviva, and L. Benini. A control theoretic approach to energy-efficient pipelined computation in mpsocs. ACM Transactions on Embedded Computing Systems (TECS), 6(4):27, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. F. Catthoor, S. Wuytack, G. de Greef, F. Banica, L. Nachtergaele, and A. Vandecappelle. Custom memory management methodology: Exploration of memory organisation for embedded multimedia system design. Springer Science & Business Media, 2013.Google ScholarGoogle Scholar
  5. L.-F. Chao and E. H.-M. Sha. Scheduling data-flow graphs via retiming and unfolding. IEEE Transactions on Parallel and Distributed Systems, 8 (12):1259–1267, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Y.-H. Chen, J. Emer, and V. Sze. Eyeriss: A spatial architecture for energyefficient dataflow for convolutional neural networks. In Computer Architecture (ISCA), 2016 ACM/IEEE 43rd Annual International Symposium on, pages 367–379. IEEE, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. Cucinotta and L. Palopoli. Qos control for pipelines of tasks using multiple resources. IEEE Transactions on Computers, 59(3):416–430, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Cucu-Grosjean, L. Santinelli, M. Houston, C. Lo, T. Vardanega, L. Kosmidis, J. Abella, E. Mezzetti, E. Quinones, and F. J. Cazorla. Measurement-based probabilistic timing analysis for multi-path programs. In Real-Time Systems (ECRTS), 2012 24th Euromicro Conference on, pages 91–101. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B. Donyanavard, T. Mück, S. Sarma, and N. Dutt. Sparta: runtime task allocation for energy efficient heterogeneous many-cores. In Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, page 27. ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. I. Gordon, W. Thies, and S. Amarasinghe. Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. ACM SIGOPS Operating Systems Review, 40(5):151–162, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Hu, Q. Zhuge, C. J. Xue, W.-C. Tseng, and E. H.-M. Sha. Software enabled wear-leveling for hybrid pcm main memory on embedded systems. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013, pages 599–602. IEEE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Hua, G. Qu, and S. S. Bhattacharyya. Energy reduction techniques for multimedia applications with tolerance to deadline misses. In Proceedings of the 40th annual Design Automation Conference, pages 131–136. ACM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. W.-L. Hung, Y. Xie, N. ViJ’aykrishnan, M. Kandemir, and M. J. Irwin. Thermal-aware task allocation and scheduling for embedded systems. In Design, Automation and Test in Europe, 2005. Proceedings, pages 898–899. IEEE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. K. Ito, L. E. Lucke, and K. K. Parhi. Ilp-based cost-optimal dsp synthesis with module selection and data format conversion. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 6(4):582–594, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Jalle, L. Kosmidis, J. Abella, E. Qui˜nones, and F. J. Cazorla. Bus designs for time-probabilistic multicore processors. In Proceedings of the conference on Design, Automation & Test in Europe, page 50. European Design and Automation Association, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. Javaid, X. He, A. Ignjatovic, and S. Parameswaran. Optimal synthesis of latency and throughput constrained pipelined mpsocs targeting streaming applications. In Hardware/Software Codesign and System Synthesis (CODES+ ISSS), 2010 IEEE/ACM/IFIP International Conference on, pages 75–84. IEEE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. H. Javaid, A. Ignjatovic, and S. Parameswaran. Performance estimation of pipelined multiprocessor system-on-chips (mpsocs). IEEE Transactions on Parallel and Distributed Systems, 25(8):2159–2168, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  18. W. Jiang, E. H.-M. Sha, Q. Zhuge, and X. Chen. Optimal functionalunit assignment and buffer placement for probabilistic pipelines. In Hardware/Software Codesign and System Synthesis (CODES+ ISSS), 2016 International Conference on, pages 1–10. IEEE, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. W. Jiang, E. H.-M. Sha, X. Chen, L. Yang, L. Zhou, and Q. Zhuge. Optimal functional-unit assignment for heterogeneous systems under timing constraint. IEEE Transactions on Parallel and Distributed Systems, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  20. I. Karkowski and H. Corporaal. Design of heterogenous multi-processor embedded systems: applying functional pipelining. In Parallel Architectures and Compilation Techniques., 1997. Proceedings., 1997 International Conference on, pages 156–165. IEEE, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S.-R. Kuang, C.-Y. Chen, and R.-Z. Liao. Partitioning and pipelined scheduling of embedded system using integer linear programming. In Parallel and Distributed Systems, 2005. Proceedings. 11th International Conference on, volume 2, pages 37–41. IEEE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. LeCun, K. Kavukcuoglu, and C. Farabet. Convolutional networks and applications in vision. In Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium on, pages 253–256. IEEE, 2010.Google ScholarGoogle Scholar
  23. MediaTek. Mediatek helio x20. http://mediatek-helio.com/x20/. J. Niu, C. Liu, Y. Gao, and M. Qiu. Energy efficient task assignment with guaranteed probability satisfying timing constraints for embedded systems. IEEE Transactions on Parallel and Distributed Systems, 25(8): 2043–2052, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  24. Odroid-XU3. Odroid-xu3. http://goo.gl/Nn6z3O. A. Prakash, S. Wang, A. E. Irimiea, and T. Mitra. Energy-efficient execution of data-parallel applications on heterogeneous mobile platforms. In Computer Design (ICCD), 2015 33rd IEEE International Conference on, pages 208–215. IEEE, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Qiu and E. H.-M. Sha. Cost minimization while satisfying hard/soft timing constraints for heterogeneous embedded systems. ACM Transactions on Design Automation of Electronic Systems (TODAES), 14(2):25, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Qiu, C. Xue, Q. Zhuge, Z. Shao, M. Liu, and E. H.-M. Sha. Voltage assignment and loop scheduling for energy minimization while satisfying timing constraint with guaranteed probability. In Application-specific Systems, Architectures and Processors, 2006. ASAP’06. International Conference on, pages 178–181. IEEE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Salman, I. Ahmad, and S. Al-Madani. Particle swarm optimization for task assignment problem. Microprocessors and Microsystems, 26(8): 363–371, 2002.Google ScholarGoogle Scholar
  28. E. H.-M. Sha, W. Jiang, Q. Zhuge, L. Yang, and X. Chen. On the design of high-performance and energy-efficient probabilistic self-timed systems. In High Performance Computing and Communications (HPCC), 2015, pages 260–265. IEEE, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Z. Shao, Q. Zhuge, C. Xue, and E.-M. Sha. Efficient assignment and scheduling for heterogeneous dsp systems. IEEE Transactions on Parallel and Distributed Systems, 16(6):516–525, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. N. V. SMP. A multi-core cpu architecture for low power and high performance. Technical report, NVidia, 2011.Google ScholarGoogle Scholar
  31. W. Thies, M. Karczmarek, and S. Amarasinghe. Streamit: A language for streaming applications. In International Conference on Compiler Construction, pages 179–196. Springer, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Tongsima, E.-M. Sha, C. Chantrapornchai, D. R. Surma, and N. L. Passos. Probabilistic loop scheduling for applications with uncertain execution time. IEEE transactions on computers, 49(1):65–80, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Y. Wang, H. Liu, D. Liu, Z. Qin, Z. Shao, and E. H.-M. Sha. Overheadaware energy optimization for real-time streaming applications on multiprocessor system-on-chip. ACM Transactions on Design Automation of Electronic Systems (TODAES), 16(2):14, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Y. Xie and W.-L. Hung. Temperature-aware task allocation and scheduling for embedded multiprocessor systems-on-chip (mpsoc) design. The Journal of VLSI Signal Processing, 45(3):177–189, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. D. Zhu, H. Aydin, and J.-J. Chen. Optimistic reliability aware energy management for real-time tasks with probabilistic execution times. In Real-Time Systems Symposium, 2008, pages 313–322. IEEE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Optimal functional unit assignment and voltage selection for pipelined MPSoC with guaranteed probability on time performance

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!