skip to main content
research-article
Public Access

Thermal-Aware Scheduling for Integrated CPUs--GPU Platforms

Published:08 October 2019Publication History
Skip Abstract Section

Abstract

As modern embedded systems like cars need high-power integrated CPUs--GPU SoCs for various real-time applications such as lane or pedestrian detection, they face greater thermal problems than before, which may, in turn, incur higher failure rate and cooling cost. We demonstrate, via experimentation on a representative CPUs--GPU platform, the importance of accounting for two distinct thermal characteristics—the platform’s temperature imbalance and different power dissipations of different tasks—in real-time scheduling to avoid any burst of power dissipations while guaranteeing all timing constraints. To achieve this goal, we propose a new <u>R</u>eal-<u>T</u>ime <u>T</u>hermal-<u>A</u>ware <u>S</u>cheduling (RT-TAS) framework. We first capture different CPU cores’ temperatures caused by different GPU power dissipations (i.e., CPUs--GPU thermal coupling) with core-specific thermal coupling coefficients. We then develop thermally-balanced task-to-core assignment and CPUs--GPU co-scheduling. The former addresses the platform’s temperature imbalance by efficiently distributing the thermal load across cores while preserving scheduling feasibility. Building on the thermally-balanced task assignment, the latter cooperatively schedules CPU and GPU computations to avoid simultaneous peak power dissipations on both CPUs and GPU, thus mitigating excessive temperature rises while meeting task deadlines. We have implemented and evaluated RT-TAS on an automotive embedded platform to demonstrate its effectiveness in reducing the maximum temperature by 6−12.2°C over existing approaches without violating any task deadline.

References

  1. 2018. Tegra X1 Thermal Design Guide. Technical Report TDG-08214-001. Nvidia.Google ScholarGoogle Scholar
  2. Rehan Ahmed, Pengcheng Huang, Max Millen, and Lothar Thiele. 2017. On the design and application of thermal isolation servers. ACM Transactions on Embedded Computing Systems (TECS) 16 (2017).Google ScholarGoogle Scholar
  3. Tarek A AlEnawy and Hakan Aydin. 2005. Energy-aware task allocation for rate monotonic scheduling. In RTAS.Google ScholarGoogle Scholar
  4. Hakan Aydin and Qi Yang. 2003. Energy-aware partitioning for multiprocessor real-time systems. In Parallel and Distributed Processing Symposium.Google ScholarGoogle ScholarCross RefCross Ref
  5. Enrico Bini and Giorgio C. Buttazzo. 2005. Measuring the performance of schedulability tests. Real-Time Systems 30, 1--2 (2005).Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Thidapat Chantem, X. Sharon Hu, and Robert P. Dick. 2011. Temperature-aware scheduling and assignment for hard real-time applications on MPSoCs. IEEE Transactions on Very Large Scale Integration Systems 19, 10 (2011).Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Minki Cho, William Song, Sudhakar Yalamanchili, and Saibal Mukhopadhyay. 2012. Thermal system identification (TSI): A methodology for post-silicon characterization and prediction of the transient thermal field in multicore chips. In SEMI-THERM.Google ScholarGoogle Scholar
  8. Edward G. Coffman, Gabor Galambos, Silvano Martello, and Daniele Vigo. 1999. Bin packing approximation algorithms: Combinatorial analysis. In Handbook of Combinatorial Optimization. 151--207.Google ScholarGoogle Scholar
  9. David Defour and Eric Petit. 2013. GPUburn: A system to test and mitigate GPU hardware failures. In International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS).Google ScholarGoogle ScholarCross RefCross Ref
  10. Kapil Dev and Sherief Reda. 2016. Scheduling challenges and opportunities in integrated cpu+ gpu processors. In ESTIMedia.Google ScholarGoogle Scholar
  11. Glenn A. Elliott, Bryan C. Ward, and James H. Anderson. 2013. GPUSync: A framework for real-time GPU management. In RTSS.Google ScholarGoogle Scholar
  12. Paolo Gai, Marco Di Natale, Giuseppe Lipari, Alberto Ferrari, Claudio Gabellini, and Paolo Marceca. 2003. A comparison of MPCP and MSRP when sharing resources in the Janus multiple-processor on a chip platform. In RTAS.Google ScholarGoogle Scholar
  13. Sharath Kodase, Shige Wang, Zonghua Gu, and Kang G. Shin. 2003. Improving scalability of task allocation and scheduling in large distributed real-time systems using shared buffers. In RTAS.Google ScholarGoogle Scholar
  14. Pratyush Kumar and Lothar Thiele. 2011. Cool shapers: Shaping real-time tasks for improved thermal guarantees. In DAC.Google ScholarGoogle Scholar
  15. Kai Lampka and Bjorn Forsberg. 2016. Keep it slow and in time : Online DVFS with hard real-time workloads. In DATE.Google ScholarGoogle Scholar
  16. Youngmoon Lee, Hoon Sung Chwa, Kang G. Shin, and Shige Wang. 2018. Thermal-aware resource management for embedded real-time systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37, 11 (2018).Google ScholarGoogle ScholarCross RefCross Ref
  17. Sheng-Chih Lin and Kaustav Banerjee. 2008. Cool chips: Opportunities and implications for power and thermal management. IEEE Trans. Dev. 55, 1 (2008).Google ScholarGoogle Scholar
  18. Pratyush Patel, Iljoo Baek, Hyoseung Kim, and Ragunathan Rajkumar. 2018. Analytical enhancements and practical insights for MPCP with self-suspensions. In RTAS.Google ScholarGoogle Scholar
  19. Indrani Paul, Srilatha Manne, Manish Arora, W. Lloyd Bircher, and Sudhakar Yalamanchili. 2013. Cooperative boosting: Needy versus greedy power management. In ISCA.Google ScholarGoogle Scholar
  20. Nick Piggin. [n.d.]. “Linux CFS Scheduler”. https://www.kernel.org/doc/Documentation/scheduler/sched-design-CFS.txt.Google ScholarGoogle Scholar
  21. Alok Prakash, Hussam Amrouch, Muhammad Shafique, Tulika Mitra, and Jörg Henkel. 2016. Improving mobile gaming performance through cooperative CPU-GPU thermal management. In DAC.Google ScholarGoogle Scholar
  22. Danil Prokhorov. 2008. Computational Intelligence in Automotive Applications. Vol. 132. Springer.Google ScholarGoogle Scholar
  23. Robert Redelmeier. [n.d.]. cpuburn. https://patrickmn.com/projects/cpuburn/.Google ScholarGoogle Scholar
  24. Onur Sahin, Lothar Thiele, and Ayse K. Coskun. 2018. MAESTRO: Autonomous QoS management for mobile applications under thermal constraints. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2018).Google ScholarGoogle Scholar
  25. Gaurav Singla, Gurinderjit Kaur, Ali Unver, and Umit Ogras. 2015. Predictive dynamic thermal and power management for heterogeneous mobile platforms. In DATE.Google ScholarGoogle Scholar
  26. Kevin Skadron, Mircea Stan, Wei Huang, Sivakumar Velusamy, Karthik Sankaranarayanan, and David Tarjan. 2003. Temperature-aware microarchitecture. In ISCA.Google ScholarGoogle Scholar
  27. Liang Wang, Xiaohang Wang, and Terrence Mak. 2016. Adaptive routing algorithms for lifetime reliability optimization in network-on-chip. IEEE Trans. Comput. 65, 9 (2016).Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Man-Ki Yoon, Sibin Mohan, Chien-Ying Chen, and Lui Sha. 2016. TaskShuffler: A schedule randomization protocol for obfuscation against timing inference attacks in real-time systems. In RTAS.Google ScholarGoogle Scholar

Index Terms

  1. Thermal-Aware Scheduling for Integrated CPUs--GPU Platforms

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!