Abstract
Modern embedded systems execute applications, which interact with the operating system and hardware differently depending on the type of workload. These cross-layer interactions result in wide variations of the chip-wide thermal profile. In this article, a reinforcement learning-based runtime manager is proposed that guarantees application-specific performance requirements and controls the POSIX thread allocation and voltage/frequency scaling for energy-efficient thermal management. This controls three thermal aspects: peak temperature, average temperature, and thermal cycling. Contrary to existing learning-based runtime approaches that optimize energy and temperature individually, the proposed runtime manager is the first approach to combine the two objectives, simultaneously addressing all three thermal aspects. However, determining thread allocation and core frequencies to optimize energy and temperature is an NP-hard problem. This leads to exponential growth in the learning table (significant memory overhead) and a corresponding increase in the exploration time to learn the most appropriate thread allocation and core frequency for a particular application workload. To confine the learning space and to minimize the learning cost, the proposed runtime manager is implemented in a two-stage hierarchy: a heuristic-based thread allocation at a longer time interval to improve thermal cycling, followed by a learning-based hardware frequency selection at a much finer interval to improve average temperature, peak temperature, and energy consumption. This enables finer control on temperature in an energy-efficient manner while simultaneously addressing scalability, which is a crucial aspect for multi-/many-core embedded systems. The proposed hierarchical runtime manager is implemented for Linux running on nVidia’s Tegra SoC, featuring four ARM Cortex-A15 cores. Experiments conducted with a range of embedded and cpu-intensive applications demonstrate that the proposed runtime manager not only reduces energy consumption by an average 15% with respect to Linux but also improves all the thermal aspects—average temperature by 14°C, peak temperature by 16°C, and thermal cycling by 54%.
- 1994. IEEE Standard for information technology - portable operating system interfaces (POSIX(R)) - part 1: System application program interface (API) - amendment 1: Realtime extension (C language). IEEE Std 1003.1b-1993 (1994), 0--3. DOI:http://dx.doi.org/10.1109/IEEESTD.1994.121455Google Scholar
- A. G. Barto. 1998. Reinforcement Learning: An Introduction. MIT Press.Google Scholar
Digital Library
- L. Benini, A. Bogliolo, and G. De Micheli. 1998. Dynamic power management of electronic systems. In Proceedings on the International Conference on Computer Aided Design (ICCAD’98). Google Scholar
Digital Library
- C. Bienia, S. Kumar, and K. Li. 2008. PARSEC vs. SPLASH-2: A quantitative comparison of two multithreaded benchmark suites on chip-multiprocessors. In IEEE Symposium on Workload Characterization. 47--56. DOI:http://dx.doi.org/10.1109/IISWC.2008.4636090Google Scholar
- J. L. Chaboche and P. M. Lesne. 1988. A non-linear continuous fatigue damage model. Fatigue & Fracture of Engineering Materials & Structures 11, 1 (1988), 1--17.Google Scholar
Cross Ref
- R. Cochran, C. Hankendi, A. Coskun, and S. Reda. 2011a. Identifying the optimal energy-efficient operating points of parallel workloads. In Proceedings on the International Conference on Computer Aided Design (ICCAD’11). Google Scholar
Digital Library
- R. Cochran, C. Hankendi, A. K. Coskun, and S. Reda. 2011b. Pack & cap: Adaptive DVFS and thread packing under power caps. In Proceedings of the International Symposium on Microarchitecture (MICRO’11). ACM, 175--185. DOI:http://dx.doi.org/10.1145/2155620.2155641 Google Scholar
Digital Library
- L. F. Coffin Jr. 1973. Fatigue at high temperature. American Society of Testing Materials Philadelphia STP 520 (1973), 5--34.Google Scholar
- A. K. Coskun, T. S. Rosing, and K. C. Gross. 2009a. Utilizing predictors for efficient thermal management in multiprocessor SoCs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 28, 10 (2009), 1503--1516. DOI:http://dx.doi.org/10.1109/TCAD.2009.2026357 Google Scholar
Digital Library
- A. K. Coskun, R. Strong, D. M. Tullsen, and T. S. Rosing. 2009b. Evaluating the impact of job scheduling and power management on processor lifetime for chip multiprocessors. In Proceedings of the Joint Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’09). ACM, 169--180. DOI:http://dx.doi.org/10.1145/1555349.1555369 Google Scholar
Digital Library
- J. Cui and D. L. Maskell. 2012. A fast high-level event-driven thermal estimator for dynamic thermal aware scheduling. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 31, 6 (2012), 904--917. DOI:http://dx.doi.org/10.1109/TCAD.2012.2183371 Google Scholar
Digital Library
- L. Dagum and R. Menon. 1998. OpenMP: An industry standard api for shared-memory programming. IEEE Computational Science Engineering 5, 1 (1998), 46--55. DOI:http://dx.doi.org/10.1109/99.660313 Google Scholar
Digital Library
- A. Das, A. Kumar, and B. Veeravalli. 2015a. Reliability and energy-aware mapping and scheduling of multimedia applications on multiprocessor systems. IEEE Transactions on Parallel and Distributed Systems (2015). DOI:http://dx.doi.org/10.1109/TPDS.2015.2412137Google Scholar
- A. Das, A. Kumar, B. Veeravalli, R. Shafik, G. Merrett, and B. Al-Hashimi. 2015b. Workload uncertainty characterization and adaptive frequency scaling for energy minimization of embedded systems. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’15). San Jose, CA. Google Scholar
Digital Library
- A. Das, R. A. Shafik, G. V. Merrett, B. M. Al-Hashimi, A. Kumar, and B. Veeravalli. 2014. Reinforcement learning-based inter- and intra-application thermal optimization for lifetime improvement of multicore systems. In Proceedings of the Design Automation Conference (DAC’14). ACM, Article 170, 6 pages. DOI:http://dx.doi.org/10.1145/2593069.2593199 Google Scholar
Digital Library
- J. Dean and S. Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Communication of the ACM 51, 1 (2008), 107--113. DOI:http://dx.doi.org/10.1145/1327452.1327492 Google Scholar
Digital Library
- G. Dhiman, V. Kontorinis, D. Tullsen, T. Rosing, E. Saxe, and J. Chew. 2010. Dynamic workload characterization for power efficient scheduling on cmp systems. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED’10). Google Scholar
Digital Library
- G. Dhiman and T. S. Rosing. 2009. System-level power management using online learning. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 28, 5 (2009), 676--689. DOI:http://dx.doi.org/10.1109/TCAD.2009.2015740 Google Scholar
Digital Library
- G. Dhiman and T. S. Rosing. 2007. Dynamic voltage frequency scaling for multi-tasking systems using online learning. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED’07). Google Scholar
Digital Library
- S. D. Downing and D. F. Socie. 1982. Simple rainflow counting algorithms. International Journal of Fatigue 4, 1 (1982), 31--40.Google Scholar
Cross Ref
- T. Ebi, M. A. Al Faruque, and J. Henkel. 2009. TAPE: Thermal-aware agent-based power economy for multi/many-core architectures. In Proceedings on the International Conference on Computer Aided Design (ICCAD’09). ACM, 302--309. DOI:http://dx.doi.org/10.1145/1687399.1687457 Google Scholar
Digital Library
- T. Ebi, D. Kramer, W. Karl, and J. Henkel. 2011. Economic learning for thermal-aware power budgeting in many-core architectures. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’11). ACM, 189--196. DOI:http://dx.doi.org/10.1145/2039370.2039401 Google Scholar
Digital Library
- M. Al Faruque, J. Jahn, and J. Henkel. 2010. Runtime thermal management using software agents for multi- and many-core architectures. IEEE Design & Test of Computers 27, 6 (2010), 58--68. Google Scholar
Digital Library
- Y. Ge and Q. Qiu. 2011. Dynamic thermal management for multimedia applications using machine learning. In Proceedings of the Design Automation Conference (DAC’11). ACM, 95--100. DOI:http://dx.doi.org/10.1145/2024724.2024746 Google Scholar
Digital Library
- M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. 2001. MiBench: A free, commercially representative embedded benchmark suite. In IEEE Workshop on Workload Characterization. 3--14. DOI:http://dx.doi.org/10.1109/WWC.2001.990739 Google Scholar
Digital Library
- L. He, W. Liao, and M. R. Stan. 2004. System level leakage reduction considering the interdependence of temperature and leakage. In Proceedings of the Design Automation Conference (DAC’04). ACM, 12--17. DOI:http://dx.doi.org/10.1145/996566.996572 Google Scholar
Digital Library
- H. Javaid, M. Shafique, J. Henkel, and S. Parameswaran. 2011. System-level application-aware dynamic power management in adaptive pipelined MPSoCs for multimedia. In Proceedings on the International Conference on Computer Aided Design (ICCAD’11). IEEE, 616--623. Google Scholar
Digital Library
- D.-C. Juan, S. Garg, J. Park, and D. Marculescu. 2013. Learning the optimal operating point for many-core systems with extended range voltage/frequency scaling. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’13). IEEE, Article 8, 10 pages. Google Scholar
Digital Library
- H. Jung and M. Pedram. 2008. Continuous frequency adjustment technique based on dynamic workload prediction. In International Conference on VLSI Design. Google Scholar
Digital Library
- H. Jung and M. Pedram. 2010. Supervised learning based power management for multicore processors. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 29, 9 (2010), 1395--1408. Google Scholar
Digital Library
- U. A. Khan and B. Rinner. 2014. Online learning of timeout policies for dynamic power management. ACM Transactions on Embedded Computing Systems (TECS) 13, 4, Article 96 (2014), 96:1--96:25 pages. DOI:http://dx.doi.org/10.1145/2529992 Google Scholar
Digital Library
- S. S. Manson. 1972. The Challenge to Unify Treatment of High-Temperature Fatigue: A Partisan Proposal Based on Strainrange Partitioning. National Aeronautics and Space Administration.Google Scholar
- P. Mercati, A. Bartolini, F. Paterna, T. S. Rosing, and L. Benini. 2013. Workload and user experience-aware dynamic reliability management in multicore processors. In Proceedings of the Design Automation Conference (DAC’13). ACM, Article 2, 6 pages. DOI:http://dx.doi.org/10.1145/2463209.2488735 Google Scholar
Digital Library
- P. Mercati, A. Bartolini, F. Paterna, T. S. Rosing, and L. Benini. 2014. A linux-governor based dynamic reliability manager for android mobile devices. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’14). European Design and Automation Association, Article 104, 4 pages. Google Scholar
Digital Library
- S. Pagani, H. Khdr, W. Munawar, J.-J. Chen, M. Shafique, M. Li, and J. Henkel. 2014. TSP: Thermal safe power: Efficient power budgeting for many-core systems in dark silicon. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’14). Google Scholar
Digital Library
- D. Rai, Hoeseok Yang, I Bacivarov, Jian-Jia Chen, and L. Thiele. 2011. Worst-case temperature analysis for real-time systems. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’11). 1--6. DOI:http://dx.doi.org/10.1109/DATE.2011.5763104Google Scholar
- L. Schor, I. Bacivarov, H. Yang, and L. Thiele. 2013. Efficient worst-case temperature evaluation for thermal-aware assignment of real-time applications on MPSoCs. Journal of Electronic Testing 29, 4 (2013), 521--535. DOI:http://dx.doi.org/10.1007/s10836-013-5397-5 Google Scholar
Digital Library
- S. Sharifi, D. Krishnaswamy, and T. S. Rosing. 2013. PROMETHEUS: A proactive method for thermal management of heterogeneous MPSoCs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 32, 7 (2013), 1110--1123. DOI:http://dx.doi.org/10.1109/TCAD.2013.2247656 Google Scholar
Digital Library
- H. Shen, J. Lu, and Q. Qiu. 2012. Learning based DVFS for simultaneous temperature, performance and energy management. In Proceedings of the International Symposium on Quality Electronic Design (ISQED’12). 747--754. DOI:http://dx.doi.org/10.1109/ISQED.2012.6187575Google Scholar
- H. Shen, Y. Tan, J. Lu, Q. Wu, and Q. Qiu. 2013. Achieving autonomous power management using reinforcement learning. ACM Transactions on Design Automation of Electronic Systems (TODAES) 18, 2 (2013), 24. Google Scholar
Digital Library
- B. Shi, Y. Zhang, and A. Srivastava. 2013. Dynamic thermal management under soft thermal constraints. IEEE Transactions on Very Large Scale Integration Systems (TVLSI) 21, 11 (2013), 2045--2054. DOI:http://dx.doi.org/10.1109/TVLSI.2012.2227854 Google Scholar
Digital Library
- T. Simunic, L. Benini, A. Acquaviva, P. Glynn, and G. De Micheli. 2001. Dynamic voltage scaling and power management for portable systems. In Proceedings of the Design Automation Conference (DAC’01). Google Scholar
Digital Library
- F. Sironi, M. Maggio, R. Cattaneo, G. F. Del Nero, D. Sciuto, and M. D. Santambrogio. 2013. ThermOS: System support for dynamic thermal management of chip multi-processors. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. DOI:http://dx.doi.org/10.1109/PACT.2013.6618802 Google Scholar
Digital Library
- K. Skadron, M. R. Stan, K. Sankaranarayanan, W. Huang, S. Velusamy, and D. Tarjan. 2004. Temperature-aware microarchitecture: Modeling and implementation. ACM Transactions on Architecture and Code Optimization (TACO) 1, 1 (2004), 94--125. DOI:http://dx.doi.org/10.1145/980152.980157 Google Scholar
Digital Library
- J. Srinivasan, S. V. Adve, P. Bose, and J. A. Rivers. 2004. The case for lifetime reliability-aware microprocessors. In Proceedings of the International Symposium on Computer Architecture (ISCA’04). IEEE, 276--287. Google Scholar
Digital Library
- M. J. Walker, A. Das, G. V. Merrett, and B. M. Hashimi. 2015. Run-time power estimation for mobile ad embedded asymmetric multi-core CPUs. In Proceedings of the HiPEAC Workshop on Energy Efficiency with Heterogenous Computing (2015).Google Scholar
- R. Ye and Q. Xu. 2014. Learning-based power management for multicore processors via idle period manipulation. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 33, 7 (2014), 1043--1055. DOI:http://dx.doi.org/10.1109/TCAD.2014.2305838Google Scholar
Cross Ref
Index Terms
Adaptive and Hierarchical Runtime Manager for Energy-Aware Thermal Management of Embedded Systems
Recommendations
Workload Change Point Detection for Runtime Thermal Management of Embedded Systems
Applications executed on multicore embedded systems interact with system software [such as the operating system (OS)] and hardware, leading to widely varying thermal profiles which accelerate some aging mechanisms, reducing the lifetime reliability. ...
Thermal-Aware Scheduling for Integrated CPUs--GPU Platforms
Special Issue ESWEEK 2019, CASES 2019, CODES+ISSS 2019 and EMSOFT 2019As modern embedded systems like cars need high-power integrated CPUs--GPU SoCs for various real-time applications such as lane or pedestrian detection, they face greater thermal problems than before, which may, in turn, incur higher failure rate and ...
Predictive dynamic thermal management for multicore systems
DAC '08: Proceedings of the 45th annual Design Automation ConferenceRecently, processor power density has been increasing at an alarming rate resulting in high on-chip temperature. Higher temperature increases current leakage and causes poor reliability. In this paper, we propose a Predictive Dynamic Thermal Management (...






Comments