skip to main content
research-article

A Closed-Loop Controller to Ensure Performance and Temperature Constraints for Dynamic Applications

Authors Info & Claims
Published:09 October 2019Publication History
Skip Abstract Section

Abstract

To secure correct system operation, a plethora of Reliability, Availability and Serviceability (RAS) techniques have been deployed by circuit designers. RAS mechanisms however, come with the cost of extra clock cycles. In addition, a wide variety of dynamic workloads and different input conditions often constitute preemptive dependability techniques hard to implement. To this end, we focus on a realistic case study of a closed-loop controller that mitigates performance variation with a reactive response. This concept has been discussed but was only illustrated on small benchmarks. In particular, the extension of the approach to manage performance of dynamic workloads on a target platform has not been shown earlier. We compare our scheme against the version of a Linux CPU frequency governor in terms of timing response and energy consumption. Finally, we move forward and suggest a new flavor of our controller to efficiently manage processor temperature. Again, the concept is illustrated with a realistic case study and compared to a modern temperature manager.

References

  1. T. F. Abdelzaher, J. A. Stankovic, Chenyang Lu, Ronghua Zhang, and Ying Lu. 2003. Feedback performance control in software services. IEEE Contr. Syst. 23, 4 (May 2003), 74--90.Google ScholarGoogle Scholar
  2. Fardin Abdi, Renato Mancuso, Rohan Tabish, and Marco Caccamo. 2017. Restart-based fault-tolerance: System design and schedulability analysis. In Proceedings of the IEEE International Conference on (RTCSA’17).Google ScholarGoogle ScholarCross RefCross Ref
  3. AMD. 2000. AMD PowerNow! Technology. Informational White Paper. Retrieved from http://www.amd-k6.com/wp-content/uploads/2012/07/24404a.pdf.Google ScholarGoogle Scholar
  4. AMD. 2004. Cool ‘n’ Quiet Technology Installation Guide for AMD Athlon 64 Processor Based Systems. White Paper. Retrievd from https://web.archive.org/web/20070409045621/http://www.amd.com/us-en/assets/content_type/DownloadableAssets/Cool_N_Quiet_Installation_Guide3.pdf.Google ScholarGoogle Scholar
  5. AMD. 2011. AMD FX Processors Unleashed: A Guide to Performance Tuning with AMD OverDrive and the new AMD FX Processors. Retrieved from https://www.amd.com/Documents/AMD_FX_Performance_Tuning_Guide.pdf.Google ScholarGoogle Scholar
  6. ARM. 2015. Cortex-A9 Processor Specifications. Retrieved from https://www.arm.com/products/processors/cortex-a/cortex-a9.php?tab=Specifications.Google ScholarGoogle Scholar
  7. A. Asenov, A. R. Brown, J. H. Davies, S. Kaya, and G. Slavcheva. 2003. Simulation of intrinsic parameter fluctuations in decananometer and nanometer-scale MOSFETs. IEEE Trans. Electr. Dev. 50, 9 (Sept. 2003), 1837--1852.Google ScholarGoogle Scholar
  8. Mario Bambagini, Mauro Marinoni, Hakan Aydin, and Giorgio Buttazzo. 2016. Energy-aware scheduling for real-time systems: A survey. ACM Trans. Embed. Comput. Syst. 15, 1, Article 7 (Jan. 2016), 34 pages. DOI:https://doi.org/10.1145/2808231Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Banerjee, A. Mehrotra, A. Sangiovanni-Vincentelli, and Chenming Hu. 1999. On thermal effects in deep sub-micron VLSI interconnects. In Proceedings of the Design Automation Conference (DAC’99).Google ScholarGoogle Scholar
  10. R. C. Baumann. 2005. Radiation-induced soft errors in advanced semiconductor technologies. IEEE Trans. Dev. Mater. Reliabil. 5, 3 (Sep. 2005), 305--316.Google ScholarGoogle ScholarCross RefCross Ref
  11. Dominik Brodowski, Rafael J. Wysocki, and Viresh Kumar. 2017. Linux CPUFreq Governors. CPU Frequency and Voltage Scaling Code in the Linux(TM) Kernel. Retrieved from https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt.Google ScholarGoogle Scholar
  12. David Bull, Shidhartha Das, Karthik Shivshankar, Ganesh Dasika, Krisztian Flautner, and David Blaauw. 2010. A power-efficient 32b ARM ISA processor using timing-error detection and correction for transient- error tolerance and adaptation to PVT variation. In Proceedings of the IEEE International Solid-State Circuits Conference.Google ScholarGoogle ScholarCross RefCross Ref
  13. Yu Cao, Jyothi Velamala, Ketul Sutaria, Mike Shuo-Wei Chen, Jonathan Ahlbin, Ivan Sanchez Esqueda, Michael Bajura, and Michael Fritze. 2014. Cross-layer modeling and simulation of circuit reliability. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 33, 1 (Jan. 2014), 8--23.Google ScholarGoogle Scholar
  14. F. Catthoor, P. Raghavan, A. Lambrechts, M. Jayapala, A. Kritikakou, and J. Absar. 2010. Ultra-Low Energy Domain-Specific Instruction-Set Processors (1st ed.). Springer.Google ScholarGoogle Scholar
  15. Kueing-Long Chen, S.A. Saller, I.A. Groves, and D.B. Scott. 1985. Reliability effects on MOS transistors due to hot-carrier injection. IEEE J. Solid-State Circ. 20, 1 (Feb. 1985), 306--313.Google ScholarGoogle Scholar
  16. Shyh-Kwei Chen, Neal J. Alewine, W. Kent Fuchs, and Wen mei W. Hwu. 1995. Compiler-assisted multiple instruction rollback recovery using a read buffer. IEEE Trans. Comput. 44, 9 (Sep. 1995), 1096--1107. DOI:https://doi.org/10.1109/12.464388Google ScholarGoogle Scholar
  17. J. J. Clement. 2001. Electromigration modeling for integrated circuit interconnect reliability analysis. IEEE Trans. Dev. Mater. Reliabil. 1, 1 (Mar. 2001), 33--42. DOI:https://doi.org/10.1109/7298.946458Google ScholarGoogle ScholarCross RefCross Ref
  18. Thales Communication and Security. 2015. Spectrum monitoring and homeland security. Retrieved from https://www.thalesgroup.com/en/worldwide/security.Google ScholarGoogle Scholar
  19. Simone Corbetta, Wim Meeus, Dimitrios Rodopoulos, Etienne Cappe, Francky Catthoor, and Agnes Fritsch. 2016. System-wide reliability analysis on real processor and application under vdd and t stress. In Proceedings of the Conference on Silicon Errors in Logic System Effects (SELSE’16).Google ScholarGoogle Scholar
  20. ARM Developer. 2009. Cortex-A9 Technical Reference Manual. Retrieved from http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0388e/BEHEDIHI.html.Google ScholarGoogle Scholar
  21. J. Donald and M. Martonosi. 2006. Techniques for multicore thermal management: Classification and new exploration. In Proceedings of the 33rd International Symposium on Computer Architecture (ISCA’06). 78--88.Google ScholarGoogle Scholar
  22. Timothy J. Dysart and Peter M. Kogge. 2011. Reliability impact of n-modular redundancy in QCA. IEEE Trans. Nanotechnol. 10, 5 (Sep. 2011), 1015--1022. DOI:https://doi.org/10.1109/TNANO.2010.2099131Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Ernst. 2003. Razor: A low-power pipeline based on circuit-level timing speculation. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-36).Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hadi Esmaeilzadeh, Emily Blem, Renee St. Amant, Karthikeyan Sankaralingam, and Doug Burger. 2012. Dark silicon and the end of multicore scaling. IEEE Micro 32, 3 (Apr. 2012), 122--134.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Georgia Giannopoulou, Nikolay Stoimenov, Pengcheng Huang, and Lothar Thiele. 2013. Scheduling of mixed-criticality applications on resource-sharing multicore systems. In Proceedings of the 11th ACM International Conference on Embedded Software.Google ScholarGoogle ScholarCross RefCross Ref
  26. D. Gnad, M. Shafique, F. Kriebel, S. Rehman, Duo Sun, and J. Henkel. 2015. Hayat: Harnessing dark silicon and variability for aging deceleration and balancing. In Proceedings of the 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC’15). 1--6.Google ScholarGoogle Scholar
  27. R. Gonzalez and M. Horowitz. 1996. Energy dissipation in general purpose microprocessors. IEEE J. Solid-State Circ. 31, 9 (Sep. 1996), 1277--1284.Google ScholarGoogle ScholarCross RefCross Ref
  28. T. Grasser, B. Kaczer, W. Goes, H. Reisinger, Th. Aichinger, Ph. Hehenberger, P. J. Wagner, F. Schanovsky, J. Franco, Ph. Roussel, and M. Nelhiebel. 2010. Recent advances in understanding the bias temperature instability. In Proceedings of the IEEE International Electron Devices Meeting (IEDM’10).Google ScholarGoogle Scholar
  29. Chuancai Gu, Nan Guan, Qingxu Deng, and Wang Yi. 2014. Partitioned mixed-criticality scheduling on multiprocessor platforms. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE’14).Google ScholarGoogle Scholar
  30. R. W. Hamming. 1950. Error Detecting and Error Correcting Codes. Technical Journal 2. The Bell System. 147--160 pages.Google ScholarGoogle Scholar
  31. H. Hanson, S. W. Keckler, S. Ghiasi, K. Rajamani, F. Rawson, and J. Rubio. 2007. Thermal response to DVFS: Analysis with an intel pentium M. In Proceedings of the 2007 International Symposium on Low Power Electronics and Design (ISLPED’07). 219--224.Google ScholarGoogle Scholar
  32. Damien Hardy, Isidoros Sideris, Nikolas Ladas, and Yiannakis Sazeides. 2012. The performance vulnerability of architectural and non-architectural arrays to permanent faults. In Proceedings of the International Symposium on Microarchitecture (MICRO’12).Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. R. Hyman, K. Bhattacharya, and N. Ranganathan. 2011. Redundancy mining for soft error detection in multicore processors. IEEE Transactions on Computers 60, 8 (2011), 1114--1125.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Intel. 2004. Enhanced Intel SpeedStep Technology for the Intel Pentium M Processor. White Paper. Retrieved from https://web.archive.org/web/20150812030010http://download.intel.com/design/network/papers/30117401.pdf.Google ScholarGoogle Scholar
  35. Intel. 2008. Intel Turbo Boost Technology 2.0: Higher Performance When You Need It Most. Retrieved from https://www.intel.com/content/www/us/en/architecture-and-technology/turbo-boost/turbo-boost-technology.html.Google ScholarGoogle Scholar
  36. Intel. 2018. Intel 64 and IA-32 Architectures Optimization Reference Manual. Retrieved from https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf.Google ScholarGoogle Scholar
  37. Norman James, Phillip Restle, Joshua Friedrich, Bill Huott, and Bradley McCredie. 2007. Comparison of split-versus connected-core supplies in the POWER6 microprocessor. In Proceedings of the IEEE International Solid-State Circuits Conference.Google ScholarGoogle ScholarCross RefCross Ref
  38. Xiaobo Jiang, Runsheng Wang, Tao Yu, Jiang Chen, and Ru Huang. 2013. Investigations on line-edge roughness (LER) and line-width roughness (LWR) in nanoscale CMOS technology: Part I—Modeling and simulation method. IEEE Trans. Electr. Dev. 60, 11 (Nov. 2013), 3669--3675.Google ScholarGoogle Scholar
  39. F. Kriebel, M. Shafique, S. Rehman, J. Henkel, and S. Garg. 2016. Variability and reliability awareness in the age of dark silicon. IEEE Des. Test 33, 2 (Apr. 2016), 59--67.Google ScholarGoogle ScholarCross RefCross Ref
  40. Kai Lampka and Bjorn Forsberg. 2016. Keep it slow and in time: Online DVFS with hard real-time workloads. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE’16).Google ScholarGoogle ScholarCross RefCross Ref
  41. G. Liu, M. Fan, and G. Quan. 2012. Neighbor-aware dynamic thermal management for multi-core platform. In Proceedings of the 2012 Design, Automation Test in Europe Conference Exhibition (DATE’12). 187--192.Google ScholarGoogle Scholar
  42. J. W. S. Liu. 2000. Real-Time Systems (1st ed.). Prentice-Hall.Google ScholarGoogle Scholar
  43. Chenyang Lu, John A. Stankovic, Sang H. Son, and Gang Tao. 2002. Feedback control real-time scheduling: Framework, modeling, and algorithms*. Int. J. Time-Crit. Comput. Syst. 23, 1--2 (Jul. 2002), 85--126.Google ScholarGoogle Scholar
  44. C. Lu, J. A. Stankovic, G. Tao, and S. H. Son. 1999. Design and evaluation of a feedback control EDF scheduling algorithm. In Proceedings of the 20th IEEE Real-Time Systems Symposium.Google ScholarGoogle Scholar
  45. Shyue-Kung Lu, Yu-Chen Tsai, C. H. Hsu, Kuo-Hua Wang, and Cheng-Wen Wu. 2006. Efficient built-in redundancy analysis for embedded memories with 2-D redundancy. IEEE Trans. VLSI Syst. 14, 1 (Jan. 2006), 32--42.Google ScholarGoogle Scholar
  46. Kyeong-Sik Min, Hun-Dae Choi, H. Choi, H. Kawaguchi, and T. Sakurai. 2006. Leakage-suppressed clock-gating circuit with zigzag super cut-off CMOS (ZSCCMOS) for leakage-dominant sub-70-nm and sub-1-V-V/sub DD/ LSIs. IEEE Trans. VLSI Syst. 14, 4 (Apr. 2006), 430--435.Google ScholarGoogle Scholar
  47. M. Moudgill, J. Wellman, and J. H. Moreno. 1999. Environment for powerpc microarchitecture exploration. IEEE Micro 19, 3 (May 1999), 15--25.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. R. Mukherjee and S. O. Memik. 2006. Physical aware frequency selection for dynamic thermal management in multi-core systems. In Proceedings of the 2006 IEEE/ACM International Conference on Computer Aided Design. 547--552.Google ScholarGoogle Scholar
  49. Shubu Mukherjee. 2008. Architecture Design for Soft Errors (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA.Google ScholarGoogle Scholar
  50. Michail Noltsis, Dimitrios Rodopoulos, Nikolaos Zompakis, Francky Catthoor, and Dimitrios Soudris. 2018. Run time slack creation for processor performance variability using system scenarios (submitted).Google ScholarGoogle Scholar
  51. Cicero Nunes, Paulo F. Butzen, Andre I. Reis, and Renato P. Ribas. 2013. BTI, HCI and TDDB aging impact in flip--flops. Microelectr. Reliabil. 53, 9--11 (Nov. 2013), 1355--1359.Google ScholarGoogle ScholarCross RefCross Ref
  52. NXP. 2014. i.MX 6Dual/6Quad Automotive and Infotainment Applications Processors Data Sheet, Document No. IMX6DQAEC 3/2014. Technical Report.Google ScholarGoogle Scholar
  53. T. Pering, T. Burd, and R. Brodersen. 1998. The simulation and evaluation of dynamic voltage scaling algorithms. In Proceedings of the 1998 International Symposium on Low Power Electronics and Design. 76--81.Google ScholarGoogle Scholar
  54. Padmanabhan Pillai and Kang G. Shin. 2001. Real-time dynamic voltage scaling for low-power embedded operating systems. SIGOPS Operat. Syst. Rev. 35, 5 (Oct. 2001), 89--102.Google ScholarGoogle Scholar
  55. Georgia Psychou, Dimitrios Rodopoulos, Mohamed M. Sabry, Tobias Gemmeke, David Atienza, Tobias G. Noll, and Francky Catthoor. 2017. A classification scheme of resilience and mitigation techniques against functional errors at the higher abstraction layers of digital systems. Comput. Surv. 50, 50 (Nov. 2017).Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Vijay Janapa Reddi, Svilen Kanev, Wonyoung Kim, Simone Campanoni, Michael D. Smith, Gu-Yeon Wei, and David Brooks. 2010. Voltage smoothing: Characterizing and mitigating voltage noise in production processors via software-guided thread scheduling. In Proceedings of the 43rd IEEE/ACM International Symposium on Microarchitecture.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. D. Rodopoulos, F. Catthoor, and D. Soudris. 2015. Tackling performance variability due to RAS mechanisms with PID-controlled DVFS. IEEE Comput. Arch. Lett. 14, 2 (2015), 156--159.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Efi Rotem, Alon Naveh, Micha Moffie, and Avi Mendelson. 2004. Analysis of thermal monitor features of intel pentium m processor. In Proceedings of the Temperature-Aware Computer Systems (TACS-01), (International Symposium on Computer Architecture (ISCA-31)).Google ScholarGoogle Scholar
  59. M. M. Sabry, D. Atienza, and F. Catthoor. 2012. A hybrid HW-SW approach for intermittent error mitigation in streaming-based embedded systems. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE’12).Google ScholarGoogle Scholar
  60. H. Sanchez, B. Kuttanna, T. Olson, M. Alexander, G. Gerosa, R. Philip, and J. Alvarez. 1997. Thermal management system for high performance PowerPC/sup TM/ microprocessors. In Proceedings of the IEEE Computer Society International Conference (COMPCON'97). 325--330.Google ScholarGoogle Scholar
  61. NXP Semiconductors. 2015. i.MX 6Dual/6Quad Applications Processor Reference Manual. Retrieved from http://www.nxp.com/assets/documents/data/en/reference-manuals/IMX6DQRM.pdf.Google ScholarGoogle Scholar
  62. NXP Semiconductors. 2018. i.MX 6Dual/6Quad Automotive and Infotainment Applications Processors. Retrieved from https://www.nxp.com/docs/en/data-sheet/IMX6DQAEC.pdf.Google ScholarGoogle Scholar
  63. D. P. Siewiorek and R. S. Swarz. 1998. Reliable Computer Systems: Design and Evaluation (3rd ed.). A. K. Peters, Ltd.Google ScholarGoogle Scholar
  64. G. S. Sohi. 1989. Cache memory organization to enhance the yield of high performance VLSI processors. IEEE Trans. Comput. 38, 4 (Apr. 1989), 484--492.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. James Tschanz, Keith Bowman, Shih-Lien Lu, Paolo Aseron, Muhammad Khellah, Arijit Raychowdhury, Bibiche Geuskens, Carlos Tokunaga, Chris Wilkerson, Tanay Karnik, and Vivek De. 2010. A 45nm resilient and adaptive microprocessor core for dynamic variation tolerance. In Proceedings of the IEEE International Solid-State Circuits Conference.Google ScholarGoogle ScholarCross RefCross Ref
  66. J. W. Tschanz, S. G. Narendra, Y. Ye, B. A. Bloechel, S. Borkar, and V. De. 2003. Dynamic sleep transistor and body bias for active leakage power control of microprocessors. IEEE J. Solid-State Circ. 38, 11 (Nov. 2003), 1838--1845.Google ScholarGoogle ScholarCross RefCross Ref
  67. L. Xia, Y. Zhu, J. Yang, J. Ye, and Z. Gu. 2010. Implementing a thermal-aware scheduler in linux kernel on a multi-core processor. Comput. J. 53, 7 (Sep. 2010), 895--903.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Ruibin Xu, Daniel Mosse, and Rami Melhem. 2007. Minimizing expected energy consumption in real-time systems through dynamic voltage scaling. ACM Trans. Comput. Syst. 25, 4 (Dec. 2007).Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. J. Yang, X. Zhou, M. Chrobak, Y. Zhang, and L. Jin. 2008. Dynamic thermal management through task scheduling. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’08). 191--201.Google ScholarGoogle Scholar
  70. Inchoon Yeo, Chih Chun Liu, and Eun Jung Kim. 2008. Predictive dynamic thermal management for multicore systems. In Proceedings of the 45th Annual Design Automation Conference (DAC’08). ACM, New York, NY, 734--739.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Lei Zhou and Shengchao Guo. 2015. Thermal management of ARM SoCs using linux CPUFreq as cooling device. In Computer Modelling and New Technologies. 162--167.Google ScholarGoogle Scholar
  72. J. G. Ziegler and N. B. Nichols. 1993. Optimum setting for automatic controllers. Journal of Dynamic Systems, Measurement, and Control 115, 2B (1993), 220--222. DOI:10.1115/1.2899060Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. A Closed-Loop Controller to Ensure Performance and Temperature Constraints for Dynamic Applications

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!