skip to main content
article

Bypass aware instruction scheduling for register file power reduction

Published:14 June 2006Publication History
Skip Abstract Section

Abstract

Since register files suffer from some of the highest power densities within processors, designers have investigated several architectural strategies for register file power reduction, including "On Demand RF Read" where the register file is read only if the operand value is not available from the bypasses. However, we show in this paper that significant additional reductions in the register file power consumption can be obtained by scheduling instructions so that they transfer the operands via bypasses, rather than reading from the register file. Such instruction scheduling requires the compiler to be cognizant of the bypasses in the processor pipeline. In this paper, we develop several bypass aware instruction scheduling heuristics varying in time complexity, and study their effectiveness on the Intel XScale processor pipeline running MiBench benchmarks. Our experimental results show additional power consumption reductions of up to 26% and on average 12% over and above the register file power reduction achieved through existing techniques.

References

  1. J. L. Ayala, A. Veidenbaum, and M. López-Vallejo. Power-aware compilation for register file energy reduction. Int. J. Parallel Program., 31(6):451--467, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Azevedo, I. Issenin, R. Cornea, R. Gupta, N. Dutt, A. Veidenbaum, and A. Nicolau. Profile-based dynamic voltage scheduling using program checkpoints in the copper framework, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Balasubramonian, S. Dwarkadas, and D. H. Albonesi. Reducing the complexity of the register file in dynamic superscalar processors. In MICRO 34: Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, pages 237--248, Washington, DC, USA, 2001. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Deeney. Thermal modeling and measurement of large high power silicon devices with asymmetric power distribution. In International Symposium on Microelectronics, 2002.Google ScholarGoogle Scholar
  5. A. Eichenberger and E. Davidson. Stage scheduling: A technique to reduce the register requirements of a modulo schedule. In Proceedings of MICRO, pages 338--349, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. R. Gonzales. Micro-RISC architecture for the wireless market. IEEE Micro, 19(4):30--37, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. H. Gunther, F. Binns, D. M. Carmean, and J. C. Hall. The impact of increasing microprocessor power consumption. In Intel Technology Journal, 2001.Google ScholarGoogle Scholar
  8. M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. MiBench: A free, commercially representative embedded benchmark suite. In IEEE Workshop in workload characterization, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Hasan, A. Jalote, T. Vijaykumar, and C. Brodley. Heat stroke: Power-density-based denial of service in smt. In In Proceedings of International Symposium on High-Performance Computer Architecture, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. http://www.synopsys.com/products/logic/design_compiler.html. Synopsys Design Compiler, 2001.Google ScholarGoogle Scholar
  11. Z. Hu and M. Martonosi. Reducing register file power consumption by exploiting value lifetime.Google ScholarGoogle Scholar
  12. R. Huff. Lifetime-sensitive modulo scheduling. In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation, pages 258--267, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Intel Corporation, http://www.intel.com/design/iio/manuals/273411.htm. Intel 80200 Processor based on Intel XScale Microarchitecture.Google ScholarGoogle Scholar
  14. Intel Corporation, http://www.intel.com/design/intelxscale/273473.htm. Intel XScale(R) Core: Developer's Manual.Google ScholarGoogle Scholar
  15. A. Kalambur and M. J. Irwin. An extended addressing mode for low power. In ISLPED '97: Proceedings of the 1997 international symposium on Low power electronics and design, pages 208--213, New York, NY, USA, 1997. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. S. Kim and T. Mudge. Reducing register ports using delayed write-back queues and operand pre-fetch. In ICS '03: Proceedings of the 17th annual international conference on Supercomputing, pages 172--182, New York, NY, USA, 2003. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. I. Park, M. D. Powell, and T. N. Vijaykumar. Reducing register ports for higher speed and lower energy. In MICRO 35: Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, pages 171--182, Los Alamitos, CA, USA, 2002. IEEE Computer Society Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Shivakumar and N. Jouppi. Cacti 3.0: An integrated cache timing, power, and area model. In WRL Technical Report 2001/2, 2001.Google ScholarGoogle Scholar
  19. A. Shrivastava, N. Dutt, A. Nicolau, and E. Earlie. Pbexplore: A framework for compiler-in-the-loop exploration of partial bypassing in embedded processors. In DATE '05: Proceedings of the conference on Design, Automation and Test in Europe, pages 1264--1269, Washington, DC, USA, 2005. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Shrivastava, E. Earlie, N. Dutt, and A. Nicolau. Operation tables for scheduling in the presence of incomplete bypassing. In CODES+ISSS '04: Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, pages 194--199, New York, NY, USA, 2004. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. M. Tomasulo. An efficient algorithm for exploiting multiple arithmetic units. IBM Journal of Research and Development, 11(1), 1967.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. H. Tseng and K. Asanovic. Energy-efficient register access. In SBCCI '00: Proceedings of the 13th symposium on Integrated circuits and systems design, page 377, Washington, DC, USA, 2000. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. L. Wehmeyer, M. K. Jain, S. Steinke, P. Marwedel, and M. Balakrishnan. Analysis of the influence of register file size on energy consumption, code size, and execution time. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 20(11):1329--1337, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. H.-S. Yun and J. Kim. Power-aware modulo scheduling for high-performance vliw, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. V. Zyuban and P. Kogge. The energy complexity of register files. In ISLPED '98: Proceedings of the 1998 international symposium on Low power electronics and design, pages 305--310, New York, NY, USA, 1998. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Bypass aware instruction scheduling for register file power reduction

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!