skip to main content
research-article

Energy-aware code motion for GPU shader processors

Published:24 December 2013Publication History
Skip Abstract Section

Abstract

Graphics processing units (GPUs) are now being widely adopted in system-on-a-chip designs, and they are often used in embedded systems for manipulating computer graphics or even for general-purpose computation. Energy management is of concern to both hardware and software designers. In this article, we present an energy-aware code-motion framework for a compiler to generate concentrated accesses to input and output (I/O) buffers inside a GPU. Our solution attempts to gather the I/O buffer accesses into clusters, thereby extending the time period during which the I/O buffers are clock or power gated. We performed experiments in which the energy consumption was simulated by incorporating our compiler-analysis and code-motion framework into an in-house compiler tool. The experimental results demonstrated that our mechanisms were effective in reducing the energy consumption of the shader processor by an average of 13.1% and decreasing the energy-delay product by 2.2%.

References

  1. Aho, A. V., Lam, M. S., Sethi, R., and Ullman, J. D. 2006. Compilers: Principles, Techniques, and Tools 2nd Ed. Prentice Hall. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Apple 2011. OpenGL ES on iOS. http://developer.apple.com/iphone/library/documentation/3DDrawing/Conceptual/OpenGLES_ProgrammingGuide/OpenGLESontheiPhone/OpenGLESontheiPhone.html#//apple_ref/doc/uid/TP40008793-CH101-SW1.Google ScholarGoogle Scholar
  3. Chang, C.-M., Chen, Y.-J., Lu, Y.-C., Lin, C.-Y., Chen, L.-G., and Chien, S.-Y. 2011. A 172.6mW 43.8GFLOPS energy-efficient scalable eight-core 3D graphics processor for mobile multimedia applications. In Proceedings of the IEEE Asian Solid-State Circuits Conference (A-SSCC'11). 405--408.Google ScholarGoogle Scholar
  4. Chien, S.-Y., Tsao, Y.-M., Chang, C.-H., and Lin, Y.-C. 2008. An 8.6mW 25Mvertices/s 400-MFLOPS 800-MOPS 8.91mm2 multimedia stream processor core for mobile applications. IEEE J. Solid-State Circuits 43, 9, 2025--2035.Google ScholarGoogle ScholarCross RefCross Ref
  5. Cooper, K. D., Simpson, L. T., and Vick, C. A. 2001. Operator strength reduction. ACM Trans. Program. Lang. Syst. 23, 5, 603--625. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dropsho, S., Kursun, V., Albonesi, D. H., Dwarkadas, S., and Friedman, E. G. 2002. Managing static leakage energy in microprocessor functional units. In Proceedings of the 35th International Symposium on Microarchitecture (MICRO'02). IEEE Computer Society Press, 321--332. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Google. 2011. Android 3.2 platform. http://developer.android.com/sdk/android-3.2.html.Google ScholarGoogle Scholar
  8. Hong, S. and Kim, H. 2010. An integrated GPU power and performance model. In Proceedings of the 37th Annual International Symposium on Computer Architecture (ISCA'10). 280--289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Khronos Group. 2010. OpenGL ES 2.0 specification. http://www.khronos.org/opengles/2_X/.Google ScholarGoogle Scholar
  10. Khronos Group. 2011. OpenGL. http://www.opengl.org/.Google ScholarGoogle Scholar
  11. Khronos Group. 2009. The OpenGL ES Shading Language. http://www.khronos.org/opengles/sdk/docs/manglsl/.Google ScholarGoogle Scholar
  12. Ko, M.-Y., Lin, I.-T., Lee, S.-Y., Lyu, Z.-H., Chang, C.-M., and Cheng, Y.-J. 2011. Cyclone—A GPU IP designed for embedded 3D games. In Proceedings of the 24th Conference on Computer Vision, Graphics, and Image Processing (CVGIP'11). SS1--1--4.Google ScholarGoogle Scholar
  13. Mahjur, A., Taghizadeh, M., and Jahangir, A.-H. 2008. Lazy instruction scheduling: Keeping performance, reducing power. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED'08). 375--380. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Microsoft. 2008. DirectX common-shader core (directx hlsl). http://www.microsoft.com/windows/directx/default.mspx.Google ScholarGoogle Scholar
  15. Mochocki, B. C., Lahiri, K., Cadambi, S., and Hu, X. S. 2006. Signature-based workload estimation for mobile 3D graphics. In Proceedings of the 43rd Annual Design Automation Conference (DAC'06). 592--597. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Palm. 2011. webOS developer center. https://developer.palm.com/.Google ScholarGoogle Scholar
  17. Rele, S., Pande, S., Onder, S., and Gupta, R. 2002. Optimizing static power dissipation by functional units in superscalar processors. In Proceedings of the 11th International Conference on Compiler Construction (CC'02). 261--275. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Roy, S., Ranganathan, N., and Katkoori, S. 2009. Exploring compiler optimizations for enhancing power gating. In Proceedings of the IEEE International Symposium on Circuit and Systems (ISCAS'09). 1004--1007.Google ScholarGoogle Scholar
  19. Silpa, B., Vemuri, K. S., and Panda, P. R. 2009. Adaptive partitioning of vertex shader for low power high performance geometry engine. In Advances in Visual Computing. Lecture Notes in Computer Science, vol. 5875, Springer, Berlin, 111--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Synopsys. 2009. Design Compiler. http://www.synopsys.com/.Google ScholarGoogle Scholar
  21. Vincent Pervasive Media Technologies. 2011. Vincent 3D rendering library—open source graphics libraries for mobile and embedded devices. http://www.vincent3d.com/software/software.html.Google ScholarGoogle Scholar
  22. Wang, P.-H., Chen, Y.-M., Yang, C.-L., and Cheng, Y.-J. 2009. A predictive shutdown technique for GPU shader processors. IEEE Comput. Architect. Lett. 8, 1, 9--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Yang, H., Govindarajan, R., Gao, G. R., Cai, G., and Hu, Z. 2002. Exploiting schedule slacks for rate-optimal power-minimum software pipelining. In Proceedings of the 3rd Workshop on Compilers and Operating Systems for Low Power (COLP'02).Google ScholarGoogle Scholar
  24. You, Y.-P., Huang, C.-W., and Lee, J. K. 2005. A Sink-N-Hoist framework for leakage power reduction. In Proceedings of the ACM International Conference on Embedded Software (EMSOFT'05). 124--133. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. You, Y.-P., Huang, C.-W., and Lee, J. K. 2007. Compilation for compact power-gating controls. ACM Trans. Des. Autom. Electron. Syst. 12, 4, 51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. You, Y.-P., Lee, C., and Lee, J. K. 2002. Compiler analysis and supports for leakage power reduction on microprocessors. In Proceedings of the International Workshop on Languages and Compilers for Parallel Computing (LCPC'02). Lecture Notes in Computer Science, vol. 2481, Springer Verlag, Berlin, 63--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. You, Y.-P., Lee, C., and Lee, J. K. 2006. Compilers for leakage power reduction. ACM Trans. Des. Autom. Electron. Syst. 11, 1, 147--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zhang, W., Kandemir, M. T., Vijaykrishnan, N., Irwin, M. J., and De, V. 2003. Compiler support for reducing leakage energy consumption. In Proceedings of the 6th Design Automation and Test in Europe Conference (DATE'03). 1146--1147. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Energy-aware code motion for GPU shader processors

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!