Abstract
Graphics processing units (GPUs) are now being widely adopted in system-on-a-chip designs, and they are often used in embedded systems for manipulating computer graphics or even for general-purpose computation. Energy management is of concern to both hardware and software designers. In this article, we present an energy-aware code-motion framework for a compiler to generate concentrated accesses to input and output (I/O) buffers inside a GPU. Our solution attempts to gather the I/O buffer accesses into clusters, thereby extending the time period during which the I/O buffers are clock or power gated. We performed experiments in which the energy consumption was simulated by incorporating our compiler-analysis and code-motion framework into an in-house compiler tool. The experimental results demonstrated that our mechanisms were effective in reducing the energy consumption of the shader processor by an average of 13.1% and decreasing the energy-delay product by 2.2%.
- Aho, A. V., Lam, M. S., Sethi, R., and Ullman, J. D. 2006. Compilers: Principles, Techniques, and Tools 2nd Ed. Prentice Hall. Google Scholar
Digital Library
- Apple 2011. OpenGL ES on iOS. http://developer.apple.com/iphone/library/documentation/3DDrawing/Conceptual/OpenGLES_ProgrammingGuide/OpenGLESontheiPhone/OpenGLESontheiPhone.html#//apple_ref/doc/uid/TP40008793-CH101-SW1.Google Scholar
- Chang, C.-M., Chen, Y.-J., Lu, Y.-C., Lin, C.-Y., Chen, L.-G., and Chien, S.-Y. 2011. A 172.6mW 43.8GFLOPS energy-efficient scalable eight-core 3D graphics processor for mobile multimedia applications. In Proceedings of the IEEE Asian Solid-State Circuits Conference (A-SSCC'11). 405--408.Google Scholar
- Chien, S.-Y., Tsao, Y.-M., Chang, C.-H., and Lin, Y.-C. 2008. An 8.6mW 25Mvertices/s 400-MFLOPS 800-MOPS 8.91mm2 multimedia stream processor core for mobile applications. IEEE J. Solid-State Circuits 43, 9, 2025--2035.Google Scholar
Cross Ref
- Cooper, K. D., Simpson, L. T., and Vick, C. A. 2001. Operator strength reduction. ACM Trans. Program. Lang. Syst. 23, 5, 603--625. Google Scholar
Digital Library
- Dropsho, S., Kursun, V., Albonesi, D. H., Dwarkadas, S., and Friedman, E. G. 2002. Managing static leakage energy in microprocessor functional units. In Proceedings of the 35th International Symposium on Microarchitecture (MICRO'02). IEEE Computer Society Press, 321--332. Google Scholar
Digital Library
- Google. 2011. Android 3.2 platform. http://developer.android.com/sdk/android-3.2.html.Google Scholar
- Hong, S. and Kim, H. 2010. An integrated GPU power and performance model. In Proceedings of the 37th Annual International Symposium on Computer Architecture (ISCA'10). 280--289. Google Scholar
Digital Library
- Khronos Group. 2010. OpenGL ES 2.0 specification. http://www.khronos.org/opengles/2_X/.Google Scholar
- Khronos Group. 2011. OpenGL. http://www.opengl.org/.Google Scholar
- Khronos Group. 2009. The OpenGL ES Shading Language. http://www.khronos.org/opengles/sdk/docs/manglsl/.Google Scholar
- Ko, M.-Y., Lin, I.-T., Lee, S.-Y., Lyu, Z.-H., Chang, C.-M., and Cheng, Y.-J. 2011. Cyclone—A GPU IP designed for embedded 3D games. In Proceedings of the 24th Conference on Computer Vision, Graphics, and Image Processing (CVGIP'11). SS1--1--4.Google Scholar
- Mahjur, A., Taghizadeh, M., and Jahangir, A.-H. 2008. Lazy instruction scheduling: Keeping performance, reducing power. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED'08). 375--380. Google Scholar
Digital Library
- Microsoft. 2008. DirectX common-shader core (directx hlsl). http://www.microsoft.com/windows/directx/default.mspx.Google Scholar
- Mochocki, B. C., Lahiri, K., Cadambi, S., and Hu, X. S. 2006. Signature-based workload estimation for mobile 3D graphics. In Proceedings of the 43rd Annual Design Automation Conference (DAC'06). 592--597. Google Scholar
Digital Library
- Palm. 2011. webOS developer center. https://developer.palm.com/.Google Scholar
- Rele, S., Pande, S., Onder, S., and Gupta, R. 2002. Optimizing static power dissipation by functional units in superscalar processors. In Proceedings of the 11th International Conference on Compiler Construction (CC'02). 261--275. Google Scholar
Digital Library
- Roy, S., Ranganathan, N., and Katkoori, S. 2009. Exploring compiler optimizations for enhancing power gating. In Proceedings of the IEEE International Symposium on Circuit and Systems (ISCAS'09). 1004--1007.Google Scholar
- Silpa, B., Vemuri, K. S., and Panda, P. R. 2009. Adaptive partitioning of vertex shader for low power high performance geometry engine. In Advances in Visual Computing. Lecture Notes in Computer Science, vol. 5875, Springer, Berlin, 111--124. Google Scholar
Digital Library
- Synopsys. 2009. Design Compiler. http://www.synopsys.com/.Google Scholar
- Vincent Pervasive Media Technologies. 2011. Vincent 3D rendering library—open source graphics libraries for mobile and embedded devices. http://www.vincent3d.com/software/software.html.Google Scholar
- Wang, P.-H., Chen, Y.-M., Yang, C.-L., and Cheng, Y.-J. 2009. A predictive shutdown technique for GPU shader processors. IEEE Comput. Architect. Lett. 8, 1, 9--12. Google Scholar
Digital Library
- Yang, H., Govindarajan, R., Gao, G. R., Cai, G., and Hu, Z. 2002. Exploiting schedule slacks for rate-optimal power-minimum software pipelining. In Proceedings of the 3rd Workshop on Compilers and Operating Systems for Low Power (COLP'02).Google Scholar
- You, Y.-P., Huang, C.-W., and Lee, J. K. 2005. A Sink-N-Hoist framework for leakage power reduction. In Proceedings of the ACM International Conference on Embedded Software (EMSOFT'05). 124--133. Google Scholar
Digital Library
- You, Y.-P., Huang, C.-W., and Lee, J. K. 2007. Compilation for compact power-gating controls. ACM Trans. Des. Autom. Electron. Syst. 12, 4, 51. Google Scholar
Digital Library
- You, Y.-P., Lee, C., and Lee, J. K. 2002. Compiler analysis and supports for leakage power reduction on microprocessors. In Proceedings of the International Workshop on Languages and Compilers for Parallel Computing (LCPC'02). Lecture Notes in Computer Science, vol. 2481, Springer Verlag, Berlin, 63--73. Google Scholar
Digital Library
- You, Y.-P., Lee, C., and Lee, J. K. 2006. Compilers for leakage power reduction. ACM Trans. Des. Autom. Electron. Syst. 11, 1, 147--164. Google Scholar
Digital Library
- Zhang, W., Kandemir, M. T., Vijaykrishnan, N., Irwin, M. J., and De, V. 2003. Compiler support for reducing leakage energy consumption. In Proceedings of the 6th Design Automation and Test in Europe Conference (DATE'03). 1146--1147. Google Scholar
Digital Library
Index Terms
Energy-aware code motion for GPU shader processors
Recommendations
Vector-aware register allocation for GPU shader processors
CASES '15: Proceedings of the 2015 International Conference on Compilers, Architecture and Synthesis for Embedded SystemsGraphics processing units (GPUs) are now widely used in embedded systems for manipulating computer graphics and even for general-purpose computation. However, many embedded systems have to manage highly restricted hardware resources in order to achieve ...
VecRA: A Vector-Aware Register Allocator for GPU Shader Processors
Special Issue on ESWEEK2015 and Regular PapersGraphics processing units (GPUs) are now widely used in embedded systems for manipulating computer graphics and even for general-purpose computation. However, many embedded systems have to manage highly restricted hardware resources in order to achieve ...
Loop Detection for Energy-Aware High Performance Embedded Processors
APSCC '08: Proceedings of the 2008 IEEE Asia-Pacific Services Computing ConferenceThe energy consumed in instruction fetching accounts for a significant portion of total processor energy consumption. Energy consumption as well as performance should be considered when designing high performance embedded processors. In this paper, we ...






Comments