skip to main content
research-article

Exploring and Predicting the Effects of Microarchitectural Parameters and Compiler Optimizations on Performance and Energy

Published:01 June 2012Publication History
Skip Abstract Section

Abstract

Embedded processor performance is dependent on both the underlying architecture and the compiler optimizations applied. However, designing both simultaneously is extremely difficult to achieve due to the time constraints designers must work under. Therefore, current methodology involves designing compiler and architecture in isolation, leading to suboptimal performance of the final product.

This article develops a novel approach to this codesign space problem. For our specific design space, we demonstrate that we can automatically predict the performance that an optimizing compiler would achieve without actually tuning it for any of the microarchitecture configurations considered. Once trained, a single run of the program compiled with the standard optimization setting is enough to make a prediction on the new microarchitecture with just a 3.2% error rate on average. This allows the designer to accurately choose an architectural configuration with knowledge of how an optimizing compiler will perform on it. We use this to find the best optimizing compiler/architectural configuration in our codesign space and demonstrate that it achieves an average 19% performance improvement and energy savings of 16% compared to the baseline, nearly doubling the energy-efficiency measured as the energy-delay-squared product (EDD).

References

  1. Abraham, S. G. and Rau, B. R. 2000. Efficient design space exploration in pico. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded System. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Agakov, F., Bonilla, E., Cavazos, J., Franke, B., Fursin, G., O’Boyle, M. F. P., Thomson, J., Toussaint, M., and Williams, C. K. I. 2006. Using machine learning to focus iterative optimization. In Proceedings of the International Symposium on Code Generation and Optimization. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Almagor, L., Cooper, K. D., Grosul, A., Harvey, T. J., Reeves, S. W., Subramanian, D., Torczon, L., and Waterman, T. 2004. Finding effective compilation sequences. SIGPLAN Not. 39, 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cavazos, J., Dubach, C., Agakov, F., Bonilla, E., O’Boyle, M. F. P., Fursin, G., and Temam, O. 2006. Automatic performance model construction for the fast software exploration of new hardware designs. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Cavazos, J., Fursin, G., Agakov, F., Bonilla, E., O’Boyle, M. F. P., and Temam, O. 2007. Rapidly selecting good compiler optimizations using performance counters. In Proceedings of the International Symposium on Code Generation and Optimization. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Contreras, G. et al. 2004. XTREM: A power simulator for the Intel XScale core. In Proceedings of the ACM SIGPLAN Joint Conference on Languages, Compilers and Tools for Embedded Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cooper, K. D., Grosul, A., Harvey, T. J., Reeves, S., Subramanian, D., Torczon, L., and Waterman, T. 2005. Acme: Adaptive compilation made efficient. SIGPLAN Not. 40, 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Desmet, V., Girbal, S., and Temam, O. 2009. Archexplorer.org: Joint compiler/hardware exploration for fair comparison of architectures. In Proceedings of the INTERACT workshop at the International Symposium on High-Performance Computer Architecture.Google ScholarGoogle Scholar
  9. Dubach, C., Cavazos, J., Franke, B., Fursin, G., O’Boyle, M. F. P., and Temam, O. 2007a. Fast compiler optimisation evaluation using code-feature based performance prediction. In Proceedings of the International Conference on Computer Frontiers. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Dubach, C., Jones, T. M., and O’Boyle, M. F. P. 2007b. Microarchitectural design space exploration using an architecture-centric approach. In Proceedings of the Annual ACM/IEEE International Symposium on Microarchitecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Dubach, C., Jones, T. M., and O’Boyle, M. F. P. 2008. Exploring and predicting the architecture/optimising compiler co-design space. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Eyerman, S., Eeckhout, L., Karkhanis, T., and Smith, J. E. 2006. A performance counter architecture for computing accurate cpi components. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Fischer, D., Teich, J., Thies, M., and Weper, R. 2002. Efficient architecture/compiler coexploration for asips. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded System. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Fischer, D., Teich, J., Weper, R., Kastens, U., and Thies, M. 2001. Design space characterization for architecture/compiler co-exploration. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Guthaus, M., Ringenberg, J., Ernst, D., Austin, T., Mudge, T., and Brown, R. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the 4th Annual Workshop on Workload Characterization. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Haneda, M., Knijnenburg, P., and Wijshoff, H. 2005. Automatic selection of compiler options using non-parametric inferential statistics. Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Hoste, K., Phansalkar, A., Eeckhout, L., Georges, A., John, L. K., and Bosschere, K. D. 2006. Performance prediction based on inherent program similarity. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Intel Corporation. Intel XScale microarchitecture. http://www.intel.com/design/intelxscale/.Google ScholarGoogle Scholar
  19. İpek, E., de Supinski, B. R., Schulz, M., and McKee, S. A. 2005. An approach to performance prediction for parallel applications. In Proceedings of the International Euro-Par Conference on Parallel Processing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. İpek, E., McKee, S. A., Caruana, R., de Supinski, B. R., and Schulz, M. 2006. Efficiently exploring architectural design spaces via predictive modeling. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Joseph, P. J., Vaswani, K., and Thazhuthaveetil, M. J. 2006a. Construction and use of linear regression models for processor performance analysis. In Proceedings of the International Symposium on High-Performance Computer Architecture.Google ScholarGoogle Scholar
  22. Joseph, P. J., Vaswani, K., and Thazhuthaveetil, M. J. 2006b. A predictive performance model for superscalar processors. In Proceedings of the Annual ACM/IEEE International Symposium on Microarchitecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Karkhanis, T. S. and Smith, J. E. 2004. A first-order superscalar processor model. In Proceedings of the Annual International Symposium on Computer Architecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Khan, S., Xekalakis, P., Cavazos, J., and Cintra, M. 2007. Using predictive modeling for cross-program design space exploration in multicore systems. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Kulkarni, P., Hines, S., Hiser, J., Whalley, D., Davidson, J., and Jones, D. 2004. Fast searches for effective optimization phase sequences. In Proceedings of the Conference on Programming Language Design and Implementation. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Lee, B. C. and Brooks, D. M. 2006. Accurate and efficient regression modeling for microarchitectural performance and power prediction. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Lee, B. C. and Brooks, D. M. 2007. Illustrative design space studies with microarchitectural regression models. In Proceedings of the International Symposium on High-Performance Computer Architecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Lee, B. C., Brooks, D. M., de Supinski, B. R., Schulz, M., Singh, K., and McKee, S. A. 2007. Methods of inference and learning for performance modeling of parallel applications. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Leupers, R., Hohenauer, M., Ceng, J., Scharwaechter, H., Meyr, H., Ascheid, G., and Braun, G. 2005. Retargetable compilers and architecture exploration for embedded processors. IEE Proc., Comput. Digit. Tech. 152, 209--223.Google ScholarGoogle ScholarCross RefCross Ref
  30. Lloyd, S. 1982. Least squares quantization in PCM. IEEE Trans. Inf. Theory 28, 2, 129--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Pan, Z. and Eigenmann, R. 2006. Fast and effective orchestration of compiler optimizations for automatic performance tuning. In Proceedings of the International Symposium on Code Generation and Optimization. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Silvano, C., Agosta, G., and Palermo, G. 2007. Efficient architecture/compiler coexploration using analytical models. Des. Autom Embed. Syst. 11, 1, 1--23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Smola, A. J. and Schölkopf, B. 2004. A tutorial on support vector regression. Stat. Comput. 14, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Tarjan, D., Thoziyoor, S., and Jouppi, N. P. 2006. Cacti 4.0. Tech. rep. HPL-2006-86, HP Laboratories Palo Alto, CA.Google ScholarGoogle Scholar
  35. Triantafyllis, S., Vachharajani, M., Vachharajani, N., and August, D. I. 2003. Compiler optimization-space exploration. In Proceedings of the International Symposium on Code Generation and Optimization. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Trimaran. 2000. Trimaran: An infrastructure for research in instruction-level parallelism. http://www.trimaran.org/.Google ScholarGoogle Scholar
  37. Vaswani, K., Thazhuthaveetil, M. J., Srikant, Y. N., and Joseph, P. J. 2007. Microarchitecture sensitive empirical models for compiler optimizations. In Proceedings of the International Symposium on Code Generation and Optimization. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Vuduc, R., Demmel, J. W., and Bilmes, J. A. 2004. Statistical models for empirical search-based performance tuning. Int. J. High Perform. Comput. Appl. 18, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Zhao, M., Childers, B. R., and Soffa, M. L. 2005. A model-based framework: An approach for profit-driven optimization. In Proceedings of the International Symposium on Code Generation and Optimization. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploring and Predicting the Effects of Microarchitectural Parameters and Compiler Optimizations on Performance and Energy

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!