skip to main content
research-article
Public Access

GPU Performance Estimation using Software Rasterization and Machine Learning

Published:27 September 2017Publication History
Skip Abstract Section

Abstract

This paper introduces a predictive modeling framework to estimate the performance of GPUs during pre-silicon design. Early-stage performance prediction is useful when simulation times impede development by rendering driver performance validation, API conformance testing and design space explorations infeasible. Our approach builds a Random Forest regression model to analyze DirectX 3D workload behavior when executed by a software rasterizer, which we have extended with a workload characterizer to collect further performance information via program counters. In addition to regression models, this work produces detailed feature rankings which can provide valuable architectural insight, and accurate performance estimates for an Intel integrated Skylake generation GPU. Our models achieve reasonable out-of-sample-error rates of 14%, with an average simulation speedup of 327x.

References

  1. Hirotugu Akaike. 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control 19 (1974), 716--723.Google ScholarGoogle ScholarCross RefCross Ref
  2. Newsha Ardalani, Clint Lestourgeon, Karthikeyan Sankaralingam, and Xiaojin Zhu. 2015. Cross-architecture performance prediction (xapp) using cpu code to predict gpu performance. In Microarchitecture (MICRO), 2015 48th Annual IEEE/ACM International Symposium on. 725--737. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Peter E. Bailey, David K. Lowenthal, Vignesh Ravi, Barry Rountree, Martin Schulz, De Supinski, and R. Bronis. 2014. Adaptive configuration selection for power-constrained heterogeneous systems. In Parallel Processing (ICPP), 2014 43rd International Conference on. 371--380.Google ScholarGoogle Scholar
  4. Ali Bakhoda, George L. Yuan, Wilson W. L. Fung, Henry Wong, and Tor M. Aamodt. 2009. Analyzing CUDA workloads using a detailed GPU simulator. In Performance Analysis of Systems and Software. 2009. ISPASS 2009. IEEE International Symposium on. 163--174.Google ScholarGoogle Scholar
  5. Victor Barrio, Moya Barrio, Carlos González, Jordi Roca, Agusta Fernández, and E. Espasa. 2006. ATTILA: A cycle-level execution-driven simulator for modern GPU architectures. In Performance Analysis of Systems and Software, 2006 IEEE International Symposium on. 231--241.Google ScholarGoogle Scholar
  6. Leo Breiman, Jerome Friedman, Charles J. Stone, and Richard A. Olshen. 1984. Classification and Regression Trees. CRC Press, 1984.Google ScholarGoogle Scholar
  7. Leo Breiman. 2001. Random forests. Machine Learning 45 (2001), 5--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jianmin Chen, Bin Li, Ying Zhang, Lu Peng, and Jih-kwon Peir. 2011. Tree structured analysis on GPU power study. In Computer Design (ICCD), 2011 IEEE 29th International Conference on. 57--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Derek Chiou, Dam Sunwoo, and Joonsoo Kim. 2007. Fpga-accelerated simulation technologies (fast): Fast, full-system, cycle-accurate simulators. In Proceedings of the 40th Annual IEEE/ACM international Symposium on Microarchitecture. 249--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Adele Cutler. Random Forests for Regression and Classification. Retrieved 2017-07-14 from https://goo.gl/0d7mxj.Google ScholarGoogle Scholar
  11. Davy Genbrugge, Stijn Eyerman, and Lieven Eeckhout. 2010. Interval simulation: Raising the level of abstraction in architectural simulation. In High Performance Computer Architecture (HPCA), 2010 IEEE 16th International Symposium on. 1--12.Google ScholarGoogle ScholarCross RefCross Ref
  12. Christoph Gerum, Oliver Bringmann, and Wolfgang Rosenstiel. 2015. Source level performance simulation of gpu cores. In Design, Automation 8 Test in Europe Conference 8 Exhibition (DATE), 2015. 217--222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Qi Guo, Tianshi Chen, Yunji Chen, and Franz Franchetti. 2016. Accelerating architectural simulation via statistical techniques: A survey. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 35 (2016), 433--446. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Trevor Hastie, Jerome Friedman, and Robert Tibshirani. 2001. The elements of Statistical Learning. Springer series in statistics New York, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  15. Intel Corporation. OpenSWR: A scalable High-Performance Sotware Rasterizer for SciVis. Retrieved 2017-07-14 from https://goo.gl/G8faFn.Google ScholarGoogle Scholar
  16. Intel Corporation. Intel Open Source HD Graphics, Intel Iris Graphics, and Intel Iris Pro Graphics Programmer's Reference Manual. Retrieved 2017-07-14 from https://goo.gl/KX3wgK.Google ScholarGoogle Scholar
  17. Intel Corporation. The Compute Architecture of Intel Processor Graphics Gen9. Retrieved 2017-07-14 from https://goo.gl/RMmUc6.Google ScholarGoogle Scholar
  18. Engin Ïpek, Sally A. McKee, Rich Caruana, Bronis R. de Supinski, and Martin Schulz. 2006. Efficiently Exploring Architectural Design Spaces Via Predictive Modeling. ACM, 2006.Google ScholarGoogle Scholar
  19. Ron Kohavi. 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Ijcai 1137--1145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Benjamin C. Lee and David M. Brooks. 2006. Accurate and efficient regression modeling for microarchitectural performance and power prediction. In ACM SIGOPS Operating Systems Review. 185--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Andy Liaw and Matthew Wiener. 2002. Classification and regression by randomforest. R News 2 (2002), 18--22.Google ScholarGoogle Scholar
  22. Xiaohan Ma, Mian Dong, Lin Zhong, and Zhigang Deng. 2009. Statistical power consumption analysis and modeling for GPU-based computing. In Proceeding of ACM SOSP Workshop on Power Aware Computing and Systems (HotPower).Google ScholarGoogle Scholar
  23. Mesa 3D Graphics Library. Gallium Driver, SWR. Retrieved 2017-07-14 from https://goo.gl/YkuYyP.Google ScholarGoogle Scholar
  24. Microsoft Corporation. Windows Hardware Certification Kit User's Guide. Retrieved 2017-07-14 from https://goo.gl/s0TCzJ.Google ScholarGoogle Scholar
  25. Jason E. Miller, Harshad Kasture, and George Kurian. 2010. Graphite: A distributed parallel simulator for multicores. In High Performance Computer Architecture (HPCA), 2010 IEEE 16th International Symposium on. 1--12.Google ScholarGoogle ScholarCross RefCross Ref
  26. B. N. Petrov and F. Csáki. 1973. Information theory: Proceedings of the 2nd International symposium. Akadémiai Kiado. 1973, 1971, 267--281.Google ScholarGoogle Scholar
  27. Timothy Sherwood, Erez Perelman, Greg Hamerly, and Brad Calder. 2002. Automatically characterizing large scale program behavior. In ACM SIGARCH Computer Architecture News. 45--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Shuaiwen Song, Chunyi Su, Barry Rountree, and Kirk W. Cameron. 2013. A simplified and accurate model of power-performance efficiency on emergent GPU architectures. In Parallel 8 Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on. 673--686. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological). (1996), 267--288.Google ScholarGoogle Scholar
  30. Rafael Ubal, Byunghyun Jang, Perhaad Mistry, Dana Schaa, and David Kaeli. 2012. Multi2Sim: A simulation framework for CPU-GPU computing. In Parallel Architectures and Compilation Techniques (PACT), 2012 21st International Conference on. 335--344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Gene Wu, Joseph L. Greathouse, Alexander Lyashevsky, Nuwan Jayasena, and Derek Chiou. 2015. GPGPU performance and power estimation using machine learning. In High Performance Computer Architecture (HPCA), 2015 IEEE 21st International Symposium on. 564--576.Google ScholarGoogle ScholarCross RefCross Ref
  32. Roland E. Wunderlich, Thomas F. Wenisch, Babak Falsafi, and James C. Hoe. 2003. SMARTS: Accelerating microarchitecture simulation via rigorous statistical sampling. In Computer Architecture, 2003. Proceedings. 30th Annual International Symposium on. 84--95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Zhibin Yu, Lieven Eeckhout, Nilanjan Goswami, Tao Li, Lizy John, Hai Jin, and Chengzhong Xu. 2013. Accelerating GPGPU architecture simulation. In ACM SIGMETRICS Performance Evaluation Review. 331--332. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ying Zhang, Yue Hu, Bin Li, and Lu Peng. 2011. Performance and power analysis of ATI GPU: A statistical approach. In Networking, Architecture and Storage (NAS), 2011 6th IEEE International Conference on. 149--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Xinnian Zheng, Lizy K. John, and Andreas Gerstlauer. 2016. Accurate phase-level cross-platform power and performance estimation. In Proceedings of the 53rd Annual Design Automation Conference. 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Hui Zou and Trevor Hastie. 2005. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 (2005), 301--320.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. GPU Performance Estimation using Software Rasterization and Machine Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!