skip to main content
poster

A microbenchmark to study GPU performance models

Published:10 February 2018Publication History
Skip Abstract Section

Abstract

Basic microarchitectural features of NVIDIA GPUs have been stable for a decade, and many analytic solutions were proposed to model their performance. We present a way to review, systematize, and evaluate these approaches by using a microbenchmark. In this manner, we produce a brief algebraic summary of key elements of selected performance models, identify patterns in their design, and highlight their previously unknown limitations. Also, we identify a potentially superior method for estimating performance based on classical work.

References

  1. Denning, P. J., and Buzen, J. P. 1978. The Operational Analysis of Queuing Network Models. ACM Computing Surveys 10, 3, 225--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Hong, S., and Kim, H. 2009. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In International Symposium on Computer Architecture (ISCA '09), 152--163. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Chen, X. E., and Aamodt, T. M. 2009. A first-order fine-grained multithreaded throughput model. In International Symposium on High Performance Computer Architecture (HPCA '09), 329--340.Google ScholarGoogle Scholar
  4. Huang, J.-C., Lee, J. H., Kim, H., and Lee, H.-H. S. 2014. GPUMech: GPU performance modeling technique based on interval analysis. In International Symposium on Microarchitecture (MICRO-47), 268--279. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sim, J., Dasgupta, A., Kim, H., and Vuduc, R. 2012. A performance analysis framework for identifying potential benefits in GPGPU applications. In Symposium on Principles and Practice of Parallel Programming (PPoPP '12), 11--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Zhang, Y., and Owens, J. D. 2011. A quantitative performance analysis model for GPU architectures. In International Symposium on High Performance Computer Architecture (HPCA '11), 382--393. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Baghsorkhi, S. S., Delahaye, M., Patel, S. J., Gropp, W. D., and Hwu, W. W. 2010. An adaptive performance modeling tool for GPU architectures. In Symposium on Principles and Practice of Parallel Programming (PPoPP '10), 105--114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. NVIDIA. 2017. CUDA C Programming Guide v9.1. November 2017.Google ScholarGoogle Scholar
  9. Saavedra-Barrera, R., Culler, D., and von Eicken, T. 1990. Analysis of multithreaded architectures for parallel computing. In Symposium on Parallel Algorithms and Architectures (SPAA '90), 169--178. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 53, Issue 1
    PPoPP '18
    January 2018
    426 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/3200691
    Issue’s Table of Contents
    • cover image ACM Conferences
      PPoPP '18: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
      February 2018
      442 pages
      ISBN:9781450349826
      DOI:10.1145/3178487

    Copyright © 2018 Owner/Author

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 10 February 2018

    Check for updates

    Qualifiers

    • poster

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader
About Cookies On This Site

We use cookies to ensure that we give you the best experience on our website.

Learn more

Got it!