skip to main content
research-article

Power Limitations and Dark Silicon Challenge the Future of Multicore

Published:01 August 2012Publication History
Skip Abstract Section

Abstract

Since 2004, processor designers have increased core counts to exploit Moore’s Law scaling, rather than focusing on single-core performance. The failure of Dennard scaling, to which the shift to multicore parts is partially a response, may soon limit multicore scaling just as single-core scaling has been curtailed. This paper models multicore scaling limits by combining device scaling, single-core scaling, and multicore scaling to measure the speedup potential for a set of parallel workloads for the next five technology generations. For device scaling, we use both the ITRS projections and a set of more conservative device scaling parameters. To model single-core scaling, we combine measurements from over 150 processors to derive Pareto-optimal frontiers for area/performance and power/performance. Finally, to model multicore scaling, we build a detailed performance model of upper-bound performance and lower-bound core power. The multicore designs we study include single-threaded CPU-like and massively threaded GPU-like multicore chip organizations with symmetric, asymmetric, dynamic, and composed topologies. The study shows that regardless of chip organization and topology, multicore scaling is power limited to a degree not widely appreciated by the computing community. Even at 22 nm (just one year from now), 21% of a fixed-size chip must be powered off, and at 8 nm, this number grows to more than 50%. Through 2024, only 7.9× average speedup is possible across commonly used parallel workloads for the topologies we study, leaving a nearly 24-fold gap from a target of doubled performance per generation.

References

  1. Amdahl, G. M. 1967. Validity of the single processor approach to achieving large-scale computing capabilities. In Proceedings of the AFIPS Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Azizi, O., Mahesri, A., Lee, B. C., Patel, S. J., and Horowitz, M. 2010. Energy-performance tradeoffs in processor architecture and circuit design: A marginal cost analysis. In Proceedings of the International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bakhoda, A., Yuan, G. L., Fung, W. W. L., Wong, H., and Aamodt, T. M. 2009. Analyzing CUDA workloads using a detailed GPU simulator. In Proceedings of the 2009 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).Google ScholarGoogle Scholar
  4. Bhadauria, M., Weaver, V., and McKee, S. 2009. Understanding PARSEC performance on contemporary CMPs. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bienia, C., Kumar, S., Singh, J. P., and Li, K. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Borkar, S. 2010. The exascale challenge. In Proceedings of the International Symposium on VLSI Design, Automation and Test (VLSI-DAT).Google ScholarGoogle ScholarCross RefCross Ref
  7. Chakraborty, K. 2008. Over-provisioned multicore systems. Ph.D. thesis, University of Wisconsin-Madison. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cho, S. and Melhem, R. 2008. Corollaries to Amdahl’s law for energy. Comput. Arch. Lett. 7, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chung, E. S., Milder, P. A., Hoe, J. C., and Mai, K. 2010. Single-chip heterogeneous computing: Does the future include custom logic, FPGAs, and GPUs? In Proceedings of MICRO. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Dennard, R. H., Gaensslen, F. H., Rideout, V. L., Bassous, E., and LeBlanc, A. R. 1974. Design of ion-implanted mosfet’s with very small physical dimensions. IEEE J. Solid-State Circ. 9.Google ScholarGoogle Scholar
  11. Dennard, R. H., Cai, J., and Kumar, A. 2007. A perspective on today’s scaling challenges and possible future directions. Solid-State Electron. 5, 4, 518--525.Google ScholarGoogle ScholarCross RefCross Ref
  12. Esmaeilzadeh, H., Cao, T., Yang, X., Blackburn, S. M., and McKinley, K. S. 2011. Looking back on the language and hardware revolutions: Measured power, performance, and scaling. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’11). ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Guz, Z., Bolotin, E., Keidar, I., Kolodny, A., Mendelson, A., and Weiser, U. C. 2009. Many-core vs. many-thread machines: Stay away from the valley. IEEE Comput. Arch. Lett. 8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hardavellas, N., Ferdman, M., Falsafi, B., and Ailamaki, A. 2011. Toward dark silicon in servers. IEEE Micro 31, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hempstead, M., Wei, G.-Y., and Brooks, D. 2009. Navigo: An early-stage model to study power-contrained architectures and specialization. In Proceedings of MoBS.Google ScholarGoogle Scholar
  16. Hill, M. D. and Marty, M. R. 2008. Amdahl’s law in the multicore era. Computer 41, 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Horowitz, M., Alon, E., Patil, D., Naffziger, S., Kumar, R., and Bernstein, K. 2005. Scaling, power, and the future of CMOS. In Proceedings of the 2005 International Electron Devices Meeting (IEDM).Google ScholarGoogle Scholar
  18. Ipek, E., Kirman, M., Kirman, N., and Martinez, J. F. 2007. Core fusion: Accommodating software diversity in chip multiprocessors. In Proceedings of the International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. ITRS. 2011. International technology roadmap for semiconductors, 2010 update.Google ScholarGoogle Scholar
  20. Kim, C., Sethumadhavan, S., Govindan, M. S., Ranganathan, N., Gulati, D., Burger, D., and Keckler, S. W. 2007. Composable lightweight processors. In Proceedings of MICRO. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Lee, J.-G., Jung, E., and Shin, W. 2009. An asymptotic performance/energy analysis and optimization of multi-core architectures. In Proceedings of the International Conference on Distributed Computing and Networking (ICDCN). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Lee, V. W., Kim, E., Chhugani, J., Deisher, M., Kim, D., Nguyen, A. D., Satish, N., Smelyanskiy, M., Chennupaty, S., Hammarlund, P., Singhal, R., and Dubey, P. 2010. Debunking the 100X GPU vs. CPU myth: An evaluation of throughput computing on CPU and GPU. In Proceedings of the 37th Annual International Symposium on Computer Architecture (ISCA’10). ACM, New York, 451--460. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Loh, G. 2008. The cost of uncore in throughput-oriented many-core processors. In Proceedings of the Workshop on Architectures and Languages for Troughput Applications (ALTA).Google ScholarGoogle Scholar
  24. Moore, G. E. 1965. Cramming more components onto integrated circuits. Electronics 38, 8.Google ScholarGoogle Scholar
  25. Nose, K. and Sakurai, T. 2000. Optimization of VDD and VTH for low-power and high speed applications. In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Pollack, F. 1999. New microarchitecture challenges in the coming generations of CMOS process technologies. In Proceedings of MICRO. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. SPEC. 2011. Standard performance evaluation corporation.Google ScholarGoogle Scholar
  28. Suleman, A. M., Mutlu, O., Qureshi, M. K., and Patt, Y. N. 2009. Accelerating critical section execution with asymmetric multi-core architectures. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Venkatesh, G., Sampson, J., Goulding, N., Garcia, S., Bryksin, V., Lugo-Martinez, J., Swanson, S., and Taylor, M. B. 2010. Conservation cores: Reducing the energy of mature computations. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Woo, D. H. and Lee, H.-H. S. 2008. Extending Amdahl’s law for energy-efficient computing in the many-core era. Computer 41, 12. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Power Limitations and Dark Silicon Challenge the Future of Multicore

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Computer Systems
        ACM Transactions on Computer Systems  Volume 30, Issue 3
        August 2012
        97 pages
        ISSN:0734-2071
        EISSN:1557-7333
        DOI:10.1145/2324876
        Issue’s Table of Contents

        Copyright © 2012 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 August 2012
        • Accepted: 1 May 2012
        • Received: 1 March 2012
        Published in tocs Volume 30, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!