skip to main content
10.1145/1454115.1454142acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article

Multitasking workload scheduling on flexible-core chip multiprocessors

Published:25 October 2008Publication History

ABSTRACT

While technology trends have ushered in the age of chip multiprocessors (CMP), a fundamental question is what size to make each core. Most current commercial designs are symmetric CMPs (SCMP) in which each core is identical and range from a simple RISC processor to a complex out-of-order x86 processor. Some researchers have proposed asymmetric CMPs (ACMP) consisting of multiple types of cores. While less of an issue for ACMPs, the fixed nature of both these architectures makes them vulnerable to mismatches between the granularity of the cores and the parallelism in the workload, which can cause inefficient execution. To remedy this weakness, recent research has proposed flexible-core CMPs (FCMP), which have the capability of aggregating multiple small processing cores to form larger logical processors. FCMPs introduce a new resource allocation and scheduling problem which must determine how many logical processors should be configured, how powerful each processor should be, and where/when each task should run. This paper introduces and motivates this problem, describes the challenges associated with it, and evaluates algorithms appropriate for multitasking on FCMPs. We also evaluate static-core CMPs of various configurations and compare them to FCMPs for various multitasking workloads.

References

  1. M. Annavaram, E. Grochowski, and J. Shen. Mitigating Amdahl's Law Through EPI Throttling. In International Symposium on Computer Architecture, pages 298--309, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Balakrishnan, R. Rajwar, M. Upton, and K. Lai. The Impact of Performance Asymmetry in Emerging Multicore Architectures. In International Symposium on Computer Architecture, pages 506--517, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Burger, S. Keckler, K. McKinley, M. Dahlin, L. John, C. Lin, C. Moore, J. Burrill, R. McDonald, and W. Yoder. Scaling to the End of Silicon with EDGE Architectures. IEEE Computer, 37(7):44--55, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Corbalan, X. Martorell, and J. Labarta. Performance-Driven Processor Allocation. IEEE Transactions on Parallel and Distributed Systems, 16(7):599--611, July 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Dorsey, S. Searles, M. Ciraula, E. Fang, S. Johnson, N. Bujanos, R. Kumar, D. Wu, M. Braganza, and S. Meyers. An Integrated Quad-Core Opteron(TM) Processor. In IEEE International Solid-State Circuits Conference, pages 102--103, February 2007.Google ScholarGoogle Scholar
  6. D. Feitelson, L. Rudolph, and U. Schwiegelshohn. Parallel Job Scheduling -- A Status Report. In Workshop on Job Scheduling Strategies for Parallel Processing, June 2004.Google ScholarGoogle Scholar
  7. D. G. Feitelson. Job Scheduling in Multiprogrammed Parallel Systems. Technical Report RC 19790 (87657), IBM Research, August 1997.Google ScholarGoogle Scholar
  8. D. G. Feitelson and L. Rudolph. Metrics and Benchmarking for Parallel Job Scheduling. In Workshop on Job Scheduling Strategies for Parallel Processing, pages 1--24, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Ghiasi and D. Grunwald. Aide de Camp: Asymmetric Dual Core Design for Power and Energy Reduction. Technical Report CU-CS-964-03, The University of Colorado, Department of Computer Science, 2003.Google ScholarGoogle Scholar
  10. E. Grochowski, R. Ronen, J. Shen, and H. Wang. Best of Both Latency and Throughput. In International Conference on Computer Design, pages 236--243, October 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. T. Ibaraki and N. Katoh. Resource Allocation Problems: Algorithmic Approaches. MIT Press, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. E. Ipek, M. Kirman, N. Kirman, and J. F. Martínez. Core Fusion: Accommodating Software Diversity in Chip Multiprocessors. In International Symposium on Computer Architecture, pages 186--197, June 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Kim, D. Burger, and S. W. Keckler. An Adaptive, Non-Uniform Cache Structure for Wire-Delay Dominated On-Chip Caches. In International Conference on Architectural Support for Programming Languages and Operating Systems, pages 211--222, October 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Kim, S. Sethumadhavan, M. Govindan, N. Ranganathan, D. Gulati, D. Burger, and S. W. Keckler. Composable Lightweight Processors. In International Symposium on Microarchitecture, pages 381--394, December 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Kumar, K. Farkas, N. Jouppi, P. Ranganathan, and D. Tullsen. Single-ISA Heterogeneous Multi-core Architectures: The Potential for Processor Power Reduction. In International Symposium on Microarchitecture, pages 81--92, December 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Kumar, D. M. Tullsen, P. Ranganathan, N. P. Jouppi, and K. I. Farkas. Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance. In International Symposium on Computer Architecture, pages 64--75, June 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. U. Nawathe, M. Hassan, K. Yen, L. Warriner, B. Upputuri, D. Greenhill, A. Kumar, and H. Park. An 8-Core 64-Thread 64b Power-Efficient SPARC SoC. In IEEE International Solid-State Circuits Conference, pages 108--109, February 2007.Google ScholarGoogle Scholar
  18. D. Pham, T. Aipperspach, D. Boerstler, M. Bolliger, R. Chaudhry, D. Cox, P. Harvey, P. Harvey, H. Hofstee, C. Johns, J. Kahle, A. Kameyama, J. Keaty, Y. Masubuchi, M. Pham, J. Pille, S. Posluszny, M. Riley, D. Stasiak, M. Suzuoki, O. Takahashi, J. Warnock, S. Weitzel, D. Wendel, and K. Yazawa. Overview of the Architecture, Circuit Design, and Physical Implementation of a First-Generation Cell Processor. IEEE Journal of Solid-State Circuits, 41(1):179--196, January 2006.Google ScholarGoogle ScholarCross RefCross Ref
  19. T. Sherwood, E. Perelman, G. Hamerly, S. Sair, and B. Calder. Discovering and Exploiting Program Phases. IEEE Micro, 23(6):84--93, November/December 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. Tarjan, M. Boyer, and K. Skadron. Federation: Out-of-Order Execution Using Simple In-Order Cores. Technical Report CS-2007-11, University of Virginia, Department of Computer Science, August 2007.Google ScholarGoogle Scholar
  21. S. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, P. Iyer, A. Singh, T. Jacob, S. Jain, S. Venkataraman, Y. Hoskote, and N. Borkar. An 80-Tile 1.28 TFLOPS Network-on-Chip in 65nm CMOS. In IEEE International Solid-State Circuits Conference, pages 98--99, February 2007.Google ScholarGoogle Scholar
  22. H. Zhong, S. A. Lieberman, and S. A. Mahlke. Extending Multicore Architectures to Exploit Hybrid Parallelism in Single-thread Applications. In International Symposium on High Performance Computer Architecture, pages 25--36, February 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multitasking workload scheduling on flexible-core chip multiprocessors

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PACT '08: Proceedings of the 17th international conference on Parallel architectures and compilation techniques
      October 2008
      328 pages
      ISBN:9781605582825
      DOI:10.1145/1454115

      Copyright © 2008 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 October 2008

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate121of471submissions,26%

      Upcoming Conference

      PACT '24
      International Conference on Parallel Architectures and Compilation Techniques
      October 14 - 16, 2024
      Southern California , CA , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader