ABSTRACT
Simultaneous Multithreading (SMT) increases processor throughput by allowing the parallel execution of several threads. However, fully sharing processor resources may cause resource monopolization by a single thread or other misallocations, resulting in overall performance degradation. Static resource partitioning techniques have been suggested, but are not as effective as dynamically controlling the resource usage of each thread since program behavior does change during its execution.
In this paper, we propose an Adaptive Resource Partitioning Algorithm (ARPA) that dynamically assigns resources to threads according to thread behavior changes. ARPA analyzes the resource usage efficiency of each thread in a time period and assigns more resources to threads which can use them in a more efficient way. The purpose of ARPA is to improve the efficiency of resource utilization, thereby improving overall instruction throughput. Our simulation results on a set of 42 multiprogramming workloads show that ARPA outperforms the traditional fetch policy ICOUNT by 55.8% with regard to overall instruction throughput and achieves a 33.8% improvement over Static Partitioning. It also outperforms the current best dynamic resource allocation technique, Hill-climbing, by 5.7%. Considering fairness accorded to each thread, ARPA attains 43.6%, 18.5% and 9.2% improvements over ICOUNT, Static Partitioning and Hill-climbing, respectively, using a common fairness metric.
- D. Brooks, V. Tiwari, and M. Martonosi, "Wattch: A Framework for Architectural-level Power Analysis and Optimizations," Proc. 27th Ann. Int'l Symp. Computer Architecture, pp. 83--94, June 2000. Google Scholar
Digital Library
- D. C. Burger and T. M. Austin, "The SimpleScalar Tool Set, Version 2.0," Technical Report CS-TR-1997-1342, University of Wisconsin, Madison, June 1997.Google Scholar
Digital Library
- H. Hirata, K. Kimura, S. Nagamine, Y. Mochizuki, A. Nishimura, Y. Nakase, and T. Nishizawa, "An Elementary Processor Architecture with Simultaneous Instruction Issuing from Multiple Threads," Proc. 19th Ann. Int'l Symp. Computer Architecture, pp. 136--145, May 1992. Google Scholar
Digital Library
- W. Yamamoto and M. Nemirovsky, "Increasing Superscalar Performance Through Multistreaming," Proc. First Int'l Symp. High Performance Computer Architecture, pp. 49--58, June 1995. Google Scholar
Digital Library
- S. J. Eggers, J. S. Emer, H. M. Levy, J. L. Lo, R. L. Stamm, and D. M. Tullsen, "Simultaneous Multithreading: A Platform for Next-Generation Processors," IEEE Micro, vol. 17, no. 5, pp. 12--19, Sept. 1997. Google Scholar
Digital Library
- F. J. Cazorla, E. Fernández, A. Ramírez, and M. Valero, "Improving Memory Latency Aware Fetch Policies for SMT Processors," Proc. Fifth Int'l Symp. High Performance Computing, pp. 70--85, Oct. 2003.Google Scholar
Cross Ref
- F. J. Cazorla, A. Ramírez, M. Valero, and E. Fernández, "Dynamically Controlled Resource Allocation in SMT Processors," Proc. 37th Int'l Symp. Microarchitecture, pp. 171--182, Dec. 2004. Google Scholar
Digital Library
- S. Choi and D. Yeung, "Learning-Based SMT Processor Resource Distribution via Hill-Climbing," Proc. 33rd Ann. Int'l Symp. Computer Architecture, pp. 239--251, June 2006. Google Scholar
Digital Library
- A. El-Moursy and D. H. Albonesi, "Front-End Policies for Improved Issue Efficiency in SMT Processors," Proc. 9th Int'l Symp. High Performance Computer Architecture, pp. 31--40, Feb. 2003. Google Scholar
Digital Library
- K. Luo, J. Gummaraju, and M. Franklin, "Balancing Throughout and Fairness in SMT Processors," Proc. Int'l Symp. Performance Analysis of Systems and Software, pp. 164--171, Nov. 2001.Google Scholar
- D. T. Marr, F. Binns, D. L. Hill, G. Hinton, D. A. Koufaty, J. A. Miller, and M. Upton, "Hyper-Threading Technology Architecture and Microarchitecture," Intel Technology J., vol. 6, no. 1, pp. 4--15, Feb. 2002.Google Scholar
- S. E. Raasch and S. K. Reinhardt, "The Impact of Resource Partitioning on SMT Processors," Proc. 12th Int'l Conf. Parallel Architecture and Compilation Techniques, pp. 15--26, Sept. 2003. Google Scholar
Digital Library
- S. Sair and M. Charney, "Memory Behavior of the SPEC2000 Benchmark Suite," Technical Report, IBM T.J. Watson Research Center, 2000.Google Scholar
- J. J. Sharkey, D. Balkan, and D. Ponomarev, "Adaptive Reorder Buffers for SMT processors," Proc. 15th Int'l Conf. Parallel Architecture and Compilation Techniques, pp. 244--253, Sept. 2006. Google Scholar
Digital Library
- A. Snavely, D. M. Tullsen, and G. M. Voelker, "Symbiotic Jobscheduling with Priorities for a Simultaneous Multithreading Processor," Proc. Int'l Conf. Measurement and Modelling of Computer Systems, pp. 66--76, June 2002. Google Scholar
Digital Library
- D. M. Tullsen, S. J. Eggers, and H. M. Levy, "Simultaneous Multithreading: Maximizing On-Chip Parallelism," Proc. 22nd Ann. Int'l Symp. Computer Architecture, pp. 392--403, June 1995. Google Scholar
Digital Library
- D. M. Tullsen and J. A. Brown, "Handling Long-latency Loads in a Simultaneous Multithreading Processor," Proc. 34th Int'l Symp. Microarchitecture, pp. 318--327, Dec. 2001. Google Scholar
Digital Library
- D.M. Tullsen, S. J. Eggers, J. S. Emer, H. M. Levy, J. L. Lo, and R. L. Stamm, "Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous MultiThreading Processor," Proc. 23rd Ann. Int'l Symp. Computer Architecture, pp. 191--202, May 1996. Google Scholar
Digital Library
- H. Wang, Y. Guo, I. Koren, and C. M. Krishna, "Compiler-Based Adaptive Fetch Throttling for Energy Efficiency," Proc. Int'l Symp. Performance Analysis of Systems and Software, pp. 112--119, Mar. 2006.Google Scholar
- S. Lee and J. Gaudiot, "Throttling-Based Resource Management in High Performance Multithreaded Architectures." IEEE Trans. on Computers, vol. 55, no. 9, pp. 1142--1152, Sept. 2006. Google Scholar
Digital Library
Index Terms
An adaptive resource partitioning algorithm for SMT processors
Recommendations
Utilization-Based Resource Partitioning for Power-Performance Efficiency in SMT Processors
Simultaneous multithreading (SMT) increases processor throughput by allowing parallel execution of several threads. However, fully sharing processor resources may cause resource monopolization by a single thread or other misallocations, resulting in ...
A Dynamic Resource Allocation Optimization for SMT Processors
ICFCC '09: Proceedings of the 2009 International Conference on Future Computer and CommunicationThe threads on simultaneous multithreading (SMT) processors compete for the common resources rather than share them, and meanwhile they take on changing program phases. It is a challenge to meet the changing resource requirements of the threads by ...
The impact of speculative execution on SMT processors
By executing two or more threads concurrently, Simultaneous MultiThreading (SMT) architectures are able to exploit both Instruction-Level Parallelism (ILP) and Thread-Level Parallelism (TLP) from the increased number of in-flight instructions that are ...





Comments