ABSTRACT
One of the essential features in modern computer systems is context switching, which allows multiple threads of execution to time-share a limited number of processors. While very useful, context switching can introduce high performance overheads, with one of the primary reasons being the cache perturbation effect. Between the time a thread is switched out and when it resumes execution, parts of its working set in the cache may be perturbed by other interfering threads, leading to (context switch) cache misses to recover from the perturbation.
The goal of this paper is to understand how cache parameters and application behavior influence the number of context switch misses the application suffers from. We characterize a previously-unreported type of context switch misses that occur as the artifact of the interaction of cache replacement policy and an application's temporal reuse behavior. We characterize the behavior of these "reordered misses" for various applications, cache sizes, and the amount of cache perturbation. As a second contribution, we develop an analytical model that reveals the mathematical relationship between cache design parameters, an application's temporal reuse pattern, and the number of context switch misses the application suffers from. We validate the model against simulation studies and find that it is accurate in predicting the trends of context switch misses. The mathematical relationship provided by the model allows us to derive insights into precisely why some applications are more vulnerable to context switch misses than others. Through a case study, we also find that prefetching tends to aggravate the number of context switch misses.
- A. Agarwal and J. Hennessy and M. Horowitz. An Analytical Cache Model. ACM Trans. on Computer Systems, 7(2):184--215, 1989. Google Scholar
Digital Library
- A. Agarwal, J. Hennessy, and M. Horowitz. Cache Performance of Operating System and Multiprogramming Workloads. ACM Trans. on Computer Systems, 6(4):393--431, 1988. Google Scholar
Digital Library
- C. Cascaval and David A. Padua. Estimating Cache Misses and Locality Using Stack Distances. In Proc. of 17th Intl. Conf. on Supercomputing, pages 150--159, 2003. Google Scholar
Digital Library
- C. Cascaval, L. DeRose, D. A. Padua, and D. Reed. Compile-Time Based Performance Prediction. In Proc. of the 12th Intl. Workshop on Languages and Compilers for Parallel Computing, pages 365--379, 1999. Google Scholar
Digital Library
- F. M. David, J. Carlyle, and R. H. Campbell. Context Switch Overheads for Linux on ARM Platforms. In Proc. of the 2007 Workshop on Experimental Computer Science, 2007. Google Scholar
Digital Library
- R. Fromm and N. Treuhaft. Revisiting the Cache Interference Costs of Context Switches. http://citeseer.ist.psu.edu/252861.html, 1996.Google Scholar
- G. Edward Suh and S. Devadas and L. Rudolph. Analytical Cache Models with Applications to Cache Partitioning. In Proc. of Intl. Conf. on Supercomputing, pages 1--12, 2001. Google Scholar
Digital Library
- G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker and P. Roussel. The Microarchitecture of the Pentium 4 Processor. Intel Technology Journal, (1Q), 2001.Google Scholar
- F. Guo and Y. Solihin. An Analytical Model for Cache Replacement Policy Performance. In Proc. of the ACM SIGMETRICS/Performance 2006 Joint Intl. Conf. on Measurement and Modeling of Computer System, pages 228--239, 2006. Google Scholar
Digital Library
- IBM. IBM Power4 System Architecture White Paper, 2002.Google Scholar
- N. Jouppi. Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers. In Proc. of the 17th Intl. Symp. on Computer Architecture, pages 364--373, 1990. Google Scholar
Digital Library
- P. Koka and M. H. Lipasti. Opportunities for Cache Friendly Process Scheduling. In Workshop on Interaction Between Operating Systems and Computer Architecture, 2005.Google Scholar
- H. Kwak, B. Lee, A. Hurson, S. Yoon, and W. Hahn. Effects of Multithreading on Cache Performance. IEEE Trans. on Computers, 48(2):176--184, 1999. Google Scholar
Digital Library
- C. Li, C. Ding, and K. Shen. Quantifying The Cost of Context Switch. In Proc. of the 2007 Workshop on Experimental Computer Science, 2007. Google Scholar
Digital Library
- P. S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A Full System Simulation Platform. IEEE Computer Society, 35(2):50--58, 2002. Google Scholar
Digital Library
- R. L. Mattson, J. Gecsei, D. Slutz, and I. Traiger. Evaluation Techniques for Storage Hierarchies. IBM Systems Journal, 9(2):78--117, 1970.Google Scholar
Digital Library
- J. Mogul and A. Borg. The Effect of Context Switches on Cache Performance. In Proc. of the 4th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, pages 75--84, 1991. Google Scholar
Digital Library
- Standard Performance Evaluation Corporation. Spec cpu2006 benchmarks. http://www.spec.org, 2006.Google Scholar
- G. E. Suh, S. Devadas, and L. Rudolph. A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning. In Proc. of Intl. Symp. on High Performance Computer Architecture, pages 117--126, 2002. Google Scholar
Digital Library
- D. Thiebaut and H. S. Stone. Footprints in the Cache. ACM Trans. on Computer Systems, 5(4):305--329, 1987. Google Scholar
Digital Library
- D. Tsafrir. The Context-Switching Overhead Inflicted by Handling Hardware Interrupts. In Proc. of the 2007 Workshop on Experimental Computer Science, 2007. Google Scholar
Digital Library
Index Terms
Characterizing and modeling the behavior of context switch misses
Recommendations
Understanding the behavior and implications of context switch misses
One of the essential features in modern computer systems is context switching, which allows multiple threads of execution to time-share a limited number of processors. While very useful, context switching can introduce high performance overheads, with ...
Runtime identification of cache conflict misses: The adaptive miss buffer
This paper describes the miss classification table, a simple mechanism that enables the processor or memory controller to identify each cache miss as either a conflict miss or a capacity (non-conflict) miss. The miss classification table works by ...





Comments