Abstract
Collaborative caching allows software to use hints to influence cache management in hardware. Previous theories have shown that such hints observe the inclusion property and can obtain optimal caching if the access sequence and the cache size are known ahead of time. Previously, the interface of a cache hint is limited, e.g., a binary choice between LRU and MRU.
In this paper, we generalize the hint interface, where a hint is a number encoding a priority. We show the generality in a hierarchical relation where collaborative caching subsumes non-collaborative caching, and within collaborative caching, the priority hint subsumes the previous binary hint. We show two theoretical results for the general hint. The first is a new cache replacement policy, priority LRU, which permits the complete range of choices between MRU and LRU. We prove a new type of inclusion property---non-uniform inclusion---and give a one-pass algorithm to compute the miss rate for all cache sizes. Second, we show that priority hints can enable the use of the same hints to obtain optimal caching for all cache sizes, without having to know the cache size beforehand.
- IA-64 Application Developer';s Architecture Guide. May 1999.Google Scholar
- L. A. Belady, R. A. Nelson, and G. S. Shedler. An anomaly in space-time characteristics of certain programs running in a paging machine. Communications of ACM, 12(6):349--353, 1969. Google Scholar
Digital Library
- K. Beyls and E. D'Hollander. Reuse distance-based cache hint selection. In Proceedings of the 8th International Euro-Par Conference, Paderborn, Germany, Aug. 2002. Google Scholar
Digital Library
- K. Beyls and E. D'Hollander. Generating cache hints for improved program efficiency. Journal of Systems Architecture, 51(4):223--250, 2005. Google Scholar
Digital Library
- C. Cascaval and D. A. Padua. Estimating cache misses and locality using stack distances. In Proceedings of the International Conference on Supercomputing, pages 150--159, 2003. Google Scholar
Digital Library
- A. Chauhan and C.-Y. Shei. Static reuse distances for locality-based optimizations in MATLAB. In Proceedings of the International Conference on Supercomputing, pages 295--304, 2010. Google Scholar
Digital Library
- C. Ding and K. Kennedy. Improving effective bandwidth through compiler enhancement of global cache reuse. Journal of Parallel and Distributed Computing, 64(1):108--134, 2004. Google Scholar
Digital Library
- X. Ding, K. Wang, and X. Zhang. ULCC: a user-level facility for optimizing shared cache performance on multicores. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 103--112, 2011. Google Scholar
Digital Library
- C. Fang, S. Carr, S. Önder, and Z. Wang. Instruction based memory distance analysis and its application. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, pages 27--37, 2005. Google Scholar
Digital Library
- M. Feng, C. Tian, C. Lin, and R. Gupta. Dynamic access distance driven cache replacement. ACM Transactions on Architecture and Code Optimization, 8(3):14, 2011. Google Scholar
Digital Library
- X. Gu, T. Bai, Y. Gao, C. Zhang, R. Archambault, and C. Ding. P-OPT: Program-directed optimal cache management. In Proceedings of the Workshop on Languages and Compilers for Parallel Computing, pages 217--231, 2008. Google Scholar
Digital Library
- X. Gu and C. Ding. On the theory and potential of LRU-MRU collaborative cache management. In Proceedings of the International Symposium on Memory Management, pages 43--54, 2011. Google Scholar
Digital Library
- M. D. Hill. Aspects of cache memory and instruction buffer performance. PhD thesis, University of California, Berkeley, Nov. 1987. Google Scholar
Digital Library
- S. Jiang, F. Chen, and X. Zhang. CLOCK-Pro: An effective improvement of the clock replacement. In USENIX Annual Technical Conference, General Track, pages 323--336, 2005. Google Scholar
Digital Library
- S. Jiang and X. Zhang. LIRS: an efficient low inter-reference recency set replacement to improve buffer cache performance. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems, Marina Del Rey, California, June 2002. Google Scholar
Digital Library
- K. Kennedy and K. S. McKinley. Typed fusion with applications to parallel and sequential code generation. Technical Report TR93-208, Dept. of Computer Science, Rice University, Aug. 1993. (also available as CRPC-TR94370).Google Scholar
- S. M. Khan, D. A. Jiménez, D. Burger, and B. Falsafi. Using dead blocks as a virtual victim cache. In Proceedings of the 19th international conference on Parallel architectures and compilation techniques, PACT '10, pages 489--500, New York, NY, USA, 2010. ACM. Google Scholar
Digital Library
- A.-C. Lai, C. Fide, and B. Falsafi. Dead-block prediction & dead-block correlating prefetchers. In ISCA, pages 144--154, 2001. Google Scholar
Digital Library
- R. L. Mattson, J. Gecsei, D. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM System Journal, 9(2):78--117, 1970. Google Scholar
Digital Library
- K. S. McKinley, S. Carr, and C.-W. Tseng. Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems, 18(4):424--453, July 1996. Google Scholar
Digital Library
- E. Petrank and D. Rawitz. The hardness of cache conscious data placement. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Portland, Oregon, Jan. 2002. Google Scholar
Digital Library
- M. K. Qureshi, A. Jaleel, Y. N. Patt, S. C. S. Jr., and J. S. Emer. Adaptive insertion policies for high performance caching. In Proceedings of the International Symposium on Computer Architecture, pages 381--391, San Diego, California, USA, June 2007. Google Scholar
Digital Library
- S. Rus, R. Ashok, and D. X. Li. Automated locality optimization based on the reuse distance of string operations. In Proceedings of the International Symposium on Code Generation and Optimization, pages 181--190, 2011. Google Scholar
Digital Library
- D. L. Schuff, M. Kulkarni, and V. S. Pai. Accelerating multicore reuse distance analysis with sampling and parallelization. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, pages 53--64, 2010. Google Scholar
Digital Library
- B. Sinharoy, R. N. Kalla, J. M. Tendler, R. J. Eickemeyer, and J. B. Joyner. Power5 system microarchitecture. IBM J. Res. Dev., 49:505--521, July 2005. Google Scholar
Digital Library
- Y. Smaragdakis, S. Kaplan, and P. Wilson. The EELRU adaptive replacement algorithm. Perform. Eval., 53(2):93--123, 2003. Google Scholar
Digital Library
- K. So and R. N. Rechtschaffen. Cache operations by MRU change. IEEE Transactions on Computers, 37(6):700--709, 1988. Google Scholar
Digital Library
- R. A. Sugumar and S. G. Abraham. Efficient simulation of caches under optimal replacement with applications to miss characterization. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems, Santa Clara, CA, May 1993. Google Scholar
Digital Library
- Z. Wang, K. S. McKinley, A. L.Rosenberg, and C. C. Weems. Using the compiler to improve cache replacement decisions. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, Charlottesville, Virginia, 2002. Google Scholar
Digital Library
- M. E. Wolf and M. Lam. A data locality optimizing algorithm. In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, Toronto, Canada, June 1991. Google Scholar
Digital Library
- L. Xiang, T. Chen, Q. Shi, and W. Hu. Less reused filter: improving L2 cache performance via filtering less reused lines. In Proceedings of the 23rd international conference on Supercomputing, ICS '09, pages 68--79, New York, NY, USA, 2009. ACM. Google Scholar
Digital Library
- T. Yang, E. D. Berger, S. F. Kaplan, and J. E. B. Moss. CRAMM: Virtual memory support for garbage-collected applications. In Proceedings of the Symposium on Operating Systems Design and Implementation, pages 103--116, 2006. Google Scholar
Digital Library
- X. Yang, S. M. Blackburn, D. Frampton, J. B. Sartor, and K. S. McKinley. Why nothing matters: the impact of zeroing. In OOPSLA, pages 307--324, 2011. Google Scholar
Digital Library
- M. Zahran and S. A. McKee. Global management of cache hierarchies. In Proceedings of the 7th ACM international conference on Computing frontiers, CF '10, pages 131--140, New York, NY, USA, 2010. ACM. Google Scholar
Digital Library
- C. Zhang and M. Hirzel. Online phase-adaptive data layout selection. In Proceedings of the European Conference on Object-Oriented Programming, pages 309--334, 2008. Google Scholar
Digital Library
- Y. Zhong and W. Chang. Sampling-based program locality approximation. In Proceedings of the International Symposium on Memory Management, pages 91--100, 2008. Google Scholar
Digital Library
- Y. Zhong, X. Shen, and C. Ding. Program locality analysis using reuse distance. ACM Transactions on Programming Languages and Systems, 31(6):1--39, Aug. 2009. Google Scholar
Digital Library
- P. Zhou, V. Pandey, J. Sundaresan, A. Raghuraman, Y. Zhou, and S. Kumar. Dynamic tracking of page miss ratio curve for memory management. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pages 177--188, 2004. Google Scholar
Digital Library
Index Terms
A generalized theory of collaborative caching
Recommendations
A generalized theory of collaborative caching
ISMM '12: Proceedings of the 2012 international symposium on Memory ManagementCollaborative caching allows software to use hints to influence cache management in hardware. Previous theories have shown that such hints observe the inclusion property and can obtain optimal caching if the access sequence and the cache size are known ...
Pacman: program-assisted cache management
ISMM '13: Proceedings of the 2013 international symposium on memory managementAs caches become larger and shared by an increasing number of cores, cache management is becoming more important. This paper explores collaborative caching, which uses software hints to influence hardware caching. Recent studies have shown that such ...
Pacman: program-assisted cache management
ISMM '13: Proceedings of the 2013 international symposium on memory managementAs caches become larger and shared by an increasing number of cores, cache management is becoming more important. This paper explores collaborative caching, which uses software hints to influence hardware caching. Recent studies have shown that such ...







Comments