skip to main content
research-article

On the theory and potential of LRU-MRU collaborative cache management

Published:04 June 2011Publication History
Skip Abstract Section

Abstract

The goal of cache management is to maximize data reuse. Collaborative caching provides an interface for software to communicate access information to hardware. In theory, it can obtain optimal cache performance.

In this paper, we study a collaborative caching system that allows a program to choose different caching methods for its data. As an interface, it may be used in arbitrary ways, sometimes optimal but probably suboptimal most times and even counter productive. We develop a theoretical foundation for collaborative caches to show the inclusion principle and the existence of a distance metric we call LRU-MRU stack distance. The new stack distance is important for program analysis and transformation to target a hierarchical collaborative cache system rather than a single cache configuration. We use 10 benchmark programs to show that optimal caching may reduce the average miss ratio by 24%, and a simple feedback-driven compilation technique can utilize collaborative cache to realize 50% of the optimal improvement.

References

  1. The LLVM Compiler Infrastructure. http://llvm.org/.Google ScholarGoogle Scholar
  2. SciMark2.0. http://math.nist.gov/scimark2/.Google ScholarGoogle Scholar
  3. SPEC CPU2000. http://www.spec.org/cpu2000.Google ScholarGoogle Scholar
  4. SPEC CPU2006. http://www.spec.org/cpu2006.Google ScholarGoogle Scholar
  5. IA-64 Application Developer's Architecture Guide. May 1999.Google ScholarGoogle Scholar
  6. L. A. Belady. A study of replacement algorithms for a virtual-storage computer. IBM Systems Journal, 1966. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. A. Belady, R. A. Nelson, and G. S. Shedler. An anomaly in space-time characteristics of certain programs running in a paging machine. Communications of ACM, 1969. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. Beyls and E. D'Hollander. Reuse distance-based cache hint selection. In Proceedings of the 8th International Euro-Par Conference, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Beyls and E. D'Hollander. Generating cache hints for improved program efficiency. Journal of Systems Architecture, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Cascaval and D. A. Padua. Estimating cache misses and locality using stack distances. In International Conference on Supercomputing, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Chauhan and C.-Y. Shei. Static reuse distances for locality-based optimizations in MATLAB. In International Conference on Supercomputing, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. F. Chen, S. Jiang, and X. Zhang. CLOCK-Pro: an effective improvement of the CLOCK replacement. In Proceedings of USENIX Annual Technical Conference, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Ding and K. Kennedy. Improving effective bandwidth through compiler enhancement of global cache reuse. Journal of Parallel and Distributed Computing, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. X. Ding, K. Wang, and X. Zhang. ULCC: a user-level facility for optimizing shared cache performance on multicores. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Fang, S. Carr, S. Önder, and Z. Wang. Instruction based memory distance analysis and its application. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. X. Gu, T. Bai, Y. Gao, C. Zhang, R. Archambault, and C. Ding. P-OPT: Program-directed optimal cache management. In Proceedings of the Workshop on Languages and Compilers for Parallel Computing, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. D. Hill and A. J. Smith. Evaluating associativity in CPU caches. IEEE Transactions on Computers, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Jiang and X. Zhang. LIRS: an efficient low inter-reference recency set replacement to improve buffer cache performance. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. K. Kennedy and K. S. McKinley. Typed fusion with applications to parallel and sequential code generation. Technical Report TR93-208, Dept. of Computer Science, Rice University, 1993.Google ScholarGoogle Scholar
  21. S. M. Khan, D. A. Jiménez, D. Burger, and B. Falsafi. Using dead blocks as a virtual victim cache. In Proceedings of the 19th international conference on Parallel architectures and compilation techniques, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A.-C. Lai, C. Fide, and B. Falsafi. Dead-block prediction & dead-block correlating prefetchers. In Proceedings of the International Symposium on Computer Architecture, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. L. Mattson, J. Gecsei, D. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM System Journal, 1970. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. K. S. McKinley, S. Carr, and C.-W. Tseng. Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. E. Petrank and D. Rawitz. The hardness of cache conscious data placement. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. K. Qureshi, A. Jaleel, Y. N. Patt, S. C. S. Jr., and J. S. Emer. Adaptive insertion policies for high performance caching. In Proceedings of the International Symposium on Computer Architecture, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. B. Sinharoy, R. N. Kalla, J. M. Tendler, R. J. Eickemeyer, and J. B. Joyner. Power5 system microarchitecture. IBM J. Res. Dev., 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. Smaragdakis, S. Kaplan, and P. Wilson. The EELRU adaptive replacement algorithm. Perform. Eval., 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. K. So and R. N. Rechtschaffen. Cache operations by MRU change. IEEE Transactions on Computers, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. R. A. Sugumar and S. G. Abraham. Efficient simulation of caches under optimal replacement with applications to miss characterization. In Proceedings of the ACM SIGMETRICS Conference on Measurement & Modeling Computer Systems, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Z. Wang, K. S. McKinley, A. L.Rosenberg, and C. C. Weems. Using the compiler to improve cache replacement decisions. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. E. Wolf and M. Lam. A data locality optimizing algorithm. In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. L. Xiang, T. Chen, Q. Shi, and W. Hu. Less reused filter: improving L2 cache performance via filtering less reused lines. In Proceedings of the 23rd international conference on Supercomputing, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. T. Yang, E. D. Berger, S. F. Kaplan, and J. E. B. Moss. Cramm: Virtual memory support for garbage-collected applications. In Proceedings of the Symposium on Operating Systems Design and Implementation, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. Zahran and S. A. McKee. Global management of cache hierarchies. In Proceedings of the 7th ACM international conference on Computing frontiers, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. C. Zhang and M. Hirzel. Online phase-adaptive data layout selection. In Proceedings of the European Conference on Object-Oriented Programming, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Y. Zhong, X. Shen, and C. Ding. Program locality analysis using reuse distance. ACM Transactions on Programming Languages and Systems, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. P. Zhou, V. Pandey, J. Sundaresan, A. Raghuraman, Y. Zhou, and S. Kumar. Dynamic tracking of page miss ratio curve for memory management. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. On the theory and potential of LRU-MRU collaborative cache management

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 46, Issue 11
      ISMM '11
      November 2011
      135 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2076022
      Issue’s Table of Contents
      • cover image ACM Conferences
        ISMM '11: Proceedings of the international symposium on Memory management
        June 2011
        148 pages
        ISBN:9781450302630
        DOI:10.1145/1993478

      Copyright © 2011 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 June 2011

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!