skip to main content
research-article
Public Access

Fast Miss Ratio Curve Modeling for Storage Cache

Authors Info & Claims
Published:12 April 2018Publication History
Skip Abstract Section

Abstract

The reuse distance (least recently used (LRU) stack distance) is an essential metric for performance prediction and optimization of storage cache. Over the past four decades, there have been steady improvements in the algorithmic efficiency of reuse distance measurement. This progress is accelerating in recent years, both in theory and practical implementation.

In this article, we present a kinetic model of LRU cache memory, based on the average eviction time (AET) of the cached data. The AET model enables fast measurement and use of low-cost sampling. It can produce the miss ratio curve in linear time with extremely low space costs. On storage trace benchmarks, AET reduces the time and space costs compared to former techniques. Furthermore, AET is a composable model that can characterize shared cache behavior through sampling and modeling individual programs or traces.

References

  1. Arnold O. Allen. 2014. Probability, Statistics, and Queueing Theory. Academic Press.Google ScholarGoogle Scholar
  2. George Almasi, Calin Cascaval, and David A. Padua. 2002. Calculating stack distances efficiently. In Proceedings of the ACM SIGPLAN Workshop on Memory System Performance. Berlin, Germany, 37--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Nathan Beckmann and Daniel Sanchez. 2017. Maximizing cache performance under uncertainty. In Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA’17). IEEE, 109--120.Google ScholarGoogle ScholarCross RefCross Ref
  4. Kristof Beyls and Erik H. D’Hollander. 2006. Discovery of locality-improving refactoring by reuse path analysis. In Proceedings of High Performance Computing and Communications. Springer. Lecture Notes in Computer Science, Vol. 4208. 220--229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Hjortur Bjornsson, Gregory Chockler, Trausti Saemundsson, and Ymir Vigfusson. 2013. Dynamic performance profiling of cloud caches. In Proceedings of the 4th Annual Symposium on Cloud Computing. ACM, 59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jacob Brock, Yechen Li, Chencheng Ye, and Chen Ding. 2015. Optimal cache partition-sharing: Don’t ever take a fence down until you know why it was put up—Robert Frost. In Proceedings of the International Conference on Parallel Processing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Mustafa Canim, George A. Mihaila, Bishwaranjan Bhattacharjee, Kenneth A. Ross, and Christian A. Lang. 2010. SSD bufferpool extensions for database systems. Proceedings of the VLDB Endowment 3, 1--2 (2010), 1435--1446. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Calin Cascaval, Evelyn Duesterwald, Peter F. Sweeney, and Robert W. Wisniewski. 2005. Multiple page size modeling and optimization. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques. 339--349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Dhruba Chandra, Fei Guo, Seongbeom Kim, and Yan Solihin. 2005. Predicting inter-thread cache contention on a chip multi-processor architecture. In Proceedings of the 11th International Symposium on High-Performance Computer Architecture (HPCA’11). IEEE, 340--351. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Zhifeng Chen, Yuanyuan Zhou, and Kai Li. 2003. Eviction-based cache placement for storage caches. In Proceedings of the USENIX Annual Technical Conference, General Track. 269--281.Google ScholarGoogle Scholar
  11. Edward Grady Coffman and Peter J. Denning. 1973. Operating Systems Theory. Vol. 973. Prentice-Hall Englewood Cliffs, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Peter J. Denning. 1968. The working set model for program behavior. Commun. ACM 11, 5 (1968), 323--333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Peter J. Denning. 1980. Working sets past and present. IEEE Trans. Software Eng.1 (1980), 64--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Peter J. Denning, Craig H. Martell, and Vint Cerf. 2015. Great Principles of Computing. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Peter J. Denning and Stuart C. Schwartz. 1972. Properties of the working-set model. Commun. ACM 15, 3 (1972), 191--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Peter J. Denning and Donald R. Slutz. 1978. Generalized working sets for segment reference strings. Commun. ACM 21, 9 (1978), 750--759. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Chen Ding, Xiaoya Xiang, Bin Bao, Hao Luo, Ying-Wei Luo, and Xiao-Lin Wang. 2014. Performance metrics and models for shared cache. Journal of Computer Science and Technology 29, 4 (2014), 692--712.Google ScholarGoogle ScholarCross RefCross Ref
  18. Zachary Drudi, Nicholas J. A. Harvey, Stephen Ingram, Andrew Warfield, and Jake Wires. 2015. Approximating hit rate curves using streaming algorithms. In LIPIcs-Leibniz International Proceedings in Informatics, Vol. 40. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.Google ScholarGoogle Scholar
  19. E. Duesterwald, C. Cascaval, and S. Dwarkadas. 2003. Characterizing and predicting program behavior and its variability. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques. New Orleans, Louisiana. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. David Eklov and Erik Hagersten. 2010. StatStack: Efficient modeling of LRU caches. In Proceedings of the 2010 IEEE International Symposium on Performance Analysis of Systems 8 Software (ISPASS’10). IEEE, 55--65.Google ScholarGoogle ScholarCross RefCross Ref
  21. Éric Fusy, G. Olivier, and Frédéric Meunier. 2007. Hyperloglog: The analysis of a near-optimal cardinality estimation algorithm. In Proceedings of the 2007 International Conference on Analysis of Algorithms (AofA’07).Google ScholarGoogle Scholar
  22. Binny S. Gill. 2008. On multi-level exclusive caching: Offline optimality and why promotions are better than demotions. In Proceedings of the 6th USENIX Conference on File and Storage Technologies. USENIX Association, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Xiameng Hu, Xiaolin Wang, Yechen Li, Lan Zhou, Yingwei Luo, Chen Ding, Song Jiang, and Zhenlin Wang. 2015. LAMA: Optimized locality-aware memory allocation for key-value cache. In Proceedings of USENIX Annual Technical Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Song Jiang and Xiaodong Zhang. 2002. LIRS: An efficient low inter-reference recency set replacement policy to improve buffer cache performance. ACM SIGMETRICS Perform. Eval. Rev. 30, 1 (2002), 31--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yunlian Jiang, Eddy Z. Zhang, Kai Tian, and Xipeng Shen. 2010. Is reuse distance applicable to data locality analysis on chip multiprocessors? In Compiler Construction. Springer, 264--282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Taeho Kgil and Trevor Mudge. 2006. FlashCache: A NAND flash memory file cache for low power web servers. In Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems. ACM, 103--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Yul H. Kim, Mark D. Hill, and David A. Wood. 1991. Implementing stack simulation for highly-associative memories. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems. 212--213. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. L. Mattson, J. Gecsei, D. Slutz, and I. L. Traiger. 1970. Evaluation techniques for storage hierarchies. IBM Syst. J. 9, 2 (1970), 78--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Nimrod Megiddo and Dharmendra S. Modha. 2003. ARC: A self-tuning, low overhead replacement cache. In FAST, Vol. 3. 115--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Dushyanth Narayanan, Austin Donnelly, and Antony Rowstron. 2008. Write off-loading: Practical power management for enterprise storage. ACM Trans. Storage (TOS) 4, 3 (2008), 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Frank Olken. 1981a. Efficient Methods for Calculating the Success Function of Fixed-space Replacement Policies. Technical Report. Lawrence Berkeley Lab, CA.Google ScholarGoogle Scholar
  32. F. Olken. 1981b. Efficient Methods for Calculating the Success Function of Fixed Space Replacement Policies. Technical Report LBL-12370. Lawrence Berkeley Laboratory.Google ScholarGoogle Scholar
  33. Derek L. Schuff, Milind Kulkarni, and Vijay S. Pai. 2010. Accelerating multicore reuse distance analysis with sampling and parallelization. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques. 53--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Xipeng Shen, Jonathan Shaw, Brian Meeker, and Chen Ding. 2007. Locality approximation using time. In ACM SIGPLAN Notices, Vol. 42. ACM, 55--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Donald R. Slutz and Irving L. Traiger. 1974. A note on the calculation working set size. Commun. ACM 17, 10 (1974), 563--565. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. G. Edward Suh, Srinivas Devadas, and Larry Rudolph. 2001. Analytical cache models with applications to cache partitioning. In Proceedings of the 15th International Conference on Supercomputing. ACM, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. G. Edward Suh, Srinivas Devadas, and Larry Rudolph. 2014. Analytical cache models with applications to cache partitioning. In Proceedings of the 25th Anniversary International Conference on Supercomputing Anniversary Volume. ACM, 323--334. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. David K. Tam, Reza Azimi, Livio Soares, and Michael Stumm. 2009. RapidMRC: Approximating L2 miss rate curves on commodity systems for online optimizations. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 121--132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Jeffrey S. Vitter. 1985. Random sampling with a reservoir. ACM Trans. Math. Software (TOMS) 11, 1 (1985), 37--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Carl Waldspurger, Trausti Saemundsson, Irfan Ahmad, and Nohhyun Park. 2017. Cache modeling and optimization using miniature simulations. In Proceedings of USENIX ATC. 487--498. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Carl A. Waldspurger, Nohhyun Park, Alexander Garthwaite, and Irfan Ahmad. 2015. Efficient MRC construction with SHARDS. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). USENIX Association, 95--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Xiaolin Wang, Yechen Li, Yingwei Luo, Xiameng Hu, Jacob Brock, Chen Ding, and Zhenlin Wang. 2015. Optimal footprint symbiosis in shared cache. In CCGRID.Google ScholarGoogle Scholar
  43. Jake Wires, Stephen Ingram, Zachary Drudi, Nicholas J. A. Harvey, Andrew Warfield, and Coho Data. 2014. Characterizing storage workloads with counter stacks. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation. USENIX Association, 335--349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Theodore M. Wong and John Wilkes. 2002. My cache or yours? Making storage more exclusive. In Proceedings of the USENIX Annual Technical Conference, General Track. 161--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Xiaoya Xiang, Bin Bao, Tongxin Bai, Chen Ding, and Trishul M. Chilimbi. 2011a. All-window profiling and composable models of cache sharing. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 91--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Xiaoya Xiang, Bin Bao, Chen Ding, and Yaoqing Gao. 2011b. Linear-time modeling of program working set in shared cache. In Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques (PACT’11). IEEE, 350--360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Xiaoya Xiang, Chen Ding, Hao Luo, and Bin Bao. 2013. HOTL: A higher order theory of locality. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 343--356. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Gala Yadgar, Michael Factor, Kai Li, and Assaf Schuster. 2008. Mc2: Multiple clients on a multilevel cache. In Proceedings of the 28th International Conference on Distributed Computing Systems (ICDCS’08). IEEE, 722--730. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Xiao Zhang, Sandhya Dwarkadas, and Kai Shen. 2009. Towards practical page coloring-based multicore cache management. In Proceedings of the 4th ACM European Conference on Computer Systems. ACM, 89--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Yutao Zhong and Wentao Chang. 2008. Sampling-based program locality approximation. In Proceedings of the International Symposium on Memory Management. 91--100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Yutao Zhong, Xipeng Shen, and Chen Ding. 2009. Program locality analysis using reuse distance. ACM Trans. Program. Lang. Syst. (TOPLAS) 31, 6 (2009), 20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Pin Zhou, Vivek Pandey, Jagadeesan Sundaresan, Anand Raghuraman, Yuanyuan Zhou, and Sanjeev Kumar. 2004. Dynamic tracking of page miss ratio curve for memory management. In ACM SIGOPS Operating Systems Review, Vol. 38. ACM, 177--188. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fast Miss Ratio Curve Modeling for Storage Cache

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Storage
        ACM Transactions on Storage  Volume 14, Issue 2
        May 2018
        210 pages
        ISSN:1553-3077
        EISSN:1553-3093
        DOI:10.1145/3208078
        • Editor:
        • Sam H. Noh
        Issue’s Table of Contents

        Copyright © 2018 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 April 2018
        • Revised: 1 January 2018
        • Accepted: 1 January 2018
        • Received: 1 February 2017
        Published in tos Volume 14, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!