skip to main content
research-article

Penalty- and Locality-aware Memory Allocation in Redis Using Enhanced AET

Authors Info & Claims
Published:28 May 2021Publication History
Skip Abstract Section

Abstract

Due to large data volume and low latency requirements of modern web services, the use of an in-memory key-value (KV) cache often becomes an inevitable choice (e.g., Redis and Memcached). The in-memory cache holds hot data, reduces request latency, and alleviates the load on background databases. Inheriting from the traditional hardware cache design, many existing KV cache systems still use recency-based cache replacement algorithms, e.g., least recently used or its approximations. However, the diversity of miss penalty distinguishes a KV cache from a hardware cache. Inadequate consideration of penalty can substantially compromise space utilization and request service time. KV accesses also demonstrate locality, which needs to be coordinated with miss penalty to guide cache management.

In this article, we first discuss how to enhance the existing cache model, the Average Eviction Time model, so that it can adapt to modeling a KV cache. After that, we apply the model to Redis and propose pRedis, Penalty- and Locality-aware Memory Allocation in Redis, which synthesizes data locality and miss penalty, in a quantitative manner, to guide memory allocation and replacement in Redis. At the same time, we also explore the diurnal behavior of a KV store and exploit long-term reuse. We replace the original passive eviction mechanism with an automatic dump/load mechanism, to smooth the transition between access peaks and valleys. Our evaluation shows that pRedis effectively reduces the average and tail access latency with minimal time and space overhead. For both real-world and synthetic workloads, our approach delivers an average of 14.0%∼52.3% latency reduction over a state-of-the-art penalty-aware cache management scheme, Hyperbolic Caching (HC), and shows more quantitative predictability of performance. Moreover, we can obtain even lower average latency (1.1%∼5.5%) when dynamically switching policies between pRedis and HC.

References

  1. From web. Approximated LRU. Retrieved on 15 March, 2020from https://redis.io/topics/lru-cache.Google ScholarGoogle Scholar
  2. From web. IEEE Standard for Floating-Point Arithmetic (IEEE 754). Retrieved on 15 March, 2020 from https://en.wikipedia.org/wiki/IEEE_754.Google ScholarGoogle Scholar
  3. From web. Memcached Memory Management Blog. Retrieved on 15 March, 2020 from https://www.loginradius.com/engineering/memcach-memory-management.Google ScholarGoogle Scholar
  4. From web. Memcached Website. Retrieved on 15 March, 2020 from http://memcached.org.Google ScholarGoogle Scholar
  5. From web. Memtier_benchmark. Retrieved on 15 March, 2020 from https://github.com/GarantiaData/memtier_benchmark.Google ScholarGoogle Scholar
  6. From web. MSR Cambridge Traces. Retrieved on 15 March, 2020 from http://iotta.snia.org/traces/388.Google ScholarGoogle Scholar
  7. From web. MurmurHash. Retrieved on 15 March, 2020 from https://en.wikipedia.org/wiki/Murmur-Hash.Google ScholarGoogle Scholar
  8. From web. MySQL Website. Retrieved on 15 March, 2020 from https://www.mysql.com.Google ScholarGoogle Scholar
  9. From web. MySQLslap. Retrieved on 15 March, 2020 from https://tosbourn.com/mysqlslap-a-quickstart-guide/.Google ScholarGoogle Scholar
  10. From web. PostgreSQL Website. Retrieved on 15 March, 2020 from https://www.postgresql.org/.Google ScholarGoogle Scholar
  11. From web. Redis as an LRU cache. Retrieved on 15 March, 2020 from http://oldblog.antirez.com/post/redis-as-LRU-cache.html.Google ScholarGoogle Scholar
  12. From web. Redis Website. Retrieved on 15 March, 2020 from https://redis.io.Google ScholarGoogle Scholar
  13. From web. Yahoo! Cloud Serving Benchmark (YCSB). Retrieved on 15 March, 2020 from https://github.com/brianfrankcooper/YCSB.Google ScholarGoogle Scholar
  14. From web. Zipfian's Law. Retrieved on 15 March, 2020 from https://en.wikipedia.org/wiki/Zipf%27s_law.Google ScholarGoogle Scholar
  15. Paulo Sérgio Almeida, Carlos Baquero, Nuno Preguiça, and David Hutchison. 2007. Scalable bloom filters. Inform. Process. Lett. 101, 6 (2007), 255–261. DOI:https://doi.org/10.1016/j.ipl.2006.10.007 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’12). ACM, New York, NY, 53–64. DOI:https://doi.org/10.1145/2254756.2254766 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Nirav Atre, Justine Sherry, Weina Wang, and Daniel S. Berger. 2020. Caching with delayed hits. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM’20). Association for Computing Machinery, New York, NY, 495–513. DOI:https://doi.org/10.1145/3387514.3405883 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. N. Beckmann and D. Sanchez. 2015. Talus: A simple way to remove cliffs in cache performance. In Proceedings of the IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). 64–75.Google ScholarGoogle Scholar
  19. D. Berger, R. Sitaraman, and Mor Harchol-Balter. 2017. AdaptSize: Orchestrating the hot object memory cache in a content delivery network. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI’17). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Daniel S. Berger, Benjamin Berg, Timothy Zhu, Siddhartha Sen, and Mor Harchol-Balter. 2018. RobinHood: Tail latency-aware caching—Dynamic reallocation from cache-rich to cache-poor. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI’18). USENIX Association, Carlsbad, CA, 195–212. Retrieved from https://www.usenix.org/conference/osdi18/presentation/berger. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Hjortur Bjornsson, Gregory Chockler, Trausti Saemundsson, and Ymir Vigfusson. 2013. Dynamic performance profiling of cloud caches. In Proceedings of the 4th Annual Symposium on Cloud Computing. ACM, Seattle, WA, 1–14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Aaron Blankstein, Siddhartha Sen, and Michael J. Freedman. 2017. Hyperbolic caching: Flexible caching for web applications. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’17). USENIX Association, Santa Clara, CA, 499–511. Retrieved from https://www.usenix.org/conference/atc17/technical-sessions/presentation/blankstein. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Burton H. Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7 (July 1970), 422–426. DOI:https://doi.org/10.1145/362686.362692 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Jacob Brock, Chencheng Ye, Ding Chen, Yechen Li, Xiaolin Wang, and Yingwei Luo. 2015. Optimal cache partition-sharing. In Proceedings of the International Conference on Parallel Processing (ICPP’15). 749–758. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Daniel Byrne, Nilufer Onder, and Zhenlin Wang. 2018. mPart: Miss-ratio curve guided partitioning in key-value stores. In Proceedings of the ACM SIGPLAN International Symposium on Memory Management (ISMM’18). ACM, New York, NY, 84–95. DOI:https://doi.org/10.1145/3210563.3210571 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Pei Cao and Sandy Irani. 1997. Cost-aware WWW proxy caching algorithms. In Proceedings of the USENIX Symposium on Internet Technologies and Systems (USITS’97). USENIX Association, Berkeley, CA, 18–18. Retrieved from http://dl.acm.org/citation.cfm?id=1267279.1267297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Calin Cascaval, Evelyn Duesterwald, Peter F. Sweeney, and Robert W. Wisniewski. 2005. Multiple page size modeling and optimization. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’05). 339–349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Asaf Cidon, Assaf Eisenman, Mohammad Alizadeh, and Sachin Katti. 2015. Dynacache: Dynamic cloud caching. In Proceedings of the 7th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’15). USENIX Association, Santa Clara, CA. Retrieved from https://www.usenix.org/conference/hotcloud15/workshop-program/presentation/cidon. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Asaf Cidon, Assaf Eisenman, Mohammad Alizadeh, and Sachin Katti. 2016. Cliffhanger: Scaling performance cliffs in web memory caches. In Proceedings of the 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI’16). USENIX Association, Santa Clara, CA, 379–392. Retrieved from https://www.usenix.org/conference/nsdi16/technical-sessions/presentation/cidon. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Asaf Cidon, Daniel Rushton, Stephen M. Rumble, and Ryan Stutsman. 2017. Memshare: A dynamic multi-tenant key-value cache. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’17). USENIX Association, Santa Clara, CA, 321–334. Retrieved from https://www.usenix.org/conference/atc17/technical-sessions/presentation/cidon. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Haipeng Dai, Yuankun Zhong, Alex X. Liu, Wei Wang, and Meng Li. 2016. Noisy bloom filters for multi-set membership testing. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science (SIGMETRICS’16). Association for Computing Machinery, New York, NY, 139–151. DOI:https://doi.org/10.1145/2896377.2901451 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. F. Hao, M. Kodialam, T. V. Lakshman, and H. Song. 2012. Fast dynamic multiple-set membership testing using combinatorial bloom filters. IEEE/ACM Trans. Netw. 20, 1 (2012), 295–304. DOI:https://doi.org/10.1109/TNET.2011.2173351 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Xiameng Hu, Xiaolin Wang, Yechen Li, Lan Zhou, Yingwei Luo, Chen Ding, Song Jiang, and Zhenlin Wang. 2015. LAMA: Optimized locality-aware memory allocation for key-value cache. In Proceedings of the USENIX Annual Technical Conference (ATC’15). 57–70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. X. Hu, X. Wang, L. Zhou, Y. Luo, C. Ding, S. Jiang, and Z. Wang. 2017. Optimizing locality-aware memory management of key-value caches. IEEE Trans. Comput. 66, 5 (May 2017), 862–875. DOI:https://doi.org/10.1109/TC.2016.2618920 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Xiameng Hu, Xiaolin Wang, Lan Zhou, Yingwei Luo, Chen Ding, and Zhenlin Wang. 2016. Kinetic modeling of data eviction in cache. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’16). Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Xiameng Hu, Xiaolin Wang, Lan Zhou, Yingwei Luo, Zhenlin Wang, Chen Ding, and Chencheng Ye. 2018. Fast miss ratio curve modeling for storage cache. ACM Trans. Storage 14, 2, Article 12 (Apr. 2018), 34 pages. DOI:https://doi.org/10.1145/3185751 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Yul H. Kim, Mark D. Hill, and David A. Wood. 1991. Implementing stack simulation for highly-associative memories. ACM Sigmetrics Perform. Eval. Rev. 19, 1, 212–213. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Conglong Li and Alan L. Cox. 2015. GD-wheel: A cost-aware replacement policy for key-value stores. In Proceedings of the 10th European Conference on Computer Systems (EuroSys’15). ACM, New York, NY, Article 5, 15 pages. DOI:https://doi.org/10.1145/2741948.2741956 Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. R. L. Mattson, J. Gecsei, D. R. Slutz, and I. L. Traiger. 1970. Evaluation techniques for storage hierarchies. IBM Syst. J. 9, 2 (June 1970), 78–117. DOI:https://doi.org/10.1147/sj.92.0078 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Qingpeng Niu, James Dinan, Qingda Lu, and P. Sadayappan. 2012. PARDA: A fast parallel reuse distance analysis algorithm. In Proceedings of the IEEE 26th International Parallel and Distributed Processing Symposium (IPDPS’12). IEEE Computer Society, Washington, DC, 1284–1294. DOI:https://doi.org/10.1109/IPDPS.2012.117 Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. J. Ou, M. Patton, M. D. Moore, Y. Xu, and S. Jiang. 2015. A penalty-aware memory allocation scheme for key-value cache. In Proceedings of the 44th International Conference on Parallel Processing. 530–539. DOI:https://doi.org/10.1109/ICPP.2015.62 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. R. Pagh, G. Segev, and U. Wieder. 2013. How to approximate a set without knowing its size in advance. In Proceedings of the IEEE 54th Annual Symposium on Foundations of Computer Science. 80–89. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Cheng Pan, Xiameng Hu, Lan Zhou, Yingwei Luo, Xiaolin Wang, and Zhenlin Wang. 2018. PACE: Penalty-aware cache modeling with enhanced AET. In Proceedings of the 9th Asia-Pacific Workshop on Systems (APSys’18). ACM, New York, NY, Article 19, 8 pages. DOI:https://doi.org/10.1145/3265723.3265736 Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. J. Shim, P. Scheuermann, and R. Vingralek. 1999. Proxy cache algorithms: Design, implementation, and performance. Knowl. Data Eng. IEEE Trans. 11, 4 (1999), 549–562. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. G. Edward Suh, Srinivas Devadas, and Larry Rudolph. 2001. Analytical cache models with applications to cache partitioning. In Proceedings of the 15th International Conference on Supercomputing (ICS’01). 1–12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Aditya Sundarrajan, Mingdong Feng, Mangesh Kasbekar, and Ramesh K. Sitaraman. 2017. Footprint descriptors: Theory and practice of cache provisioning in a global CDN. In Proceedings of the 13th International Conference on Emerging Networking EXperiments and Technologies (CoNEXT’17). Association for Computing Machinery, New York, NY, 55–67. DOI:https://doi.org/10.1145/3143361.3143368 Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Carl Waldspurger, Trausti Saemundsson, Irfan Ahmad, and Nohhyun Park. 2017. Cache modeling and optimization using miniature simulations. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’17). USENIX Association, Santa Clara, CA, 487–498. Retrieved from https://www.usenix.org/conference/atc17/technical-sessions/presentation/waldspurger. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Carl A. Waldspurger, Nohhyun Park, Alexander Garthwaite, and Irfan Ahmad. 2015. Efficient MRC construction with SHARDS. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). USENIX Association, 95–110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Yang Wang, Jiwu Shu, Guangyan Zhang, Wei Xue, and Weimin Zheng. 2010. SOPA: Selecting the optimal caching policy adaptively. ACM Trans. Storage 6, 2, Article Article 7 (July 2010), 18 pages. DOI:https://doi.org/10.1145/1807060.1807064 Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Jake Wires, Stephen Ingram, Zachary Drudi, Nicholas J. A. Harvey, and Andrew Warfield. 2014. Characterizing storage workloads with counter stacks. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). USENIX Association, Broomfield, CO, 335–349. Retrieved from https://www.usenix.org/conference/osdi14/technical-sessions/presentation/wires. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Z. Xie, W. Ding, H. Wang, Y. Xiao, and Z. Liu. 2017. D-Ary Cuckoo filter: A space efficient data structure for set membership lookup. In Proceedings of the IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS). 190–197. DOI:https://doi.org/10.1109/ICPADS.2017.00035Google ScholarGoogle Scholar
  52. Chencheng Ye, Jacob Brock, Ding Chen, and Hai Jin. 2017. Rochester elastic cache utility (RECU): Unequal cache sharing is good economics. Int. J. Parallel Program. 45, 1 (2017), 30–44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. M. K. Yoon, J. Son, and S. Shin. 2014. Bloom tree: A search tree based on Bloom filters for multiple-set membership testing. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM’14). 1429–1437. DOI:https://doi.org/10.1109/INFOCOM.2014.6848077Google ScholarGoogle Scholar
  54. Neal E. Young. 1991. Competitive paging and dual-guided on-line weighted caching and matching algorithms. PhD thesis, Princeton University. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Xiao Zhang, Sandhya Dwarkadas, and Kai Shen. 2009. Towards practical page coloring-based multicore cache management. In Proceedings of the 4th ACM European Conference on Computer Systems. ACM, 89–102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Pin Zhou, Vivek Pandey, Jagadeesan Sundaresan, Anand Raghuraman, Yuanyuan Zhou, and Sanjeev Kumar. 2004. Dynamic tracking of page miss ratio curve for memory management. In Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’04). ACM, New York, NY, 177–188. DOI:https://doi.org/10.1145/1024393.1024415 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Penalty- and Locality-aware Memory Allocation in Redis Using Enhanced AET

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Storage
          ACM Transactions on Storage  Volume 17, Issue 2
          May 2021
          202 pages
          ISSN:1553-3077
          EISSN:1553-3093
          DOI:10.1145/3465461
          • Editor:
          • Sam H. Noh
          Issue’s Table of Contents

          Copyright © 2021 Association for Computing Machinery.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 28 May 2021
          • Accepted: 1 January 2021
          • Revised: 1 November 2020
          • Received: 1 April 2020
          Published in tos Volume 17, Issue 2

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!