Abstract
Due to large data volume and low latency requirements of modern web services, the use of an in-memory key-value (KV) cache often becomes an inevitable choice (e.g., Redis and Memcached). The in-memory cache holds hot data, reduces request latency, and alleviates the load on background databases. Inheriting from the traditional hardware cache design, many existing KV cache systems still use recency-based cache replacement algorithms, e.g., least recently used or its approximations. However, the diversity of miss penalty distinguishes a KV cache from a hardware cache. Inadequate consideration of penalty can substantially compromise space utilization and request service time. KV accesses also demonstrate locality, which needs to be coordinated with miss penalty to guide cache management.
In this article, we first discuss how to enhance the existing cache model, the Average Eviction Time model, so that it can adapt to modeling a KV cache. After that, we apply the model to Redis and propose pRedis, Penalty- and Locality-aware Memory Allocation in Redis, which synthesizes data locality and miss penalty, in a quantitative manner, to guide memory allocation and replacement in Redis. At the same time, we also explore the diurnal behavior of a KV store and exploit long-term reuse. We replace the original passive eviction mechanism with an automatic dump/load mechanism, to smooth the transition between access peaks and valleys. Our evaluation shows that pRedis effectively reduces the average and tail access latency with minimal time and space overhead. For both real-world and synthetic workloads, our approach delivers an average of 14.0%∼52.3% latency reduction over a state-of-the-art penalty-aware cache management scheme, Hyperbolic Caching (HC), and shows more quantitative predictability of performance. Moreover, we can obtain even lower average latency (1.1%∼5.5%) when dynamically switching policies between pRedis and HC.
- From web. Approximated LRU. Retrieved on 15 March, 2020from https://redis.io/topics/lru-cache.Google Scholar
- From web. IEEE Standard for Floating-Point Arithmetic (IEEE 754). Retrieved on 15 March, 2020 from https://en.wikipedia.org/wiki/IEEE_754.Google Scholar
- From web. Memcached Memory Management Blog. Retrieved on 15 March, 2020 from https://www.loginradius.com/engineering/memcach-memory-management.Google Scholar
- From web. Memcached Website. Retrieved on 15 March, 2020 from http://memcached.org.Google Scholar
- From web. Memtier_benchmark. Retrieved on 15 March, 2020 from https://github.com/GarantiaData/memtier_benchmark.Google Scholar
- From web. MSR Cambridge Traces. Retrieved on 15 March, 2020 from http://iotta.snia.org/traces/388.Google Scholar
- From web. MurmurHash. Retrieved on 15 March, 2020 from https://en.wikipedia.org/wiki/Murmur-Hash.Google Scholar
- From web. MySQL Website. Retrieved on 15 March, 2020 from https://www.mysql.com.Google Scholar
- From web. MySQLslap. Retrieved on 15 March, 2020 from https://tosbourn.com/mysqlslap-a-quickstart-guide/.Google Scholar
- From web. PostgreSQL Website. Retrieved on 15 March, 2020 from https://www.postgresql.org/.Google Scholar
- From web. Redis as an LRU cache. Retrieved on 15 March, 2020 from http://oldblog.antirez.com/post/redis-as-LRU-cache.html.Google Scholar
- From web. Redis Website. Retrieved on 15 March, 2020 from https://redis.io.Google Scholar
- From web. Yahoo! Cloud Serving Benchmark (YCSB). Retrieved on 15 March, 2020 from https://github.com/brianfrankcooper/YCSB.Google Scholar
- From web. Zipfian's Law. Retrieved on 15 March, 2020 from https://en.wikipedia.org/wiki/Zipf%27s_law.Google Scholar
- Paulo Sérgio Almeida, Carlos Baquero, Nuno Preguiça, and David Hutchison. 2007. Scalable bloom filters. Inform. Process. Lett. 101, 6 (2007), 255–261. DOI:https://doi.org/10.1016/j.ipl.2006.10.007 Google Scholar
Digital Library
- Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’12). ACM, New York, NY, 53–64. DOI:https://doi.org/10.1145/2254756.2254766 Google Scholar
Digital Library
- Nirav Atre, Justine Sherry, Weina Wang, and Daniel S. Berger. 2020. Caching with delayed hits. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM’20). Association for Computing Machinery, New York, NY, 495–513. DOI:https://doi.org/10.1145/3387514.3405883 Google Scholar
Digital Library
- N. Beckmann and D. Sanchez. 2015. Talus: A simple way to remove cliffs in cache performance. In Proceedings of the IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). 64–75.Google Scholar
- D. Berger, R. Sitaraman, and Mor Harchol-Balter. 2017. AdaptSize: Orchestrating the hot object memory cache in a content delivery network. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI’17). Google Scholar
Digital Library
- Daniel S. Berger, Benjamin Berg, Timothy Zhu, Siddhartha Sen, and Mor Harchol-Balter. 2018. RobinHood: Tail latency-aware caching—Dynamic reallocation from cache-rich to cache-poor. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI’18). USENIX Association, Carlsbad, CA, 195–212. Retrieved from https://www.usenix.org/conference/osdi18/presentation/berger. Google Scholar
Digital Library
- Hjortur Bjornsson, Gregory Chockler, Trausti Saemundsson, and Ymir Vigfusson. 2013. Dynamic performance profiling of cloud caches. In Proceedings of the 4th Annual Symposium on Cloud Computing. ACM, Seattle, WA, 1–14. Google Scholar
Digital Library
- Aaron Blankstein, Siddhartha Sen, and Michael J. Freedman. 2017. Hyperbolic caching: Flexible caching for web applications. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’17). USENIX Association, Santa Clara, CA, 499–511. Retrieved from https://www.usenix.org/conference/atc17/technical-sessions/presentation/blankstein. Google Scholar
Digital Library
- Burton H. Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7 (July 1970), 422–426. DOI:https://doi.org/10.1145/362686.362692 Google Scholar
Digital Library
- Jacob Brock, Chencheng Ye, Ding Chen, Yechen Li, Xiaolin Wang, and Yingwei Luo. 2015. Optimal cache partition-sharing. In Proceedings of the International Conference on Parallel Processing (ICPP’15). 749–758. Google Scholar
Digital Library
- Daniel Byrne, Nilufer Onder, and Zhenlin Wang. 2018. mPart: Miss-ratio curve guided partitioning in key-value stores. In Proceedings of the ACM SIGPLAN International Symposium on Memory Management (ISMM’18). ACM, New York, NY, 84–95. DOI:https://doi.org/10.1145/3210563.3210571 Google Scholar
Digital Library
- Pei Cao and Sandy Irani. 1997. Cost-aware WWW proxy caching algorithms. In Proceedings of the USENIX Symposium on Internet Technologies and Systems (USITS’97). USENIX Association, Berkeley, CA, 18–18. Retrieved from http://dl.acm.org/citation.cfm?id=1267279.1267297. Google Scholar
Digital Library
- Calin Cascaval, Evelyn Duesterwald, Peter F. Sweeney, and Robert W. Wisniewski. 2005. Multiple page size modeling and optimization. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’05). 339–349. Google Scholar
Digital Library
- Asaf Cidon, Assaf Eisenman, Mohammad Alizadeh, and Sachin Katti. 2015. Dynacache: Dynamic cloud caching. In Proceedings of the 7th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’15). USENIX Association, Santa Clara, CA. Retrieved from https://www.usenix.org/conference/hotcloud15/workshop-program/presentation/cidon. Google Scholar
Digital Library
- Asaf Cidon, Assaf Eisenman, Mohammad Alizadeh, and Sachin Katti. 2016. Cliffhanger: Scaling performance cliffs in web memory caches. In Proceedings of the 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI’16). USENIX Association, Santa Clara, CA, 379–392. Retrieved from https://www.usenix.org/conference/nsdi16/technical-sessions/presentation/cidon. Google Scholar
Digital Library
- Asaf Cidon, Daniel Rushton, Stephen M. Rumble, and Ryan Stutsman. 2017. Memshare: A dynamic multi-tenant key-value cache. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’17). USENIX Association, Santa Clara, CA, 321–334. Retrieved from https://www.usenix.org/conference/atc17/technical-sessions/presentation/cidon. Google Scholar
Digital Library
- Haipeng Dai, Yuankun Zhong, Alex X. Liu, Wei Wang, and Meng Li. 2016. Noisy bloom filters for multi-set membership testing. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science (SIGMETRICS’16). Association for Computing Machinery, New York, NY, 139–151. DOI:https://doi.org/10.1145/2896377.2901451 Google Scholar
Digital Library
- F. Hao, M. Kodialam, T. V. Lakshman, and H. Song. 2012. Fast dynamic multiple-set membership testing using combinatorial bloom filters. IEEE/ACM Trans. Netw. 20, 1 (2012), 295–304. DOI:https://doi.org/10.1109/TNET.2011.2173351 Google Scholar
Digital Library
- Xiameng Hu, Xiaolin Wang, Yechen Li, Lan Zhou, Yingwei Luo, Chen Ding, Song Jiang, and Zhenlin Wang. 2015. LAMA: Optimized locality-aware memory allocation for key-value cache. In Proceedings of the USENIX Annual Technical Conference (ATC’15). 57–70. Google Scholar
Digital Library
- X. Hu, X. Wang, L. Zhou, Y. Luo, C. Ding, S. Jiang, and Z. Wang. 2017. Optimizing locality-aware memory management of key-value caches. IEEE Trans. Comput. 66, 5 (May 2017), 862–875. DOI:https://doi.org/10.1109/TC.2016.2618920 Google Scholar
Digital Library
- Xiameng Hu, Xiaolin Wang, Lan Zhou, Yingwei Luo, Chen Ding, and Zhenlin Wang. 2016. Kinetic modeling of data eviction in cache. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’16). Google Scholar
Digital Library
- Xiameng Hu, Xiaolin Wang, Lan Zhou, Yingwei Luo, Zhenlin Wang, Chen Ding, and Chencheng Ye. 2018. Fast miss ratio curve modeling for storage cache. ACM Trans. Storage 14, 2, Article 12 (Apr. 2018), 34 pages. DOI:https://doi.org/10.1145/3185751 Google Scholar
Digital Library
- Yul H. Kim, Mark D. Hill, and David A. Wood. 1991. Implementing stack simulation for highly-associative memories. ACM Sigmetrics Perform. Eval. Rev. 19, 1, 212–213. Google Scholar
Digital Library
- Conglong Li and Alan L. Cox. 2015. GD-wheel: A cost-aware replacement policy for key-value stores. In Proceedings of the 10th European Conference on Computer Systems (EuroSys’15). ACM, New York, NY, Article 5, 15 pages. DOI:https://doi.org/10.1145/2741948.2741956 Google Scholar
Digital Library
- R. L. Mattson, J. Gecsei, D. R. Slutz, and I. L. Traiger. 1970. Evaluation techniques for storage hierarchies. IBM Syst. J. 9, 2 (June 1970), 78–117. DOI:https://doi.org/10.1147/sj.92.0078 Google Scholar
Digital Library
- Qingpeng Niu, James Dinan, Qingda Lu, and P. Sadayappan. 2012. PARDA: A fast parallel reuse distance analysis algorithm. In Proceedings of the IEEE 26th International Parallel and Distributed Processing Symposium (IPDPS’12). IEEE Computer Society, Washington, DC, 1284–1294. DOI:https://doi.org/10.1109/IPDPS.2012.117 Google Scholar
Digital Library
- J. Ou, M. Patton, M. D. Moore, Y. Xu, and S. Jiang. 2015. A penalty-aware memory allocation scheme for key-value cache. In Proceedings of the 44th International Conference on Parallel Processing. 530–539. DOI:https://doi.org/10.1109/ICPP.2015.62 Google Scholar
Digital Library
- R. Pagh, G. Segev, and U. Wieder. 2013. How to approximate a set without knowing its size in advance. In Proceedings of the IEEE 54th Annual Symposium on Foundations of Computer Science. 80–89. Google Scholar
Digital Library
- Cheng Pan, Xiameng Hu, Lan Zhou, Yingwei Luo, Xiaolin Wang, and Zhenlin Wang. 2018. PACE: Penalty-aware cache modeling with enhanced AET. In Proceedings of the 9th Asia-Pacific Workshop on Systems (APSys’18). ACM, New York, NY, Article 19, 8 pages. DOI:https://doi.org/10.1145/3265723.3265736 Google Scholar
Digital Library
- J. Shim, P. Scheuermann, and R. Vingralek. 1999. Proxy cache algorithms: Design, implementation, and performance. Knowl. Data Eng. IEEE Trans. 11, 4 (1999), 549–562. Google Scholar
Digital Library
- G. Edward Suh, Srinivas Devadas, and Larry Rudolph. 2001. Analytical cache models with applications to cache partitioning. In Proceedings of the 15th International Conference on Supercomputing (ICS’01). 1–12. Google Scholar
Digital Library
- Aditya Sundarrajan, Mingdong Feng, Mangesh Kasbekar, and Ramesh K. Sitaraman. 2017. Footprint descriptors: Theory and practice of cache provisioning in a global CDN. In Proceedings of the 13th International Conference on Emerging Networking EXperiments and Technologies (CoNEXT’17). Association for Computing Machinery, New York, NY, 55–67. DOI:https://doi.org/10.1145/3143361.3143368 Google Scholar
Digital Library
- Carl Waldspurger, Trausti Saemundsson, Irfan Ahmad, and Nohhyun Park. 2017. Cache modeling and optimization using miniature simulations. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’17). USENIX Association, Santa Clara, CA, 487–498. Retrieved from https://www.usenix.org/conference/atc17/technical-sessions/presentation/waldspurger. Google Scholar
Digital Library
- Carl A. Waldspurger, Nohhyun Park, Alexander Garthwaite, and Irfan Ahmad. 2015. Efficient MRC construction with SHARDS. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). USENIX Association, 95–110. Google Scholar
Digital Library
- Yang Wang, Jiwu Shu, Guangyan Zhang, Wei Xue, and Weimin Zheng. 2010. SOPA: Selecting the optimal caching policy adaptively. ACM Trans. Storage 6, 2, Article Article 7 (July 2010), 18 pages. DOI:https://doi.org/10.1145/1807060.1807064 Google Scholar
Digital Library
- Jake Wires, Stephen Ingram, Zachary Drudi, Nicholas J. A. Harvey, and Andrew Warfield. 2014. Characterizing storage workloads with counter stacks. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). USENIX Association, Broomfield, CO, 335–349. Retrieved from https://www.usenix.org/conference/osdi14/technical-sessions/presentation/wires. Google Scholar
Digital Library
- Z. Xie, W. Ding, H. Wang, Y. Xiao, and Z. Liu. 2017. D-Ary Cuckoo filter: A space efficient data structure for set membership lookup. In Proceedings of the IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS). 190–197. DOI:https://doi.org/10.1109/ICPADS.2017.00035Google Scholar
- Chencheng Ye, Jacob Brock, Ding Chen, and Hai Jin. 2017. Rochester elastic cache utility (RECU): Unequal cache sharing is good economics. Int. J. Parallel Program. 45, 1 (2017), 30–44. Google Scholar
Digital Library
- M. K. Yoon, J. Son, and S. Shin. 2014. Bloom tree: A search tree based on Bloom filters for multiple-set membership testing. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM’14). 1429–1437. DOI:https://doi.org/10.1109/INFOCOM.2014.6848077Google Scholar
- Neal E. Young. 1991. Competitive paging and dual-guided on-line weighted caching and matching algorithms. PhD thesis, Princeton University. Google Scholar
Digital Library
- Xiao Zhang, Sandhya Dwarkadas, and Kai Shen. 2009. Towards practical page coloring-based multicore cache management. In Proceedings of the 4th ACM European Conference on Computer Systems. ACM, 89–102. Google Scholar
Digital Library
- Pin Zhou, Vivek Pandey, Jagadeesan Sundaresan, Anand Raghuraman, Yuanyuan Zhou, and Sanjeev Kumar. 2004. Dynamic tracking of page miss ratio curve for memory management. In Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’04). ACM, New York, NY, 177–188. DOI:https://doi.org/10.1145/1024393.1024415 Google Scholar
Digital Library
Index Terms
Penalty- and Locality-aware Memory Allocation in Redis Using Enhanced AET
Recommendations
Dynamically Configuring LRU Replacement Policy in Redis
MEMSYS '20: Proceedings of the International Symposium on Memory SystemsTo reduce the latency of accessing backend servers, today’s web services usually adopt in-memory key-value stores in the front end which cache the frequently accessed objects. Memcached and Redis are two most popular key-value cache systems. Due to the ...
pRedis: Penalty and Locality Aware Memory Allocation in Redis
SoCC '19: Proceedings of the ACM Symposium on Cloud ComputingDue to large data volume and low latency requirements of modern web services, the use of in-memory key-value (KV) cache often becomes an inevitable choice (e.g. Redis and Memcached). The in-memory cache holds hot data, reduces request latency, and ...
Cache index-aware memory allocation
ISMM '11Poor placement of data blocks in memory may negatively impact application performance because of an increase in the cache conflict miss rate [18]. For dynamically allocated structures this placement is typically determined by the memory allocator. Cache ...






Comments