skip to main content
research-article

Cache What You Need to Cache: Reducing Write Traffic in Cloud Cache via “One-Time-Access-Exclusion” Policy

Authors Info & Claims
Published:16 July 2020Publication History
Skip Abstract Section

Abstract

The SSD has been playing a significantly important role in caching systems due to its high performance-to-cost ratio. Since the cache space is typically much smaller than that of the backend storage by one order of magnitude or even more, write density (defined as writes per unit time and space) of the SSD cache is therefore much more intensive than that of HDD storage, which brings about tremendous challenges to the SSD’s lifetime. Meanwhile, under social network workloads, quite a lot writes to the SSD cache are unnecessary. For example, our study on Tencent’s photo caching shows that about 61% of total photos are accessed only once, whereas they are still swapped in and out of the cache. Therefore, if we can predict these kinds of photos proactively and prevent them from entering the cache, we can eliminate unnecessary SSD cache writes and improve cache space utilization.

To cope with the challenge, we put forward a “one-time-access criteria” that is applied to the cache space and further propose a “one-time-access-exclusion” policy. Based on these two techniques, we design a prediction-based classifier to facilitate the policy. Unlike the state-of-the-art history-based predictions, our prediction is non-history oriented, which is challenging to achieve good prediction accuracy. To address this issue, we integrate a decision tree into the classifier, extract social-related information as classifying features, and apply cost-sensitive learning to improve classification precision. Due to these techniques, we attain a prediction accuracy greater than 80%. Experimental results show that the one-time-access-exclusion approach results in outstanding cache performance in most aspects. Take LRU, for instance: applying our approach improves the hit rate by 4.4%, decreases the cache writes by 56.8%, and cuts the average access latency by 5.5%.

References

  1. Ethem Alpaydin. 2014. Introduction to Machine Learning. MIT Press, Cambridge, MA.Google ScholarGoogle Scholar
  2. Xiao Bai, B. Barla Cambazoglu, and Archie Russell. 2016. Improved caching techniques for large-scale image hosting services. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 639--648.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Leo Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and Regression Trees. Wadsworth.Google ScholarGoogle Scholar
  4. Lee Breslau, Pei Cao, Li Fan, Graham Phillips, and Scott Shenker. 1999. Web caching and Zipf-like distributions: Evidence and implications. In Proceedings of the 18th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM’99), Vol. 1. IEEE, Los Alamitos, CA, 126--134.Google ScholarGoogle ScholarCross RefCross Ref
  5. Li-Pin Chang, Yu-Syun Liu, and Wen-Huei Lin. 2016. Stable greedy: Adaptive garbage collection for durable page-mapping multichannel SSDs. ACM Transactions on Embedded Computing Systems 15, 1 (Jan. 2016), Article 13, 25 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Feng Chen, Tian Luo, and Xiaodong Zhang. 2011. CAFTL: A content-aware flash translation layer enhancing the lifespan of flash memory based solid state drives. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST’11), Vol. 11. 77--90.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. F. H. Chen, M. C. Yang, Y. H. Chang, and T. W. Kuo. 2015. PWL: A progressive wear leveling to minimize data migration overheads for NAND flash devices. In Proceedings of the 2015 Design, Automation, and Test in Europe Conference and Exhibition (DATE’15). 1209--1212.Google ScholarGoogle Scholar
  8. Asaf Cidon, Assaf Eisenman, Mohammad Alizadeh, and Sachin Katti. 2015. Dynacache: Dynamic cloud caching. In Proceedings of the 7th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’15).Google ScholarGoogle Scholar
  9. Riley Crane and Didier Sornette. 2008. Robust dynamic classes revealed by measuring the response function of a social system. Proceedings of the National Academy of Sciences 105, 41 (2008), 15649--15653.Google ScholarGoogle ScholarCross RefCross Ref
  10. Zhaoxia Deng, Lunkai Zhang, Nikita Mishra, Henry Hoffmann, and Frederic T. Chong. 2017. Memory cocktail therapy: A general learning-based framework to optimize dynamic tradeoffs in NVMs. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, New York, NY, 232--244.Google ScholarGoogle Scholar
  11. Assaf Eisenman, Asaf Cidon, Evgenya Pergament, Or Haimovich, Ryan Stutsman, Mohammad Alizadeh, and Sachin Katti. 2019. Flashield: A hybrid key-value cache that controls flash write amplification. In Proceedings of the 16th USENIX Conference on Networked Systems Design and Implementation (NSDI’19). 65--78.Google ScholarGoogle Scholar
  12. Charles Elkan. 2001. The foundations of cost-sensitive learning. In Proceedings of the International Joint Conference on Artificial Intelligence, Vol. 17. 973--978.Google ScholarGoogle Scholar
  13. Eran Gal and Sivan Toledo. 2005. Algorithms and data structures for flash memories. ACM Computing Surveys 37, 2 (2005), 138--163.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Francesco Gelli, Tiberio Uricchio, Marco Bertini, Alberto Del Bimbo, and Shih-Fu Chang. 2015. Image popularity prediction in social media using sentiment and context features. In Proceedings of the 23rd ACM International Conference on Multimedia. ACM, New York, NY, 907--910.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ping Huang, Wenjie Liu, Kun Tang, Xubin He, and Ke Zhou. 2016. ROP: Alleviating refresh overheads via reviving the memory system in frozen cycles. In Proceedings of the 2016 45th International Conference on Parallel Processing (ICPP’16). IEEE, Los Alamitos, CA, 169--178.Google ScholarGoogle ScholarCross RefCross Ref
  16. Ping Huang, Pradeep Subedi, Xubin He, Shuang He, and Ke Zhou. 2014. FlexECC: Partially relaxing ECC of MLC SSD for better cache performance. In Proceedings of the 2014 USENIX Annual Technical Conference (USENIX ATC’14). 489--500.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Qi Huang, Ken Birman, Robbert van Renesse, Wyatt Lloyd, Sanjeev Kumar, and Harry C. Li. 2013. An analysis of Facebook photo caching. In Proceedings of the 24th ACM Symposium on Operating Systems Principles. ACM, New York, NY, 167--181.Google ScholarGoogle Scholar
  18. Sai Huang, Qingsong Wei, Dan Feng, Jianxi Chen, and Cheng Chen. 2016. Improving flash-based disk cache with lazy adaptive replacement. ACM Transactions on Storage 12, 2 (2016), 8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Song Jiang and Xiaodong Zhang. 2002. LIRS: An efficient low inter-reference recency set replacement policy to improve buffer cache performance. ACM SIGMETRICS Performance Evaluation Review 30, 1 (2002), 31--42.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Daniel A. Jiménez and Elvira Teran. 2017. Multiperspective reuse prediction. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, New York, NY, 436--448.Google ScholarGoogle Scholar
  21. Xavier Jimenez, David Novo, and Paolo Ienne. 2014. Wear unleveling: Improving NAND flash lifetime by balancing page endurance. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’14), Vol. 14. 47--59.Google ScholarGoogle Scholar
  22. Ramakrishna Karedla, J. Spencer Love, and Bradley G. Wherry. 1994. Caching strategies to improve disk system performance. Computer 27, 3 (1994), 38--46.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Georgios Keramidas, Pavlos Petoumenos, and Stefanos Kaxiras. 2007. Cache replacement based on reuse-distance prediction. In Proceedings of the 2007 25th International Conference on Computer Design. IEEE, Los Alamitos, CA, 245--250.Google ScholarGoogle ScholarCross RefCross Ref
  24. Mazen Kharbutli and Yan Solihin. 2008. Counter-based cache replacement and bypassing algorithms. IEEE Transactions on Computers 57, 4 (2008), 433--447.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Aditya Khosla, Atish Das Sarma, and Raffay Hamid. 2014. What makes an image popular? In Proceedings of the 23rd International Conference on World Wide Web. ACM, New York, NY, 867--876.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ren-Shuo Liu, Chia-Lin Yang, and Wei Wu. 2012. Optimizing NAND flash-based SSDs via retention relaxation. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). 11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Nimrod Megiddo and Dharmendra S. Modha. 2003. ARC: A self-tuning, low overhead replacement cache. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST’03). 115--130.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Leeor Peled, Shie Mannor, Uri Weiser, and Yoav Etsion. 2015. Semantic locality and context-based prefetching using reinforcement learning. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA’15). ACM, New York, NY, 285--297.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, and Jude A. Rivers. 2009. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA’09). ACM, New York, NY, 24--33.Google ScholarGoogle Scholar
  30. Gokul Soundararajan, Vijayan Prabhakaran, Mahesh Balakrishnan, and Ted Wobber. 2010. Extending SSD lifetimes with disk-based write caches. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). 101--114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Gabor Szabo and Bernardo A. Huberman. 2010. Predicting the popularity of online content. Communications of the ACM 53, 8 (2010), 80--88.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Linpeng Tang, Qi Huang, Wyatt Lloyd, Sanjeev Kumar, and Kai Li. 2015. RIPQ: Advanced photo caching on flash for Facebook. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). 373--386.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Linpeng Tang, Qi Huang, Amit Puntambekar, Ymir Vigfusson, Wyatt Lloyd, and Kai Li. 2017. Popularity prediction of Facebook videos for higher quality streaming. In Proceedings of the 2017 USENIX Annual Technical Conference (USENIX ATC’17). 111--123.Google ScholarGoogle Scholar
  34. Elvira Teran, Zhe Wang, and Daniel A. Jiménez. 2016. Perceptron learning for reuse prediction. In Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-49). IEEE, Los Alamitos, CA, 1--12.Google ScholarGoogle Scholar
  35. Laszlo A. Belady.1966. A study of replacement algorithms for a virtual-storage computer. IBM Systems Journal 5, 2 (1966), 78--101.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Xiaopeng Fan, Jiannong Cao, Haixia Mao, Weigang Wu, Yubin Zhao, and Chengzhong Xu. 2016. Web access patterns enhancing data access performance of cooperative caching in IMANETs. In Proceedings of the 2016 17th IEEE International Conference on Mobile Data Management (MDM’16), Vol. 1. IEEE, Los Alamitos, CA, 50--59.Google ScholarGoogle ScholarCross RefCross Ref
  37. Lei Guo, Enhua Tan, Songqing Chen, Zhen Xiao, and Xiaodong Zhang. 2008. The stretched exponential distribution of Internet media access patterns. In Proceedings of the 27th ACM Symposium on Principles of Distributed Computing. ACM, Los Alamitos, CA, 283--294.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Rohan Samarasinghe, Yoshihiro Yasutake, and Takaichi Yoshida. 2005. Optimizing the access performance and data freshness of distributed cache objects considering user access pattern. In Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA’05), Vol. 2. IEEE, Los Alamitos, CA, 325--328.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. M. Zubair Shafiq, Amir R. Khakpour, and Alex X. Liu. 2016. Characterizing caching workload of a large commercial content delivery network. In Proceedings of the 35th Annual IEEE International Conference on Computer Communications (INFOCOM’16). IEEE, Los Alamitos, CA, 1--9.Google ScholarGoogle Scholar
  40. Aditya Sundarrajan, Mingdong Feng, Mangesh Kasbekar, and Ramesh K. Sitaraman. 2017. Footprint descriptors: Theory and practice of cache provisioning in a global CDN. In Proceedings of the 13th International Conference on Emerging Networking Experiments and Technologies. ACM, New York, NY, 55--67.Google ScholarGoogle Scholar
  41. Yue Yang and Jianwen Zhu. 2016. Write skew and Zipf distribution: Evidence and implications. ACM Transactions on Storage 12, 4 (2016), 1--19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Guanying Wu and Xubin He. 2012. Delta-FTL: Improving SSD lifetime via exploiting content locality. In Proceedings of the 7th ACM European Conference on Computer Systems. ACM, New York, NY, 253--266.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Qiang Yang, Haining Henry Zhang, and Tianyi Li. 2001. Mining web logs for prediction models in WWW caching and prefetching. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 473--478.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Rui Ye, Wentao Meng, and Shenggang Wan. 2017. Extending lifetime of SSD in Raid5 systems through a reliable hierarchical cache. In Proceedings of the 2017 International Conference on Networking, Architecture, and Storage (NAS’17). IEEE, Los Alamitos, CA, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  45. Qingyuan Zhao, Murat A. Erdogdu, Hera Y. He, Anand Rajaraman, and Jure Leskovec. 2015. Seismic: A self-exciting point process model for predicting tweet popularity. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 1513--1522.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Ke Zhou, Shaofu Hu, Ping Huang, and Yuhong Zhao. 2017. LX-SSD: Enhancing the lifespan of NAND flash-based memory via recycling invalid pages. In Proceedings of the 2017 IEEE 33rd Symposium on Massive Storage Systems and Technology.Google ScholarGoogle Scholar
  47. Ke Zhou, Si Sun, Hua Wang, Ping Huang, Xubin He, Rui Lan, Wenyan Li, Wenjie Liu, and Tianming Yang. 2019. Improving cache performance for large-scale photo stores via heuristic prefetching scheme. IEEE Transactions on Parallel and Distributed Systems 30, 9 (2019), 2033--2045.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Ke Zhou, Yu Zhang, Ping Huang, Hua Wang, Yongguang Ji, Bin Cheng, and Ying Liu. 2018. LEA: A lazy eviction algorithm for SSD cache in cloud block storage. In Proceedings of the 2018 IEEE 36th International Conference on Computer Design (ICCD’18). IEEE, Los Alamitos, CA, 569--572.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Cache What You Need to Cache: Reducing Write Traffic in Cloud Cache via “One-Time-Access-Exclusion” Policy

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Storage
      ACM Transactions on Storage  Volume 16, Issue 3
      August 2020
      150 pages
      ISSN:1553-3077
      EISSN:1553-3093
      DOI:10.1145/3410885
      • Editor:
      • Sam H. Noh
      Issue’s Table of Contents

      Copyright © 2020 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 16 July 2020
      • Online AM: 7 May 2020
      • Accepted: 1 April 2020
      • Revised: 1 February 2020
      • Received: 1 June 2019
      Published in tos Volume 16, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!