Abstract
Successfully integrating cloud storage as a primary storage layer in the I/O stack is highly challenging. This is essentially due to two inherent critical issues: the high and variant cloud I/O latency and the per-I/O pricing model of cloud storage. To minimize the associated latency and monetary cost with cloud I/Os, caching is a crucial technology, as it directly influences how frequently the client has to communicate with the cloud. Unfortunately, current cloud caching schemes are mostly designed to optimize miss reduction as the sole objective and only focus on improving system performance while ignoring the fact that various cache misses could have completely distinct effects in terms of latency and monetary cost.
In this article, we present a cost-aware caching scheme, called GDS-LC, which is highly optimized for cloud storage caching. Different from traditional caching schemes that merely focus on improving cache hit ratios and the classic cost-aware schemes that can only achieve a single optimization target, GDS-LC offers a comprehensive cache design by considering not only the access locality but also the object size, associated latency, and price, aiming at enhancing the user experience with cloud storage from two aspects: access latency and monetary cost. To achieve this, GDS-LC virtually partitions the cache space into two regions: a high-priority latency-aware region and a low-priority price-aware region. Each region is managed by a cost-aware caching scheme, which is based on GreedyDual-Size (GDS) and designed for a cloud storage scenario by adopting clean-dirty differentiation and latency normalization. The GDS-LC framework is highly flexible, and we present a further enhanced algorithm, called GDS-LCF, by incorporating access frequency in caching decisions. We have built a prototype to emulate a typical cloud client cache and evaluate GDS-LC and GDS-LCF with Amazon Simple Storage Services (S3) in three different scenarios: local cloud, Internet cloud, and heterogeneous cloud. Our experimental results show that our caching schemes can effectively achieve both optimization goals: low access latency and low monetary cost. It is our hope that this work can inspire the community to reconsider the cache design in the cloud environment, especially for the purpose of integrating cloud storage into the current storage stack as a primary layer.
- Hussam Abu-Libdeh, Lonnie Princehouse, and Hakim Weatherspoon. 2010. RACS: A case for cloud storage diversity. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC’10). Google Scholar
Digital Library
- Soam Acharya and Brian Smith. 2000. Middleman: A video caching proxy server. In Proceedings of the 10th International Workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV’00).Google Scholar
- Amazon. 2016. Amazon S3 Pricing. Retrieved October 17, 2017, from https://aws.amazon.com/s3/pricing/.Google Scholar
- Amazon. 2016. Netflix Case Study. Retrieved October 17, 2017, from https://aws.amazon.com/solutions/case-studies/netflix/.Google Scholar
- Amazon. 2017. AWS Free Tier. Retrieved October 17, 2017, from https://aws.amazon.com/free/.Google Scholar
- Andrea Araldo, Michele Mangili, Fabio Martignon, and Dario Rossi. 2014. Cost-aware caching: Optimizing cache provisioning and object placement in ICN. In Proceedings of the 2014 IEEE Global Communications Conference (GLOBECOM’14).Google Scholar
Cross Ref
- Sobir Bazarbayev, Matti Hiltunen, Kaustubh Joshi, Richard Schlichting, and William Sanders. 2013. PSCloud: A durable context-aware personal storage cloud. In Proceedings of the 9th Workshop on Hot Topics in Dependable Systems (HotDep’13). Google Scholar
Digital Library
- Ignacio Bermudez, Stefano Traverso, Marco Mellia, and Maurizio Munafo. 2013. Exploring the cloud from passive measurement: The Amazon AWS case. In Proceedings of the 32nd Annual IEEE International Conference on Computer Communications (INFOCOM’13).Google Scholar
Cross Ref
- Bessani, Alysson, Ricardo Mendes, Tiago Oliveira, Nuno Neves, Miguel Correia, Marcelo Pasin, and Paulo Verissimo. 2014. SCFS: A shared cloud-backed file system. In Proceedings of the 2014 USENIX Annual Technical Conference (ATC’14). Google Scholar
Digital Library
- Alysson Bessani, Miguel Correia, Bruno Quaresma, Fernando André, and Paulo Sousa. 2013. DepSky: Dependable and secure storage in a cloud-of-clouds. ACM Transactions on Storage 9, 4, 1--33. Google Scholar
Digital Library
- BlueCoat. 2017. Executive Summary. Retrieved October 17, 2017, from https://www.bluecoat.com/sites/default/files/documents/files/Object_Caching.1.pdf.Google Scholar
- Nicolas Bonvin, Thanasis G. Papaioannou, and Karl Aberer. 2010. A self-organized, fault-tolerant and scalable replication scheme for cloud storage. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC’10). Google Scholar
Digital Library
- Pei Cao, Edward W. Felten, and Kai Li. 1994. Application-controlled file caching policies. In Proceedings of the 1994 USENIX Summer Technical Conference (USTC’94). Google Scholar
Digital Library
- Pei Cao and Sandy Irani. 1997. Cost-aware WWW proxy caching algorithms. In Proceedings of the 1997 USENIX Symposium on Internet Technologies and Systems (USITS’97). Google Scholar
Digital Library
- Feng Chen, Michael P. Mesnier, and Scott Hahn. 2014. Client-aware cloud storage. In Proceedings of the 30th International Conference on Massive Storage Systems and Technology (MSST’14).Google Scholar
Cross Ref
- Feng Chen and Xiaodong Zhang. 2010. PS-BC: Power-saving considerations in design of buffer caches serving heterogeneous storage devices. In Proceedings of the 2010 International Symposium on Low Power Electronics and Design (ISLPED’10). Google Scholar
Digital Library
- Jongmoo Choi, Sam H. Noh, Sang Lyul Min, and Yookun Cho. 1999. An implementation study of a detection-based adaptive block replacement scheme. In Proceedings of the 1999 Annual USENIX Technical Conference (ATC’99). Google Scholar
Digital Library
- Jongmoo Choi, Sam H. Noh, Sang Lyul Min, and Yookun Cho. 2000. Towards application/file-level characterization of block references. In Proceedings of the 2000 ACM SIGMETRICS Conference on Measuring and Modeling of Computer Systems (SIGMETRICS’00). Google Scholar
Digital Library
- ClarkNet. 2016. ClarkNet-HTTP. Retrieved October 17, 2017, from http://ita.ee.lbl.gov/html/contrib/ClarkNet-HTTP.html.Google Scholar
- CTERA. 2017. Home Page. Retrieved October 17, 2017, from http://www.ctera.com/.Google Scholar
- Yuan Dong, Jinzhan Peng, Dawei Wang, Haiyang Zhu, Fang Wang, Sun C. Chan, and Michael P. Mesnier. 2011. RFS—a network file system for mobile devices and the cloud. ACM SIGOPS Operating Systems Review 45, 1, 101--111. Google Scholar
Digital Library
- Idilio Drago, Enrico Bocchi, Marco Mellia, Herman Slatman, and Aiko Pras. 2013. Benchmarking personal cloud storage. In Proceedings of the 2013 ACM Internet Measurement Conference (IMC’13). Google Scholar
Digital Library
- Idilio Drago, Marco Mellia, Maurizio M. Munafo, Anna Sperotto, Ramin Sadre, and Aiko Pras. 2012. Inside Dropbox: Understanding personal cloud storage services. In Proceedings of the 2012 ACM Internet Measurement Conference (IMC’12). Google Scholar
Digital Library
- Dropbox. 2016. Home Page. Retrieved October 17, 2017, from https://www.dropbox.com/.Google Scholar
- Patrick R. Eaton, Dennis Geels, and Greg Mori. 1999. Clump: Improving File System Performance Through Adaptive Optimizations. Available at http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.23.227.Google Scholar
- D. Ford, F. Labelle, Florentina I. Popovici, Murray Stokely, Van-Anh Truong, Luiz Barroso, Carrie Grimes, and Sean Quinlan. 2010. Availability in globally distributed storage systems. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI’10). Google Scholar
Digital Library
- Brian C. Forney, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2002. Storage-aware caching: Revisiting caching for heterogeneous storage systems. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST’02). Google Scholar
Digital Library
- Gideon Glass and Pei Cao. 1997. Adaptive page replacement based on memory reference behavior. In Proceedings of the 1997 ACM SIGMETRICS Conference on Measuring and Modeling of Computer Systems (SIGMETRICS’97). Google Scholar
Digital Library
- Google. 2016. Home Page. Retrieved October 17, 2017, from https://www.google.com/drive/.Google Scholar
- Google. 2017. Google Cloud Platform Free Tier: Always Free Usage Limits. Retrieved October 17, 2017, from https://cloud.google.com/free/docs/always-free-usage-limits.Google Scholar
- Ajay Gulati, Ganesha Shanmuganathan, Irfan Ahmad, Carl Waldspurger, and Mustafa Uysal. 2011. Pesto: Online storage performance management in virtualized datacenters. In Proceedings of the 2nd ACM Symposium on Cloud Computing (SoCC’11). Google Scholar
Digital Library
- Flex Hartanto, Jussi Kangasharju, Martin Reisslein, and Keith W. Ross. 2002. Caching video objects: Layers vs versions? In Proceedings of the 2002 IEEE International Conference on Multimedia and Expo (ICME’02).Google Scholar
- Binbing Hou, Feng Chen, Zhonghong Ou, Ren Wang, and Michael Mesnier. 2016. Understanding I/O performance behaviors of cloud storage from a client’s perspective. In Proceedings of the 32nd International Conference on Massive Storage Systems and Technology (MSST’16).Google Scholar
- Binbing Hou, Feng Chen, Zhonghong Ou, Ren Wang, and Michael Mesnier. 2017. Understanding I/O performance behaviors of cloud storage from a client’s perspective. ACM Transactions on Storage 13, 2, 16:1--16:36. Google Scholar
Digital Library
- Wenjin Hu, Tao Yang, and Jeanna N. Matthews. 2010. The good, the bad and the ugly of consumer cloud storage. ACM SIGOPS Operating Systems Review 44, 3, 110--115. Google Scholar
Digital Library
- Yuchong Hu, Henry C. H. Chen, Patrick P. C. Lee, and Yang Tang. 2012. NCCloud: Applying network coding for the storage repair in a cloud-of-clouds. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). Google Scholar
Digital Library
- InTheCloud. 2016. Spotify Moving Onto Google Cloud Is A Big Win for Google Over Amazon and Microsoft. Retrieved October 17, 2017, from https://www.forbes.com/sites/alexkonrad/2016/02/23/spotify-is-a-big-win-for-google-cloud/#49cc582374b9.Google Scholar
- Jaeheon Jeong and Michel Dubois. 2003. Cost-sensitive cache replacement algorithms. In Proceedings of the 9th International Symposium on High Performance Computer Architecture (HPCA’03). Google Scholar
Digital Library
- Song Jiang, Feng Chen, and Xiaodong Zhang. 2005. CLOCK-pro: An effective improvement of the CLOCK replacement. In Proceedings of the 2005 USENIX Annual Technical Conference (ATC’05). Google Scholar
Digital Library
- Song Jiang, Xiaoning Ding, Feng Chen, Enhua Tan, and Xiaodong Zhang. 2005. DULO: An effective buffer cache management scheme to exploit both temporal and spatial localities. In Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST’05). Google Scholar
Digital Library
- Song Jiang and Xiaodong Zhang. 2002. LIRS: An efficient low inter-reference recency set replacement policy to improve buffer cache performance. In Proceedings of the 2002 International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’02). Google Scholar
Digital Library
- Hyojun Kim and Seongjun Ahn. 2008. BPLRU: A buffer management scheme for improving random writes in flash storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). Google Scholar
Digital Library
- Jong Min Kim, Jongmoo Choi, Jesung Kim, Sam H. Noh, Sang Lyul Min, Yookun Cho, and Chong Sang Kim. 2000. A low-overhead high-performance unified buffer management scheme that exploits sequential and looping references. In Proceedings of the 4th Conference Symposium on Operating System Design and Implementation (OSDI’04).Google Scholar
Digital Library
- Conglong Li and Alan L. Cox. 2015. GD-wheel: A cost-aware replacement policy for key-value stores. In Proceedings of the 10th European Conference on Computer Systems (EuroSys’15). ACM, New York, NY. Google Scholar
Digital Library
- Zhenmin Li, Zhifeng Chen, Sudarshan M. Srinivasan, and Yuanyuan Zhou. 2004. C-miner: Mining block correlations in storage systems. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST’04). Google Scholar
Digital Library
- Shuang Liang, Ke Chen, Song Jiang, and Xiaodong Zhang. 2007. Cost-aware caching algorithms for distributed storage servers. In Proceedings of 21st International Symposium on Distributed Computing (DISC’07). Google Scholar
Digital Library
- Thomas Mager, Ernst Biersack, and Pietro Michiardi. 2012. A measurement study of the Wuala on-line storage service. In Proceedings of the 12th IEEE International Conference on Peer-to-Peer Computing (P2P’12).Google Scholar
Cross Ref
- Richard McDougall, Joshua Crase, and Shawn Debnath. 2005. FileBench. Retrieved October 17, 2017, from http://sourceforge.net/projects/filebench.Google Scholar
- N. Megiddo and D. Modha. 2003. ARC: A self-tuning, low overhead replacement cache. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST’03). Google Scholar
Digital Library
- Microsoft. 2016. Cache Configuration. ESG Microsoft Azure StorSimple White Paper. Microsoft. Available at http://download.microsoft.com/download/8/B/3/8B3F84CD-DBEE-483D-943A-936752AD4516/ESG_Microsoft_Azure_StorSimple_White_paper.pdf.Google Scholar
- Microsoft. 2016. OneDrive. Retrieved October 17, 2017, from https://onedrive.live.com/.Google Scholar
- Alexandros Nanopoulos, Dimitrios Katsaros, and Yannis Manolopoulos. 2003. A data mining algorithm for generalized Web prefetching. IEEE Transactions on Knowledge and Data Engineering 15, 5, 1155--1169. Google Scholar
Digital Library
- Nasuni. 2016. Home Page. Retrieved October 17, 2017, from https://www.nasuni.com/.Google Scholar
- Nasuni. 2016. Nasuni Cache Configuration. Retrieved October 17, 2017, from http://www6.nasuni.com/rs/445-ZDB-645/images/CacheConfig.pdf.Google Scholar
- NetApp. 2016. NetApp SteelStore Cloud Integrated Storage 3.2: Deployment Guide. Retrieved October 17, 2017, from https://library.netapp.com/ecm/ecm_download_file/ECMP12031272.Google Scholar
- Elizabeth J. O’Neil, Patrick E. O’Neil, and Gerhard Weikum. 1993. The LRU-K page replacement algorithm for database disk buffering. In Proceedings of the 1993 ACM International Conference on Management of Data (SIGMOD’93). Google Scholar
Digital Library
- Zhonghong Ou, Zhen-Huan Hwang, Antti Ylä-Jääski, Feng Chen, and Ren Wang. 2015. Is cloud storage ready? A comprehensive study of IP-based storage systems. In Proceedings of the 8th IEEE/ACM International Conference on Utility and Cloud Computing (UCC’15).Google Scholar
- Panzura. 2016. Home Page. Retrieved October 17, 2017, from http://panzura.com/.Google Scholar
- Panzura. 2016. Panzura Debuts Version 3.0 of Its Global Cloud Storage System. Retrieved October 17, 2017, from http://panzura.com/press-releases/panzura-debuts-version-3-0-of-its-global-cloud-storage-system/.Google Scholar
- R. Hugo Patterson, Garth A. Gibson, Eka Ginting, Daniel Stodolsky, and Jim Zelenka. 1995. Informed prefetching and caching. In Proceedings of the 15th Symposium on Operating System Principles (SOSP’95). Google Scholar
Digital Library
- Moinuddin K. Qureshi and Yale N. Patt. 2006. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’06). Google Scholar
Digital Library
- S3Backer. 2016. Google Code Archive: S3Backer. Retrieved October 17, 2017, from https://code.google.com/p/s3backer/.Google Scholar
- S3FS. 2016. Google Code Archive: S3FS. Retrieved October 17, 2017, from https://code.google.com/p/s3fs/.Google Scholar
- Muhammad Zubair Shafiq, Alex X. Liu, and Amir R. Khakpour. 2014. Revisiting caching in content delivery networks. ACM SIGMETRICS Performance Evaluation Review 42, 1, 567--568. Google Scholar
Digital Library
- StorageServers. 2013. Dropbox Uses Amazon S3 Services for Storage! Retrieved October 17, 2017, from https://storageservers.wordpress.com/2013/10/25/dropbox-uses-amazon-s3-services-forstorage/.Google Scholar
- StorSimple. 2016. Microsoft Azure: StorSimple. Retrieved October 17, 2017, from https://www.microsoft.com/en-us/cloud-platform/azure-storsimple.Google Scholar
- G. Edward Suh, Larry Rudolph, and Srinivas Devadas. 2004. Dynamic partitioning of shared cache memory. Journal of Supercomputing 28, 1, 7--26. Google Scholar
Digital Library
- Wenting Tang, Yun Fu, Ludmila Cherkasova, and Amin Vahdat. 2003. Medisyn: A synthetic streaming media service workload generator. In Proceedings of the 13th International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV’03). Google Scholar
Digital Library
- TwinStrata. 2016. EMC CloudArray. Retrieved October 17, 2017, from http://www.emc.com/domains/cloudarray/.Google Scholar
- VideoCache. 2017. Home Page. Retrieved October 17, 2017, from https://cachevideos.com/.Google Scholar
- Michael Vrable, Stefan Savage, and Geoffrey M. Voelker. 2012. BlueSky: A cloud-backed file system for the enterprise. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). Google Scholar
Digital Library
- Haiyang Wang, Ryan Shea, Feng Wang, and Jiangchuan Liu. 2012. On the impact of virtualization on Dropbox-like cloud file storage/synchronization services. In Proceedings of the 20th International Workshop on Quality of Service (IWQoS’12). Google Scholar
Digital Library
- Qiang Yang, Haining Henry Zhang, and Tianyi Li. 2001. Mining Web logs for prediction models in WWW caching and prefetching. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD’01). Google Scholar
Digital Library
- Suli Yang, Kiran Srinivasan, Kishore Udayashankar, Swetha Krishnan, Jingxin Feng, Yupu Zhang, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. Tombolo: Performance enhancements for cloud storage gateways. In Proceedings of the 32nd International Conference on Massive Storage Systems and Technology (MSST’16).Google Scholar
- Neal Young. 1994. The K-server dual and loose competitiveness for paging. Algorithmica 11, 6, 525--541. Google Scholar
Digital Library
- Rui Zhang, Ramani Routray, David Eyers, David Chambliss, Prasenjit Sarkar, Douglas Willcocks, and Peter Pietzuch. 2011. IO Tetris: Deep storage consolidation for the cloud via fine-grained workload analysis. In Proceedings of the 4th IEEE International Conference on Cloud Computing (CLOUD’11). Google Scholar
Digital Library
- Yuanyuan Zhou, James F. Philbin, and Kai Li. 2001. The multi-queue replacement algorithm for second level buffer caches. In Proceedings of the 2001 USENIX Annual Technical Conference (ATC’01). Google Scholar
Digital Library
Index Terms
GDS-LC: A Latency- and Cost-Aware Client Caching Scheme for Cloud Storage
Recommendations
A unified multiple-level cache for high performance storage systems
Multi-level cache hierarchies are widely used in high-performance storage systems to improve I/O performance. However, traditional cache management algorithms are not suited well for such cache organisations. Recently proposed multi-level cache ...
New caching algorithms performance evaluation
Spects '15: Proceedings of the International Symposium on Performance Evaluation of Computer and Telecommunication SystemsIn this paper we propose new caching and test their performance. We use Memcached platform to test the caching algorithm performance. Memcached is an extremely popular open-source, distributed key-value store system that powers several of the Internet's ...
Impact of Document Types on the Performance of Caching Algorithms in WWW Proxies: A Trace Driven Simulation Study
AINA '05: Proceedings of the 19th International Conference on Advanced Information Networking and Applications - Volume 1Caching has been recognized as one of the most important techniques to reduce Internet bandwidth in the WWW. Different caching algorithms have been developed to achieve this task. These caching algorithms have been developed for specific contexts and ...






Comments