Abstract
Cloud storage has gained increasing popularity in the past few years. In cloud storage, data is stored in the service provider’s data centers, and users access data via the network. For such a new storage model, our prior wisdom about conventional storage may not remain valid nor applicable to the emerging cloud storage. In this article, we present a comprehensive study to gain insight into the unique characteristics of cloud storage and optimize user experiences with cloud storage from a client’s perspective. Unlike prior measurement work that mostly aims to characterize cloud storage providers or specific client applications, we focus on analyzing the effects of various client-side factors on the user-experienced performance. Through extensive experiments and quantitative analysis, we have obtained several important findings. For example, we find that (1) a proper combination of parallelism and request size can achieve optimized bandwidths, (2) a client’s capabilities and geographical location play an important role in determining the end-to-end user-perceivable performance, and (3) the interference among mixed cloud storage requests may cause performance degradation. Based on our findings, we showcase a sampling- and inference-based method to determine a proper combination for different optimization goals. We further present a set of case studies on client-side chunking and parallelization for typical cloud-based applications. Our studies show that specific attention should be paid to fully exploiting the capabilities of clients and the great potential of cloud storage services.
- Hussam Abu-Libdeh, Lonnie Princehouse, and Hakim Weatherspoon. 2010. RACS: A case for cloud storage diversity. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC’10). Indianapolis, IN. Google Scholar
Digital Library
- Amazon. 2010. Amazon S3 Object Size Limit Now 5 TB. Retrieved from https://aws.amazon.com/blogs/aws/amazon-s3-object-size-limit/.Google Scholar
- Amazon. 2015a. Amazon EBS. Retrieved from https://aws.amazon.com/ebs/.Google Scholar
- Amazon. 2015b. Amazon EFS. Retrieved from https://aws.amazon.com/efs/.Google Scholar
- Amazon. 2015c. Amazon S3. Retrieved from https://aws.amazon.com/s3/.Google Scholar
- Amazon. 2015d. Amazon S3 TCP Window Scaling. Retrieved from http://docs.aws.amazon.com/AmazonS3/latest/dev/TCPWindowScaling.html.Google Scholar
- Andreas Bergen, Yvonne Coady, and Rick McGeer. 2011. Client bandwidth: The forgotten metric of online storage providers. In Proceedings of 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PacRim’11). Google Scholar
Cross Ref
- Ignacio Bermudez, Stefano Traverso, Marco Mellia, and Maurizio Munafo. 2013. Exploring the cloud from passive measurement: The amazon AWS case. In Proceedings of the 32nd IEEE International Conference on Computer Communications (INFOCOM’13). Google Scholar
Cross Ref
- Bessani, Alysson, Ricardo Mendes, Tiago Oliveira, Nuno Neves, Miguel Correia, Marcelo Pasin, , and Paulo Verissimo. 2014. SCFS: A shared cloud-backed file system. In Proceedings of the 2014 USENIX Annual Technical Conference (ATC’14).Google Scholar
Digital Library
- Enrico Bocchi, Idilio Drago, and Marco Mellia. 2015. Personal cloud storage benchmarks and comparison. IEEE Transactions on Cloud Computing 99 (2015), 1--14. Google Scholar
Cross Ref
- Enrico Bocchi, Marco Mellia, and Sofiane Sarni. 2014. Cloud storage service benchmarking: Methodologies and experimentations. In Proceedings of the 2014 IEEE 3rd International Conference on Cloud Networking (CloudNet’14). 395--400. Google Scholar
Cross Ref
- Nicolas Bonvin, Thanasis G. Papaioannou, and Karl Aberer. 2010. A self-organized, fault-tolerant and scalable replication scheme for cloud storage. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC’10). Google Scholar
Digital Library
- Boto. 2015a. An Introduction to Boto’s S3 Interface. Retrieved from http://boto.readthedocs.org/en/latest/s3_tut.html.Google Scholar
- Boto. 2015b. S3 API Reference. https://boto.readthedocs.org/en/latest/ref/s3.html.Google Scholar
- Calder Brad, Ju Wang, Aaron Ogus, Niranjan Nilakantan, Arild Skjolsvold, Sam McKelvie, and Yikang Xu et al. 2011. Windows azure storage: A highly available cloud storage service with strong consistency. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP’11). 119--132.Google Scholar
- Feng Chen, David A. Koufaty, and Xiaodong Zhang. 2009. Understanding intrinsic characteristics and system implications of flash memory based solid state drives. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’09). ACM Press. Google Scholar
Digital Library
- Feng Chen, Michael P. Mesnier, and Scott Hahn. 2014. Client-aware cloud storage. In Proceedings of the 30th International Conference on Massive Storage Systems and Technology (MSST’14). Google Scholar
Cross Ref
- Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC’10). ACM Press. Google Scholar
Digital Library
- Yong Cui, Zeqi Lai, Xin Wang, Ningwei Dai, and Congcong Miao. 2015. QuickSync: Improving synchronization efficiency for mobile cloud storage services. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking (MobiCom’15). 582--603. Google Scholar
Digital Library
- Xiaoning Ding, Song Jiang, Feng Chen, Kei Davis, and Xiaodong Zhang. 2007. DiskSeen: Exploiting disk layout and access history to enhance I/O prefetch. In Proceedings of the 2007 USENIX Annual Technical Conference (ATC’07). USENIX Association.Google Scholar
- Yuan Dong, Jinzhan Peng, Dawei Wang, Haiyang Zhu, Fang Wang, Sun C. Chan, and Michael P. Mesnier. 2011. RFS: A network file system for mobile devices and the cloud. In SIGOPS Operating System Review, Vol. 45. 101--111. Google Scholar
Digital Library
- Idilio Drago, Enrico Bocchi, Marco Mellia, Herman Slatman, and Aiko Pras. 2013. Benchmarking personal cloud storage. In Proceedings of the 2013 ACM Conference on Internet Measurement Conference (IMC’13). Google Scholar
Digital Library
- Idilio Drago, Enrico Bocchi, Macro Mellia, Herman Slatman, and Aiko Pras. 2014. Modeling the dropbox client behavior. In Proceedings of the 2014 IEEE International Conference on Communications (ICC’14).Google Scholar
- Idilio Drago, Marco Mellia, Maurizio M. Munafo, Anna Sperotto, Ramin Sadre, and Aiko Pras. 2012. Inside dropbox: Understanding personal cloud storage services. In Proceedings of the 2012 ACM Conference on Internet Measurement Conference (IMC’12). Google Scholar
Digital Library
- Dropbox. 2015. Dropbox. Retrieved from https://www.dropbox.com/.Google Scholar
- Daniel Ellard, Jonathan Ledlie, Pia Malkani, and Margo Seltzer. 2003. Passive NFS tracing of email and research workloads. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST’03). USENIX Association.Google Scholar
- D. Ford, F. Labelle, Florentina I. Popovici, Murray Stokely, Van-Anh Truong, Luiz Barroso, Carrie Grimes, and Sean Quinlan. 2010. Availability in globally distributed storage systems. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI’10).Google Scholar
Digital Library
- Google. 2015. Google Drive. https://www.google.com/drive/.Google Scholar
- Raul Gracia-Tinedo, Marc Sanchez Artigas, Adrian Moreno-Martinez, Cristian Cotes, and Pedro Garcia Lopez. 2013. Actively measuring personal cloud storage. Proceedings of the 2013 IEEE 6th International Conference on Cloud Computing (CLOUD’13). 301--308. Google Scholar
Digital Library
- Ajay Gulati, Ganesha Shanmuganathan, Irfan Ahmad, Carl Waldspurger, and Mustafa Uysal. 2011. Pesto: Online storage performance management in virtualized datacenters. In Proceedings of the 2nd ACM Symposium on Cloud Computing (SoCC’11). Google Scholar
Digital Library
- Keqiang He, Alexis Fisher, Liang Wang, Aaron Gember, Aditya Akella, and Thomas Ristenpart. 2013. Next stop, the cloud: understanding modern web service deployment in EC2 and azure. In Proceedings of the 2013 Conference on Internet Measurement Conference (IMC’13). ACM, 177--190. Google Scholar
Digital Library
- Brett D. Higgins, Jason Flinn, T. J. Giuli, Brian Noble, Christopher Peplin, and David Watson. 2012. Informed mobile prefetching. In Proceedings of the 10th International Conference on Mobile Systems, Applications, and Services (MobiSys’12). 155--158. Google Scholar
Digital Library
- Binbing Hou, Feng Chen, Zhonghong Ou, Ren Wang, and Michael Mesnier. 2016. Understanding I/O performance behaviors of cloud storage from a clients perspective. In Proceedings of the 32nd International Conference on Massive Storage Systems and Technology (MSST’16).Google Scholar
- Wenjin Hu, Tao Yang, and Jeanna N. Matthews. 2010. The good, the bad and the ugly of consumer cloud storage. In ACM SIGOPS Operating Systems Review, Vol. 44:3. Google Scholar
Digital Library
- Yuchong Hu, Henry C. H. Chen, Patrick P.C. Lee, and Yang Tang. 2012. NCCloud: Applying network coding for the storage repair in a cloud-of-clouds. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12).Google Scholar
Digital Library
- Liu Huan. 2002. A trace driven study of packet level parallelism. In Proceedings of the IEEE International Conference on Communications (ICC’02) 4, 1, 2191--2195. Google Scholar
Cross Ref
- IHS. 2012. Subscriptions to Cloud Storage Services to Reach Half-Billion Level This Year. Retrieved from https://technology.ihs.com/410084/subscriptions-to-cloud-storage-services-to-reach-half-billion-level-this-year.Google Scholar
- Van Jacobson, Robert Braden, Dave Borman, M. Satyanarayanan, J. J. Kistler, L. B. Mummert, and M. R. Ebling. 1992. RFC 1323: TCP Extensions for High Performance.Google Scholar
- Song Jiang, Xiaoning Ding, Feng Chen, Enhua Tan, and Xiaodong Zhang. 2005. DULO: An effective buffer cache management scheme to exploit both temporal and spatial localities. In Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST’05). USENIX Association.Google Scholar
Digital Library
- Ang Li, Xiaowei Yang, and Ming Zhang. 2010. CloudCmp: Comparing public cloud providers. In Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement (IMC’10). ACM Press. Google Scholar
Digital Library
- Zhenhua Li, Christo Wilson, Zhefu Jiang, Yao Liu, Ben Y. Zhao, Cheng Jin, Zhi-Li Zhang, and Yafei Dai. 2013. Efficient batched synchronization in dropbox-like cloud storage services. In Proceedings of International Middleware Conference (Middleware’13). Google Scholar
Cross Ref
- Thomas Mager, Ernst Biersack, and Pietro Michiardi. 2012. A measurement study of the wuala on-line storage service. In Proceedings of the 12th IEEE International Conference on Peer-to-Peer Computing (P2P’12). Google Scholar
Cross Ref
- MarketsandMarkets. 2015. Cloud Storage Market by Solutions. (August 2015). http://www.marketsandmarkets.com/Market-Reports/cloud-storage-market-902.html.Google Scholar
- Xiaofeng Meng, Ying Chen, Jianliang Xu, and Jiaheng Lu. 2010. Benchmarking cloud-based data management systems. In Proceedings of the 2nd International Workshop on Cloud Data Management in Cloud Systems (CloudDB’10). ACM Press Google Scholar
Digital Library
- Microsoft. 2015. OneDrive. https://onedrive.live.com/.Google Scholar
- OpenStack. 2011. OpenStack Swift. http://www.openstack.org/.Google Scholar
- Zhonghong Ou, Zhen-Huan Hwang, Antti Ylä-Jääski, Feng Chen, and Ren Wang. 2015. Is cloud storage ready? A comprehensive study of IP-based storage systems. In Proceedings of the 8th IEEE/ACM International Conference on Utility and Cloud Computing (UCC’15).Google Scholar
- S3Backer. 2015. S3Backer. Retrieved from https://code.google.com/p/s3backer/.Google Scholar
- S3FS. 2015. S3FS. Retrieved from https://code.google.com/p/s3fs/.Google Scholar
- Michael Vrable, Stefan Savage, and Geoffrey M. Voelker. 2012. BlueSky: A cloud-backed file system for the enterprise. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12).Google Scholar
Digital Library
- Haiyang Wang, Ryan Shea, Feng Wang, and Jiangchuan Liu. 2012. On the impact of virtualization on dropbox-like cloud file storage/synchronization services. In Proceedings of International Workshop on Quality of Service (IWQoS’12). Google Scholar
Cross Ref
- Zhe Wu, Curtis Yu, and Harsha V. Madhyastha. 2015. CosTLO: Cost-effective redundancy for lower latency variance on cloud storage services. In Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI’15). 543--557.Google Scholar
- Rui Zhang, Ramani Routray, David Eyers, David Chambliss, Prasenjit Sarkar, Douglas Willcocks, and Peter Pietzuch. 2011. IO tetris: Deep storage consolidation for the cloud via fine-grained workload analysis. In Proceedings of the 4th International IEEE Conference on Cloud Computing (CLOUD’11). Google Scholar
Digital Library
- Yupu Zhang, Charis Dragga, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2014. ViewBox: Integrating local file systems with cloud storage services. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’14). 119--132.Google Scholar
Index Terms
Understanding I/O Performance Behaviors of Cloud Storage from a Client’s Perspective
Recommendations
Read-Performance Optimization for Deduplication-Based Storage Systems in the Cloud
Data deduplication has been demonstrated to be an effective technique in reducing the total data transferred over the network and the storage space in cloud backup, archiving, and primary storage systems, such as VM (virtual machine) platforms. However, ...
Cloud Storage as the Infrastructure of Cloud Computing
ICICCI '10: Proceedings of the 2010 International Conference on Intelligent Computing and Cognitive InformaticsAs an emerging technology and business paradigm, Cloud Computing has taken commercial computing by storm. Cloud computing platforms provide easy access to a company’s high-performance computing and storage infrastructure through web services. With cloud ...
Middleware enabled data sharing on cloud storage services
MW4SOC '10: Proceedings of the 5th International Workshop on Middleware for Service Oriented ComputingWith the emergence of public cloud storage platforms like Amazon, Microsoft and Google etc, individual applications and some enterprise storage are being increasingly deployed on Clouds. However, dynamic data sharing in public clouds face problems of ...






Comments