Abstract
Cloud block storage systems support diverse types of applications in modern cloud services. Characterizing their input/output (I/O) activities is critical for guiding better system designs and optimizations. In this article, we present an in-depth comparative analysis of production cloud block storage workloads through the block-level I/O traces of billions of I/O requests collected from two production systems, Alibaba Cloud and Tencent Cloud Block Storage. We study their characteristics of load intensities, spatial patterns, and temporal patterns. We also compare the cloud block storage workloads with the notable public block-level I/O workloads from the enterprise data centers at Microsoft Research Cambridge, and we identify the commonalities and differences of the three sources of traces. To this end, we provide 6 findings through the high-level analysis and 16 findings through the detailed analysis on load intensity, spatial patterns, and temporal patterns. We discuss the implications of our findings on load balancing, cache efficiency, and storage cluster management in cloud block storage systems.
- [1] . 2007. Easy and efficient disk I/O workload characterization in VMware ESX server. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC’07). 149–158.Google Scholar
Digital Library
- [2] . 2022. Alibaba Block Traces. Retrieved from https://github.com/alibaba/block-traces.Google Scholar
- [3] . 2022. Alibaba Cloud Block Storage. Retrieved from https://www.alibabacloud.com/help/doc-detail/63136.htm.Google Scholar
- [4] . 2022. Amazon EBS. Retrieved from https://aws.amazon.com/ebs/.Google Scholar
- [5] . 2016. CloudCache: On-demand flash cache management for cloud computing. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). 355–369.Google Scholar
- [6] . 2014. Client-side flash caching for cloud systems. In Proceedings of the 7th ACM International Systems and Storage Conference (SYSTOR’14). 1–11.Google Scholar
Digital Library
- [7] . 2012. Workload analysis of a large-scale key-value store. In Proceedings of the ACM Special Interest Group for the Computer Performance Evaluation Community (SIGMETRICS’12). 53–64.Google Scholar
Digital Library
- [8] . 2010. Finding a needle in Haystack: Facebook’s photo storage. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI’10). 47–60.Google Scholar
- [9] . 2009. BORG: Block-reORGanization for self-optimizing storage systems. In Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST’09). 183–196.Google Scholar
- [10] . 2015. Data retention in MLC NAND flash memory: Characterization, optimization, and recovery. In Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture (HPCA’15). IEEE, 551–563.Google Scholar
Cross Ref
- [11] . 2020. Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST’20). 209–223.Google Scholar
Digital Library
- [12] . 2014. Parity logging with reserved space: Towards efficient updates and recovery in erasure-coded clustered storage. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’14). 163–176.Google Scholar
- [13] . 2014. Software orchestrated flash array. In Proceedings of the 7th ACM International Systems and Storage Conference (SYSTOR’14). 1–11.Google Scholar
Digital Library
- [14] . 2012. Analytic modeling of SSD write performance. In Proceedings of the 5th ACM International Systems and Storage Conference (SYSTOR’12). 1–10.Google Scholar
Digital Library
- [15] . 2021. An in-depth study of correlated failures in production SSD-based data centers. In Proceedings of the 19th USENIX Conference on File and Storage Technologies (FAST’21). 417–429.Google Scholar
- [16] . 2016. Slacker: Fast distribution with lazy docker containers. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). 181–195.Google Scholar
- [17] . 2017. The unwritten contract of solid state drives. In Proceedings of the 12th ACM European Conference on Computer Systems (EuroSys’17). 127–144.Google Scholar
Digital Library
- [18] . 2003. Characteristics of I/O traffic in personal computer and server workloads. IBM Syst. J. 42, 2 (2003), 347–372.Google Scholar
Digital Library
- [19] . 2008. Characterization of storage workload traces from production windows servers. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC’08). 119–128.Google Scholar
Cross Ref
- [20] . 2017. Understanding storage traffic characteristics on enterprise virtual desktop infrastructure. In Proceedings of the 10th ACM International Systems and Storage Conference (SYSTOR’17). 1–11.Google Scholar
Digital Library
- [21] . 2019. URSA: Hybrid block storage for cloud-scale virtual disks. In Proceedings of the 14th ACM European Conference on Computer Systems (EuroSys’19). 1–17.Google Scholar
Digital Library
- [22] . 2020. An in-depth analysis of cloud block storage workloads in large scale production. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC’20). 37–47.Google Scholar
Cross Ref
- [23] . 2016. Access characteristic guided read and write cost regulation for performance improvement on flash memory. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). 125–132.Google Scholar
Digital Library
- [24] . 2012. Optimizing NAND flash-based SSDs via retention relaxation. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). 1–11.Google Scholar
- [25] . 2019. Analysis of and optimization for write-dominated hybrid storage nodes in cloud. In Proceedings of ACM Symposium on Cloud Computing (SoCC’19). 403–415.Google Scholar
Digital Library
- [26] . 2019. DistCache: Provable load balancing for large-scale storage systems with distributed caching. In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST’19). 143–157.Google Scholar
- [27] . 2020. A study of SSD reliability in large scale enterprise storage deployments. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST’20). 137–149.Google Scholar
Digital Library
- [28] . 2008. Parallax: Virtual disks for virtual machines. In Proceedings of the 3rd ACM European Conference on Computer Systems (EuroSys’08). 41–54.Google Scholar
Digital Library
- [29] . 2014. Blizzard: Fast, cloud-scale block storage for cloud-oblivious applications. In Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI’14). 257–273.Google Scholar
- [30] . 2022. MSR Cambridge Traces. Retrieved from http://iotta.snia.org/traces/388.Google Scholar
- [31] . 2012. SFS: Random write considered harmful in solid state drives. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). 1–16.Google Scholar
- [32] . 2010. Towards characterizing cloud backend workloads: Insights from Google compute clusters. In Proceedings of the ACM Special Interest Group for the Computer Performance Evaluation Community (SIGMETRICS’10). 34–41.Google Scholar
Digital Library
- [33] . 2008. Write off-loading: Practical Power Management for Enterprise Storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). 253–267.Google Scholar
Digital Library
- [34] . 2006. Disk drive level workload characterization. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’06). 97–102.Google Scholar
- [35] . 1992. The design and implementation of a log-structured file system. ACM Trans. Comput. Syst. 10, 1 (1992), 26–52.Google Scholar
Digital Library
- [36] . 2012. FlashTier: A lightweight, consistent and durable storage cache. In Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys’12). 267–280.Google Scholar
Digital Library
- [37] . 2010. Extending SSD lifetimes with disk-based write caches. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). 101–114.Google Scholar
- [38] . 1987. The proof and measurement of association between two things. Amer. J. Psychol. 100, 3/4 (1987), 441–471.Google Scholar
Cross Ref
- [39] . 2015. DiskAccel: Accelerating disk-based experiments by representative sampling. In Proceedings of the ACM Special Interest Group for the Computer Performance Evaluation Community (SIGMETRICS’15). 297–308.Google Scholar
Digital Library
- [40] . 2022. Tencent Block Storage. Retrieved from http://iotta.snia.org/traces/27917.Google Scholar
- [41] . 2010. SRCMap: Energy proportional storage using dynamic consolidation. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). 267–280.Google Scholar
- [42] . 2019. Distribution fitting and performance modeling for storage traces. In Proceedings of the 27th IEEE International Symposium on the Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’19). IEEE, 138–151.Google Scholar
Cross Ref
- [43] . 2015. Efficient MRC construction with SHARDS. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). 95–110.Google Scholar
Digital Library
- [44] . 2018. Efficient SSD caching by avoiding unnecessary writes using machine learning. In Proceedings of the 47th ACM International Conference on Parallel Processing (ICPP’18). 1–10.Google Scholar
Digital Library
- [45] . 2022. Separating data via block invalidation time inference for write amplification reduction in log-structured storage. In Proceedings of the 20th USENIX Conference on File and Storage Technologies (FAST’22). 429–444.Google Scholar
- [46] . 2020. BCW: Buffer-controlled writes to HDDs for SSD-HDD hybrid storage server. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST’20). 253–266.Google Scholar
- [47] . 2014. Characterizing storage workloads with counter stacks. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). 335–349.Google Scholar
- [48] . 2019. Lessons and actions: What we learned from 10K SSD-related storage system failures. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’19). 961–976.Google Scholar
- [49] . 2021. SSD-based workload characteristics and their performance implications. ACM Trans. Storage 17, 1 (2021), 1–26.Google Scholar
Digital Library
- [50] . 2019. WARCIP: Write amplification reduction by clustering I/O pages. In Proceedings of the 12th ACM International Systems and Storage Conference (SYSTOR’19). 155–166.Google Scholar
Digital Library
- [51] . 2020. A large scale analysis of hundreds of in-memory cache clusters at Twitter. In Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI’20). 191–208.Google Scholar
- [52] . 2020. OSCA: An online-model based cache allocation scheme in cloud block storage systems. In Proceedings of USENIX Annual Technical Conference (USENIX ATC’20). 785–798.Google Scholar
- [53] . 2020. PBS: An efficient erasure-coded block storage system based on speculative partial writes. ACM Trans. Storage 16, 1 (2020), 1–25.Google Scholar
Digital Library
- [54] . 2015. I/O characteristics of smartphone applications and their implications for eMMC design. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC’15). 12–21.Google Scholar
Digital Library
Index Terms
An In-depth Comparative Analysis of Cloud Block Storage Workloads: Findings and Implications
Recommendations
A machine learning based write policy for SSD cache in cloud block storage
DATE '20: Proceedings of the 23rd Conference on Design, Automation and Test in EuropeNowadays, SSD cache plays an important role in cloud storage systems. The associated write policy, which enforces an admission control policy regarding filling data into the cache, has a significant impact on the performance of the cache system and the ...
Block storage scheduling based on SLA in cloud storage systems
EDB '16: Proceedings of the Sixth International Conference on Emerging Databases: Technologies, Applications, and TheoryAccording to the growth of the utilization of cloud computing, the required storage capacity is also increasing. The cloud computing services can allocate immediately storage volume according to user requirements, and cloud storage should be scalable. ...
Supporting Cloud Computing with the Virtual Block Store System
E-SCIENCE '09: Proceedings of the 2009 Fifth IEEE International Conference on e-ScienceThe fast development of cloud computing systems stimulates the needs for a standalone block storage system to provide persistent block storage services to virtual machines maintained by clouds. This paper presents the Virtual Block Store (VBS) System, a ...






Comments