Abstract
Storage systems are designed and optimized relying on wisdom derived from analysis studies of file-system and block-level workloads. However, while SSDs are becoming a dominant building block in many storage systems, their design continues to build on knowledge derived from analysis targeted at hard disk optimization. Though still valuable, it does not cover important aspects relevant for SSD performance. In a sense, we are “searching under the streetlight,” possibly missing important opportunities for optimizing storage system design.
We present the first I/O workload analysis designed with SSDs in mind. We characterize traces from four repositories and examine their “temperature” ranges, sensitivity to page size, and “logical locality.” We then take the first step towards correlating these characteristics with three standard performance metrics: write amplification, read amplification, and flash read costs. Our results show that SSD-specific characteristics strongly affect performance, often in surprising ways.
- Intel. [n.d.]. Intel 64M20C Client Compute NAND Flash Memory. Retrieved on September 2020.Google Scholar
- Neha Agarwal and Thomas F. Wenisch. 2017. Thermostat: Application-transparent page management for two-tiered main memory. In Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’17).Google Scholar
- Nitin Agrawal, William J. Bolosky, John R. Douceur, and Jacob R. Lorch. 2007. A five-year study of file-system metadata. ACM Trans. Stor. 3, 3 (Oct. 2007).Google Scholar
Digital Library
- Nitin Agrawal, Vijayan Prabhakaran, Ted Wobber, John D. Davis, Mark Manasse, and Rina Panigrahy. 2008. Design tradeoffs for SSD performance. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’08).Google Scholar
- Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems.Google Scholar
Digital Library
- Matias Bjørling, Javier Gonzalez, and Philippe Bonnet. 2017. LightNVM: The Linux open-channel SSD subsystem. In Proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST’17).Google Scholar
- Luc Bouganim, Björn Jónsson, and Philippe Bonnet. 2009. uFLIP: Understanding flash IO patterns. In Proceedings of the Conference on Innovative Data Systems Research (CIDR’09).Google Scholar
- Alan D. Brunelle. 2008. blktrace user guide. Retrieved on September 2020 from https://github.com/efarrer/blktrace/blob/master/doc/blktrace.tex.Google Scholar
- John S. Bucy, Jiri Schindler, Steven W. Schlosser, and Gregory R. Ganger. 2008. The DiskSim Simulation Environment Version 4.0 Reference Manual.Google Scholar
- Yanpei Chen, Kiran Srinivasan, Garth Goodson, and Randy Katz. 2011. Design implications for enterprise storage systems via multi-dimensional trace analysis. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP’11).Google Scholar
Digital Library
- Mei-Ling Chiang, Paul C. H. Lee, and Ruei-Chuan Chang. 1999. Using data clustering to improve cleaning performance for flash memory. Softw.: Pract. Exper. 29, 3 (1999), 267–290.Google Scholar
Digital Library
- Mong-Ling Chiao and Da-Wei Chang. 2011. ROSE: A novel flash translation layer for NAND flash memory based on hybrid address translation. IEEE Trans. Comput. 60, 6 (2011), 753–766.Google Scholar
Digital Library
- Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC’10).Google Scholar
Digital Library
- J. Courville and F. Chen. 2016. Understanding storage I/O behaviors of mobile applications. In Proceedings of the 32nd Symposium on Mass Storage Systems and Technologies (MSST’16).Google Scholar
- Peter Desnoyers. 2013. What systems researchers need to know about NAND flash. In Proceedings of the 9th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’13).Google Scholar
- Peter Desnoyers. 2014. Analytic models of SSD write performance. ACM Trans. Stor. 10, 2 (Mar. 2014). DOI:https://doi.org/10.1145/2577384.Google Scholar
Digital Library
- Ajay Gulati, Chethan Kumar, and Irfan Ahmad. 2009. Storage workload characterization and consolidation in virtualized environments. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (VPACT’09).Google Scholar
- Aayush Gupta, Youngjae Kim, and Bhuvan Urgaonkar. 2009. DFTL: A flash translation layer employing demand-based selective caching of page-level address mappings. In Proceedings of the ACM International Conference Architecture Support for Programming Languages and Operating Systems (ASPLOS’09).Google Scholar
Digital Library
- M. Hadizadeh, E. Cheshmikhani, and H. Asadi. 2020. STAIR: High reliable STT-MRAM aware multi-level I/O cache architecture by adaptive ECC allocation. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE’20).Google Scholar
- Mohammad Hossein Hajkazemi, Ajay Narayan Kulkarni, Peter Desnoyers, and Timothy R. Feldman. 2019. Track-based translation layers for interlaced magnetic recording. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’19).Google Scholar
- Jun He, Sudarsun Kannan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2017. The unwritten contract of solid state drives. In Proceedings of the 12th European Conference on Computer Systems (EuroSys’17).Google Scholar
- Weiping He and David H. C. Du. 2017. SMaRT: An approach to shingled magnetic recording translation. In Proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST’17).Google Scholar
Digital Library
- Jen-Wei Hsieh, Tei-Wei Kuo, and Li-Pin Chang. 2006. Efficient identification of hot data for flash memory storage systems. Trans. Stor. 2, 1 (Feb. 2006), 22–40.Google Scholar
Digital Library
- Soojun Im and Dongkun Shin. 2010. ComboFTL: Improving performance and lifespan of MLC flash memory using SLC flash buffer. J. Syst. Archit. 56, 12 (Dec. 2010), 641–653.Google Scholar
Digital Library
- Jürgen Kaiser, Fabio Margaglia, and André Brinkmann. 2013. Extending SSD lifetime in database applications with page overwrites. In Proceedings of the 6th International Systems and Storage Conference (SYSTOR’13).Google Scholar
Digital Library
- Jeong-Uk Kang, Jeeseok Hyun, Hyunjoo Maeng, and Sangyeun Cho. 2014. The multi-streamed solid-state drive. In Proceedings of the 6th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’14).Google Scholar
- Anil Kashyap. 2018. Workload characterization for enterprise disk drives. ACM Trans. Stor. 14, 2 (Apr. 2018).Google Scholar
- S. Kavalanekar, B. Worthington, Qi Zhang, and V. Sharda. 2008. Characterization of storage workload traces from production windows servers. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC’08).Google Scholar
- Jaeho Kim, Jongmin Lee, Jongmoo Choi, Donghee Lee, and Sam H. Noh. 2013. Improving SSD reliability with RAID via elastic striping and anywhere parity. In Proceedings of the International Conference on Dependable Systems and Networks (DSN’13).Google Scholar
- Taejin Kim, Duwon Hong, Sangwook Shane Hahn, Myoungjun Chun, Sungjin Lee, Jooyoung Hwang, Jongyoul Lee, and Jihong Kim. 2019. Fully automatic stream management for multi-streamed SSDs using program contexts. In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST’19).Google Scholar
Digital Library
- Ricardo Koller and Raju Rangaswami. 2010. I/O deduplication: Utilizing content similarity to improve I/O performance. ACM Trans. Stor. 6, 3 (Sept. 2010), 13:1–13:26.Google Scholar
Digital Library
- Kevin Kremer and André Brinkmann. 2019. FADaC: A self-adapting data classifier for flash memory. In Proceedings of the 12th ACM International Conference on Systems and Storage (SYSTOR’19).Google Scholar
Digital Library
- Andrew W. Leung, Shankar Pasupathy, Garth Goodson, and Ethan L. Miller. 2008. Measurement and analysis of large-scale network file system workloads. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’08).Google Scholar
- Cheng Li, Philip Shilane, Fred Douglis, Darren Sawyer, and Hyong Shim. 2014. Assert(!Defined(Sequential I/O)). In Proceedings of the 7th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’14).Google Scholar
- Huaicheng Li, Mingzhe Hao, Michael Hao Tong, Swaminathan Sundararaman, Matias Bjørling, and Haryadi S. Gunawi. 2018. The CASE of FEMU: Cheap, accurate, scalable and extensible flash emulator. In Proceedings of the 16th USENIX Conference on File and Storage Technologies (FAST’18).Google Scholar
- Y. Li, S. Lee, K. Oowada, H. Nguyen, Q. Nguyen, N. Mokhlesi, C. Hsu, J. Li, V. Ramachandra, T. Kamei, M. Higashitani, T. Pham, M. Honma, Y. Watanabe, K. Ino, B. Le, B. Woo, K. Htoo, T. Y. Tseng, L. Pham, F. Tsai, K. h. Kim, Y. C. Chen, M. She, J. Yuh, A. Chu, C. Chen, R. Puri, H. S. Lin, Y. F. Chen, W. Mak, J. Huynh, J. Chan, M. Watanabe, D. Yang, G. Shah, P. Souriraj, D. Tadepalli, S. Tenugu, R. Gao, V. Popuri, B. Azarbayjani, R. Madpur, J. Lan, E. Yero, F. Pan, P. Hong, J. Y. Kang, F. Moogat, Y. Fong, R. Cernea, S. Huynh, C. Trinh, M. Mofidi, R. Shrivastava, and K. Quader. 2012. 128Gb 3b/cell NAND flash memory in 19nm technology with 18MB/s write rate and 400Mb/s toggle mode. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC’12).Google Scholar
- Hang Liu and H. Howie Huang. 2017. Graphene: Fine-grained IO management for graph computing. In Proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST’17).Google Scholar
Digital Library
- Fabio Margaglia and André Brinkmann. 2015. Improving MLC flash performance and endurance with extended P/E cycles. In Proceedings of the IEEE 31st Symposium on Mass Storage Systems and Technologies (MSST’15).Google Scholar
Cross Ref
- Fabio Margaglia, Gala Yadgar, Eitan Yaakobi, Yue Li, Assaf Schuster, and Andre Brinkmann. 2016. The devil is in the details: Implementing flash page reuse with WOM codes. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16).Google Scholar
Digital Library
- Dutch T. Meyer and William J. Bolosky. 2011. A study of practical deduplication. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST’11).Google Scholar
- Changwoo Min, Kangnyeon Kim, Hyunjin Cho, Sang-Won Lee, and Young Ik Eom. 2012. SFS: Random write considered harmful in solid state drives. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’12).Google Scholar
- Jayashree Mohan, Dhathri Purohith, Matthew Halpern, Vijay Chidambaram, and Vijay Janapa Reddi. 2017. Storage on your SmartPhone uses more energy than you think. In Proceedings of the 9th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’17).Google Scholar
Digital Library
- Dushyanth Narayanan, Austin Donnelly, and Antony Rowstron. 2008. Write off-loading: Practical power management for enterprise storage. ACM Trans. Stor. 4, 3 (Nov. 2008), 10:1–10:23.Google Scholar
Digital Library
- Dushyanth Narayanan, Eno Thereska, Austin Donnelly, Sameh Elnikety, and Antony Rowstron. 2009. Migrating server storage to SSDs: Analysis of tradeoffs. In Proceedings of the European Conference on Computer Systems (EuroSys’09).Google Scholar
Digital Library
- Jian Ouyang, Shiding Lin, Song Jiang, Zhenyu Hou, Yong Wang, and Yuanzheng Wang. 2014. SDF: Software-defined flash for web-scale Internet storage systems. In Proceedings of the ACM International Conference on Architecture Support for Programming Languages and Operating Systems (ASPLOS’14).Google Scholar
Digital Library
- Dongchul Park and David H. C. Du. 2011. Hot data identification for flash-based storage systems using multiple Bloom filters. In Proceedings of the 27th IEEE Symposium on Mass Storage Systems and Technologies (MSST’11).Google Scholar
- Eunhee Rho, Kanchan Joshi, Seung-Uk Shin, Nitesh Jagadeesh Shetty, Jooyoung Hwang, Sangyeun Cho, Daniel D. G. Lee, and Jaeheon Jeong. 2018. FStream: Managing flash streams in the file system. In Proceedings of the 16th USENIX Conference on File and Storage Technologies (FAST’18).Google Scholar
Digital Library
- Alma Riska and Erik Riedel. 2006. Disk drive level workload characterization. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’06).Google Scholar
Digital Library
- Drew Roselli, Jacob R. Lorch, and Thomas E. Anderson. 2000. A comparison of file system workloads. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’00).Google Scholar
- Samsung Electronics. 2011. 16Gb F-die NAND Flash Multi-Level-Cell (2bit/cell) (1.1 ed.). Retrieved on September 2020.Google Scholar
- Samsung Electronics. 2011. Samsung V-NAND Technology. Retrieved on September 2020 from https://www.samsung.com/semiconductor/global.semi.static/2bit_V-NAND_technology_White_Paper-1.pdf.Google Scholar
- Mohit Saxena, Yiying Zhang, Michael M. Swift, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2013. Getting real: Lessons in transitioning research simulations into hardware systems. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13).Google Scholar
- Zhaoyan Shen, Feng Chen, Gala Yadgar, and Zili Shao. 2019. One size never fits all: A flexible storage interface for SSDs. In Proceedings of the 39th IEEE International Conference on Distributed Computing Systems (ICDCS’19).Google Scholar
Cross Ref
- N. Shibata, K. Kanda, T. Shimizu, J. Nakai, O. Nagao, N. Kobayashi, M. Miakashi, Y. Nagadomi, T. Nakano, T. Kawabe, T. Shibuya, M. Sako, K. Yanagidaira, T. Hashimoto, H. Date, M. Sato, T. Nakagawa, H. Takamoto, J. Musha, T. Minamoto, M. Uda, D. Nakamura, K. Sakurai, T. Yamashita, J. Zhou, R. Tachibana, T. Takagiwa, T. Sugimoto, M. Ogawa, Y. Ochi, K. Kawaguchi, M. Kojima, T. Ogawa, T. Hashiguchi, R. Fukuda, M. Masuda, K. Kawakami, T. Someya, Y. Kajitani, Y. Matsumoto, N. Morozumi, J. Sato, N. Raghunathan, Y. L. Koh, S. Chen, J. Lee, H. Nasu, H. Sugawara, K. Hosono, T. Hisada, T. Kaneko, and H. Nakamura. 2019. A 1.33Tb 4-bit/cell 3D-flash memory on a 96-word-line-layer technology. In Proceedings of the IEEE International Solid-state Circuits Conference (ISSCC’19).Google Scholar
- SNIA IOTTA Trace Repository. 2020. YCSB RocksDB SSD Traces. Retrieved from http://iotta.snia.org/traces/28568.Google Scholar
- Gokul Soundararajan, Vijayan Prabhakaran, Mahesh Balakrishnan, and Ted Wobber. 2010. Extending SSD lifetimes with disk-based write caches. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10).Google Scholar
Digital Library
- Radu Stoica and Anastasia Ailamaki. 2013. Improving flash write performance by using update frequency. VLDB Endow. 6, 9 (July 2013), 733–744.Google Scholar
Digital Library
- University of Massachusetts Amherst. 2014. UMass Trace Repository. University of Massachusetts Amherst. http://traces.cs.umass.edu/index.php/Storage/Storage.Google Scholar
- Carl Waldspurger, Trausti Saemundsson, Irfan Ahmad, and Nohhyun Park. 2017. Cache modeling and optimization using miniature simulations. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’17).Google Scholar
- Feng Wang, Qin Xin, Bo Hong, Scott Brandt, Ethan Miller, and Darrell Long. 2004. File system workload analysis for large scale scientific computing applications. In Proceedings of the IEEE Symposium on Mass Storage Systems and Technologies (MSST’04).Google Scholar
- Fenggang Wu, Bingzhe Li, Baoquan Zhang, Zhichao Cao, Jim Diehl, Hao Wen, and David H. C. Du. 2020. TrackLace: Data management for interlaced magnetic recording. IEEE Trans. Comput. (early access) (2020), 1–1. DOI:10.1109/TC.2020.2988257Google Scholar
Digital Library
- Fei Wu, Jiaona Zhou, Shunzhuo Wang, Yajuan Du, Chengmo Yang, and Changsheng Xie. 2018. In Proceedings of the 55th ACM/ESDA/IEEE Design Automation Conference (DAC’18).Google Scholar
- Kan Wu, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. 2019. Towards an unwritten contract of Intel Optane SSD. In Proceedings of the 11th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’19).Google Scholar
Digital Library
- Gala Yadgar and Moshe Gabel. 2016. Avoiding the streetlight effect: I/O workload analysis with SSDs in mind. In Proceedings of the 8th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’16).Google Scholar
- Gala Yadgar, Roman Shor, Eitan Yaakobi, and Assaf Schuster. 2015. It’s not where your data is, it’s how it got there. In Proceedings of the USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’15).Google Scholar
- Gala Yadgar, Eitan Yaakobi, Fabio Margaglia, Yue Li, Alexander Yucovich, Nachum Bundak, Lior Gilon, Nir Yakovi, Assaf Schuster, and André Brinkmann. 2018. An analysis of flash page reuse with WOM codes. ACM Trans. Stor. 14, 1 (Feb. 2018).Google Scholar
Digital Library
- Gala Yadgar, Eitan Yaakobi, and Assaf Schuster. 2015. Write once, get 50% free: Saving SSD erase costs using WOM codes. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’15).Google Scholar
- R. Yamashita, S. Magia, T. Higuchi, K. Yoneya, T. Yamamura, H. Mizukoshi, S. Zaitsu, M. Yamashita, S. Toyama, N. Kamae, J. Lee, S. Chen, J. Tao, W. Mak, X. Zhang, Y. Yu, Y. Utsunomiya, Y. Kato, M. Sakai, M. Matsumoto, H. Chibvongodze, N. Ookuma, H. Yabe, S. Taigor, R. Samineni, T. Kodama, Y. Kamata, Y. Namai, J. Huynh, S. E. Wang, Y. He, T. Pham, V. Saraf, A. Petkar, M. Watanabe, K. Hayashi, P. Swarnkar, H. Miwa, A. Pradhan, S. Dey, D. Dwibedy, T. Xavier, M. Balaga, S. Agarwal, S. Kulkarni, Z. Papasaheb, S. Deora, P. Hong, M. Wei, G. Balakrishnan, T. Ariki, K. Verma, C. Siau, Y. Dong, C. H. Lu, T. Miwa, and F. Moogat. 2017. 11.1 A 512Gb 3b/cell flash memory on 64-word-line-layer BiCS technology. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC’17).Google Scholar
- Pan Yang, Ni Xue, Yuqi Zhang, Yangxu Zhou, Li Sun, Wenwen Chen, Zhonggang Chen, Wei Xia, Junke Li, and Kihyoun Kwon. 2019. Reducing garbage collection overhead in SSD based on workload prediction. In Proceedings of the 11th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’19).Google Scholar
Digital Library
- Yue Yang and Jianwen Zhu. 2014. Analytical modeling of garbage collection algorithms in hotness-aware flash-based solid state drives. In Proceedings of the 30th Symposium on Mass Storage Systems and Technologies (MSST’14).Google Scholar
Cross Ref
- Chun yi Liu, Jagadish Kotra, Myoungsoo Jung, and Mahmut Kandemir. 2018. PEN: Design and evaluation of partial-erase for 3D NAND-based high density SSDs. In Proceedings of the 16th USENIX Conference on File and Storage Technologies (FAST’18).Google Scholar
- Jie Zhang, Miryeong Kwon, Michael Swift, and Myoungsoo Jung. 2020. Scalable parallel flash firmware for many-core architectures. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST’20).Google Scholar
Digital Library
- D. Zhou, W. Pan, W. Wang, and T. Xie. 2015. I/O characteristics of smartphone applications and their implications for eMMC design. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC’15).Google Scholar
- Yuanyuan Zhou, James Philbin, and Kai Li. 2001. The multi-queue replacement algorithm for second level buffer caches. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’01).Google Scholar
Index Terms
SSD-based Workload Characteristics and Their Performance Implications
Recommendations
Evaluation of Exclusive Data Allocation Between SSD Tier and SSD Cache in Storage Systems
ICEIS 2014: Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 1We proposed and evaluated the storage I/O response time with the exclusive allocation method between SSD for tiered volume and SSD for cache in the storage system utilizing SSD and HDD. In the proposed method, the SSD cache function with exclusive ...
A comprehensive study of energy efficiency and performance of flash-based SSD
Use of flash memory as a storage medium is becoming popular in diverse computing environments. However, because of differences in interface, flash memory requires a hard-disk-emulation layer, called FTL (flash translation layer). Although the FTL ...
A machine learning based write policy for SSD cache in cloud block storage
DATE '20: Proceedings of the 23rd Conference on Design, Automation and Test in EuropeNowadays, SSD cache plays an important role in cloud storage systems. The associated write policy, which enforces an admission control policy regarding filling data into the cache, has a significant impact on the performance of the cache system and the ...






Comments