Abstract
Flash-based solid state drives (SSDs) have gained a central role in the infrastructure of large-scale datacenters, as well as in commodity servers and personal devices. The main limitation of flash media is its inability to support update-in-place: after data has been written to a physical location, it has to be erased before new data can be written to it. Moreover, SSDs support read and write operations in granularity of pages, while erasures are performed on entire blocks, which often contain hundreds of pages. When erasing a block, any valid data it stores must be rewritten to a clean location. As an SSD eventually wears out with progressing number of erasures, the efficiency of the management algorithm has a significant impact on its endurance. In this paper we first formally define the SSD management problem. We then explore this problem from an algorithmic perspective, considering it in both offline and online settings. In the offline setting, we present a near-optimal algorithm that, given any input, performs a negligible number of rewrites (relative to the input length). We also discuss the hardness of the offline problem. In the online setting, we first consider algorithms that have no prior knowledge about the input. We prove that no deterministic algorithm outperforms the greedy algorithm in this setting, and discuss the possible benefit of randomization. We then augment our model, assuming that each request for a page arrives with a prediction of the next time the page is updated. We design an online algorithm that uses such predictions, and show that its performance improves as the prediction error decreases. We also show that the performance of our algorithm is never worse than that guaranteed by the greedy algorithm, even when the prediction error is large. We complement our theoretical findings with an empirical evaluation of our algorithms, comparing them with the state-of-the-art scheme. The results confirm that our algorithms exhibit an improved performance for a wide range of input traces.
- R. Agarwal and M. Marrow. 2010. A closed-form expression for write amplification in NAND Flash. In 2010 IEEE Globecom Workshops . 1846--1850.Google Scholar
- Nathan Beckmann, Phillip B. Gibbons, and Charles McGuffey. 2021. Block-Granularity-Aware Caching. In Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures (Virtual Event, USA) (SPAA '21). Association for Computing Machinery, New York, NY, USA, 414--416. https://doi.org/10.1145/3409964.3461818Google Scholar
Digital Library
- Avraham Ben-Aroya and Sivan Toledo. 2011. Competitive analysis of flash memory algorithms. ACM Transactions on Algorithms (TALG) , Vol. 7, 2 (2011), 1--37.Google Scholar
Digital Library
- Matias Bjørling. 2019. New NVMe Specification Defines Zoned Namespaces (ZNS) as Go-To Industry Technology Acceleration Library. https://nvmexpress.org/new-nvmetm-specification-defines-zoned-namespaces-zns-as-go-to-industry-technology/. Accessed: 2021--1--21.Google Scholar
- Douglas C Burger, James R Goodman, and Alain Kagi. 1995. The declining effectiveness of dynamic caching for general-purpose microprocessors . Technical Report. University of Wisconsin-Madison Department of Computer Sciences.Google Scholar
- Werner Bux. 2009. Performance Evaluation of the Write Operation In Flash-Based Solid-State Drives . IBM Res. rep. RZ 3757, IBM Research - Zurich (2009), 1--29.Google Scholar
- Werner Bux and Ilias Iliadis. 2010. Performance of Greedy Garbage Collection in Flash-Based Solid-State Drives. Perform. Eval. , Vol. 67, 11 (Nov. 2010), 1172--1186.Google Scholar
Digital Library
- Brad Calder, Chandra Krintz, Simmi John, and Todd Austin. 1998. Cache-conscious data placement. In Proceedings of the eighth international conference on Architectural support for programming languages and operating systems . 139--149.Google Scholar
Digital Library
- Chandranil Chakraborttii and Heiner Litz. 2021. Reducing Write Amplification in Flash by Death-Time Prediction of Logical Block Addresses. In Proceedings of the 14th ACM International Conference on Systems and Storage .Google Scholar
Digital Library
- Mong-Ling Chiao and Da-Wei Chang. 2011. ROSE: A Novel flash Translation Layer for NAND flash Memory Based on Hybrid Address Translation. IEEE Trans. Comput. , Vol. 60, 6 (2011), 753--766.Google Scholar
Digital Library
- Trishul M Chilimbi, Bob Davidson, and James R Larus. 1999 a. Cache-conscious structure definition. In Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation. 13--24.Google Scholar
Digital Library
- Trishul M Chilimbi, Mark D Hill, and James R Larus. 1999 b. Cache-conscious structure layout. In Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation. 1--12.Google Scholar
Digital Library
- Peter Desnoyers. 2014. Analytic Models of SSD Write Performance. ACM Trans. Storage , Vol. 10, 2, Article 8 (March 2014).Google Scholar
Digital Library
- Ajit Diwan, Soumitra Pal, and Abhiram Ranade. 2015. Fragmented coloring of proper interval and split graphs. Discrete Applied Mathematics , Vol. 193 (2015), 110--118.Google Scholar
Digital Library
- Amos Fiat, Richard M Karp, Michael Luby, Lyle A McGeoch, Daniel D Sleator, and Neal E Young. 1991. Competitive paging algorithms. Journal of Algorithms , Vol. 12, 4 (1991), 685--699. https://doi.org/10.1016/0196--6774(91)90041-VGoogle Scholar
Cross Ref
- Jun He, Sudarsun Kannan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2017. The Unwritten Contract of Solid State Drives. In Proceedings of the Twelfth European Conference on Computer Systems (Belgrade, Serbia) (EuroSys '17). Association for Computing Machinery, New York, NY, USA, 127--144. https://doi.org/10.1145/3064176.3064187Google Scholar
Digital Library
- Jen-Wei Hsieh, Tei-Wei Kuo, and Li-Pin Chang. 2006. Efficient Identification of Hot Data for Flash Memory Storage Systems. Transactions on Storage , Vol. 2, 1 (Feb. 2006), 22--40.Google Scholar
Digital Library
- Xiao-Yu Hu, Evangelos Eleftheriou, Robert Haas, Ilias Iliadis, and Roman Pletka. 2009. Write Amplification Analysis in Flash-Based Solid State Drives. In Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference (Haifa, Israel) (SYSTOR '09). Article 10, bibinfonumpages9 pages.Google Scholar
Digital Library
- X.-Y. Hu and R. Haas. 2010. The fundamental limit of flash random write performance: Understanding, analysis and performance modelling . IBM Res. rep. RZ 3771, IBM Research - Zurich (2010), 1--19.Google Scholar
- Soojun Im and Dongkun Shin. 2010. ComboFTL: Improving Performance and Lifespan of MLC flash Memory Using SLC flash Buffer. J. Syst. Archit. , Vol. 56, 12 (Dec. 2010), 641--653.Google Scholar
Digital Library
- Djordje Jevdjic, Gabriel H Loh, Cansu Kaynak, and Babak Falsafi. 2014. Unison cache: A scalable and effective die-stacked DRAM cache. In 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 25--37.Google Scholar
Digital Library
- Djordje Jevdjic, Stavros Volos, and Babak Falsafi. 2013. Die-stacked dram caches for servers: Hit ratio, latency, or bandwidth? have it all with footprint cache. ACM SIGARCH Computer Architecture News , Vol. 41, 3 (2013), 404--415.Google Scholar
Digital Library
- Jeong-Uk Kang, Jeeseok Hyun, Hyunjoo Maeng, and Sangyeun Cho. 2014. The Multi-streamed Solid-State Drive. In 6th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage) .Google Scholar
- Samira M Khan, Daniel A Jiménez, Doug Burger, and Babak Falsafi. 2010. Using dead blocks as a virtual victim cache. In Proceedings of the 19th international conference on Parallel architectures and compilation techniques . 489--500.Google Scholar
Digital Library
- Taejin Kim, Duwon Hong, Sangwook Shane Hahn, Myoungjun Chun, Sungjin Lee, Jooyoung Hwang, Jongyoul Lee, and Jihong Kim. 2019. Fully Automatic Stream Management for Multi-Streamed SSDs Using Program Contexts. In 17th USENIX Conference on File and Storage Technologies (FAST) .Google Scholar
- J. Kleinberg, R. Motwani, P. Raghavan, and S. Venkatasubramanian. 1997. Storage management for evolving databases. In Proceedings 38th Annual Symposium on Foundations of Computer Science. 353--362.Google Scholar
- Kevin Kremer and André Brinkmann. 2019. FADaC: A Self-adapting Data Classifier for Flash Memory. In 12th ACM International Conference on Systems and Storage (SYSTOR) .Google Scholar
Digital Library
- Haiming Liu, Michael Ferdman, Jaehyuk Huh, and Doug Burger. 2008. Cache bursts: A new approach for eliminating dead blocks and increasing cache efficiency. In 2008 41st IEEE/ACM International Symposium on Microarchitecture. IEEE, 222--233.Google Scholar
Digital Library
- X. Luojie and B. M. Kurkoski. 2012. An improved analytic expression for write amplification in NAND flash. In 2012 International Conference on Computing, Networking and Communications (ICNC). 497--501. https://doi.org/10.1109/ICCNC.2012.6167472Google Scholar
- Thodoris Lykouris and Sergei Vassilvitskii. 2018. Competitive Caching with Machine Learned Advice. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsm"a ssan, Stockholm, Sweden, July 10--15, 2018 (Proceedings of Machine Learning Research, Vol. 80). PMLR , 3302--3311.Google Scholar
- Michael Mitzenmacher and Sergei Vassilvitskii. 2020. Algorithms with Predictions. In Beyond the Worst-Case Analysis of Algorithms, Tim Roughgarden (Ed.). Cambridge University Press, 646--662.Google Scholar
- Dushyanth Narayanan, Austin Donnelly, and Antony Rowstron. 2008. Write Off-loading: Practical Power Management for Enterprise Storage. Trans. Storage , Vol. 4, 3, Article 10 (Nov. 2008), bibinfonumpages10:1--10:23 pages.Google Scholar
- Dongchul Park and David H.C. Du. 2011. Hot data identification for flash-based storage systems using multiple Bloom filters. In 27th IEEE Symposium on Mass Storage Systems and Technologies (MSST) .Google Scholar
- Erez Petrank and Dror Rawitz. 2002. The hardness of cache conscious data placement. ACM SIGPLAN Notices , Vol. 37, 1 (2002), 101--112.Google Scholar
Digital Library
- Manish Purohit, Zoya Svitkina, and Ravi Kumar. 2018. Improving Online Algorithms via ML Predictions. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3--8, 2018, Montré al, Canada . 9684--9693.Google Scholar
- Moinuddin K Qureshi and Gabe H Loh. 2012. Fundamental latency trade-off in architecting dram caches: Outperforming impractical sram-tags with a simple and practical design. In 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 235--246.Google Scholar
Digital Library
- Eunhee Rho, Kanchan Joshi, Seung-Uk Shin, Nitesh Jagadeesh Shetty, Jooyoung Hwang, Sangyeun Cho, Daniel DG Lee, and Jaeheon Jeong. 2018. FStream: Managing Flash Streams in the File System. In 16th USENIX Conference on File and Storage Technologies (FAST) .Google Scholar
- Daniel D. Sleator and Robert E. Tarjan. 1985. Amortized Efficiency of List Update and Paging Rules. Commun. ACM , Vol. 28, 2 (Feb. 1985), 202--208. https://doi.org/10.1145/2786.2793Google Scholar
Digital Library
- SNIA IOTTA Trace Repository. 2007. MSR Cambridge Traces. http://iotta.snia.org/traces/block-io/388 .Google Scholar
- Radu Stoica and Anastasia Ailamaki. 2013. Improving Flash Write Performance by Using Update Frequency. Proc. VLDB Endow. , Vol. 6, 9 (July 2013), 733--744.Google Scholar
Digital Library
- Benny Van Houdt. 2014. A Mean Field Model for a Class of Garbage Collection Algorithms in Flash-Based Solid State Drives. Queueing Syst. Theory Appl. , Vol. 77, 2 (June 2014), 149--176. https://doi.org/10.1007/s11134-014--9403-0Google Scholar
- B. Van Houdt. 2014. On the necessity of hot and cold data identification to reduce the write amplification in flash-based SSDs . Performance Evaluation , Vol. 82 (2014), 1--14. https://doi.org/10.1016/j.peva.2014.08.003Google Scholar
Digital Library
- Eitan Yaakobi, Gala Yadgar, Nachum Bundak, and Lior Gilon. 2018. A Case for Biased Programming in Flash. In 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 18) .Google Scholar
Digital Library
- Eitan Yaakobi, Alexander Yucovich, Gal Maor, and Gala Yadgar. 2015. When Do WOM Codes Improve the Erasure Factor in Flash Memories?. In IEEE International Symposium on Information Theory (ISIT).Google Scholar
Cross Ref
- Gala Yadgar, Moshe Gabel, Shehbaz Jaffer, and Bianca Schroeder. 2021. SSD-Based Workload Characteristics and Their Performance Implications. ACM Trans. Storage , Vol. 17, 1, Article 8 (Jan. 2021), bibinfonumpages26 pages.Google Scholar
Digital Library
- Gala Yadgar, Eitan Yaakobi, Fabio Margaglia, Yue Li, Alexander Yucovich, Nachum Bundak, Lior Gilon, Nir Yakovi, Assaf Schuster, and André Brinkmann. 2018. An Analysis of Flash Page Reuse With WOM Codes. ACM Trans. Storage , Vol. 14, 1, Article 10 (Feb. 2018), bibinfonumpages39 pages.Google Scholar
Digital Library
- Pan Yang, Ni Xue, Yuqi Zhang, Yangxu Zhou, Li Sun, Wenwen Chen, Zhonggang Chen, Wei Xia, Junke Li, and Kihyoun Kwon. 2019. Reducing Garbage Collection Overhead in SSD Based on Workload Prediction. In 11th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage) .Google Scholar
- Yudong Yang, Vishal Misra, and Dan Rubenstein. 2015. On the Optimality of Greedy Garbage Collection for SSDs . SIGMETRICS Perform. Eval. Rev. , Vol. 43, 2 (Sept. 2015), 63--65.Google Scholar
Digital Library
- Andrew Chi-Chin Yao. 1977. Probabilistic computations: Toward a unified measure of complexity. In 18th Annual Symposium on Foundations of Computer Science (sfcs 1977). 222--227.Google Scholar
Index Terms
Offline and Online Algorithms for SSD Management
Recommendations
RAID-Aware SSD: Improving the Write Performance and Lifespan of SSD in SSD-Based RAID-5 System
BDCLOUD '14: Proceedings of the 2014 IEEE Fourth International Conference on Big Data and Cloud ComputingFlash memory-based SSD RAID has an excellent I/O performance with high stability, which making it get more and more attention from companies and manufacturers, especially in I/O-intensive environments. However, frequently updating parity also makes the ...
Offline and Online Algorithms for SSD Management
SIGMETRICS '22The abundance of system-level optimizations for reducing SSD write amplification, which are usually based on experimental evaluation, stands in contrast to the lack of theoretical algorithmic results in this problem domain. To bridge this gap, we ...
Offline and Online Algorithms for SSD Management
SIGMETRICS/PERFORMANCE '22: Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer SystemsThe abundance of system-level optimizations for reducing SSD write amplification, which are usually based on experimental evaluation, stands in contrast to the lack of theoretical algorithmic results in this problem domain. To bridge this gap, we ...






Comments