Abstract
High density Solid State Drives, such as QLC drives, offer increased storage capacity, but a magnitude lower Program and Erase (P/E) cycles, limiting their endurance and hence usability. We present the design and implementation of non-binary, Voltage-Based Write-Once-Memory (WOM-v) Codes to improve the lifetime of QLC drives. First, we develop a FEMU based simulator test-bed to evaluate the gains of WOM-v codes on real world workloads. Second, we propose and implement two optimizations, an efficient garbage collection mechanism and an encoding optimization to drastically improve WOM-v code endurance without compromising performance. Third, we propose analytical approaches to obtain estimates of the endurance gains under WOM-v codes. We analyze the Greedy garbage collection technique with uniform page access distribution and the Least Recently Written (LRW) garbage collection technique with skewed page access distribution in the context of WOM-v codes. We find that although both approaches overestimate the number of required erase operations, the model based on greedy garbage collection with uniform page access distribution provides tighter bounds. A careful evaluation, including microbenchmarks and trace-driven evaluation, demonstrates that WOM-v codes can reduce Erase cycles for QLC drives by 4.4×–11.1× for real world workloads with minimal performance overheads resulting in improved QLC SSD lifetime.
- [1] 2020. Western Digital and Toshiba talk up penta-level cell flash. Retrieved from https://blocksandfiles.com/2019/08/07/penta-level-cell-flash/. Accessed January 11, 2022.Google Scholar
- [2] 2021. FEMU TLC and QLC NAND support. Retrieved from https://github.com/ucare-uchicago/FEMU/pull/47. Accessed January 11, 2022.Google Scholar
- [3] 2022. 4 QLC workloads and why they’re a good fit for QLC NAND flash. Retrieved from https://www.techtarget.com/searchstorage/tip/4-QLC-workloads-and-why-they-are-a-good-fit-for-QLC-NAND-flash. Accessed January 11, 2022.Google Scholar
- [4] 2022. QLC NAND - What can we expect from the technology? Retrieved from https://www.architecting.it/blog/qlc-nand/. Accessed January 11, 2022.Google Scholar
- [5] 2022. SLC vs MLC vs TLC vs QLC. Retrieved from https://memkor.com/slc-vs-mlc-vs-tlc%2Fqlc. Accessed January 11, 2022.Google Scholar
- [6] 2022. TLC vs QLC SSDs: The Ultimate Guide. Retrieved from https://storagereviews.net/tlc-vs-qlc-ssds/. Accessed January 11, 2022.Google Scholar
- [7] 2022. WOM-v Source Code. Retrieved from https://github.com/uoftsystems/womv. Accessed January 11, 2022.Google Scholar
- [8] . 2007. A five-year study of file-system metadata. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST 07). USENIX Association. Retrieved from https://www.usenix.org/conference/fast-07/five-year-study-file-system-metadata.Google Scholar
Digital Library
- [9] . 2017. Lightnvm: The linux open-channel \(\lbrace\)SSD\(\rbrace\) subsystem. In Proceedings of the 15th \(\lbrace\)USENIX\(\rbrace\) Conference on File and Storage Technologies (\(\lbrace\)FAST\(\rbrace\) 17). 359–374.Google Scholar
- [10] . 2006. Elements of information theory, 2nd edition. Wiley.Google Scholar
- [11] . 2012. Analytic models of SSD write performance. In Proceedings of the 5th Annual International Systems and Storage Conference.Google Scholar
Digital Library
- [12] . 2013. What systems researchers need to know about NAND flash. In Proceedings of the 5th USENIX Conference on Hot Topics in Storage and File Systems (San Jose, CA) (
HotStorage’13 ). USENIX Association, 6.Google ScholarDigital Library
- [13] . 2014. Analytic models of SSD write performance. ACM Transactions on Storage 10, 2 (
March 2014), 25 pages.DOI: Google ScholarDigital Library
- [14] . 2020. Rethinking WOM codes to enhance the lifetime in new SSD generations. In Proceedings of the 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 20). USENIX Association. Retrieved from https://www.usenix.org/conference/hotstorage20/presentation/jaffer.Google Scholar
- [15] . 2022. Improving the reliability of next generation SSDs using WOM-v codes. In Proceedings of the 20th USENIX Conference on File and Storage Technologies (FAST 22). USENIX Association, Santa Clara, CA, 117–132. Retrieved from https://www.usenix.org/conference/fast22/presentation/jaffer.Google Scholar
- [16] . 2014. Lifetime improvement of NAND flash-based storage systems using dynamic program and erase scaling. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST 14). USENIX Association, 61–74. Retrieved from https://www.usenix.org/conference/fast14/technical-sessions/presentation/jeong.Google Scholar
Digital Library
- [17] . 2020. Need for a deeper cross-layer optimization for dense NAND SSD to improve read performance of big data applications: A case for melded pages. In Proceedings of the 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 20). USENIX Association. Retrieved from https://www.usenix.org/conference/hotstorage20/presentation/k.Google Scholar
- [18] . 2008. Characterization of storage workload traces from production windows servers. In Proceedings of the 2008 IEEE International Symposium on Workload Characterization. IEEE, 119–128.Google Scholar
Cross Ref
- [19] . 2010. I/O deduplication: Utilizing content similarity to improve I/O performance. ACM Transactions on Storage (TOS) 6, 3 (2010), 1–26.Google Scholar
Digital Library
- [20] . 2018. The CASE of FEMU: Cheap, accurate, scalable and extensible flash emulator. In Proceedings of the 16th USENIX Conference on File and Storage Technologies (FAST 18). USENIX Association, 83–90. Retrieved from https://www.usenix.org/conference/fast18/presentation/li.Google Scholar
- [21] . 2020. An in-depth analysis of cloud block storage workloads in large-scale production. In IEEE International Symposium on Workload Characterization, IISWC 2020. IEEE, 37–47.
DOI: Google ScholarCross Ref
- [22] . 2020. A study of SSD reliability in large scale enterprise storage deployments. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST 20). USENIX Association, 137–149. Retrieved from https://www.usenix.org/conference/fast20/presentation/maneas.Google Scholar
Digital Library
- [23] . 2022. Operational characteristics of SSDs in enterprise storage systems: A large-scale field study. In Proceedings of the 20th USENIX Conference on File and Storage Technologies (FAST 22). USENIX Association. Retrieved from https://www.usenix.org/conference/fast22/presentation/maneas.Google Scholar
- [24] . 2015. Improving MLC flash performance and endurance with extended P/E cycles. In Proceedings of the 2015 31st Symposium on Mass Storage Systems and Technologies (MSST). IEEE, 1–12.Google Scholar
Cross Ref
- [25] . 2016. The devil is in the details: Implementing flash page reuse with WOM codes. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST 16). USENIX Association, 95–109. Retrieved from https://www.usenix.org/conference/fast16/technical-sessions/presentation/margaglia.Google Scholar
- [26] . 2008. Write off-loading: Practical power management for enterprise storage. ACM Transactions on Storage (TOS) 4, 3 (2008), 1–23.Google Scholar
Digital Library
- [27] . 2020. Process-variation effects on 3D TLC flash reliability: Characterization and mitigation scheme. In Proceedings of the 2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS). 329–334.
DOI: Google ScholarCross Ref
- [28] . 1982. How to reuse a “write-once” memory. Information and Control 55, 1–3 (1982), 1–19.Google Scholar
Cross Ref
- [29] . 1992. The Design and Implementation of a Log-structured File System. Ph. D. Dissertation. University of California at Berkeley.Google Scholar
Digital Library
- [30] . 2020. YCSB RocksDB SSD Traces. Retrieved from http://iotta.snia.org/traces/28568. Accessed January 11, 2022.Google Scholar
- [31] . 2019. Who’s afraid of uncorrectable bit errors? Online recovery of flash errors with distributed redundancy. In Proceedings of the 2019 USENIX Annual Technical Conference (USENIX ATC 19). 977–992.Google Scholar
Digital Library
- [32] . 2010. Error characterization and coding schemes for flash memories. In Proceedings of the 2010 IEEE Globecom Workshops. IEEE, 1856–1860.Google Scholar
Cross Ref
- [33] . 2018. A case for biased programming in flash. In Proceedings of the 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 18).Google Scholar
Digital Library
- [34] . 2015. When do WOM codes improve the erasure factor in flash memories?. In Proceedings of the IEEE International Symposium on Information Theory (ISIT).Google Scholar
Cross Ref
- [35] . 2021. SSD-based workload characteristics and their performance implications. ACM Transactions on Storage 17, 1, (
Jan 2021), 26 pages.DOI: Google ScholarDigital Library
- [36] . 2018. An analysis of flash page reuse with WOM codes. ACM Transactions on Storage (TOS) 14, 1 (2018), 1–39.Google Scholar
Digital Library
- [37] . 2015. Write once, get 50% free: Saving SSD erase costs using WOM codes. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST 15). 257–271.Google Scholar
Index Terms
Improving the Endurance of Next Generation SSD’s using WOM-v Codes
Recommendations
On designing endurance aware erasure code for SSD-based storage systems
DPD-factor and GDP-pattern are proposed for comparing the endurance of erasure codes.EA-EO is designed as a modification of EVENODD with smaller DPD-factor.A code with smaller DPD-factor can provide higher endurance for systems.A code with sequential ...
WOM-Code Solutions for Low Latency and High Endurance in Phase Change Memory
This paper describes a write-once-memory-code phase change memory (WOM-code PCM) architecture for next-generation non-volatile memory applications. Specifically, we address the long latency of the write operation in PCM—attributed to PCM SET—...
On the Endurance of the d-Choices Garbage Collection Algorithm for Flash-Based SSDs
Garbage collection (GC) algorithms for flash-based solid-state drives (SSDs) have a profound impact on its performance and many studies have focused on assessing the so-called write amplification of various GC algorithms. In this article, we consider ...






Comments