skip to main content
research-article

Management of Next-Generation NAND Flash to Achieve Enterprise-Level Endurance and Latency Targets

Published:04 December 2018Publication History
Skip Abstract Section

Abstract

Despite its widespread use in consumer devices and enterprise storage systems, NAND flash faces a growing number of challenges. While technology advances have helped to increase the storage density and reduce costs, they have also led to reduced endurance and larger block variations, which cannot be compensated solely by stronger ECC or read-retry schemes but have to be addressed holistically.

Our goal is to enable low-cost NAND flash in enterprise storage for cost efficiency. We present novel flash-management approaches that reduce write amplification, achieve better wear leveling, and enhance endurance without sacrificing performance. We introduce block calibration, a technique to determine optimal read-threshold voltage levels that minimize error rates, and novel garbage-collection as well as data-placement schemes that alleviate the effects of block health variability and show how these techniques complement one another and thereby achieve enterprise storage requirements.

By combining the proposed schemes, we improve endurance by up to 15× compared to the baseline endurance of NAND flash without using a stronger ECC scheme. The flash-management algorithms presented herein were designed and implemented in simulators, hardware test platforms, and eventually in the flash controllers of production enterprise all-flash arrays. Their effectiveness has been validated across thousands of customer deployments since 2015.

References

  1. Jens Axboe. 2014. FIO—Flexible IO Tester. Retrieved from https://linux.die.net/man/1/fio.Google ScholarGoogle Scholar
  2. Avraham Ben-Aroya and Sivan Toledo. 2006. Competitive analysis of flash-memory algorithms. In Proceedings of 14th Annual European Symposium on Algorithms (ESA’06). 100--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Simona Boboila and Peter Desnoyers. 2010. Write endurance in flash drives: Measurements and analysis. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). 115--128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Luc Bouganim, Björn Por Jonsson, and Philippe Bonnet. 2009. uFLIP: Understanding flash IO patterns. In Biennial Conference on Innovative Data Systems Research (CIDR’09).Google ScholarGoogle Scholar
  5. Werner Bux and Ilias Iliadis. 2010. Performance of greedy garbage collection in flash-based solid-state drives. Perform. Eval. 67, 11 (Nov. 2010), 1172--1186. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Yu Cai, Erich F. Haratsch, Onur Mutlu, and Ken Mai. 2013. Threshold voltage distribution in MLC NAND flash memory: Characterization, analysis, and modeling. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’13). 1285--1290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Yu Cai, O. Mutlu, E. F. Haratsch, and Ken Mai. 2013. Program interference in MLC NAND flash memory: Characterization, modeling, and mitigation. In Proceedings of the IEEE 31st International Conference on Computer Design (ICCD’13). 123--130.Google ScholarGoogle ScholarCross RefCross Ref
  8. Yu Cai, Gulay Yalcin, Onur Mutlu, Erich F. Haratsch, Adrian Cristal, Osman Unsal, and Ken Mai. 2012. Flash correct-and-refresh: Retention-aware error management for increased flash memory lifetime. In Proceedings of the IEEE Conference on Computer Design (ICCD’12). 94--101. DOI:https://doi.org/ICCD.2012.6378623 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Li-Pin Chang. 2007. On efficient wear leveling for large-scale flash-memory storage systems. In Proceedings of the 2007 ACM Symposium on Applied Computing (SAC’07). 1126--1130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Li-Pin Chang, Tei-Wei Kuo, and Shi-Wu Lo. 2004. Real-time garbage collection for flash-memory storage systems of real-time embedded systems. ACM Trans. Embed. Comput. Syst. 3, 4 (Nov. 2004), 837--863. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jeffrey Dean and Luiz André Barroso. 2013. The tail at scale. Commun. ACM 56, 2 (Feb. 2013), 74--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Peter Desnoyers. 2012. Analytic modeling of SSD write performance. In Proceedings of the 5th Annual International Systems and Storage Conference (SYSTOR’12). 12:1--12:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Yoav Etsion and Dror G. Feitelson. 2012. Exploiting core working sets to filter the L1 cache with random sampling. IEEE Trans. Comput. 61, 11 (2012), 1535--1550. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Eran Gal and Sivan Toledo. 2005. Algorithms and data structures for flash memories. Comput. Surv. 37, 2 (Jun. 2005), 138--163. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Alessandro Grossi, Lorenzo Zuolo, Franesco Restuccia, and Piero Olivo. 2015. Quality-of-service implications of enhanced program algorithms for charge-trapping NAND in future solid-state drives. IEEE Trans. Device Mater. Rel. 15, 3 (Sept. 2015), 363--369.Google ScholarGoogle ScholarCross RefCross Ref
  16. Laura M. Grupp, Adrian M. Caulfield, Joel Coburn, Steven Swanson, Eitan Yaakobi, Paul H. Siegel, and Jack K. Wolf. 2009. Characterizing flash memory: Anomalies, observations, and applications. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42). 24--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Xiao-Yu Hu, Evangelos Eleftheriou, Robert Haas, Ilias Iliadis, and Roman Pletka. 2009. Write amplification analysis in flash-based solid state drives. In Proceedings of the Israeli Experimental Systems Conference (SYSTOR’09). 10:1--10:9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Xiao-Yu Hu, Robert Haas, and Evangelos Eleftheriou. 2011. Container marking: Combining data placement, garbage collection and wear levelling for flash. In Proceedings of the 2011 IEEE 19th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’11). 237--247. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Jian Huang, Anirudh Badam, Laura Caulfield, Suman Nath, Sudipta Sengupta, Bikash Sharma, and Moinuddin K. Qureshi. 2017. FlashBlox: Achieving both performance isolation and uniform lifetime for virtualized SSDs. In Proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST’17). 375--390. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. IBM. 2015. FlashSystem 900. Retrieved from http://www-03.ibm.com/systems/storage/flash/900/.Google ScholarGoogle Scholar
  21. JEDEC 2017. Stress-Test-Driven Qualification of Integrated Circuits. Retrieved from http://jedec.org/.Google ScholarGoogle Scholar
  22. Yangwook Kang, Jingpei Yang, and Ethan L. Miller. 2010. Efficient storage management for object-based flash memory. In Proceedings of the 18th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS’10). 407--409. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Han-Joon Kim and Sang-Goo Lee. 1999. A new flash memory management for flash storage system. In Proceedings of the 23rd International Computer Software and Applications Conference (COMPSAC’99). 284--289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Youngjoo Lee, Hoyoung Yoo, Injae Yoo, and In-Cheol Park. 2012. 6.4 Gb/s multi-threaded BCH encoder and decoder for multi-channel SSD controllers. In Proceedings of the IEEE International Solid-State Circuit Conference (ISSCC’12). DOI:https://doi.org/ISSCC.2012.6177075Google ScholarGoogle Scholar
  25. Jai Menon and Larry Stockmeyer. 1998. An age-threshold algorithm for garbage collection in log-structured arrays and file systems. In High Performance Computing Systems and Applications. 119--132.Google ScholarGoogle Scholar
  26. Neal Mielke, Todd Marquart, Ning Wu, Jeff Kessenich, Hanmant P. Belgal, Eric Schares, Falgun Trivedi, Evan Goodness, and Leland R. Nevill. 2008. Bit error rate in NAND flash memories. In Proceedings of the 46th Annual Int. Reliability Physics Symposium (IRPS’08). 9--19.Google ScholarGoogle Scholar
  27. Changwoo Min, Kangnyeon Kim, Hyunjin Cho, Sang-Won Lee, and Young Ik Eom. 2012. SFS: Random write considered harmful in solid state drives. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). 139--154. http://dl.acm.org/citation.cfm?id=2208461.2208473 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Vidyabhushan Mohan, Taniya Siddiqua, Sudhanva Gurumurthi, and Mircea R. Stan. 2010. How I learned to stop worrying and love flash endurance. In Proceedings of the 2nd USENIX Conference on Hot Topics in Storage and File Systems (HotStorage’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yangyang Pan, Guiqiang Dong, and Tong Zhang. 2013. Error rate-based wear-leveling for NAND flash memory at highly scaled technology nodes. IEEE Trans. VLSI Syst. 21, 7 (Jul. 2013), 1350--1354. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Nikolaos Papandreaou, Theodore Antonakopoulos, Urs Egger, Aspa Palli, Haris Pozidis, and Evangelos S. Eleftheriou. 2013. A versatile platform for characterization of solid-state memory channels. In Proceedings of the 2013 18th International Conference on Digital Signal Processing (DSP’13). 1--5. DOI:https://doi.org/ICDSP.2013.6622745Google ScholarGoogle Scholar
  31. Nikolaos Papandreou, Thomas Parnell, Haris Pozidis, Thomas Mittelholzer, Evangelos S. Eleftheriou, Charles J. Camp, Thomas J. Griffin, Gary A. Tressler, and Andrew A. Walls. 2014. Using adaptive read voltage thresholds to enhance the reliability of MLC NAND flash memory systems. In Proceedings of the 24th ACM Great Lakes Symp. on VLSI (GLSVLSI’14). 151--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Ki-Tae Park, Sangwan Nam, Daehan Kim, Pansuk Kwak, Doosub Lee, Yoon-He Choi, Myung-Hoon Choi, Dong-Hun Kwak, Doo-Hyun Kim, Min-Su Kim, Hyun-Wook Park, Sang-Won Shim, Kyung-Min Kang, Sang-Won Park, Kangbin Lee, Hyun-Jun Yoon, Kuihan Ko, Dong-Kyo Shim, Yang-Lo Ahn, Jinho Ryu, Donghyun Kim, Kyunghwa Yun, Joonsoo Kwon, Seunghoon Shin, Dae-Seok Byeon, Kihwan Choi, Jin-Man Han, Kye-Hyun Kyung, Jeong-Hyuk Choi, and Kinam Kim. 2015. Three-dimensional 128 Gb MLC vertical NAND flash memory with 24-WL stacked layers and 50 MB/s high-speed programming. IEEE J. Solid-State Circ. 50, 1 (Jan. 2015), 204--213.Google ScholarGoogle ScholarCross RefCross Ref
  33. B. Peleato, H. Tabrizi, R. Agarwal, and J. Ferreira. 2015. BER-based wear leveling and bad block management for NAND flash. In Proceedings of the IEEE International Conference on Communications (ICC’15). 295--300.Google ScholarGoogle Scholar
  34. Roman A. Pletka and Saša Tomić. 2016. Health-binning: Maximizing the performance and the endurance of consumer-level NAND flash. In Proceedings of the 9th ACM International Systems and Storage Conference (SYSTOR’16). Article 4, 10 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Mendel Rosenblum and John K. Ousterhout. 1992. The design and implementation of a log-structured file system. ACM Trans. Comp. Syst. 10, 1 (Feb. 1992), 26--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Bianca Schroeder, Raghav Lagisetty, and Arif Merchant. 2016. Flash reliability in production: The expected and the unexpected. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). 67--80. http://dl.acm.org/citation.cfm?id=2930583.2930589 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Radu Stoica and Anastasia Ailamaki. 2013. Improving flash write performance by using update frequency. Proc. VLDB Endow. 6, 9 (Jul. 2013), 733--744. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Fei Sun, Ken Rose, and Tong Zhang. 2006. On the use of strong BCH codes for improving multilevel NAND flash memory storage capacity. In Proceedings of the IEEE Workshop on Signal Processing Systems: Design and Implementation (SiPS’06). 241--249.Google ScholarGoogle Scholar
  39. Benny Van Houdt. 2013. A mean field model for a class of garbage collection algorithms in flash-based solid state drives. In Proceedings of the ACM SIGMETRICS/International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’13). ACM, New York, NY, 191--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Benny Van Houdt. 2013. A mean field model for a class of garbage collection algorithms in flash-based solid state drives. SIGMETRICS Perform. Eval. Rev. 41, 1 (Jun. 2013), 191--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Benny Van Houdt. 2013. Performance of garbage collection algorithms for flash-based solid state drives with hot/cold data. Perform. Eval. 70, 10 (Oct. 2013), 692--703. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Steven E. Wells. 1994. Method for Wear Leveling in a Flash EEPROM Memory. U.S. Patent 5 341 339.Google ScholarGoogle Scholar
  43. Jingpei Yang, Ned Plasson, Greg Gillis, and Nisha Talagala. 2013. HEC: Improving endurance of high performance flash-based cache devices. In Proceedings of the 6th International Systems and Storage Conference (SYSTOR’13). 10:1--10:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Yue Yang and Jianwen Zhu. 2016. Write skew and zipf distribution: Evidence and implications. ACM Trans. Stor. 12, 4 (Jun. 2016), 21:1--21:19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Kai Zhao, Wenzhe Zhao, Hongbin Sun, Tong Zhang, Xiaodong Zhang, and Nanning Zheng. 2013. LDPC-in-SSD: Making advanced error correction codes work effectively in solid state drives. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). 243--256. http://dl.acm.org/citation.cfm?id=2591272.2591298 Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Lorenzo Zuolo, Christian Zambelli, Rino Micheloni, Marco Indaco, Stefano Di Carlo, Paolo Prinetto, Davide Pertozzi, and Piero Olivo. 2015. SSDExplorer: A virtual platform for performance/reliability-oriented fine-grained design space exploration of solid state drives. IEEE Trans. Comput.-Aid. Design Integrat. Circ. Syst. 34, 10 (Oct. 2015), 1627--1638.Google ScholarGoogle Scholar

Index Terms

  1. Management of Next-Generation NAND Flash to Achieve Enterprise-Level Endurance and Latency Targets

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!