Abstract
As solid state drives (SSDs) are increasingly replacing hard disk drives, the reliability of storage systems depends on the failure modes of SSDs and the ability of the file system layered on top to handle these failure modes. While the classical paper on IRON File Systems provides a thorough study of the failure policies of three file systems common at the time, we argue that 13 years later it is time to revisit file system reliability with SSDs and their reliability characteristics in mind, based on modern file systems that incorporate journaling, copy-on-write, and log-structured approaches and are optimized for flash. This article presents a detailed study, spanning ext4, Btrfs, and F2FS, and covering a number of different SSD error modes. We develop our own fault injection framework and explore over 1,000 error cases. Our results indicate that 16% of these cases result in a file system that cannot be mounted or even repaired by its system checker. We also identify the key file system metadata structures that can cause such failures, and, finally, we recommend some design guidelines for file systems that are deployed on top of SSDs.
- 2019. Btrfs Bug Report. Retrieved from https://bugzilla.kernel.org/show_bug.cgi?id=198457.Google Scholar
- 2019. Btrfs mkfs man page. Retrieved October 23, 2019 from https://btrfs.wiki.kernel.org/index.php/Manpage/mkfs.btrfs.Google Scholar
- 2019. Errno Linux Programmer’s Manual. Retrieved October 23, 2019 from http://man7.org/linux/man-pages/man3/errno.3.html.Google Scholar
- 2019. F2FS Bug Report. Retrieved from https://bugzilla.kernel.org/show_bug.cgi?id=200635.Google Scholar
- 2019. F2FS Bug Report—Write I/O Errors. Retrieved from https://bugzilla.kernel.org/show_bug.cgi?id=200871.Google Scholar
- 2019. F2FS Patch File. Retrieved from https://sourceforge.net/p/linux-f2fs/mailman/message/36402198/.Google Scholar
- 2019. fs-verity: File System-Level Integrity Protection. Retrieved October 23, 2019 from https://www.spinics.net/lists/linux-fsdevel/msg121182.html.Google Scholar
- 2019. Github Code Repository. Retrieved from https://github.com/uoftsystems/dm-inject.Google Scholar
- 2019. NVM Express Specification. Retrieved October 23, 2019 from https://nvmexpress.org/.Google Scholar
- 2019. SATA Specification. Retrieved October 23, 2019 from https://sata-io.org/.Google Scholar
- 2019. SCSI Error Handling (EH). Retrieved October 23, 2019 from https://www.kernel.org/doc/Documentation/scsi/scsi_eh.txt.Google Scholar
- Nitin Agrawal, Vijayan Prabhakaran, Ted Wobber, John D. Davis, Mark S. Manasse, and Rina Panigrahy. 2008. Design tradeoffs for SSD performance. In Proceedings of the USENIX Annual Technical Conference (ATC’08), Vol. 57.Google Scholar
- Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Garth R. Goodson, and Bianca Schroeder. 2008. An analysis of data corruption in the storage stack. ACM Trans. Stor. 4, 3 (2008), 8.Google Scholar
- L. N. Bairavasundaram, M. Rungta, N. Agrawa, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, and M. M. Swift. 2008. Analyzing the effects of disk-pointer corruption. In Proceedings of the 2008 IEEE International Conference on Dependable Systems and Networks with FTCS and DCC (DSN’08), Vol. 00. 502--511. DOI:https://doi.org/10.1109/DSN.2008.4630121Google Scholar
Cross Ref
- Hanmant P. Belgal, Nick Righos, Ivan Kalastirsky, Jeff J. Peterson, Robert Shiner, and Neal Mielke. 2002. A new reliability model for post-cycling charge retention of flash memories. In Proceedings of the 40th Annual International Reliability Physics Symposium. IEEE, 7--20.Google Scholar
Cross Ref
- Matias Bjørling, Javier Gonzalez, and Philippe Bonnet. 2017. LightNVM: The Linux open-channel SSD subsystem. In Proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST’17). USENIX Association, 359--374.Google Scholar
- Simona Boboila and Peter Desnoyers. 2010. Write endurance in flash drives: Measurements and analysis. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). USENIX Association, 115--128.Google Scholar
Digital Library
- Adam Brand, Ken Wu, Sam Pan, and David Chin. 1993. Novel read disturb failure mechanism induced by FLASH cycling. In Proceedings of the 31st Annual International Reliability Physics Symposium. IEEE, 127--132.Google Scholar
Cross Ref
- Yu Cai, Saugata Ghose, Erich F. Haratsch, Yixin Luo, and Onur Mutlu. 2017. Error characterization, mitigation, and recovery in flash-memory-based solid-state drives. Proc. IEEE 105, 9 (2017), 1666--1704.Google Scholar
Cross Ref
- Yu Cai, Saugata Ghose, Yixin Luo, Ken Mai, Onur Mutlu, and Erich F. Haratsch. 2017. Vulnerabilities in MLC NAND flash memory programming: Experimental analysis, exploits, and mitigation techniques. In Proceedings of the 23rd International Symposium on High-Performance Computer Architecture (HPCA’17). IEEE, 49--60.Google Scholar
- Yu Cai, Erich F. Haratsch, Onur Mutlu, and Ken Mai. 2012. Error patterns in MLC NAND flash memory: Measurement, characterization, and analysis. In Proceedings of the Conference on Design, Automation and Test in Europe. EDA Consortium, 521--526.Google Scholar
Digital Library
- Yu Cai, Yixin Luo, Erich F. Haratsch, Ken Mai, and Onur Mutlu. 2015. Data retention in MLC NAND flash memory: Characterization, optimization, and recovery. In Proceedings of the 21st International Symposium on High Performance Computer Architecture (HPCA’15). IEEE, 551--563.Google Scholar
Cross Ref
- Yu Cai, Onur Mutlu, Erich F. Haratsch, and Ken Mai. 2013. Program interference in MLC NAND flash memory: Characterization, modeling, and mitigation. In Proceedings of the 31st International Conference on Computer Design (ICCD’13). IEEE, 123--130.Google Scholar
Cross Ref
- Yu Cai, Gulay Yalcin, Onur Mutlu, Erich F. Haratsch, Adrian Cristal, Osman S. Unsal, and Ken Mai. 2012. Flash correct-and-refresh: Retention-aware error management for increased flash memory lifetime. In Proceedings of the 30th International Conference on Computer Design (ICCD’12). IEEE, 94--101.Google Scholar
Digital Library
- Jinrui Cao, Om Rameshwar Gatla, Mai Zheng, Dong Dai, Vidya Eswarappa, Yan Mu, and Yong Chen. 2018. PFault: A general framework for analyzing the reliability of high-performance parallel file systems. In Proceedings of the 2018 International Conference on Supercomputing. ACM, 1--11.Google Scholar
Digital Library
- Paolo Cappelletti, Roberto Bez, Daniele Cantarelli, and Lorenzo Fratin. 1994. Failure mechanisms of Flash cell in program/erase cycling. In Proceedings of the IEEE International Electron Devices Meeting. IEEE, 291--294.Google Scholar
Cross Ref
- Feng Chen, David A. Koufaty, and Xiaodong Zhang. 2009. Understanding intrinsic characteristics and system implications of flash memory based solid state drives. In Proceedings of the 2009 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’09). 181--192. DOI:https://doi.org/10.1145/1555349.1555371Google Scholar
Digital Library
- Vijay Chidambaram, Tushar Sharma, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2012. Consistency without ordering. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). USENIX Association, 9--9.Google Scholar
- Robin Degraeve, F. Schuler, Ben Kaczer, Martino Lorenzini, Dirk Wellekens, Paul Hendrickx, Michiel van Duuren, G. J. M. Dormans, Jan Van Houdt, L. Haspeslagh, et al. 2004. Analytical percolation model for predicting anomalous charge loss in flash memories. IEEE Trans. Electr. Dev. 51, 9 (2004), 1392--1400.Google Scholar
Cross Ref
- Jake Edge. 2018. File-level Integrity. Retrieved October 23, 2019 from https://lwn.net/Articles/752614/.Google Scholar
- Daniel Fryer, Kuei Sun, Rahat Mahmood, Tinghao Cheng, Shaun Benjamin, Ashvin Goel, and Angela Demke Brown. 2012. Recon: Verifying file system consistency at runtime. ACM Trans. Stor. 8, 4, Article 15 (Dec. 2012), 29 pages. DOI:https://doi.org/10.1145/2385603.2385608Google Scholar
- Tomonori Fujita and Mike Christie. 2006. tgt: Framework for storage target drivers. In Proceedings of the Linux Symposium, Vol. 1. Citeseer, 303--312.Google Scholar
- Aishwarya Ganesan, Ramnatthan Alagappan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2017. Redundancy does not imply fault tolerance: Analysis of distributed storage reactions to single errors and corruptions. In Proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST’17). USENIX Association, 149--166.Google Scholar
Digital Library
- Om Rameshwar Gatla, Mai Zheng, Muhammad Hameed, Viacheslav Dubeyko, Adam Manzanares, Filip Blagojevic, Cyril Guyot, and Robert Mateescu. 2018. Towards robust file system checkers. ACM Trans. Storage 14, 4, Article 35 (Dec. 2018), 25 pages. DOI:https://doi.org/10.1145/3281031Google Scholar
Digital Library
- L. M. Grupp, A. M. Caulfield, J. Coburn, S. Swanson, E. Yaakobi, P. H. Siegel, and J. K. Wolf. 2009. Characterizing flash memory: Anomalies, observations, and applications. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’09). 24--33. DOI:https://doi.org/10.1145/1669112.1669118Google Scholar
- Laura M. Grupp, John D. Davis, and Steven Swanson. 2012. The bleak future of NAND flash memory. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). USENIX Association.Google Scholar
Digital Library
- Haryadi S. Gunawi, Cindy Rubio-González, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dussea, and Ben Liblit. 2008. EIO: Error handling is occasionally correct. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08).Google Scholar
Digital Library
- Haryadi S. Gunawi, Riza O. Suminto, Russell Sears, Casey Golliher, Swaminathan Sundararaman, Xing Lin, Tim Emami, Weiguang Sheng, Nematollah Bidokhti, Caitie McCaffrey, Gary Grider, Parks M. Fields, Kevin Harms, Robert B. Ross, Andree Jacobson, Robert Ricci, Kirk Webb, Peter Alvaro, H. Birali Runesha, Mingzhe Hao, and Huaicheng Li. 2018. Fail-slow at scale: Evidence of hardware performance faults in large production systems. In Proceedings of the 16th USENIX Conference on File and Storage Technologies (FAST'18). USENIX Association, 1--14.Google Scholar
Digital Library
- Hong Yang, Hyunjae Kim, Sung-il Park, Jongseob Kim, Sung-hoon Lee, Jung-ki Choi, Duhyun Hwang, Chulsung Kim, Mincheol Park, Keun-ho Lee, Young-kwan Park, and Jai Kwang. 2006. Reliability issues and models of sub-90nm NAND flash memory cells. In 2006 8th International Conference on Solid-State and Integrated Circuit Technology Proceedings. IEEE.Google Scholar
Cross Ref
- Seok Jin Joo, Hea Jong Yang, Keum Hwan Noh, Hee Gee Lee, Won Sik Woo, Joo Yeop Lee, Min Kyu Lee, Won Yol Choi, Kyoung Pil Hwang, Hyoung Seok Kim, et al. 2006. Abnormal disturbance mechanism of sub-100 nm NAND flash memory. Jpn. J. Appl. Phys. 45, 8R (2006), 6210.Google Scholar
Cross Ref
- Myoungsoo Jung and Mahmut Kandemir. 2013. Revisiting widely held SSD expectations and rethinking system-level implications. In Proceedings of the 2013 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’13). 203--216. DOI:https://doi.org/10.1145/2465529.2465548Google Scholar
Digital Library
- Harendra Kumar, Yuvraj Patel, Ram Kesavan, and Sumith Makam. 2017. High performance metadata integrity protection in the WAFL copy-on-write file system. In Proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST’17). USENIX Association, 197--212.Google Scholar
Digital Library
- Changman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho. 2015. F2FS: A new file system for flash storage. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). USENIX Association, 273--286. https://www.usenix.org/conference/fast15/technical-sessions/presentation/leeGoogle Scholar
Digital Library
- Jae-Duk Lee, Chi-Kyung Lee, Myung-Won Lee, Han-Soo Kim, Kyu-Charn Park, and Won-Seong Lee. 2006. A new programming disturbance phenomenon in NAND flash memory by source/drain hot-electrons generated by GIDL current. In Proceedings of the IEEE Non-Volatile Semiconductor Memory Workshop (IEEE NVSMW’06). IEEE, 31--33.Google Scholar
- Ren-Shuo Liu, Chia-Lin Yang, and Wei Wu. 2012. Optimizing NAND flash-based SSDs via retention relaxation. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). USENIX Association, 11.Google Scholar
Digital Library
- Yixin Luo, Saugata Ghose, Yu Cai, Erich F. Haratsch, and Onur Mutlu. 2018. HeatWatch: Improving 3D NAND flash memory device reliability by exploiting self-recovery and temperature awareness. In Proceedings of the 24th International Symposium on High Performance Computer Architecture (HPCA’18). IEEE, 504--517.Google Scholar
Cross Ref
- Yixin Luo, Saugata Ghose, Yu Cai, Erich F. Haratsch, and Onur Mutlu. 2018. Improving 3D NAND flash memory lifetime by tolerating early retention loss and process variation. In Proceedings of the 2018 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’18). DOI:https://doi.org/10.1145/3224432Google Scholar
Digital Library
- Ashlie Martinez and Vijay Chidambaram. 2017. CrashMonkey: A framework to automatically test file-system crash consistency. In Proceedings of the 9th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’17). USENIX Association.Google Scholar
Digital Library
- Avantika Mathur, Mingming Cao, Suparna Bhattacharya, Andreas Dilger, Alex Tomas, and Laurent Vivier. 2007. The new ext4 filesystem: Current status and future plans. In Proceedings of the Linux Symposium, Vol. 2. 21--33.Google Scholar
- Justin Meza, Qiang Wu, Sanjev Kumar, and Onur Mutlu. 2015. A large-scale study of flash memory failures in the field. In Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’15). 177--190. DOI:https://doi.org/10.1145/2745844.2745848Google Scholar
Digital Library
- Neal Mielke, Hanmant P. Belgal, Albert Fazio, Qingru Meng, and Nick Righos. 2006. Recovery effects in the distributed cycling of flash memories. In Proceedings of the 44th Annual International Reliability Physics Symposium. IEEE, 29--35.Google Scholar
Cross Ref
- Neal Mielke, Todd Marquart, Ning Wu, Jeff Kessenich, Hanmant Belgal, Eric Schares, Falgun Trivedi, Evan Goodness, and Leland R. Nevill. 2008. Bit error rate in NAND flash memories. In Proceedings of the 46th Annual International Reliability Physics Symposium. IEEE, 9--19.Google Scholar
- Jayashree Mohan, Ashlie Martinez, Soujanya Ponnapalli, Pandian Raju, and Vijay Chidambaram. 2018. Finding crash-consistency bugs with bounded black-box crash testing. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI’18). USENIX Association.Google Scholar
Digital Library
- Keshava Munegowda, G. T. Raju, and Veera Manikandan Raju. 2014. Evaluation of file systems for solid state drives. In Proceedings of the 2nd International Conference on Emerging Research in Computing, Information, Communication and Applications. 342--348.Google Scholar
- Iyswarya Narayanan, Di Wang, Myeongjae Jeon, Bikash Sharma, Laura Caulfield, Anand Sivasubramaniam, Ben Cutler, Jie Liu, Badriddine Khessib, and Kushagra Vaid. 2016. SSD failures in datacenters: What? When? And Why? In Proceedings of the 9th ACM International on Systems and Storage Conference (SYSTOR’16). Article 7, 11 pages. DOI:https://doi.org/10.1145/2928275.2928278Google Scholar
Digital Library
- Biswaranjan Panda, Deepthi Srinivasan, Huan Ke, Karan Gupta, Vinayak Khot, and Haryadi S. Gunawi. 2019. IASO: A fail-slow detection and mitigation framework for distributed storage services. In Proceedings of the 2019 USENIX Annual Technical Conference (USENIX ATC’19). USENIX Association, 47--62.Google Scholar
- Nikolaos Papandreou, Thomas Parnell, Haralampos Pozidis, Thomas Mittelholzer, Evangelos Eleftheriou, Charles Camp, Thomas Griffin, Gary Tressler, and Andrew Walls. 2014. Using adaptive read voltage thresholds to enhance the reliability of MLC NAND flash memory systems. In Proceedings of the 24th Great Lakes Symposium on VLSI (GLSVLSI’14). 151--156. DOI:https://doi.org/10.1145/2591513.2591594Google Scholar
Digital Library
- Vijayan Prabhakaran, Lakshmi N. Bairavasundaram, Nitin Agrawal, Haryadi S. Gunawi, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2005. IRON file systems. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP’05). 206--220. DOI:https://doi.org/10.1145/1095810.1095830Google Scholar
- Ohad Rodeh, Josef Bacik, and Chris Mason. 2013. BTRFS: The Linux B-tree filesystem. ACM Trans. Stor. 9, 3, Article 9 (Aug. 2013), 32 pages. DOI:https://doi.org/10.1145/2501620.2501623Google Scholar
- Marco A. A. Sanvido, Frank R. Chu, Anand Kulkarni, and Robert Selinger. 2008. NAND flash memory and its role in storage architectures. Proc. IEEE 96, 11 (2008), 1864--1874.Google Scholar
Cross Ref
- Bianca Schroeder, Raghav Lagisetty, and Arif Merchant. 2016. Flash reliability in production: The expected and the unexpected. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). USENIX Association, 67--80.Google Scholar
Digital Library
- Kang-Deog Suh, Byung-Hoon Suh, Young-Ho Lim, Jin-Ki Kim, Young-Joon Choi, Yong-Nam Koh, Sung-Soo Lee, Suk-Chon Kwon, Byung-Soon Choi, Jin-Sun Yum, et al. 1995. A 3.3 V 32 Mb NAND flash memory with incremental step pulse programming scheme. IEEE J. Solid-State Circ. 30, 11 (1995), 1149--1156.Google Scholar
Cross Ref
- Hung-Wei Tseng, Laura Grupp, and Steven Swanson. 2011. Understanding the impact of power loss on flash memory. In Proceedings of the 48th Design Automation Conference (DAC’11). 35--40. DOI:https://doi.org/10.1145/2024724.2024733Google Scholar
Digital Library
- Yongkun Wang, Kazuo Goda, Miyuki Nakano, and Masaru Kitsuregawa. 2010. Early experience and evaluation of file systems on SSD with database applications. In Proceedings of the 5th International Conference on Networking, Architecture, and Storage (NAS’10). IEEE, 467--476.Google Scholar
Digital Library
- Shiqin Yan, Huaicheng Li, Mingzhe Hao, Michael Hao Tong, Swaminathan Sundararaman, Andrew A. Chien, and Haryadi S. Gunawi. 2017. Tiny-tail flash: Near-perfect elimination of garbage collection tail latencies in NAND SSDs. In Proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST'17). USENIX Association, 15--28.Google Scholar
Digital Library
- Kai Zhao, Wenzhe Zhao, Hongbin Sun, Xiaodong Zhang, Nanning Zheng, and Tong Zhang. 2013. LDPC-in-SSD: Making advanced error correction codes work effectively in solid state drives. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST'13). USENIX Association, 243--256.Google Scholar
Digital Library
- Mai Zheng, Joseph Tucek, Feng Qin, and Mark Lillibridge. 2013. Understanding the robustness of SSDs under power fault. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). USENIX Association, 271--284.Google Scholar
Digital Library
- Mai Zheng, Joseph Tucek, Feng Qin, Mark Lillibridge, Bill W. Zhao, and Elizabeth S. Yang. 2016. Reliability analysis of SSDs under power fault. ACM Trans. Stor. 34, 4, Article 10 (Nov. 2016), 28 pages. DOI:https://doi.org/10.1145/2992782Google Scholar
Index Terms
The Reliability of Modern File Systems in the face of SSD Errors
Recommendations
A multiple-file write scheme for improving write performance of small files in Fast File System
Fast File System (FFS) stores files to disk in separate disk writes, each of which incurs a disk positioning (seek + rotation) limiting the write performance for small files. We propose a new scheme called co-writing to accelerate small file writes in ...
Implementation of a stackable file system for real-time network backup
We propose a backup system based on a stackable mirroring file system, general-purpose mirroring file system (GMFS). This file system mirrors data in real-time on the file system layer. It uses the typical network file system (NFS) and backs up data to ...
Using Working Set Reorganization to Manage Storage Systems with Hard and Solid State Disks
ICPPW '14: Proceedings of the 2014 43rd International Conference on Parallel Processing WorkshopsScientific applications from many problem domains produce and/or access large volumes of data. To support these applications, designers of high-end computing (HEC) systems have greatly increased the capacity of storage systems in recent years. However, ...






Comments