Abstract
Data corruption is the most common consequence of file-system bugs. When such corruption occurs, offline check and recovery tools must be used, but they are error prone and cause significant downtime. Previously we showed that a runtime checker for the Ext3 file system can verify that metadata updates are consistent, helping detect corruption in metadata blocks at transaction commit time. However, corruption can still occur when a bug in the file system’s transactional mechanism loses, misdirects, or corrupts writes. We show that a runtime checker must enforce the atomicity and durability properties of the file system on every write, in addition to checking transactions at commit time, to provide the strong guarantee that every block write will maintain file system consistency.
We identify the invariants that need to be enforced on journaling and shadow paging file systems to preserve the integrity of committed transactions. We also describe the key properties that make it feasible to check these invariants for a file system. Based on this characterization, we have implemented runtime checkers for Ext3 and Btrfs. Our evaluation shows that both checkers detect data corruption effectively, and they can be used during normal operation with low overhead.
- Bairavasundaram, L. N., Sundararaman, S., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2009.Tolerating file-system mistakes with envyfs. In Proceedings of the USENIX Annual Technical Conference. Google Scholar
Digital Library
- Behrens, S. 2011. BTRFs: Runtime integrity check tool. http://lwn.net/Articles/466493.Google Scholar
- Bonwick, J. and Moore, B. 2008. ZFS - The last word in file systems. http://opensolaris.org/os/community/zfs/docs/zfs_last.pdf.Google Scholar
- Carreira, J. A. C. M., Rodrigues, R., Candea, G., and Majumdar, R. 2012. Scalable testing of file system checkers. In Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys’12). ACM, New York, NY, 239--252. Google Scholar
Digital Library
- Chidambaram, V., Sharma, T., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2012. Consistency without ordering. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). Google Scholar
Digital Library
- Custer, H. 1994. Inside the Windows NT File System. Microsoft Press. Google Scholar
Digital Library
- Do, T., Harter, T., Liu, Y., Gunawi, H. S., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2013. HARDFS: Hardening HDFS with selective and lightweight versioning. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). Google Scholar
Digital Library
- Filebench. 2011. Filebench version 1.4.9. http://filebench.sourceforge.net.Google Scholar
- Fryer, D., Sun, K., Mahmood, R., Cheng, T., Benjamin, S., Goel, A., and Brown, A. D. 2012. Recon: Verifying file system consistency at runtime. ACM Trans. Storage 8, 4, 15:1--15:29. Google Scholar
Digital Library
- Griffin, D. 2008. jbd: Correctly unescape journal data blocks. http://kerneltrap.org/mailarchive/git-commits-head/2008/3/20/1206404/thread.Google Scholar
- Gunawi, H. S., Prabhakaran, V., Krishnan, S., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2007. Improving file system reliability with I/O shepherding. In Proceedings of the Symposium on Operating Systems Principles (SOSP). 293--306. Google Scholar
Digital Library
- Gunawi, H. S., Rajimwale, A., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2008. SQCK: A declarative file system checker. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI). Google Scholar
Digital Library
- Henson, V., van de Ven, A., Gud, A., and Brown, Z. 2006. Chunkfs: Using divide-and-conquer to improve file system reliability and repair. In Proceedings of the Workshop on Hot Topics in System Dependability (HotDep). Google Scholar
Digital Library
- Hitz, D., Lau, J., and Malcolm, M. 1994. File system design for an NFS file server appliance. In Proceedings of the USENIX Annual Technical Conference. Google Scholar
Digital Library
- Kara, J. 2010. ext4: Always journal quota file modifications. http://www.kerneltrap.org/mailarchive/linux-ext4/2010/6/2/6884775.Google Scholar
- Kara, J. 2012. jbd: Write journal superblock with WRITE_FUA after checkpointing. https://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git/commit/?id=fd2cbd4dfa3db477dd6226d387d3f1911d36a6a9.Google Scholar
- Lu, L., Arpaci-Dusseau, A. C., Arpaci-Dusseau, R. H., and Lu, S. 2013. A study of Linux file system evolution. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). Google Scholar
Digital Library
- Ma, A., Dragga, C., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2013. ffsck: The fast file system checker. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). Google Scholar
Digital Library
- Macko, P., Seltzer, M., and Smith, K. A. 2010. Tracking back references in a write-anywhere file system. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). Google Scholar
Digital Library
- Mason, C. 2011. https://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git/commit/?id=387125fc722a8ed432066b85a552917343bdafca.Google Scholar
- Mesnier, M., Chen, F., Luo, T., and Akers, J. B. 2011. Differentiated storage services. In Proceedings of the Symposium on Operating Systems Principles (SOSP). 57--70. Google Scholar
Digital Library
- Meyer, D. T. and Bolosky, W. J. 2011. A study of practical deduplication. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST). 1--13. Google Scholar
Digital Library
- Miller, R. 2008. Joyent services back after 8 day outage. http://www.datacenterknowledge.com/archives/2008/01/21/joyent-services-back-after-8-day-outage/.Google Scholar
- Prabhakaran, V., Bairavasundaram, L. N., Agrawal, N., Gunawi, H. S., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2005. IRON file systems. In Proceedings of the Symposium on Operating Systems Principles (SOSP). 206--220. Google Scholar
Digital Library
- Rodeh, O., Bacik, J., and Mason, C. 2013. BTRFS: The Linux B-tree filesystem. ACM Trans. Storage 9, 3, 9:1--9:32. Google Scholar
Digital Library
- Sandeen, E. 2012. ext4: Fix unjournaled inode bitmap modification. https://lwn.net/Articles/521819/.Google Scholar
- Sivathanu, G., Sundararaman, S., and Zadok, E. 2006. Type-safe disks. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI). 15--28. Google Scholar
Digital Library
- Sivathanu, M., Arpaci-Dusseau, A. C., Arpaci-Dusseau, R. H., and Jha, S. 2005. A logic of file systems. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). Google Scholar
Digital Library
- Sivathanu, M., Prabhakaran, V., Popovici, F. I., Denehy, T. E., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2003. Semantically-smart disk systems. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). 73--88. Google Scholar
Digital Library
- Sundararaman, S., Subramanian, S., Rajimwale, A., Arpaci-dusseau, A. C., Arpaci-dusseau, R. H., and Swift, M. M. 2010. Membrane: Operating system support for restartable file systems. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). Google Scholar
Digital Library
- Sweeney, A., Doucette, D., Hu, W., Anderson, C., Nishimoto, M., and Peck, G. 1996. Scalability in the XFS file system. In Proceedings of the USENIX Annual Technical Conference. 1--14. Google Scholar
Digital Library
- Ts’o, T. 2012. Re: Apparent serious progressive ext4 data corruption bug in 3.6.3. https://lkml.org/lkml/2012/10/23/690.Google Scholar
- Tweedie, S. C. 1998. Journalling the ext2fs filesystem. In Proceedings of the 4th Annual Linux Expo.Google Scholar
- Yang, J., Twohey, P., Engler, D., and Musuvathi, M. 2006. Using model checking to find serious file system errors. ACM Trans. Comput. Syst. 24, 4, 393--423. Google Scholar
Digital Library
- Zhang, Y., Rajimwale, A., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2010. End-to-end data integrity for file systems: A ZFS case study. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). Google Scholar
Digital Library
Index Terms
Checking the Integrity of Transactional Mechanisms
Recommendations
Spiffy: Enabling File-System Aware Storage Applications
Many file-system applications such as defragmentation tools, file-system checkers, or data recovery tools, operate at the storage layer. Today, developers of these file-system aware storage applications require detailed knowledge of the file-system ...
Recon: Verifying file system consistency at runtime
File system bugs that corrupt metadata on disk are insidious. Existing reliability methods, such as checksums, redundancy, or transactional updates, merely ensure that the corruption is reliably preserved. Typical workarounds, based on using backups or ...
Checking the integrity of transactional mechanisms
FAST'14: Proceedings of the 12th USENIX conference on File and Storage TechnologiesData corruption is the most common consequence of file-system bugs, as shown by a recent study. When such corruption occurs, the file system's offline check and recovery tools need to be used, but they are error prone and cause significant downtime. ...






Comments