Abstract
Many file-system applications such as defragmentation tools, file-system checkers, or data recovery tools, operate at the storage layer. Today, developers of these file-system aware storage applications require detailed knowledge of the file-system format, which requires significant time to learn, often by trial and error, due to insufficient documentation or specification of the format. Furthermore, these applications perform ad-hoc processing of the file-system metadata, leading to bugs and vulnerabilities.
We propose Spiffy, an annotation language for specifying the on-disk format of a file system. File-system developers annotate the data structures of a file system, and we use these annotations to generate a library that allows identifying, parsing, and traversing file-system metadata, providing support for both offline and online storage applications. This approach simplifies the development of storage applications that work across different file systems because it reduces the amount of file-system--specific code that needs to be written.
We have written annotations for the Linux Ext4, Btrfs, and F2FS file systems, and developed several applications for these file systems, including a type-specific metadata corruptor, a file-system converter, an online storage layer cache that preferentially caches files for certain users, and a runtime file-system checker. Our experiments show that applications built with the Spiffy library for accessing file-system metadata can achieve good performance and are robust against file-system corruption errors.
- Sidney Amani, Leonid Ryzhyk, and Toby Murray. 2012. Towards a fully verified file system. EuroSys Doctoral Workshop 2012.Google Scholar
- Lakshmi N. Bairavasundaram, Meenali Rungta, Nitin Agrawa, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, and Michael M. Swift. 2008. Analyzing the effects of disk-pointer corruption. In Proceedings of the 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN’08). IEEE, 502--511.Google Scholar
- Julian Bangert and Nickolai Zeldovich. 2014. Nail: A practical tool for parsing and generating data formats. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). 615--628.Google Scholar
- David Beazley. 2013. PLY (Python Lex-Yacc). Retrieved on June 26, 2020 from http://www.dabeaz.com/ply/.Google Scholar
- Brian Buckeye and Kevin Liston. 2006. Recovering Deleted Files in Linux. Retrieved on June 26, 2020 from http://citeseerx.ist.psu.edu/viewdoc/download?.Google Scholar
- Haogang Chen, Daniel Ziegler, Tej Chajed, Adam Chlipala, M. Frans Kaashoek, and Nickolai Zeldovich. 2015. Using Crash Hoare logic for certifying the FSCQ file system. In Proceedings of the 25th Symposium on Operating Systems Principles. ACM, 18--37.Google Scholar
Digital Library
- Michael Chow, David Meisner, Jason Flinn, Daniel Peek, and Thomas F. Wenisch. 2014. The mystery machine: End-to-end performance analysis of large-scale internet services. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). USENIX Association, 217--231. https://www.usenix.org/conference/osdi14/technical-sessions/presentation/chow.Google Scholar
- Al Danial. 2009. Cloc--count lines of code. Open Source (2009). Retrieved June 26, 2020 from http://cloc.sourceforge.net/.Google Scholar
- Ramez Elmasri and Shamkant B. Navathe. 2011. Database Systems. Vol. 9. Pearson Education, Boston, MA.Google Scholar
- Kathleen Fisher and David Walker. 2011. The PADS project: An overview. In Proceedings of the 14th International Conference on Database Theory. ACM, 11--17.Google Scholar
Digital Library
- Daniel Fryer, Kuei Sun, Rahat Mahmood, Tinghao Cheng, Shaun Benjamin, Ashvin Goel, and Angela Demke Brown. 2012. Recon: Verifying file system consistency at runtime. ACM Transactions on Storage 8, 4 (Dec. 2012), Article 15, 29 pages.Google Scholar
Digital Library
- Erich Gamma. 1995. Design Patterns: Elements of Reusable Object-Oriented Software. Pearson Education, India.Google Scholar
- Hector Garcia-Molina, Jeffrey D. Ullman, and Jennifer Widom. 2000. Database System Implementation. Vol. 672. Prentice Hall: Upper Saddle River, NJ.Google Scholar
- Philippa Gardner, Gian Ntzik, and Adam Wright. 2014. Local reasoning for the POSIX file system. In European Symposium on Programming Languages and Systems. Springer, 169--188.Google Scholar
Digital Library
- Curtis Gedak. 2012. Manage Partitions with GParted How-to. Packt Publishing Ltd.Google Scholar
- Haryadi S. Gunawi, Vijayan Prabhakaran, Swetha Krishnan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2007. Improving file system reliability with I/O shepherding. In Proceedings of the Symposium on Operating Systems Principles (SOSP’07). 293--306.Google Scholar
- Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2008. SQCK: A declarative file system checker. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI).Google Scholar
- Wim H. Hesselink and Muhammad Ikram Lali. 2009. Formalizing a hierarchical file system. Electronic Notes in Theoretical Computer Science 259 (2009), 67--85.Google Scholar
Digital Library
- Ian Hickson and David Hyatt. 2011. Html5. W3C Working Draft WD-html5-20110525, May (2011).Google Scholar
- Leslie Lamport. 1994. LATEX: A Document Preparation System: User’s Guide and Reference Manual. Addison-Wesley.Google Scholar
- Changman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho. 2015. F2FS: A new file system for flash storage. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). 273--286.Google Scholar
Digital Library
- Lanyue Lu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, and Shan Lu. 2014. A study of Linux file system evolution. ACM Transactions on Storage (TOS) 10, 1 (2014), 3.Google Scholar
- Ao Ma, Chris Dragga, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2013. ffsck: The fast file system checker. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’13).Google Scholar
- Michael Mesnier, Feng Chen, Tian Luo, and Jason B. Akers. 2011. Differentiated storage services. In Proceedings of the Symposium on Operating Systems Principles (SOSP’11). 57--70.Google Scholar
- George C. Necula, Scott McPeak, and Westley Weimer. 2002. CCured: Type-safe retrofitting of legacy code. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’02). ACM, New York, NY, 128--139. DOI:https://doi.org/10.1145/503272.503286Google Scholar
Digital Library
- Kent Overstreet. 2016. Linux Bcache. Retrieved on June 26, 2020 from https://bcache.evilpiepirate.org/.Google Scholar
- Meredith Patterson and Dan Hirsch. [n.d.]. Hammer Parser Generator (March 2014). Retrieved on June 26, 2020 from https://github.com/UpstandingHackers/hammer.Google Scholar
- Ohad Rodeh, Josef Bacik, and Chris Mason. 2013. BTRFS: The Linux B-tree filesystem. ACM Transactions on Storage 9, 39 (Aug. 2013), Article, 32 pages. DOI:https://doi.org/10.1145/2501620.2501623Google Scholar
Digital Library
- Armin Ronacher. 2011. Jinja2 Documentation.Google Scholar
- Gopalan Sivathanu, Swaminathan Sundararaman, and Erez Zadok. 2006. Type-safe disks. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI’06). 15--28.Google Scholar
- Muthian Sivathanu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, and Somesh Jha. 2005. A logic of file systems. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’05).Google Scholar
- Muthian Sivathanu, Vijayan Prabhakaran, Florentina I. Popovici, Timothy E. Denehy, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2003. Semantically-smart disk systems. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’03). 73--88.Google Scholar
- D. Steedman. 1993. Abstract Syntax Notation One (ASN. 1): The Tutorial and Reference. Technology appraisals.Google Scholar
- Ioan Stefanovici, Eno Thereska, Greg O’Shea, Bianca Schroeder, Hitesh Ballani, Thomas Karagiannis, Antony Rowstron, and Tom Talpey. 2015. Software-defined caching: Managing caches in multi-tenant data centers. In Proceedings of the 6th ACM Symposium on Cloud Computing. ACM, 174--181.Google Scholar
Digital Library
- Kuei Sun, Matthew Lakier, Angela Demke Brown, and Ashvin Goel. 2018. Breaking apart the {VFS} for managing file systems. In Proceedings of the 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’18).Google Scholar
Digital Library
- Kuei Jack Sun. 2013. Robust Consistency Checking for Modern Filesystems. Ph.D. Dissertation. University of Toronto.Google Scholar
- Microsoft TechNet. [n.d.]. How to Convert FAT Disks to NTFS. Retrieved on June 26, 2020 from https://technet.microsoft.com/en-us/library/bb456984.aspx.Google Scholar
- Tom Warren. [n.d.]. Apple is upgrading millions of iOS devices to a new modern file system today. Retrieved March 27, 2017 from https://www.theverge.com/2017/3/27/15076244/apple-file-system-apfs-ios-10-3-features.Google Scholar
- Linus Torvalds, Josh Triplett, and Christopher Li. 2007. Sparse—A semantic parser for C. Retrieved on June 26, 2020 from http://sparse.wiki.kernel.org.Google Scholar
- Theodore Ts’o. 2017. E2fsprogs: Ext2/3/4 filesystem utilities. Retrieved on June 26, 2020 from http://e2fsprogs.sourceforge.net/.Google Scholar
- Kenton Varda. 2008. Protocol buffers: Google’s data interchange format. Google Open Source Blog, available at least as early as July. 2008.Google Scholar
- Andrew Wilson. 2008. The new and improved FileBench. In Proceedings of the 6th USENIX Conference on File and Storage Technologies. https://github.com/filebench/filebench/.Google Scholar
- Junfeng Yang, Paul Twohey, Dawson Engler, and Madanlal Musuvathi. 2006. Using model checking to find serious file system errors. ACM Transactions on Computer Systems (TOCS) 24, 4 (2006), 393--423.Google Scholar
Digital Library
- Michal Zalewski. 2016. American fuzzy lop. Retrieved on June 26, 2020 from http://lcamtuf.coredump.cx/afl/.Google Scholar
- Xu Zhao, Kirk Rodrigues, Yu Luo, Ding Yuan, and Michael Stumm. 2016. Non-intrusive performance profiling for entire software stacks based on the flow reconstruction principle. In Proceedings of the12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16). USENIX Association, 603--618. https://www.usenix.org/conference/osdi16/technical-sessions/presentation/zhao.Google Scholar
- F. Zhou, J. Condit, Z. Anderson, I. Bagrak, R. Ennals, M. Harren, G. Necula, and E. Brewer. 2006. SafeDrive: Safe and recoverable extensions using language-based techniques. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation. USENIX Association, 45--60.Google Scholar
Index Terms
Spiffy: Enabling File-System Aware Storage Applications
Recommendations
Spiffy: enabling file-system aware storage applications
FAST'18: Proceedings of the 16th USENIX Conference on File and Storage TechnologiesMany file-system applications such as defragmentation tools, file system checkers or data recovery tools, operate at the storage layer. Today, developers of these storage applications require detailed knowledge of the file-system format, which takes a ...
Exploiting Multi-Block Atomic Write in SQLite Transaction
HP3C-2017: Proceedings of the International Conference on High Performance Compilation, Computing and CommunicationsThis work is dedicated to resolve the journaling overhead of widely used DBMS, SQLite. Database journaling and EXT4 filesystem journaling cause enormous write operations because of the frequent fdatasync() call and Journaling of Journal anomaly between ...
An Efficient Order-Preserving Recovery for F2FS with ZNS SSD
HotStorage '23: Proceedings of the 15th ACM Workshop on Hot Topics in Storage and File SystemsStorage devices use write buffers to improve performance, where multiple write requests are processed in parallel and completed in a random order. This may result in data loss in the event of a sudden failure. Therefore, Linux filesystems provide the ...






Comments