Abstract
File-system snapshots have been a key component of enterprise storage management since their inception. Creating and managing them efficiently, while maintaining flexibility and low overhead, has been a constant struggle. Although the current state-of-the-art mechanism—hierarchical reference counting—performs reasonably well for traditional small-file workloads, these workloads are increasingly vanishing from the enterprise data center, replaced instead with virtual machine and database workloads. These workloads center around a few very large files, violating the assumptions that allow hierarchical reference counting to operate efficiently. To better cope with these workloads, we introduce Generational Chain Trees (GCTrees), a novel method of space management that uses concepts of block lineage across snapshots rather than explicit reference counting. As a proof of concept, we create a prototype file system—gcext4, a modified version of ext4 that uses GCTrees as a basis for snapshots and copy-on-write. In evaluating this prototype empirically, we find that although they have a somewhat higher overhead for traditional workloads, GCTrees have dramatically lower overhead than hierarchical reference counting for large-file workloads, improving by a factor of 34 or more in some cases. Furthermore, gcext4 performs comparably to ext4 across all workloads, showing that GCTrees impose minor cost for their benefits.
- Amir G. 2011. NEXT3 Snapshot Design. Retrieved January 6, 2016, from http://sourceforge.net/projects/next3/files/Next3_Snapshots.pdf/download.Google Scholar
- S. Daniel and R. E. Faith. 2005. A portable, open-source implementation of the SPC-1 workload. In Proceedings of the IEEE International Workload Characterization Symposium, 2005 (IISWC’05). IEEE, Los Alamitos, CA, 174--177. DOI:http://dx.doi.org/10.1109/IISWC.2005.1526014Google Scholar
Cross Ref
- Chris Dragga and Douglas J. Santry. 2015. GCTrees: Garbage collecting snapshots. In Proceedings of the 31st International Conference on Massive Storage Systems and Technology (MSST’15). IEEE, Los Alamitos, CA, 12.Google Scholar
- John K. Edwards, Daniel Ellard, Craig Everhart, Robert Fair, Eric Hamilton, Andy Kahn, Arkady Kanevsky, James Lentini, Ashish Prakash, and Keith A. Smith. 2008. FlexVol: Flexible, efficient file volume virtualization in WAFL. In Proceedings of the 2008 USENIX Annual Technical Conference (ATC’08). 129--142. Google Scholar
Digital Library
- File Systems and Storage Lab, Stony Brook University. 2015. Filebench. Retrieved January 6, 2016, from http://filebench.sourceforge.net/wiki/index.php/Main_Page.Google Scholar
- Dave Hitz, James Lau, and Michael A. Malcolm. 1994. File system design for an NFS file server appliance. In Proceedings of the USENIX Winter Technical Conference (USENIX Winter’94). Google Scholar
Digital Library
- Avantika Mathur, Mingming Cao, Suparna Bhattacharya, Andreas Dilger, Alex Tomas, and Laurent Vivier. 2007. The new ext4 filesystem: Current status and future plans. In Proceedings of the Linux Symposium, Vol. 2. 21--33.Google Scholar
- Marshall Kirk McKusick and Gregory R. Ganger. 1999. Soft updates: A technique for eliminating most synchronous writes in the fast filesystem. In Proceedings of the USENIX Annual Technical Conference (ATC’99). 18. Google Scholar
Digital Library
- Marshall Kirk McKusick, William N. Joy, Samuel J. Leffler, and Robert S. Fabry. 1984. A fast file system for UNIX. ACM Transactions on Computer Systems 2, 3, 181--197. DOI:http://dx.doi.org/10.1145/989.990 Google Scholar
Digital Library
- Bogdan Nicolae, Gabriel Antoniu, Luc Bougè, Diana Moise, and Alexandra Carpen-Amarie. 2011. BlobSeer: Next-generation data management for large scale infrastructures. Journal of Parallel and Distributed Computing 71, 2, 169--184. Google Scholar
Digital Library
- Zachary Peterson and Randal Burns. 2005. Ext3Cow: A time-shifting file system for regulatory compliance. ACM Transactions on Storage 1, 2, 190--212. DOI:http://dx.doi.org/10.1145/1063786.1063789 Google Scholar
Digital Library
- Rob Pike, Dave Presotto, Ken Thompson, and Howard Trickey. 1990. Plan 9 from Bell labs. In Proceedings of the Summer 1990 UKUUG Conference. 1--9.Google Scholar
- Sean Quinlan. 1991. A cached WORM file system. Software: Practice and Experience 21, 12, 1289--1299. Google Scholar
Digital Library
- Sean Quinlan and Sean Dorward. 2002. Venti: A new approach to archival data storage. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST’02). Article No. 7. Google Scholar
Digital Library
- Sean Quinlan, Jim McKie, and Russ Cox. 2003. Fossil, an Archival File Server. Retrieved January 6, 2016, from http://www.cs.bell-labs.com/sys/doc/fossil.pdf.Google Scholar
- Ohad Rodeh. 2008. B-trees, shadowing, and clones. ACM Transactions on Storage 3, 4, Article No. 2. DOI:http://dx.doi.org/10.1145/1326542.1326544 Google Scholar
Digital Library
- Ohad Rodeh. 2010. Deferred Reference Counters for Copy-on-Write B-trees. Technical Report rj10464. IBM. http://domino.watson.ibm.com/library/Cyberdig.nsf/papers/B7C80D4AF7CB08DF85257712004C5228/$File/rj10464.pdf.Google Scholar
- Ohad Rodeh, Josef Bacik, and Chris Mason. 2013. BTRFS: The linux B-tree filesystem. ACM Transactions on Storage 9, 3, Article No. 9. DOI:http://dx.doi.org/10.1145/2501620.2501623 Google Scholar
Digital Library
- Douglas S. Santry, Michael J. Feeley, Norman C. Hutchinson, Alistair C. Veitch, Ross W. Carton, and Jacob Ofir. 2000. Deciding when to forget in the elephant file system. SIGOPS Oper.ating Systems Review 34, 2, 18--19. DOI:http://dx.doi.org/10.1145/346152.346180 Google Scholar
Digital Library
- Craig A. N. Soules, Garth R. Goodson, John D. Strunk, and Gregory R. Ganger. 2003. Metadata efficiency in versioning file systems. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST’03). 43--58. Google Scholar
Digital Library
- Andy Twigg, Andrew Byde, Grzegorz Miłoś, Tim Moreton, John Wilkes, and Tom Wilkie. 2011. Stratified B-trees and versioned dictionaries. In Proceedings of the 3rd USENIX Conference on Hot Topics in Storage and File Systems (HotStorage’11). 1--5. Google Scholar
Digital Library
Index Terms
GCTrees: Garbage Collecting Snapshots
Recommendations
A multiple-file write scheme for improving write performance of small files in Fast File System
Fast File System (FFS) stores files to disk in separate disk writes, each of which incurs a disk positioning (seek + rotation) limiting the write performance for small files. We propose a new scheme called co-writing to accelerate small file writes in ...
Mercurial: A Traffic-Saving Roll Back System for Virtual Machine Cluster
UCC '14: Proceedings of the 2014 IEEE/ACM 7th International Conference on Utility and Cloud ComputingVirtual Machine Cluster (VMC) is now widely used to host network applications due to its well scalability and high availability compared to physical cluster. To provide fault tolerance, VMC snapshot is one well known technique, it saves the entire VMC ...
Implementation of a stackable file system for real-time network backup
We propose a backup system based on a stackable mirroring file system, general-purpose mirroring file system (GMFS). This file system mirrors data in real-time on the file system layer. It uses the typical network file system (NFS) and backs up data to ...






Comments