skip to main content
research-article

GCTrees: Garbage Collecting Snapshots

Published:28 January 2016Publication History
Skip Abstract Section

Abstract

File-system snapshots have been a key component of enterprise storage management since their inception. Creating and managing them efficiently, while maintaining flexibility and low overhead, has been a constant struggle. Although the current state-of-the-art mechanism—hierarchical reference counting—performs reasonably well for traditional small-file workloads, these workloads are increasingly vanishing from the enterprise data center, replaced instead with virtual machine and database workloads. These workloads center around a few very large files, violating the assumptions that allow hierarchical reference counting to operate efficiently. To better cope with these workloads, we introduce Generational Chain Trees (GCTrees), a novel method of space management that uses concepts of block lineage across snapshots rather than explicit reference counting. As a proof of concept, we create a prototype file system—gcext4, a modified version of ext4 that uses GCTrees as a basis for snapshots and copy-on-write. In evaluating this prototype empirically, we find that although they have a somewhat higher overhead for traditional workloads, GCTrees have dramatically lower overhead than hierarchical reference counting for large-file workloads, improving by a factor of 34 or more in some cases. Furthermore, gcext4 performs comparably to ext4 across all workloads, showing that GCTrees impose minor cost for their benefits.

References

  1. Amir G. 2011. NEXT3 Snapshot Design. Retrieved January 6, 2016, from http://sourceforge.net/projects/next3/files/Next3_Snapshots.pdf/download.Google ScholarGoogle Scholar
  2. S. Daniel and R. E. Faith. 2005. A portable, open-source implementation of the SPC-1 workload. In Proceedings of the IEEE International Workload Characterization Symposium, 2005 (IISWC’05). IEEE, Los Alamitos, CA, 174--177. DOI:http://dx.doi.org/10.1109/IISWC.2005.1526014Google ScholarGoogle ScholarCross RefCross Ref
  3. Chris Dragga and Douglas J. Santry. 2015. GCTrees: Garbage collecting snapshots. In Proceedings of the 31st International Conference on Massive Storage Systems and Technology (MSST’15). IEEE, Los Alamitos, CA, 12.Google ScholarGoogle Scholar
  4. John K. Edwards, Daniel Ellard, Craig Everhart, Robert Fair, Eric Hamilton, Andy Kahn, Arkady Kanevsky, James Lentini, Ashish Prakash, and Keith A. Smith. 2008. FlexVol: Flexible, efficient file volume virtualization in WAFL. In Proceedings of the 2008 USENIX Annual Technical Conference (ATC’08). 129--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. File Systems and Storage Lab, Stony Brook University. 2015. Filebench. Retrieved January 6, 2016, from http://filebench.sourceforge.net/wiki/index.php/Main_Page.Google ScholarGoogle Scholar
  6. Dave Hitz, James Lau, and Michael A. Malcolm. 1994. File system design for an NFS file server appliance. In Proceedings of the USENIX Winter Technical Conference (USENIX Winter’94). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Avantika Mathur, Mingming Cao, Suparna Bhattacharya, Andreas Dilger, Alex Tomas, and Laurent Vivier. 2007. The new ext4 filesystem: Current status and future plans. In Proceedings of the Linux Symposium, Vol. 2. 21--33.Google ScholarGoogle Scholar
  8. Marshall Kirk McKusick and Gregory R. Ganger. 1999. Soft updates: A technique for eliminating most synchronous writes in the fast filesystem. In Proceedings of the USENIX Annual Technical Conference (ATC’99). 18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Marshall Kirk McKusick, William N. Joy, Samuel J. Leffler, and Robert S. Fabry. 1984. A fast file system for UNIX. ACM Transactions on Computer Systems 2, 3, 181--197. DOI:http://dx.doi.org/10.1145/989.990 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bogdan Nicolae, Gabriel Antoniu, Luc Bougè, Diana Moise, and Alexandra Carpen-Amarie. 2011. BlobSeer: Next-generation data management for large scale infrastructures. Journal of Parallel and Distributed Computing 71, 2, 169--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Zachary Peterson and Randal Burns. 2005. Ext3Cow: A time-shifting file system for regulatory compliance. ACM Transactions on Storage 1, 2, 190--212. DOI:http://dx.doi.org/10.1145/1063786.1063789 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Rob Pike, Dave Presotto, Ken Thompson, and Howard Trickey. 1990. Plan 9 from Bell labs. In Proceedings of the Summer 1990 UKUUG Conference. 1--9.Google ScholarGoogle Scholar
  13. Sean Quinlan. 1991. A cached WORM file system. Software: Practice and Experience 21, 12, 1289--1299. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Sean Quinlan and Sean Dorward. 2002. Venti: A new approach to archival data storage. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST’02). Article No. 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sean Quinlan, Jim McKie, and Russ Cox. 2003. Fossil, an Archival File Server. Retrieved January 6, 2016, from http://www.cs.bell-labs.com/sys/doc/fossil.pdf.Google ScholarGoogle Scholar
  16. Ohad Rodeh. 2008. B-trees, shadowing, and clones. ACM Transactions on Storage 3, 4, Article No. 2. DOI:http://dx.doi.org/10.1145/1326542.1326544 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ohad Rodeh. 2010. Deferred Reference Counters for Copy-on-Write B-trees. Technical Report rj10464. IBM. http://domino.watson.ibm.com/library/Cyberdig.nsf/papers/B7C80D4AF7CB08DF85257712004C5228/$File/rj10464.pdf.Google ScholarGoogle Scholar
  18. Ohad Rodeh, Josef Bacik, and Chris Mason. 2013. BTRFS: The linux B-tree filesystem. ACM Transactions on Storage 9, 3, Article No. 9. DOI:http://dx.doi.org/10.1145/2501620.2501623 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Douglas S. Santry, Michael J. Feeley, Norman C. Hutchinson, Alistair C. Veitch, Ross W. Carton, and Jacob Ofir. 2000. Deciding when to forget in the elephant file system. SIGOPS Oper.ating Systems Review 34, 2, 18--19. DOI:http://dx.doi.org/10.1145/346152.346180 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Craig A. N. Soules, Garth R. Goodson, John D. Strunk, and Gregory R. Ganger. 2003. Metadata efficiency in versioning file systems. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST’03). 43--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Andy Twigg, Andrew Byde, Grzegorz Miłoś, Tim Moreton, John Wilkes, and Tom Wilkie. 2011. Stratified B-trees and versioned dictionaries. In Proceedings of the 3rd USENIX Conference on Hot Topics in Storage and File Systems (HotStorage’11). 1--5. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. GCTrees: Garbage Collecting Snapshots

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Storage
          ACM Transactions on Storage  Volume 12, Issue 1
          Special Issue on Massive Storage Systems and Technologies (MSST 2015)
          February 2016
          108 pages
          ISSN:1553-3077
          EISSN:1553-3093
          DOI:10.1145/2875132
          Issue’s Table of Contents

          Copyright © 2016 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 28 January 2016
          • Revised: 1 December 2015
          • Accepted: 1 December 2015
          • Received: 1 October 2015
          Published in tos Volume 12, Issue 1

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!