skip to main content
short-paper
Public Access

The Composite-File File System: Decoupling One-to-One Mapping of Files and Metadata for Better Performance

Published:02 March 2020Publication History
Skip Abstract Section

Abstract

The design and implementation of traditional file systems typically use the one-to-one mapping of logical files to their physical metadata representations. File system optimizations generally follow this rigid mapping and miss opportunities for an entire class of optimizations.

We designed, implemented, and evaluated a composite-file file system, which allows many-to-one mappings of files to metadata. Through exploring different mapping strategies, our empirical evaluation shows up to a 27% performance improvement under web server and software development workloads, for both disks and SSDs. This result demonstrates that our approach of relaxing file-to-metadata mapping is promising.

References

  1. M. Abd-El-Malek, W. V. Courtright, C. Cranor, G. R. Ganger, J. Hendricks, A. J. Klosterman, M. Mesnier et al. 2005. Ursa Minor: Versatile cluster-based storage. In Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST’05). 59--72.Google ScholarGoogle Scholar
  2. R. Agrawal and R. Srikant. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94). 487--499.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Albrecht. 2017. Web Performance: Cache Efficiency Exercise. Retrieved February 6, 2020 from https://code.facebook.com/posts/964122680272229/web-performance-cache-efficiency-exercise/.Google ScholarGoogle Scholar
  4. D. Beaver, S. Kumar, H. C. Li, J. Sobel, and P. Vajgel. 2010. Finding a needle in haystack: Facebook's photo storage. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI’10). 47--60.Google ScholarGoogle Scholar
  5. B. Bloom. 1970. Space/time tradeoffs in hash coding with allowable errors. Communications of the ACM 13, 7 (1970), 422--426.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Chandrasekar, R. Dakshinamurthy, P. G. Seshakumar, B. Prabavathy, and B. Chitra. 2013. A novel indexing scheme for efficient handling of small files in Hadoop Distributed File System. In Proceedings of the 2013 International Conference on Computer Communication and Informatics (ICCCI’2013). 1--8. DOI:http://dx.doi.org/10.1109/iccci.2013.6466147Google ScholarGoogle ScholarCross RefCross Ref
  7. V. Chidambaram, T. Sharma, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. 2012. Consistency without ordering. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’12). 101--116.Google ScholarGoogle Scholar
  8. X. Ding, S. Jiang, F. Chen, K. Davis, and X. Zhang. 2007. DiskSeen: Exploiting disk layout and access history to enhance I/O prefetch. In Proceedings of the 2007 USENIX Annual Technical Conference (ATC’07). 261--274Google ScholarGoogle Scholar
  9. B. Dong, J. Qiu, Q. Zheng, X. Zhong, J. Li, and Y. Li. 2010. A novel approach to improving the efficiency of storing and accessing smaller files on Hadoop: A case study by PowerPoint files. In Proceedings of the 2010 IEEE International Conference on Services Computing. 65--72Google ScholarGoogle Scholar
  10. N. K. Edel, D. Tuteja, E. L. Miller, and S. A. Brandt. 2004. MRAMFS: A compressing file system for non-volatile RAM. In Proceedings of the IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (MASCOTS’2004). 596--603.Google ScholarGoogle Scholar
  11. G. R. Ganger and M. F. Kaashoek. 1997. Embedded inodes and explicit grouping: Exploiting disk bandwidth for small files. In Proceedings of the USENIX 1997 Annual Technical Conference (ATC’97). 1--17.Google ScholarGoogle Scholar
  12. J. A. Garrison and A. L. N. Reddy. 2009. Umbrella file system: Storage management across heterogeneous devices. ACM Transactions on Storage 5, 1, Article 3.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Harter, C. Dragga, M. Vaughn, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. 2012. A file is not a file: Understanding the I/O behavior of Apple desktop applications. ACM Transactions on Computer Systems 30, 3, (2012), Article 10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. S. Heidemann and G. J. Popek. 1994. File-system development with stackable layers. ACM Transactions on Computer Systems: Special Issue on Operating Systems Principles 12, 1 (1994), 58--89.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Jain. 1991. The Art of Computer Systems Performance Analysis. Wiley.Google ScholarGoogle Scholar
  16. S. Jiang, X. Ding, Y. Xu, and K. Davis. 2013. A prefetching scheme exploiting both data layout and access history on disk. ACM Transactions on Storage 9, 3 (2013), Article 10. DOI:http://dx.doi.org/10.1145/2508010Google ScholarGoogle ScholarCross RefCross Ref
  17. T. M. Kroeger and D. E. Long. 2001. Design and implementation of a predictive file prefetching. In Proceedings of the USENIX 2001 Annual Technical Conference (ATC’01).Google ScholarGoogle Scholar
  18. Z. Li, Z. Chen, S. M. Srinivasan, and Y. Y. Zhou. 2004. C-Miner: Mining block correlations in storage systems. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST’04).Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. K. McKusick, M. J. Karels, and K. Bostic. 1990. A pageable memory based filesystem. In Proceedings of the USENIX Summer Conference.Google ScholarGoogle Scholar
  20. S. J. Mullender and A. S. Tanenbaum. 1984. Immediate files. Software: Practice and Experience 14, 4 (1984), 365--368.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. PKWARE. 2018. ZIP File Format Specification. Retrieved February 6, 2020 from https://pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.3.5.TXT.Google ScholarGoogle Scholar
  22. K. Ren and G. Gibson. 2013. TABLEFS: Enhancing metadata efficiency in the local file system. In Proceedings of the 2013 USENIX Annual Technical Conference (ATC’13). 145--156.Google ScholarGoogle Scholar
  23. O. Rodeh, J. Bacik, and C. Mason. 2013. BTRFS: The Linux B-Tree Filesystem. ACM Transactions on Storage 9, 3 (2013), Article 9. DOI:http://dx.doi.org/10.1145/2501620.2501623Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. Roselli, J. R. Lorch, and T. E. Anderson. 2000. A comparison of file system workloads. In Proceeding of the 2000 USENIX Annual Technical Conference (ATC’00).Google ScholarGoogle Scholar
  25. G. Soundararajan, M. Mihailescu, and C. Amza. 2008. Context-aware prefetching at the storage server. In Proceedings of the 2008 USENIX Annual Technical Conference (ATC’08). 377--390.Google ScholarGoogle Scholar
  26. M. Szeredi. 2017. Filesystem in Userspace. Retrieved February 6, 2020 from https://github.com/libfuse/libfuse.Google ScholarGoogle Scholar
  27. M. Terry. 2017. Duplicity. Retrieved February 6, 2020 from http://duplicity.nongnu.org/index.html.Google ScholarGoogle Scholar
  28. B. K. R. Vangoor, V. Tarasov, and E. Zadok. 2017. To FUSE or not to FUSE: Performance of user-space file system. In Proceedings of the 15th USENIX Conference on File and Technologies (FAST’17).Google ScholarGoogle Scholar
  29. W. Yu, J. Vetter, R. S. Canon, and S. Jiang. 2007. Exploiting Lustre file joining for effective collective IO. In Proceedings of the 7th International Symposium on Cluster Computing and the Grid (CCGRID’07).Google ScholarGoogle Scholar
  30. Z. Zhang and K. Ghose. 2007. hFS: A hybrid file system prototype for improving small file and metadata performance. In Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems (EuroSys’07). 175--187. DOI:https://doi.org/10.1145/1272996.1273016Google ScholarGoogle Scholar

Index Terms

  1. The Composite-File File System: Decoupling One-to-One Mapping of Files and Metadata for Better Performance

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Storage
      ACM Transactions on Storage  Volume 16, Issue 1
      ATC 2019 Special Section and Regular Papers
      February 2020
      155 pages
      ISSN:1553-3077
      EISSN:1553-3093
      DOI:10.1145/3386184
      • Editor:
      • Sam H. Noh
      Issue’s Table of Contents

      Copyright © 2020 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 March 2020
      • Accepted: 1 October 2019
      • Revised: 1 July 2019
      • Received: 1 April 2019
      Published in tos Volume 16, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • short-paper
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!