Abstract
This article presents a framework, Frog, for Context-Based File Systems (CBFSs) that aim at simplifying the development of context-based file systems and applications. Unlike existing informed-based context-aware systems, Frog is a unifying informed-based framework that abstracts context-specific solutions as views, allowing applications to make view selections according to application behaviors. The framework can not only eliminate overheads induced by traditional context analysis, but also simplify the interactions between the context-based file systems and applications. Rather than propagating data through solution-specific interfaces, views in Frog can be selected by inserting their names in file path strings. With Frog in place, programmers can migrate an application from one solution to another by switching among views rather than changing programming interfaces. Since the data consistency issues are automatically enforced by the framework, file-system developers can focus their attention on context-specific solutions. We implement two prototypes to demonstrate the strengths and overheads of our design. Inspired by an observation that there are more than 50% of small files (<4KB) in a file system, we create a Bi-context Archiving Virtual File System (BAVFS) that utilizes conservative and aggressive prefetching for the contexts of random and sequential reads. To improve the performance of random read-and-write operations, the Bi-context Hybrid Virtual File System (BHVFS) combines the update-in-place and update-out-of-place solutions for read-intensive and write-intensive contexts. Our experimental results show that the benefits of Frog-based CBFSs outweigh the overheads introduced by integrating multiple context-specific solutions.
- N. Agrawal, W. J. Bolosky, J. R. Douceur, and J. R. Lorch. 2007. A five-year study of file-system metadata. ACM Transactions on Storage (TOS) 3, 3, Article 9, 32 pages. Google Scholar
Digital Library
- P. Carns, S. Lang, R. Ross, M. Vilayannur, J. Kunkel, and T. Ludwig. 2009. Small-file access in parallel file systems. In Proceedings of the 2009 IEEE International Symposium on Parallel and Distributed Processing. 1--11. Google Scholar
Digital Library
- Weijie Chu, Weiping Li, Tong Mo, and Zhonghai Wu. 2011. A context-source abstraction layer for context-aware middleware. In 8th International Conference on Information Technology: New Generations (ITNG). 1064--1065. Google Scholar
Digital Library
- ClamAV. 2014. ClamAV. Retrieved July 3, 2015 from http://www.clamav.net.Google Scholar
- R. Fagin, J. Nievergelt, N. Pippenger, and H. R. Strong. 1979. Extendible hashing—a fast access method for dynamic files. ACM Transactions on Database Systems 4, 3, 315--344. Google Scholar
Digital Library
- FileBench. 2014. FileBench. Retrieved July 3, 2015 from http://sourceforge.net/projects/filebench/.Google Scholar
- FUSE. 2014. File system in user space(FUSE). Retrieved July 3, 2015 from http://fuse.sourceforge.net/.Google Scholar
- J. F. Gantz. 2008. The diverse and exploding digital universe. IDC White Paper 2, Framingham, MA, 1--16.Google Scholar
- N. H. Gehani, H. V. Jagadish, and W. D. Roome. 1994. OdeFS: A file system interface to an object-oriented database. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB ’94). San Francisco, CA, 249--260. Google Scholar
Digital Library
- S. Ghemawat, H. Gobioff, and S. Leung. 2003. The Google file system. SIGOPS Operating Systems Review 37, 5, 29--43. Google Scholar
Digital Library
- Gnu Grep. 2014. Gnu Grep. Retrieved July 3, 2015 from http://www.gnu.org/software/grep/.Google Scholar
- T. Gu, H. K. Pung, and D. Q. Zhang. 2004. A middleware for building context-aware mobile services. In Proceedings of IEEE Vehicular Technology Conference (VTC-Spring’04), Vol. 5. Milan, Italy, 2656--2660.Google Scholar
- S. Jain, F. Shafique, V. Djeric, and A. Goel. 2008. Application-level isolation and recovery with solitude. SIGOPS Operating Systems Review 42, 4, 95--107. Google Scholar
Digital Library
- JFS. 2014. The IBM JFS project. Retrieved July 3, 2015 from http://jfs.sourceforge.net/.Google Scholar
- J. Katcher. 1997. PostMark: A new filesystem benchmark. Technical Report TR3022, 1--8.Google Scholar
- D. Kotz and C. S. Ellis. 1991. Practical prefetching techniques for parallel file systems. In Proceedings of the 1st International Conference on Parallel and Distributed Information Systems (PDIS’91). IEEE Computer Society Press, Los Alamitos, CA, 182--189. Google Scholar
Digital Library
- A. Krause, A. Smailagic, and D. P. Siewiorek. 2006. Context-aware mobile computing: Learning context-dependent personal preferences from a wearable sensor array. IEEE Transactions on Mobile Computing 5, 2, 113--127. Google Scholar
Digital Library
- Tom M. Kroeger. 2000. Modeling file access patterns to improve caching performance. PhD Dissertation. University of California, Santa Cruz. Google Scholar
Digital Library
- Thomas M. Kroeger and Darrell D. E. Long. 1999. The case for efficient file access pattern modeling. In Proceedings of the 7th Workshop on Hot Topics in Operating Systems (HOTOS’99). IEEE Computer Society, Washington, DC, 14--14. Google Scholar
Digital Library
- Tom M. Kroeger and Darrell D. E. Long. 2001. Design and implementation of a predictive file prefetching algorithm. In Proceedings of the General Track: 2002 USENIX Annual Technical Conference. USENIX Association, Berkeley, CA, 105--118. Google Scholar
Digital Library
- M. Lee, S. L. Min, C. Y. Park, Y. H. Bae, H. Shin, and C. S. Kim. 1993. A dual-mode instruction prefetch scheme for improved worst case and average case program execution times. In Proceedings of the Real-Time Systems Symposium. 98--105.Google Scholar
- J. Leverich and C. Kozyrakis. 2010. On the energy (in)efficiency of Hadoop clusters. SIGOPS Operating Systems Review 44, 1, 61--65. Google Scholar
Digital Library
- C. Li, K. Shen, and A. E. Papathanasiou. 2007. Competitive prefetching for concurrent sequential I/O. SIGOPS Operating Systems Review 41, 3, 189--202. Google Scholar
Digital Library
- Shuang Liang, Song Jiang, and Xiaodong Zhang. 2007. STEP: Sequentiality and thrashing detection based prefetching to improve performance of networked storage servers. In Proceedings of the 27th International Conference on Distributed Computing Systems (ICDCS’07). IEEE Computer Society, Washington, DC, 64--74. Google Scholar
Digital Library
- X. Ma and A. L. N. Reddy. 2001. MVSS: Multi-view storage system. In Proceeding of the 21st International Conference on Distributed Computing Systems, 2001. 31--38. Google Scholar
Digital Library
- A. Mathur, M. Cao, S. Bhattacharya, A. Dilger, A. Tomas, and L. Vivier. 2007. The new ext4 filesystem: Current status and future plans. In Proceedings of the 2007 Linux Symposium. 21--33.Google Scholar
- M. K. McKusick and G. R. Ganger. 1999. Soft updates: A technique for eliminating most synchronous writes in the fast filesystem. In Proceedings of the FREENIX Track: 1999 USENIX Annual Technical Conference (ATEC’99). USENIX Association, Berkeley, CA, 24--24. Google Scholar
Digital Library
- M. K. McKusick, W. N. Joy, S. J. Leffler, and R. S. Fabry. 1984. A fast file system for UNIX. ACM Transactions on Computer Systems (TOCS) 2, 3, 181--197. Google Scholar
Digital Library
- K. Muller and J. Pasquale. 1991. A high performance multi-structured file system design. Proceedings of the 13th ACM Symposium on Operating Systems Principles 25, 5, 56--67. Google Scholar
Digital Library
- Brian D. Noble, M. Satyanarayanan, Dushyanth Narayanan, James Eric Tilton, Jason Flinn, and Kevin R. Walker. 1997. Agile application-aware adaptation for mobility. In Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP’97). ACM, New York, NY, 276--287. Google Scholar
Digital Library
- S. V. Patil, G. A. Gibson, S. Lang, and M. Polte. 2007. GIGA+: Scalable directories for shared file systems. In Proceedings of the 2nd International Workshop on Petascale Data Storage: Held in Conjunction with Supercomputing’07 (PDSW’07). ACM, New York, NY, 26--29. Google Scholar
Digital Library
- R. H. Patterson, G. A. Gibson, E. Ginting, D. Stodolsky, and J. Zelenka. 1995. Informed prefetching and caching. SIGOPS Operating Systems Review 29, 5, 79--95. Google Scholar
Digital Library
- J. Piernas, T. Cortes, and J. M. García. 2002. DualFS: A new journaling file system without meta-data duplication. In Proceedings of the 16th International Conference on Supercomputing (ICS’02). ACM, New York, NY, 137--146. Google Scholar
Digital Library
- R. Ramakrishnan and J. Gehrke. 2002. Database Management Systems. McGraw-Hill Science, New York, NY. Google Scholar
Digital Library
- M. Román, C. Hess, R. Cerqueira, A. Ranganathan, R. H. Campbell, and K. Nahrstedt. 2002. A middleware infrastructure for active spaces. IEEE Pervasive Computing 1, 4, 74--83. Google Scholar
Digital Library
- M. Rosenblum and J. K. Ousterhout. 1992. The design and implementation of a log-structured file system. ACM Transactions on Computer Systems 10, 1, 26--52. Google Scholar
Digital Library
- J. Schindler, S. Shete, and K. A. Smith. 2011. Improving throughput for small disk requests with proximal I/O. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST’11). USENIX Association, Berkeley, CA, 10--25. Google Scholar
Digital Library
- M. I. Seltzer, G. R. Ganger, M. K. McKusick, K. A. Smith, C. A. N. Soules, and C. A. Stein. 2000. Journaling versus soft updates: Asynchronous meta-data protection in file systems. In Proceedings of the 2000 USENIX Annual Technical Conference (ATEC’00). USENIX Association, Berkeley, CA, 6--6. Google Scholar
Digital Library
- K. Shvachko, H. Kuang, S. Radia, and R. Chansler. 2010. The Hadoop distributed file system. Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST’10). 1--10. Google Scholar
Digital Library
- D. Siewiorek, A. Smailagic, J. Furukawa, A. Krause, N. Moraveji, K. Reiger, J. Shaffer, and Fei Lung Wong. 2003. SenSay: A context-aware mobile phone. In Proceedings of 7th IEEE International Symposium on Wearable Computers, 2003. 248--249. Google Scholar
Digital Library
- G. Soundararajan, M. Mihailescu, and C. Amza. 2008. Context-aware prefetching at the storage server. In Proceedings of the USENIX 2008 Annual Technical Conference (ATC’08). USENIX Association, Berkeley, CA, 377--390. Google Scholar
Digital Library
- G. Soundararajan, V. Prabhakaran, M. Balakrishnan, and T. Wobber. 2010. Extending SSD lifetimes with disk-based write caches. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). USENIX Association, Berkeley, CA, 8--8. Google Scholar
Digital Library
- S. Sundararaman, L. Visampalli, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. 2011. Refuse to crash with Re-FUSE. In Proceedings of the 6th Conference on Computer Systems (EuroSys’11). ACM, New York, NY, 77--90. Google Scholar
Digital Library
- A. S. Tanenbaum, J. N. Herder, and H. Bos. 2006. File size distribution on UNIX systems: Then and now. SIGOPS Operating Systems Review 40, 1, 100--104. Google Scholar
Digital Library
- K. Veeraraghavan, J. Flinn, E. B. Nightingale, and B. Noble. 2010. quFiles: The right file at the right time. ACM Transactions on Storage (TOS) 6, Issue 3, Article 12, 28 pages. Google Scholar
Digital Library
- W. Wang, Y. Zhao, and R. Bunt. 2004. HyLog: A high performance approach to managing disk layout. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST’04). USENIX Association, Berkeley, CA, 145--158. Google Scholar
Digital Library
- WD1600AAJS. 2013. WD1600AAJS specification. Retrieved July 3, 2015 from http://wdc.custhelp.com/app/answers/detail/search/1/a_id/1400#.Google Scholar
- S. A. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long, and C. Maltzahn. 2006. Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06). USENIX Association, Berkeley, CA, 307--320. Google Scholar
Digital Library
- T. White. June 5, 2009. Hadoop: The definitive guide. O’Reilly Media, Yahoo! Press, Sebastopol, CA. Google Scholar
Digital Library
- Charles P. Wright, Jay Dave, Puja Gupta, Harikesavan Krishnan, David P. Quigley, Erez Zadok, and Mohammad Nayyer Zubair. 2006. Versatility and Unix semantics in namespace unification. ACM Transactions on Storage (TOS) 2, 1, 74--105. Google Scholar
Digital Library
- XFS. 2014. The SGI XFS project. Retrieved July 3, 2015 from http://oss.sgi.com/projects/xfs/.Google Scholar
- E. Zadok, I. Badulescu, and A. Shender. 1999. Extending file systems using stackable templates. In Proceedings of the USENIX Annual Technical Conference (ATEC’99). USENIX Association, Berkeley, CA, 5--5. Google Scholar
Digital Library
- E. Zadok and J. Nieh. 2000. FIST: A language for stackable file systems. SIGOPS Operating Systems Review 34, 2, 38--38. Google Scholar
Digital Library
- Z. Zhang and K. Ghose. 2007. hFS: A hybrid file system prototype for improving small file and metadata performance. Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 41, 3, 175--187. Google Scholar
Digital Library
Index Terms
Frog: A Framework for Context-Based File Systems
Recommendations
A multiple-file write scheme for improving write performance of small files in Fast File System
Fast File System (FFS) stores files to disk in separate disk writes, each of which incurs a disk positioning (seek + rotation) limiting the write performance for small files. We propose a new scheme called co-writing to accelerate small file writes in ...
Implementation of a stackable file system for real-time network backup
We propose a backup system based on a stackable mirroring file system, general-purpose mirroring file system (GMFS). This file system mirrors data in real-time on the file system layer. It uses the typical network file system (NFS) and backs up data to ...
WOJ: Enabling Write-Once Full-data Journaling in SSDs by Using Weak-Hashing-based Deduplication
Journaling is a commonly used technique to ensure data consistency in file systems, such as ext3 and ext4. With journaling technique, file system updates are first recorded in a journal (in the commit phase) and later applied to their home locations in ...








Comments