Abstract
A filestore is a structured collection of data files housed in a conventional hierarchical file system. Many applications use filestores as a poor-man's database, and the correct execution of these applications requires that the collection of files, directories, and symbolic links stored on disk satisfy a variety of precise invariants. Moreover, all of these structures must have acceptable ownership, permission, and timestamp attributes. Unfortunately, current programming languages do not provide support for documenting assumptions about filestores, detecting errors in them, or safely loading from and storing to them.
This paper describes the design, implementation, and semantics of Forest, a new domain-specific language for describing filestores. The language uses a type-based metaphor to specify the expected structure, attributes, and invariants of filestores. Forest generates loading and storing functions that make it easy to connect data on disk to an isomorphic representation in memory that can be manipulated as if it were any other data structure. Forest also generates metadata that describes the degree to which the structures on the disk conform to the specification, making error detection easy. In a nutshell, Forest extends the rigorous discipline of typed programming languages to the untyped world of file systems.
We have implemented Forest as an embedded domain-specific language in Haskell. In addition to generating infrastructure for reading, writing and checking file systems, our implementation generates type class instances that make it easy to build generic tools that operate over arbitrary filestores. We illustrate the utility of this infrastructure by building a file system visualizer, a file access checker, a generic query interface, description-directed variants of several standard UNIX shell tools and (circularly) a simple Forest description inference engine. Finally, we formalize a core fragment of Forest in a semantics inspired by classical tree logics and prove round-tripping laws showing that the loading and storing functions behave sensibly.
Supplemental Material
- S.-C. Buraga. An XML-based semantic description of distributed file systems. In RoEduNet, pages 41--48, 2003.Google Scholar
- C. Calcagno, P. Gardner, and U. Zarfaty. Context logic and tree update. In POPL, pages 271--282, 2005. Google Scholar
Digital Library
- Filesystem Hierarchy Standard Group. Filesystem hierarchy standard. http://www.pathname.com/fhs/, 2004.Google Scholar
- K. Fisher, N. Foster, D. Walker, and K. Q. Zhu. Forest 1.0: A Language and Toolkit for Programming with Filestores. Technical Report TR-904-11, Princeton University, June 2011.Google Scholar
- K. Fisher and R. Gruber. PADS: A domain specific language for processing ad hoc data. In PLDI, pages 295--304, June 2005. Google Scholar
Digital Library
- K. Fisher, Y. Mandelbaum, and D. Walker. The next 700 data description languages. In POPL, Jan. 2006. Google Scholar
Digital Library
- K. Fisher, Y. Mandelbaum, and D. Walker. The next 700 data description languages. JACM, 57:10:1--10:51, February 2010. Google Scholar
Digital Library
- K. Fisher and D. Walker. The PADS project: An overview. In Proceedings of the 14th International Conference on Database Theory, ICDT '11, pages 11--17, New York, NY, USA, 2011. ACM. Google Scholar
Digital Library
- Forest: A language and toolkit for programming with file system fragments. http://forestproj.org, 2010.Google Scholar
- J. N. Foster, M. B. Greenwald, J. T. Moore, B. C. Pierce, and A. Schmitt. Combinators for bidirectional tree transformations: A linguistic approach to the view update problem. TOPLAS, 29(3), May 2007. Google Scholar
Digital Library
- M. J. Freedman. Experiences with CoralCDN: A five-year operational view. In NSDI, pages 7--7, 2010. Google Scholar
Digital Library
- M. J. Freedman, E. Freudenthal, and D. Mazieres. Democratizing content publication with Coral. In NSDI, pages 18--18, 2004. See also http://www.coralcdn.org/. Google Scholar
Digital Library
- E. R. Gansner and S. C. North. An open graph visualization system and its applications to software engineering. Softw. Pract. Exper., 30:1203--1233, September 2000. Google Scholar
Digital Library
- Haskell Graphviz Package. http://hackage.haskell.org/package/graphviz.Google Scholar
- Haskell Source Extensions Package. http://hackage.haskell.org/package/haskell-src-exts.Google Scholar
- Haskell Source Meta Package. http://hackage.haskell.org/package/haskell-src-meta.Google Scholar
- S. Hidaka, Z. Hu, K. Inaba, H. Kato, K. Matsuda, and K. Nakano. Bidirectionalizing graph transformations. In ICFP, pages 205--216, 2010. Google Scholar
Digital Library
- R. Lämmel and S. P. Jones. Scrap your boilerplate: A practical design pattern for generic programming. In TLDI, pages 26--37, 2003.Google Scholar
Digital Library
- D. Leijen and E. Meijer. Parsec: Direct style monadic parser combinators for the real world. Technical Report UU-CS-2001-27, Department of Computer Science, Universiteit Utrecht, 2001.Google Scholar
- LINQ: .NET language-integrated query. http://msdn.microsoft.com/library/bb308959.aspx, Feb. 2007.Google Scholar
- G. Mainland. Why it's nice to be quoted: Quasiquoting for Haskell. In Haskell Workshop, pages 73--82, 2007. Google Scholar
Digital Library
- Y. Mandelbaum, K. Fisher, D. Walker, M. Fernández, and A. Gleyzer. PADS/ML: A functional data description language. In POPL, Jan. 2007. Google Scholar
Digital Library
- S.-C. Mu, Z. Hu, and M. Takeichi. An algebraic approach to bi-directional updating. In APLAS, pages 2--20, Nov. 2004.Google Scholar
Cross Ref
- G. Ntzik. Local reasoning for filesystems. Master's thesis, Imperial College, Sept. 2010.Google Scholar
- PADS project. http://www.padsproj.org/, 2007.Google Scholar
- T. J. Parr and R. W. Quong. ANTLR: A predicated -LL(k) parser generator. Softw. Pract. Exper., 25(7):789--810, July 1995. Google Scholar
Digital Library
- T. Sheard and S. P. Jones. Template meta-programming for Haskell. In Haskell Workshop, pages 1--16, 2002. Google Scholar
Digital Library
- D. Syme. Looking Ahead with F#: Taming the Data Deluge. Presentation at the Workshop on F# in Education, Nov. 2010.Google Scholar
- Template Haskell Extension Proposal. hackage.haskell.org/trac/ghc/blog/Template%20Haskell%20Proposal.Google Scholar
Index Terms
Forest: a language and toolkit for programming with filestores
Recommendations
Forest: a language and toolkit for programming with filestores
ICFP '11: Proceedings of the 16th ACM SIGPLAN international conference on Functional programmingA filestore is a structured collection of data files housed in a conventional hierarchical file system. Many applications use filestores as a poor-man's database, and the correct execution of these applications requires that the collection of files, ...
Incremental forest: a DSL for efficiently managing filestores
OOPSLA '16File systems are often used to store persistent application data, but manipulating file systems using standard APIs can be difficult for programmers. Forest is a domain-specific language that bridges the gap between the on-disk and in-memory ...
Incremental forest: a DSL for efficiently managing filestores
OOPSLA 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and ApplicationsFile systems are often used to store persistent application data, but manipulating file systems using standard APIs can be difficult for programmers. Forest is a domain-specific language that bridges the gap between the on-disk and in-memory ...







Comments