skip to main content
research-article

Forest: a language and toolkit for programming with filestores

Published:19 September 2011Publication History
Skip Abstract Section

Abstract

A filestore is a structured collection of data files housed in a conventional hierarchical file system. Many applications use filestores as a poor-man's database, and the correct execution of these applications requires that the collection of files, directories, and symbolic links stored on disk satisfy a variety of precise invariants. Moreover, all of these structures must have acceptable ownership, permission, and timestamp attributes. Unfortunately, current programming languages do not provide support for documenting assumptions about filestores, detecting errors in them, or safely loading from and storing to them.

This paper describes the design, implementation, and semantics of Forest, a new domain-specific language for describing filestores. The language uses a type-based metaphor to specify the expected structure, attributes, and invariants of filestores. Forest generates loading and storing functions that make it easy to connect data on disk to an isomorphic representation in memory that can be manipulated as if it were any other data structure. Forest also generates metadata that describes the degree to which the structures on the disk conform to the specification, making error detection easy. In a nutshell, Forest extends the rigorous discipline of typed programming languages to the untyped world of file systems.

We have implemented Forest as an embedded domain-specific language in Haskell. In addition to generating infrastructure for reading, writing and checking file systems, our implementation generates type class instances that make it easy to build generic tools that operate over arbitrary filestores. We illustrate the utility of this infrastructure by building a file system visualizer, a file access checker, a generic query interface, description-directed variants of several standard UNIX shell tools and (circularly) a simple Forest description inference engine. Finally, we formalize a core fragment of Forest in a semantics inspired by classical tree logics and prove round-tripping laws showing that the loading and storing functions behave sensibly.

Skip Supplemental Material Section

Supplemental Material

_talk3.mp4

References

  1. S.-C. Buraga. An XML-based semantic description of distributed file systems. In RoEduNet, pages 41--48, 2003.Google ScholarGoogle Scholar
  2. C. Calcagno, P. Gardner, and U. Zarfaty. Context logic and tree update. In POPL, pages 271--282, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Filesystem Hierarchy Standard Group. Filesystem hierarchy standard. http://www.pathname.com/fhs/, 2004.Google ScholarGoogle Scholar
  4. K. Fisher, N. Foster, D. Walker, and K. Q. Zhu. Forest 1.0: A Language and Toolkit for Programming with Filestores. Technical Report TR-904-11, Princeton University, June 2011.Google ScholarGoogle Scholar
  5. K. Fisher and R. Gruber. PADS: A domain specific language for processing ad hoc data. In PLDI, pages 295--304, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. Fisher, Y. Mandelbaum, and D. Walker. The next 700 data description languages. In POPL, Jan. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Fisher, Y. Mandelbaum, and D. Walker. The next 700 data description languages. JACM, 57:10:1--10:51, February 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. Fisher and D. Walker. The PADS project: An overview. In Proceedings of the 14th International Conference on Database Theory, ICDT '11, pages 11--17, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Forest: A language and toolkit for programming with file system fragments. http://forestproj.org, 2010.Google ScholarGoogle Scholar
  10. J. N. Foster, M. B. Greenwald, J. T. Moore, B. C. Pierce, and A. Schmitt. Combinators for bidirectional tree transformations: A linguistic approach to the view update problem. TOPLAS, 29(3), May 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. J. Freedman. Experiences with CoralCDN: A five-year operational view. In NSDI, pages 7--7, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. J. Freedman, E. Freudenthal, and D. Mazieres. Democratizing content publication with Coral. In NSDI, pages 18--18, 2004. See also http://www.coralcdn.org/. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. R. Gansner and S. C. North. An open graph visualization system and its applications to software engineering. Softw. Pract. Exper., 30:1203--1233, September 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Haskell Graphviz Package. http://hackage.haskell.org/package/graphviz.Google ScholarGoogle Scholar
  15. Haskell Source Extensions Package. http://hackage.haskell.org/package/haskell-src-exts.Google ScholarGoogle Scholar
  16. Haskell Source Meta Package. http://hackage.haskell.org/package/haskell-src-meta.Google ScholarGoogle Scholar
  17. S. Hidaka, Z. Hu, K. Inaba, H. Kato, K. Matsuda, and K. Nakano. Bidirectionalizing graph transformations. In ICFP, pages 205--216, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Lämmel and S. P. Jones. Scrap your boilerplate: A practical design pattern for generic programming. In TLDI, pages 26--37, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Leijen and E. Meijer. Parsec: Direct style monadic parser combinators for the real world. Technical Report UU-CS-2001-27, Department of Computer Science, Universiteit Utrecht, 2001.Google ScholarGoogle Scholar
  20. LINQ: .NET language-integrated query. http://msdn.microsoft.com/library/bb308959.aspx, Feb. 2007.Google ScholarGoogle Scholar
  21. G. Mainland. Why it's nice to be quoted: Quasiquoting for Haskell. In Haskell Workshop, pages 73--82, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Mandelbaum, K. Fisher, D. Walker, M. Fernández, and A. Gleyzer. PADS/ML: A functional data description language. In POPL, Jan. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S.-C. Mu, Z. Hu, and M. Takeichi. An algebraic approach to bi-directional updating. In APLAS, pages 2--20, Nov. 2004.Google ScholarGoogle ScholarCross RefCross Ref
  24. G. Ntzik. Local reasoning for filesystems. Master's thesis, Imperial College, Sept. 2010.Google ScholarGoogle Scholar
  25. PADS project. http://www.padsproj.org/, 2007.Google ScholarGoogle Scholar
  26. T. J. Parr and R. W. Quong. ANTLR: A predicated -LL(k) parser generator. Softw. Pract. Exper., 25(7):789--810, July 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. T. Sheard and S. P. Jones. Template meta-programming for Haskell. In Haskell Workshop, pages 1--16, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. Syme. Looking Ahead with F#: Taming the Data Deluge. Presentation at the Workshop on F# in Education, Nov. 2010.Google ScholarGoogle Scholar
  29. Template Haskell Extension Proposal. hackage.haskell.org/trac/ghc/blog/Template%20Haskell%20Proposal.Google ScholarGoogle Scholar

Index Terms

  1. Forest: a language and toolkit for programming with filestores

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!