Abstract
Most modern applications interact with external services and access data in structured formats such as XML, JSON and CSV. Static type systems do not understand such formats, often making data access more cumbersome. Should we give up and leave the messy world of external data to dynamic typing and runtime checks? Of course, not! We present F# Data, a library that integrates external structured data into F#. As most real-world data does not come with an explicit schema, we develop a shape inference algorithm that infers a shape from representative sample documents. We then integrate the inferred shape into the F# type system using type providers. We formalize the process and prove a relative type soundness theorem. Our library significantly reduces the amount of data access code and it provides additional safety guarantees when contrasted with the widely used weakly typed techniques.
- L. Cardelli and J. C. Mitchell. Operations on Records. In Mathematical Foundations of Programming Semantics, pages 22–52. Springer, 1990. Google Scholar
Digital Library
- A. Chlipala. Ur: Statically-typed Metaprogramming with Type-level Record Computation. In ACM SIGPLAN Notices, volume 45, pages 122–133. ACM, 2010. Google Scholar
Digital Library
- D. R. Christiansen. Dependent Type Providers. In Proceedings of Workshop on Generic Programming, WGP ’13, pages 25–34, 2013. ISBN 978-1-4503-2389-5. Google Scholar
Digital Library
- D. Colazzo, G. Ghelli, and C. Sartiani. Typing Massive JSON Datasets. In International Workshop on Cross-model Language Design and Implementation, XLDI ’12, 2012.Google Scholar
- E. Cooper, S. Lindley, P. Wadler, and J. Yallop. Links: Web Programming without Tiers. In Formal Methods for Components and Objects, pages 266–296. Springer, 2007. Google Scholar
Digital Library
- J. Donham and N. Pouillard. Camlp4 and Template Haskell. In Commercial Users of Functional Programming, 2010. Google Scholar
Digital Library
- K. Fisher and R. Gruber. PADS: A Domain-specific Language for Processing Ad Hoc Data. ACM SIGPLAN Notices, 40(6): 295–304, 2005. Google Scholar
Digital Library
- K. Fisher, D. Walker, K. Q. Zhu, and P. White. From Dirt to Shovels: Fully Automatic Tool Generation from Ad Hoc Data. In Proceedings of ACM Symposium on Principles of Programming Languages, POPL ’08, pages 421–434, 2008. ISBN 978-1-59593-689-9. Google Scholar
Digital Library
- H. Hosoya and B. C. Pierce. XDuce: A Statically Typed XML Processing Language. Transactions on Internet Technology, 3 (2):117–148, 2003. Google Scholar
Digital Library
- A. Igarashi, B. Pierce, and P. Wadler. Featherweight Java: A Minimal Core Calculus for Java and GJ. In ACM SIGPLAN Notices, volume 34, pages 132–146. ACM, 1999. Google Scholar
Digital Library
- Y. Mandelbaum, K. Fisher, D. Walker, M. Fernandez, and A. Gleyzer. PADS/ML: A Functional Data Description Language. In ACM SIGPLAN Notices, volume 42, pages 77–83. ACM, 2007. Google Scholar
Digital Library
- H. Mehnert and D. Christiansen. Tool Demonstration: An IDE for Programming and Proving in Idris. In Proceedings of Vienna Summer of Logic, VSL’14, 2014.Google Scholar
- E. Meijer, W. Schulte, and G. Bierman. Unifying Tables, Objects, and Documents. In Workshop on Declarative Programming in the Context of Object-Oriented Languages, pages 145–166, 2003.Google Scholar
- E. Meijer, B. Beckman, and G. Bierman. LINQ: Reconciling Object, Relations and XML in the .NET Framework. In Proceedings of the International Conference on Management of Data, SIGMOD ’06, pages 706–706, 2006. Google Scholar
Digital Library
- R. Milner. The Definition of Standard ML: Revised. MIT press, 1997. Google Scholar
Digital Library
- T. Petricek and D. Syme. In the Age of Web: Typed Functional-first Programming Revisited. Post-Proceedings of ML Workshop, 2015.Google Scholar
- D. Rémy. Type Inference for Records in a Natural Extension of ML. Theoretical Aspects Of Object-Oriented Programming. Types, Semantics and Language Design. MIT Press, 1993.Google Scholar
- S. Scheglmann, R. Lämmel, M. Leinberger, S. Staab, M. Thimm, and E. Viegas. IDE Integrated RDF Exploration, Access and RDF–Based Code Typing with LITEQ. In The Semantic Web: ESWC 2014 Satellite Events, pages 505–510. Springer, 2014.Google Scholar
- T. Sheard and S. P. Jones. Template Meta-programming for Haskell. In Proceedings of the ACM Workshop on Haskell, pages 1–16. ACM, 2002. Google Scholar
Digital Library
- J. G. Siek and W. Taha. Gradual Typing for Functional Languages. In Scheme and Functional Programming Workshop, pages 81–92, 2006.Google Scholar
- M. Sulzmann and K. Z. M. Lu. A Type-safe Embedding of XDuce into ML. Electr. Notes in Theoretical Comp. Sci., 148 (2):239–264, 2006. Google Scholar
Digital Library
- N. Swamy, C. Fournet, A. Rastogi, K. Bhargavan, J. Chen, P.-Y. Strub, and G. Bierman. Gradual Typing Embedded Securely in JavaScript. In ACM SIGPLAN Notices, volume 49, pages 425–437. ACM, 2014. Google Scholar
Digital Library
- D. Syme, K. Battocchi, K. Takeda, D. Malayeri, J. Fisher, J. Hu, T. Liu, B. McNamara, D. Quirk, M. Taveggia, W. Chae, U. Matsveyeu, and T. Petricek. Strongly-typed Language Support for Internet-scale Information Sources. Technical Report MSR-TR-2012-101, Microsoft Research, September 2012.Google Scholar
- D. Syme, K. Battocchi, K. Takeda, D. Malayeri, and T. Petricek. Themes in Information-rich Functional Programming for Internet-scale Data Sources. In Proceedings of the Workshop on Data Driven Functional Programming, DDFP’13, pages 1–4, 2013. Google Scholar
Digital Library
- W. Taha and T. Sheard. Multi-stage Programming with Explicit Annotations. ACM SIGPLAN Notices, 32(12):203–217, 1997. ISSN 0362-1340. Google Scholar
Digital Library
- A. K. Wright and M. Felleisen. A Syntactic Approach to Type Soundness. Information and computation, 115(1):38– 94, 1994. Google Scholar
Digital Library
Index Terms
Types from data: making structured data first-class citizens in F#
Recommendations
Types from data: making structured data first-class citizens in F#
PLDI '16: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and ImplementationMost modern applications interact with external services and access data in structured formats such as XML, JSON and CSV. Static type systems do not understand such formats, often making data access more cumbersome. Should we give up and leave the ...
Refactoring legacy AJAX applications to improve the efficiency of the data exchange component
The AJAX paradigm encodes data exchange XML formats. Recently, JSON has also become a popular data exchange format. XML has numerous benefits including human-readable structures and self-describing data. However, JSON provides significant performance ...
Translating JSON Data into Relational Data Using Schema-oblivious Approaches
ACM SE '19: Proceedings of the 2019 ACM Southeast ConferenceJSON (JavaScript Object Notation) has become popular as the data exchange standard over the Web. JSON has been gaining more popularity over XML due to its simplicity, compactness and ability to fit into the object types of programming languages. The ...









Comments