Abstract
Bidirectional transformations between different data representations occur frequently in modern software systems. They appear as serializers and deserializers, as parsers and pretty printers, as database views and view updaters, and as a multitude of different kinds of ad hoc data converters. Manually building bidirectional transformations---by writing two separate functions that are intended to be inverses---is tedious and error prone. A better approach is to use a domain-specific language in which both directions can be written as a single expression. However, these domain-specific languages can be difficult to program in, requiring programmers to manage fiddly details while working in a complex type system.
We present an alternative approach. Instead of coding transformations manually, we synthesize them from declarative format descriptions and examples. Specifically, we present Optician, a tool for type-directed synthesis of bijective string transformers. The inputs to Optician are a pair of ordinary regular expressions representing two data formats and a few concrete examples for disambiguation. The output is a well-typed program in Boomerang (a bidirectional language based on the theory of lenses). The main technical challenge involves navigating the vast program search space efficiently. In particular, and unlike most prior work on type-directed synthesis, our system operates in the context of a language with a rich equivalence relation on types (the theory of regular expressions). Consequently, program synthesis requires search in two dimensions: First, our synthesis algorithm must find a pair of "syntactically compatible types," and second, using the structure of those types, it must find a type- and example-compliant term. Our key insight is that it is possible to reduce the size of this search space without losing any computational power by defining a new language of lenses designed specifically for synthesis. The new language is free from arbitrary function composition and operates only over types and terms in a new disjunctive normal form. We prove (1) our new language is just as powerful as a more natural, compositional, and declarative language and (2) our synthesis algorithm is sound and complete with respect to the new language. We also demonstrate empirically that our new language changes the synthesis problem from one that admits intractable solutions to one that admits highly efficient solutions, able to synthesize intricate lenses between complex file formats in seconds. We evaluate Optician on a benchmark suite of 39 examples that includes both microbenchmarks and realistic examples derived from other data management systems including Flash Fill, a tool for synthesizing string transformations in spreadsheets, and Augeas, a tool for bidirectional processing of Linux system configuration files.
Supplemental Material
Available for Download
The auxiliary material contains Optician both as a standalone tool, as well as a tool integrated into Boomerang. README files in the auxiliary material describe the necessary steps for installation.
- Faris Abou-Saleh, James Cheney, Jeremy Gibbons, James McKinna, and Perdita Stevens. 2016. Reflections on Monadic Lenses. In A List of Successes That Can Change the World - Essays Dedicated to Philip Wadler on the Occasion of His 60th Birthday . 1–31. Google Scholar
Cross Ref
- Lennart Augustsson. 2004. [Haskell] Announcing Djinn, version 2004-12-11, a coding wizard. Mailing List. (2004). http://www.haskell.org/pipermail/haskell/2005-December/017055.html .Google Scholar
- Davi M. J. Barbosa, Julien Cretin, Nate Foster, Michael Greenberg, and Benjamin C. Pierce. 2010. Matching Lenses: Alignment and View Update. In ACM SIGPLAN International Conference on Functional Programming (ICFP), Baltimore, Maryland. Google Scholar
Digital Library
- Aaron Bohannon, J. Nathan Foster, Benjamin C. Pierce, Alexandre Pilkiewicz, and Alan Schmitt. 2008. Boomerang: Resourceful Lenses for String Data. In Proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’08) . ACM. Google Scholar
Digital Library
- Aaron Bohannon, Jeffrey A. Vaughan, and Benjamin C. Pierce. 2006. Relational Lenses: A Language for Updateable Views. In Principles of Database Systems (PODS). Extended version available as University of Pennsylvania technical report MS-CIS-05-27. Google Scholar
Digital Library
- R. Book, S. Even, S. Greibach, and G. Ott. 1971. Ambiguity in Graphs and Expressions. IEEE Trans. Comput. 20, 2 (Feb. 1971). Google Scholar
Digital Library
- J. H. Conway. 1971. Regular Algebra and Finite Machines. Printed in GB by William Clowes & Sons Ltd.Google Scholar
- Krzysztof Czarnecki, J. Nathan Foster, Zhenjiang Hu, Ralf Lämmel, Andy Schürr, and James F. Terwilliger. 2009. Bidirectional Transformations: A Cross-Discipline Perspective. In ICMT (Lecture Notes in Computer Science), Richard F. Paige (Ed.), Vol. 5563. Springer, 260–283. Google Scholar
Digital Library
- Manfred Droste, Werner Kuich, and Heiko Vogler (Eds.). 2009. Semirings and Formal Power Series. Springer Berlin Heidelberg, 3–28. Google Scholar
Cross Ref
- Yu Feng, Ruben Martins, Jacob Van Geffen, Isil Dillig, and Swarat Chaudhuri. 2017. Component-based Synthesis of Table Consolidation and Transformation Tasks from Examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017) . ACM. Google Scholar
Digital Library
- John K. Feser, Swarat Chaudhuri, and Isil Dillig. 2015. Synthesizing Data Structure Transformations from Input-output Examples. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) . Google Scholar
Digital Library
- Sebastian Fischer, Zhenjiang Hu, and Hugo Pacheco. 2015. The essence of bidirectional programming. SCIENCE CHINA Information Sciences 58, 5 (2015), 1–21. Google Scholar
Cross Ref
- J. Nathan Foster, Michael B. Greenwald, Jonathan T. Moore, Benjamin C. Pierce, and Alan Schmitt. 2007. Combinators for bidirectional tree transformations: A linguistic approach to the view-update problem. ACM Transactions on Programming Languages and Systems 29, 3 (May 2007), 17.Google Scholar
Digital Library
- J. Nathan Foster, Alexandre Pilkiewicz, and Benjamin C. Pierce. 2008. Quotient Lenses. In ACM SIGPLAN International Conference on Functional Programming (ICFP), Victoria, Canada . Google Scholar
Digital Library
- Jonathan Frankle, Peter-Michael Osera, David Walker, and Steve Zdancewic. 2015. Example-Directed Synthesis: A TypeTheoretic Interpretation (extended version) . Technical Report MS-CIS-15-12. University of Pennsylvania.Google Scholar
- Sumit Gulwani. 2011a. Automating string processing in spreadsheets using input-output examples. In ACM SIGPLAN Notices , Vol. 46. ACM.Google Scholar
Digital Library
- Sumit Gulwani. 2011b. Automating String Processing in Spreadsheets Using Input-output Examples. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’11) . ACM. Google Scholar
Digital Library
- Tihomir Gvero, Viktor Kuncak, Ivan Kuraj, and Ruzica Piskac. 2013. Complete Completion Using Types and Weights. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) . Google Scholar
Digital Library
- Brian Harry. 2014. A new API for Visual Studio Online. (2014).Google Scholar
- Soichiro Hidaka, Zhenjiang Hu, Kazuhiro Inaba, Hiroyuki Kato, Kazutaka Matsuda, and Keisuke Nakano. 2010. Bidirectionalizing graph transformations. In Proceeding of the 15th ACM SIGPLAN international conference on Functional programming, ICFP 2010, Baltimore, Maryland, USA, September 27-29, 2010 . 205–216. Google Scholar
Digital Library
- Soichiro Hidaka, Zhenjiang Hu, Kazuhiro Inaba, Hiroyuki Kato, and Keisuke Nakano. 2011. GRoundTram: An integrated framework for developing well-behaved bidirectional model transformations. In Automated Software Engineering (ASE). Google Scholar
Digital Library
- Qinheping Hu and Loris D’Antoni. 2017. Automatic Program Inversion Using Symbolic Transducers. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017) . ACM, New York, NY, USA, 376–389. Google Scholar
Digital Library
- Hsiang-Shang Ko, Tao Zan, and Zhenjiang Hu. 2016. BiGUL: A formally verified core language for putback-based bidirectional programming. In Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation, PEPM 2016, St. Petersburg, FL, USA, January 20 - 22, 2016 . 61–72. Google Scholar
Digital Library
- D. Kozen. 1994. A Completeness Theorem for Kleene Algebras and the Algebra of Regular Events. Information and Computation 110, 2 (1994). http://www.sciencedirect.com/science/article/pii/S0890540184710376 Google Scholar
Digital Library
- Daniel Krob. 1991. Complete Systems of B-rational Identities. Theor. Comput. Sci. 89, 2 (Oct. 1991). Google Scholar
Digital Library
- Vu Le and Sumit Gulwani. 2014. FlashExtract: A Framework for Data Extraction by Examples. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14) . ACM. Google Scholar
Digital Library
- Dongxi Liu, Zhenjiang Hu, and Masato Takeichi. 2007. Bidirectional interpretation of XQuery. In Proceedings of the 2007 ACM SIGPLAN Workshop on Partial Evaluation and Semantics-based Program Manipulation, 2007, Nice, France, January 15-16, 2007 . 21–30. Google Scholar
Digital Library
- David Lutterkort. 2007. Augeas: A Linux Configuration API. (Feb. 2007). Available from http://augeas.net/.Google Scholar
- Nuno Macedo, Hugo Pacheco, Nuno Rocha Sousa, and Alcino Cunha. 2014. Bidirectional spreadsheet formulas. In IEEE Symposium on Visual Languages and Human-Centric Computing, VL/HCC 2014, Melbourne, VIC, Australia, July 28 - August 1, 2014 . 161–168. Google Scholar
Cross Ref
- Microsoft Corporation 2017. Requirements and compatibility | Team Foundation Server Setup, Update and Administration. Microsoft Corporation.Google Scholar
- Anders Miltner, Kathleen Fisher, Benjamin C. Pierce, David Walker, and Steve Zdancewic. 2017a. Synthesizing Bijective Lenses. (2017). arXiv: arXiv:1710.03248 https://arxiv.org/abs/1710.03248Google Scholar
- Anders Miltner, Solomon Maina, Kathleen Fisher, Benjamin C. Pierce, David Walker, and Steve Zdancewic. 2017b. OpticianTool. https://github.com/Optician-Tool/Optician-Tool . (2017).Google Scholar
- Peter-Michael Osera and Steve Zdancewic. 2015. Type-and-example-directed program synthesis. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation . ACM. Google Scholar
Digital Library
- Hugo Pacheco, Tao Zan, and Zhenjiang Hu. 2014. BiFluX: A Bidirectional Functional Update Language for XML. In Proceedings of the 16th International Symposium on Principles and Practice of Declarative Programming, Kent, Canterbury, United Kingdom, September 8-10, 2014 . 147–158. Google Scholar
Digital Library
- Daniel Perelman, Sumit Gulwani, Dan Grossman, and Peter Provost. 2014. Test-driven Synthesis. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14) . Google Scholar
Digital Library
- Nadia Polikarpova, Ivan Kuraj, and Armando Solar-Lezama. 2016. Program Synthesis from Polymorphic Refinement Types. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16). ACM. Google Scholar
Digital Library
- Microsoft PROSE. 2017. Microsoft Program Synthesis using Examples SDK. (2017). https://microsoft.github.io/prose/Google Scholar
- Arto Salomaa. 1966. Two Complete Axiom Systems for the Algebra of Regular Events. J. ACM 13, 1 (Jan. 1966). Google Scholar
Digital Library
- Gabriel Scherer and Didier Rèmy. 2015. Which simple types have a unique inhabitant?. In Proceedings of the 18th ACM SIGPLAN International Conference on Functional Programming (ICFP) . Google Scholar
Digital Library
- Rishabh Singh. 2016. BlinkFill: Semi-supervised Programming by Example for Syntactic String Transformations. Proc. VLDB Endow. 9, 10 (June 2016). Google Scholar
Digital Library
- Rishabh Singh and Sumit Gulwani. 2012. Learning semantic string transformations from examples. Proceedings of the VLDB Endowment 5, 8 (2012). Google Scholar
Digital Library
- Armando Solar-Lezama. 2008. Program Synthesis by Sketching. Ph.D. Dissertation. University of California, Berkeley.Google Scholar
Digital Library
- Navid Yaghmazadeh, Christian Klinger, Isil Dillig, and Swarat Chaudhuri. 2016. Synthesizing Transformations on Hierarchically Structured Data. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16) . ACM. Google Scholar
Digital Library
- Tao Zan, Li Liu, Hsiang-Shang Ko, and Zhenjiang Hu. 2016. Brul: A Putback-Based Bidirectional Transformation Library for Updatable Views. In Proceedings of the 5th International Workshop on Bidirectional Transformations, Bx 2016, co-located with The European Joint Conferences on Theory and Practice of Software, ETAPS 2016, Eindhoven, The Netherlands, April 8, 2016. 77–89.Google Scholar
- Zirun Zhu, Hsiang-Shang Ko, Pedro Martins, João Saraiva, and Zhenjiang Hu. 2015. BiYacc: Roll Your Parser and Reflective Printer into One. In Proceedings of the 4th International Workshop on Bidirectional Transformations co-located with Software Technologies: Applications and Foundations, STAF 2015, L’Aquila, Italy, July 24, 2015. 43–50.Google Scholar
Index Terms
Synthesizing bijective lenses
Recommendations
Synthesizing quotient lenses
Quotient lenses are bidirectional transformations whose correctness laws are “loosened” by specified equivalence relations, allowing inessential details in concrete data formats to be suppressed. For example, a programmer could use a quotient lens to ...
Synthesizing symmetric lenses
Lenses are programs that can be run both "front to back" and "back to front," allowing updates to either their source or their target data to be transferred in both directions. Since their introduction by Foster et al., lenses have been extensively ...
Combinators for bidirectional tree transformations: A linguistic approach to the view-update problem
Special issue on POPL 2005We propose a novel approach to the view-update problem for tree-structured data: a domain-specific programming language in which all expressions denote bidirectional transformations on trees. In one direction, these transformations---dubbed lenses---map ...






Comments