Abstract
Many problem domains, including program synthesis and rewrite-based optimization, require searching astronomically large spaces of programs. Existing approaches often rely on building specialized data structures—version-space algebras, finite tree automata, or e-graphs—to compactly represent such spaces. At their core, all these data structures exploit independence of subterms; as a result, they cannot efficiently represent more complex program spaces, where the choices of subterms are entangled.
We introduce equality-constrained tree automata (ECTAs), a new data structure, designed to compactly represent large spaces of programs with entangled subterms. We present efficient algorithms for extracting programs from ECTAs, implemented in a performant Haskell library, ecta. Using the ecta library, we construct Hectare, a type-driven program synthesizer for Haskell. Hectare significantly outperforms a state-of-the-art synthesizer Hoogle+—providing an average speedup of 8×—despite its implementation being an order of magnitude smaller.
- Michael D Adams and Matthew Might. 2017. Restricting Grammars with Tree Automata. Proceedings of the ACM on Programming Languages, 1, OOPSLA (2017), 1–25. https://doi.org/10.1145/3133906
Google Scholar
Digital Library
- Leo Bachmair and Nachum Dershowitz. 1994. Equational Inference, Canonical Proofs, and Proof Orderings. Journal of the ACM (JACM), 41, 2 (1994), 236–276. https://doi.org/10.1145/174652.174655
Google Scholar
Digital Library
- Luis Barguñó, Carles Creus, Guillem Godoy, Florent Jacquemard, and Camille Vacher. 2010. The Emptiness Problem for Tree Automata with Global Constraints. In 2010 25th Annual IEEE Symposium on Logic in Computer Science. 263–272. https://doi.org/10.1109/LICS.2010.28
Google Scholar
Digital Library
- Luis Barguñó, Carles Creus, Guillem Godoy, Florent Jacquemard, and Camille Vacher. 2013. Decidable Classes of Tree Automata Mixing Local and Global Constraints Modulo Flat Theories. Logical Methods in Computer Science, 9 (2013), 02, https://doi.org/10.2168/LMCS-9(2:1)2013
Google Scholar
Cross Ref
- Bruno Bogaert, Franck Seynhaeve, and Sophie Tison. 1999. The Recognizability Problem for Tree Automata with Comparisons Between Brothers. In International Conference on Foundations of Software Science and Computation Structure. 150–164. https://doi.org/10.1007/3-540-49019-1_11
Google Scholar
Cross Ref
- Bruno Bogaert and Sophie Tison. 1992. Equality and Disequality Constraints on Direct Subterms in Tree Automata. In Annual Symposium on Theoretical Aspects of Computer Science. 159–171. https://doi.org/10.1007/3-540-55210-3_181
Google Scholar
Cross Ref
- Max Dauchet. 1993. Rewriting and Tree Automata. In French School on Theoretical Computer Science. 95–113. https://doi.org/10.1007/3-540-59340-3_8
Google Scholar
Cross Ref
- Max Dauchet, Anne-Cécile Caron, and Jean-Luc Coquidé. 1995. Automata for Reduction Properties Solving. Journal of Symbolic Computation, 20, 2 (1995), 215–233. https://doi.org/10.1006/jsco.1995.1048
Google Scholar
Digital Library
- David Detlefs, Greg Nelson, and James B Saxe. 2005. Simplify: A Theorem Prover for Program Checking. Journal of the ACM (JACM), 52, 3 (2005), 365–473. https://doi.org/10.1145/1066100.1066102
Google Scholar
Digital Library
- Yu Feng, Ruben Martins, Yuepeng Wang, Isil Dillig, and Thomas W. Reps. 2017. Component-based synthesis for complex APIs. In POPL. https://doi.org/10.1145/3009837.3009851
Google Scholar
Digital Library
- Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova. 2021. LooPy: Interactive Program Synthesis with Control Structures. Proc. ACM Program. Lang., 5, OOPSLA (2021), Article 153, oct, 29 pages. https://doi.org/10.1145/3485530
Google Scholar
Digital Library
- Guillaume Feuillade, Thomas Genet, and Valérie Viet Triem Tong. 2004. Reachability Analysis over Term Rewriting Systems. Journal of Automated Reasoning, 33, 3 (2004), 341–383. https://doi.org/10.1007/s10817-004-6246-0
Google Scholar
Digital Library
- Alfons Geser, Dieter Hofbauer, Johannes Waldmann, and Hans Zantema. 2007. On Tree Automata that Certify Termination of Left-Linear Term Rewriting Systems. Information and Computation, 205, 4 (2007), 512–534. https://doi.org/10.1007/978-3-540-32033-3_26
Google Scholar
Digital Library
- Matthías Páll Gissurarson. 2018. Suggesting Valid Hole Fits for Typed-Holes (Experience Report). In Proceedings of the 11th ACM SIGPLAN International Symposium on Haskell (Haskell 2018). Association for Computing Machinery, 179–185. isbn:9781450358354 https://doi.org/10.1145/3299711.3242760
Google Scholar
Digital Library
- Sumit Gulwani. 2011. Automating String Processing in Spreadsheets using Input-Output Examples. ACM Sigplan Notices, 46, 1 (2011), 317–330. https://doi.org/10.1145/1926385.1926423
Google Scholar
Digital Library
- Zheng Guo, Michael James, David Justo, Jiaxiao Zhou, Ziteng Wang, Ranjit Jhala, and Nadia Polikarpova. 2020. Program synthesis by type-guided abstraction refinement. Proc. ACM Program. Lang., 4, POPL (2020), 12:1–12:28. https://doi.org/10.1145/3371080
Google Scholar
Digital Library
- George T. Heineman, Jan Bessai, Boris Düdder, and Jakob Rehof. 2016. A Long and Winding Road Towards Modular Synthesis. In Leveraging Applications of Formal Methods, Verification and Validation: Foundational Techniques - 7th International Symposium, ISoLA 2016, Imperial, Corfu, Greece, October 10-14, 2016, Proceedings, Part I. 303–317. https://doi.org/10.1007/978-3-319-47166-2_21
Google Scholar
Cross Ref
- Michael B James, Zheng Guo, Ziteng Wang, Shivani Doshi, Hila Peleg, Ranjit Jhala, and Nadia Polikarpova. 2020. Digging for Fold: Synthesis-Aided API Discovery for Haskell. Proceedings of the ACM on Programming Languages, 4, OOPSLA (2020), 1–27. https://doi.org/10.1145/3428273
Google Scholar
Digital Library
- Donald E Knuth. 1968. Semantics of Context-Free Languages. Mathematical Systems Theory, 2, 2 (1968), 127–145. https://doi.org/10.1007/BF01692511
Google Scholar
Cross Ref
- James Koppel. 2021. Version Space Algebras are Acyclic Tree Automata. https://doi.org/10.48550/arXiv.2107.12568 arxiv:2107.12568.
Google Scholar
- James Koppel, Zheng Guo, Edsko de Vries, Armando Solar-Lezama, and Nadia Polikarpova. 2022. Searching Entangled Program Spaces (Extended Version). https://doi.org/10.48550/ARXIV.2206.07828
Google Scholar
- M W Krentel. 1986. The Complexity of Optimization Problems. In Proceedings of the Eighteenth Annual ACM Symposium on Theory of Computing (STOC ’86). Association for Computing Machinery, New York, NY, USA. 69–76. isbn:0897911938 https://doi.org/10.1145/12130.12138
Google Scholar
Digital Library
- Tessa Lau, Steven A Wolfman, Pedro Domingos, and Daniel S Weld. 2003. Programming by Demonstration Using Version Space Algebra. Machine Learning, 53, 1 (2003), 111–156. https://doi.org/10.1023/A:1025671410623
Google Scholar
Digital Library
- Neil Mitchell. 2004. Hoogle. https://www.haskell.org/hoogle/
Google Scholar
- Chandrakana Nandi, Max Willsey, Adam Anderson, James R Wilcox, Eva Darulova, Dan Grossman, and Zachary Tatlock. 2020. Synthesizing Structured CAD models with Equality Saturation and Inverse Transformations. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. 31–44. https://doi.org/10.1145/3385412.3386012
Google Scholar
Digital Library
- Chandrakana Nandi, Max Willsey, Amy Zhu, Yisu Remy Wang, Brett Saiki, Adam Anderson, Adriana Schulz, Dan Grossman, and Zachary Tatlock. 2021. Rewrite Rule Inference Using Equality Saturation. Proc. ACM Program. Lang., 5, OOPSLA (2021), Article 119, oct, 28 pages. https://doi.org/10.1145/3485496
Google Scholar
Digital Library
- Greg Nelson and Derek C. Oppen. 1980. Fast Decision Procedures Based on Congruence Closure. J. ACM, 27, 2 (1980), 356–364. https://doi.org/10.1145/322186.322198
Google Scholar
Digital Library
- Robert Nieuwenhuis, Albert Oliveras, and Cesare Tinelli. 2006. Solving SAT and SAT Modulo Theories: From an Abstract Davis–Putnam–Logemann–Loveland Procedure to DPLL (T). Journal of the ACM (JACM), 53, 6 (2006), 937–977. https://doi.org/10.1145/1217856.1217859
Google Scholar
Digital Library
- Jukka Paakki. 1995. Attribute Grammar Paradigms—A High-Level Methodology in Language Implementation. ACM Computing Surveys (CSUR), 27, 2 (1995), 196–255. https://doi.org/10.1145/210376.197409
Google Scholar
Digital Library
- Joshua Pollock and Altan Haan. 2021. E-Graphs Are Minimal Deterministic Finite Tree Automata (DFTAs) · Discussion #104 · egraphs-good/egg. https://github.com/egraphs-good/egg/discussions/104
Google Scholar
- Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: A Framework for Inductive Program Synthesis. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. 107–126. https://doi.org/10.1145/2858965.2814310
Google Scholar
Digital Library
- Varot Premtoon, James Koppel, and Armando Solar-Lezama. 2020. Semantic Code Search via Equational Reasoning. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. 1066–1082. https://doi.org/10.1145/3385412.3386001
Google Scholar
Digital Library
- Andreas Reuß and Helmut Seidl. 2010. Bottom-up Tree Automata with Term Constraints. In International Conference on Logic for Programming Artificial Intelligence and Reasoning. 581–593. https://doi.org/10.1007/978-3-642-16242-8_41
Google Scholar
Cross Ref
- Ross Tate, Michael Stepp, Zachary Tatlock, and Sorin Lerner. 2009. Equality Saturation: A New Approach to Optimization. In Proceedings of the 36th annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 264–276. https://doi.org/10.1145/1480881.1480915
Google Scholar
Digital Library
- Pawel Urzyczyn. 1997. Inhabitation in Typed Lambda-Calculi (A Syntactic Approach). In Typed Lambda Calculi and Applications, Third International Conference on Typed Lambda Calculi and Applications, TLCA ’97, Nancy, France, April 2-4, 1997, Proceedings. 373–389. https://doi.org/10.1007/3-540-62688-3_47
Google Scholar
Cross Ref
- Eric Van Wyk, Derek Bodin, Jimin Gao, and Lijesh Krishnan. 2010. Silver: An Extensible Attribute Grammar System. Science of Computer Programming, 75, 1-2 (2010), 39–54. https://doi.org/10.1016/j.scico.2009.07.004
Google Scholar
Digital Library
- Xinyu Wang, Isil Dillig, and Rishabh Singh. 2017. Synthesis of Data Completion Scripts using Finite Tree Automata. Proceedings of the ACM on Programming Languages, 1, OOPSLA (2017), 1–26. https://doi.org/10.1145/3133886
Google Scholar
Digital Library
- Xinyu Wang, Isil Dillig, and Rishabh Singh. 2018. Program synthesis using abstraction refinement. Proc. ACM Program. Lang., 2, POPL (2018), 63:1–63:30. https://doi.org/10.1145/3158151
Google Scholar
Digital Library
- Max Willsey, Chandrakana Nandi, Yisu Remy Wang, Oliver Flatt, Zachary Tatlock, and Pavel Panchekha. 2021. Egg: Fast and Extensible Equality Saturation. Proceedings of the ACM on Programming Languages, 5, POPL (2021), 1–29. https://doi.org/10.1145/3434304
Google Scholar
Digital Library
- Yichen Yang, Phitchaya Phothilimthana, Yisu Wang, Max Willsey, Sudip Roy, and Jacques Pienaar. 2021. Equality Saturation for Tensor Graph Superoptimization. In Proceedings of Machine Learning and Systems, A. Smola, A. Dimakis, and I. Stoica (Eds.). 3, 255–268. https://proceedings.mlsys.org/paper/2021/file/65ded5353c5ee48d0b7d48c591b8f430-Paper.pdf
Google Scholar
Index Terms
Searching entangled program spaces
Recommendations
Layout-sensitive language extensibility with SugarHaskell
Haskell '12Programmers need convenient syntax to write elegant and concise programs. Consequently, the Haskell standard provides syntactic sugar for some scenarios (e.g., do notation for monadic code), authors of Haskell compilers provide syntactic sugar for more ...
Layout-sensitive language extensibility with SugarHaskell
Haskell '12: Proceedings of the 2012 Haskell SymposiumProgrammers need convenient syntax to write elegant and concise programs. Consequently, the Haskell standard provides syntactic sugar for some scenarios (e.g., do notation for monadic code), authors of Haskell compilers provide syntactic sugar for more ...
Un programme universel de dépouillement intelligent de documents librement structurés
RIAO '04: Coupling approaches, coupling media and coupling languages for information retrievalLa Chaîne de Traitement LADS permet de traiter tout type de document librement structuré pour en extraire tout type de données. Le langage DSL permet la Description de la Structure Logique du type de document de l'application paramétrée, ainsi que la ...






Comments