skip to main content

babble: Learning Better Abstractions with E-Graphs and Anti-unification

Published:11 January 2023Publication History
Skip Abstract Section

Abstract

Library learning compresses a given corpus of programs by extracting common structure from the corpus into reusable library functions. Prior work on library learning suffers from two limitations that prevent it from scaling to larger, more complex inputs. First, it explores too many candidate library functions that are not useful for compression. Second, it is not robust to syntactic variation in the input.

We propose library learning modulo theory (LLMT), a new library learning algorithm that additionally takes as input an equational theory for a given problem domain. LLMT uses e-graphs and equality saturation to compactly represent the space of programs equivalent modulo the theory, and uses a novel e-graph anti-unification technique to find common patterns in the corpus more directly and efficiently.

We implemented LLMT in a tool named babble. Our evaluation shows that babble achieves better compression orders of magnitude faster than the state of the art. We also provide a qualitative evaluation showing that babble learns reusable functions on inputs previously out of reach for library learning.

References

  1. Matt Bowers. 2022. Compression Benchmark. https://github.com/mlb2251/compression_benchmark Google ScholarGoogle Scholar
  2. Matthew Bowers, Theo X. Olausson, Catherine Wong, Gabriel Grand, Joshua B. Tenenbaum, Kevin Ellis, and Armando Solar-Lezama. 2023. Top-Down Synthesis For Library Learning. Proceedings of the ACM on Programming Languages, 7, POPL (2023), https://doi.org/10.1145/3571234 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Peter E. Bulychev, Egor V. Kostylev, and Vladimir A. Zakharov. 2010. Anti-unification Algorithms and Their Applications in Program Analysis. In Perspectives of Systems Informatics, Amir Pnueli, Irina Virbitskaite, and Andrei Voronkov (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 413–423. isbn:978-3-642-11486-1 Google ScholarGoogle Scholar
  4. David Cao, Rose Kunkel, Chandrakana Nandi, Max Willsey, Zachary Tatlock, and Nadia Polikarpova. 2022. Artifact for “: Learning Better Abstractions with E-Graphs and Anti-unification”. https://doi.org/10.5281/zenodo.7120897 Canonical source is on Github: https://github.com/ dcao/babble/blob/popl23/POPL23.md Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Andrew Cropper and Sebastijan Dumancic. 2022. Inductive Logic Programming At 30: A New Introduction. J. Artif. Intell. Res., 74 (2022), 765–850. https://doi.org/10.1613/jair.1.13507 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Eyal Dechter, Jon Malmaud, Ryan P. Adams, and Joshua B. Tenenbaum. 2013. Bootstrap Learning via Modular Concept Discovery. In Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI ’13). AAAI Press, 1302–1309. isbn:9781577356332 Google ScholarGoogle Scholar
  7. Rui Dong, Zhicheng Huang, Ian Iong Lam, Yan Chen, and Xinyu Wang. 2022. WebRobot: Web Robotic Process Automation Using Interactive Programming-by-Demonstration. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI 2022). Association for Computing Machinery, New York, NY, USA. 152–167. isbn:9781450392655 https://doi.org/10.1145/3519939.3523711 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Sebastijan Dumancic, Tias Guns, and Andrew Cropper. 2021. Knowledge Refactoring for Inductive Program Synthesis. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021. AAAI Press, 7271–7278. https://ojs.aaai.org/index.php/AAAI/article/view/16893 Google ScholarGoogle ScholarCross RefCross Ref
  9. Kevin Ellis, Daniel Ritchie, Armando Solar-Lezama, and Joshua B. Tenenbaum. 2017. Learning to Infer Graphics Programs from Hand-Drawn Images. https://doi.org/10.48550/ARXIV.1707.09627 Google ScholarGoogle Scholar
  10. Kevin Ellis, Catherine Wong, Maxwell I. Nye, Mathias Sablé-Meyer, Lucas Morales, Luke B. Hewitt, Luc Cary, Armando Solar-Lezama, and Joshua B. Tenenbaum. 2021. DreamCoder: bootstrapping inductive program synthesis with wake-sleep library learning. In PLDI ’21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, Virtual Event, Canada, June 20–25, 2021, Stephen N. Freund and Eran Yahav (Eds.). ACM, 835–850. https://doi.org/10.1145/3453483.3454080 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Srinivasan Iyer, Alvin Cheung, and Luke Zettlemoyer. 2019. Learning Programmatic Idioms for Scalable Semantic Parsing. https://doi.org/10.48550/ARXIV.1904.09086 Google ScholarGoogle Scholar
  12. R. Kenny Jones, David Charatan, Paul Guerrero, Niloy J. Mitra, and Daniel Ritchie. 2021. ShapeMOD: Macro Operation Discovery for 3D Shape Programs. ACM Trans. Graph., 40, 4 (2021), Article 153, jul, 16 pages. issn:0730-0301 https://doi.org/10.1145/3450626.3459821 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Tessa Lau, Steven A. Wolfman, Pedro Domingos, and Daniel S. Weld. 2001. Programming By Demonstration Using Version Space Algebra. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Miguel Lázaro-Gredilla, Dianhuan Lin, J. Swaroop Guntupalli, and Dileep George. 2018. Beyond imitation: Zero-shot task transfer on robots by learning concepts as cognitive programs. https://doi.org/10.48550/ARXIV.1812.02788 Google ScholarGoogle Scholar
  15. Na Meng, Miryung Kim, and Kathryn S. McKinley. 2013. Lase: Locating and applying systematic edits by learning from examples. In 2013 35th International Conference on Software Engineering (ICSE). 502–511. https://doi.org/10.1109/ICSE.2013.6606596 Google ScholarGoogle ScholarCross RefCross Ref
  16. Tom Michael Mitchell. 1977. Version Spaces: A Candidate Elimination Approach to Rule Learning. In IJCAI. Google ScholarGoogle Scholar
  17. Chandrakana Nandi, James R. Wilcox, Pavel Panchekha, Taylor Blau, Dan Grossman, and Zachary Tatlock. 2018. Functional Programming for Compiling and Decompiling Computer-Aided Design. Proc. ACM Program. Lang., 2, ICFP (2018), Article 99, jul, 31 pages. https://doi.org/10.1145/3236794 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Chandrakana Nandi, Max Willsey, Adam Anderson, James R. Wilcox, Eva Darulova, Dan Grossman, and Zachary Tatlock. 2020. Synthesizing structured CAD models with equality saturation and inverse transformations. In Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2020, London, UK, June 15–20, 2020, Alastair F. Donaldson and Emina Torlak (Eds.). ACM, 31–44. https://doi.org/10.1145/3385412.3386012 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Chandrakana Nandi, Max Willsey, Amy Zhu, Yisu Remy Wang, Brett Saiki, Adam Anderson, Adriana Schulz, Dan Grossman, and Zachary Tatlock. 2021. Rewrite Rule Inference Using Equality Saturation. Proc. ACM Program. Lang., 5, OOPSLA (2021), Article 119, oct, 28 pages. https://doi.org/10.1145/3485496 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Pavel Panchekha, Alex Sanchez-Stern, James R. Wilcox, and Zachary Tatlock. 2015. Automatically Improving Accuracy for Floating Point Expressions. SIGPLAN Not., 50, 6 (2015), jun, 1–11. issn:0362-1340 https://doi.org/10.1145/2813885.2737959 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Gordon Plotkin. 1970. Lattice Theoretic Properties of Subsumption. Edinburgh University, Department of Machine Intelligence and Perception. https://books.google.com/books?id=2p09cgAACAAJ Google ScholarGoogle Scholar
  22. Mohammad Raza, Natasa Milic-Frayling, and Sumit Gulwani. 2014. Programming by Example using Least General Generalizations. AAAI - Association for the Advancement of Artificial Intelligence. https://www.microsoft.com/en-us/research/publication/programming-by-example-using-least-general-generalizations/ Google ScholarGoogle Scholar
  23. John C. Reynolds. 1969. Transformational systems and the algebraic structure of atomic formulas. Google ScholarGoogle Scholar
  24. Rodrigo C. O. Rocha, Pavlos Petoumenos, Björn Franke, Pramod Bhatotia, and Michael O’ Boyle. 2022. Loop Rolling for Code Size Reduction. In 2022 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), Jae W. Lee, Sebastian Hack, and Tatiana Shpeisman (Eds.). IEEE, 217–229. isbn:978-1-6654-0585-0 https://doi.org/10.1109/CGO53902.2022.9741256 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Reudismam Rolim, Gustavo Soares, Loris D’Antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, and Björn Hartmann. 2017. Learning Syntactic Program Transformations from Examples. In Proceedings of the 39th International Conference on Software Engineering (ICSE ’17). IEEE Press, 404–415. isbn:9781538638682 https://doi.org/10.1109/ICSE.2017.44 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Gopal Sharma, Rishabh Goyal, Difan Liu, Evangelos Kalogerakis, and Subhransu Maji. 2017. CSGNet: Neural Shape Parser for Constructive Solid Geometry. https://doi.org/10.48550/ARXIV.1712.08290 Google ScholarGoogle Scholar
  27. Eui Chul Shin, Miltiadis Allamanis, Marc Brockschmidt, and Alex Polozov. 2019. Program Synthesis and Semantic Parsing with Learned Code Idioms. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d' Alché-Buc, E. Fox, and R. Garnett (Eds.). 32, Curran Associates, Inc.. https://proceedings.neurips.cc/paper/2019/file/cff34ad343b069ea6920464ad17d4bcf-Paper.pdf Google ScholarGoogle Scholar
  28. Eytan Singher and Shachar Itzhaky. 2021. Theory Exploration Powered by Deductive Synthesis. In Computer Aided Verification, Alexandra Silva and K. Rustan M. Leino (Eds.). Springer International Publishing, Cham. 125–148. isbn:978-3-030-81688-9 Google ScholarGoogle Scholar
  29. G. Stiff and F. Vahid. 2005. New Decompilation Techniques for Binary-Level Co-Processor Generation. In Proceedings of the 2005 IEEE/ACM International Conference on Computer-Aided Design (ICCAD ’05). IEEE Computer Society, USA. 547–554. isbn:078039254X Google ScholarGoogle Scholar
  30. Bogong Su, Shiyuan Ding, and Lan Jin. 1984. An Improvement of Trace Scheduling for Global Microcode Compaction. SIGMICRO Newsl., 15, 4 (1984), dec, 78–85. issn:1050-916X https://doi.org/10.1145/384281.808217 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ross Tate, Michael Stepp, Zachary Tatlock, and Sorin Lerner. 2009. Equality Saturation: A New Approach to Optimization. In Proceedings of the 36th annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 264–276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Yonglong Tian, Andrew Luo, Xingyuan Sun, Kevin Ellis, William T. Freeman, Joshua B. Tenenbaum, and Jiajun Wu. 2019. Learning to Infer and Execute 3D Shape Programs. https://doi.org/10.48550/ARXIV.1901.02875 Google ScholarGoogle Scholar
  33. Alexa VanHattum, Rachit Nigam, Vincent T. Lee, James Bornholt, and Adrian Sampson. 2021. Vectorization for Digital Signal Processors via Equality Saturation. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2021). Association for Computing Machinery, New York, NY, USA. 874–886. isbn:9781450383172 https://doi.org/10.1145/3445814.3446707 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Haoliang Wang, Nadia Polikarpova, and Judith E. Fan. 2021. Learning part-based abstractions for visual object concepts. In Proceedings of the Annual Meeting of the Cognitive Science Society. 43, https://escholarship.org/uc/item/9009w415 Google ScholarGoogle Scholar
  35. Yisu Remy Wang, Shana Hutchison, Jonathan Leang, Bill Howe, and Dan Suciu. 2020. SPORES: Sum-Product Optimization via Relational Equality Saturation for Large Scale Linear Algebra. Proc. VLDB Endow., 13, 12 (2020), jul, 1919–1932. issn:2150-8097 https://doi.org/10.14778/3407790.3407799 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Max Willsey, Chandrakana Nandi, Yisu Remy Wang, Oliver Flatt, Zachary Tatlock, and Pavel Panchekha. 2021. Egg: Fast and Extensible Equality Saturation. Proceedings of the ACM on Programming Languages, 5, POPL (2021), 1–29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Catherine Wong, Kevin Ellis, Joshua B. Tenenbaum, and Jacob Andreas. 2021. Leveraging Language to Learn Program Abstractions and Search Heuristics. https://doi.org/10.48550/ARXIV.2106.11053 Google ScholarGoogle Scholar
  38. Catherine Wong, William P. McCarthy, Gabriel Grand, Yoni Friedman, Joshua B. Tenenbaum, Jacob Andreas, Robert D. Hawkins, and Judith E. Fan. 2022. Identifying concept libraries from language about object structure. In Proceedings of the Annual Meeting of the Cognitive Science Society. Google ScholarGoogle Scholar
  39. Chenming Wu, Haisen Zhao, Chandrakana Nandi, Jeffrey I. Lipton, Zachary Tatlock, and Adriana Schulz. 2019. Carpentry Compiler. ACM Trans. Graph., 38, 6 (2019), Article 195, nov, 14 pages. issn:0730-0301 https://doi.org/10.1145/3355089.3356518 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yichen Yang, Phitchaya Phothilimthana, Yisu Wang, Max Willsey, Sudip Roy, and Jacques Pienaar. 2021. Equality Saturation for Tensor Graph Superoptimization. In Proceedings of Machine Learning and Systems, A. Smola, A. Dimakis, and I. Stoica (Eds.). 3, 255–268. https://proceedings.mlsys.org/paper/2021/file/65ded5353c5ee48d0b7d48c591b8f430-Paper.pdf Google ScholarGoogle Scholar

Index Terms

  1. babble: Learning Better Abstractions with E-Graphs and Anti-unification

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!