skip to main content
research-article
Open Access

Neurosymbolic repair for low-code formula languages

Published:31 October 2022Publication History
Skip Abstract Section

Abstract

Most users of low-code platforms, such as Excel and PowerApps, write programs in domain-specific formula languages to carry out nontrivial tasks. Often users can write most of the program they want, but introduce small mistakes that yield broken formulas. These mistakes, which can be both syntactic and semantic, are hard for low-code users to identify and fix, even though they can be resolved with just a few edits. We formalize the problem of producing such edits as the last-mile repair problem. To address this problem, we developed LaMirage, a LAst-MIle RepAir-engine GEnerator that combines symbolic and neural techniques to perform last-mile repair in low-code formula languages. LaMirage takes a grammar and a set of domain-specific constraints/rules, which jointly approximate the target language, and uses these to generate a repair engine that can fix formulas in that language. To tackle the challenges of localizing errors and ranking candidate repairs, LaMirage leverages neural techniques, whereas it relies on symbolic methods to generate candidate edits. This combination allows LaMirage to find repairs that satisfy the provided grammar and constraints, and then pick the most natural repair. We compare LaMirage to state-of-the-art neural and symbolic approaches on 400 real Excel and Power Fx formulas, where LaMirage outperforms all baselines. We release these benchmarks to encourage subsequent work in low-code domains.

Skip Supplemental Material Section

Supplemental Material

References

  1. Toufique Ahmed, Noah Rose Ledesma, and Premkumar T. Devanbu. 2021. SYNFIX: Automatically Fixing Syntax Errors using Compiler Diagnostics. CoRR, abs/2104.14671 (2021). Google ScholarGoogle Scholar
  2. Alfred V. Aho and Thomas G. Peterson. 1972. A Minimum Distance Error-Correcting Parser for Context-Free Languages. SIAM J. Comput., 1, 4 (1972), 305–312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. Addison-Wesley Longman Publishing Co., Inc., USA. isbn:0201100886 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Rajeev Alur, Rastislav Bodik, Garvit Juniwal, Milo MK Martin, Mukund Raghothaman, Sanjit A Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. 2013. Syntax-guided synthesis. IEEE. Google ScholarGoogle Scholar
  5. 2022. Appian. https://appian.com/ Google ScholarGoogle Scholar
  6. Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, and Quoc Le. 2021. Program synthesis with large language models. arXiv preprint arXiv:2108.07732. Google ScholarGoogle Scholar
  7. Johannes Bader, Andrew Scott, Michael Pradel, and Satish Chandra. 2019. Getafix: learning to fix bugs automatically. Proc. ACM Program. Lang., 3, OOPSLA (2019), 159:1–159:27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Rohan Bavishi, Caroline Lemieux, Roy Fox, Koushik Sen, and Ion Stoica. 2019. AutoPandas: neural-backed generators for program synthesis. Proceedings of the ACM on Programming Languages, 3, OOPSLA (2019), 1–27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Berkay Berabi, Jingxuan He, Veselin Raychev, and Martin Vechev. 2021. TFix: Learning to Fix Coding Errors with a Text-to-Text Transformer. In Proceedings of the 38th International Conference on Machine Learning, Marina Meila and Tong Zhang (Eds.) (Proceedings of Machine Learning Research, Vol. 139). PMLR, 780–791. https://proceedings.mlr.press/v139/berabi21a.html Google ScholarGoogle Scholar
  10. Carl Cerecke. 2003. Locally least-cost error repair in LR parsers. Google ScholarGoogle Scholar
  11. Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Joshua Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. 2021. Evaluating Large Language Models Trained on Code. CoRR, abs/2107.03374 (2021). Google ScholarGoogle Scholar
  12. Xinyun Chen, Chang Liu, and Dawn Song. 2018. Execution-guided neural program synthesis. In International Conference on Learning Representations. Google ScholarGoogle Scholar
  13. Rafael Corchuelo, José A Pérez, Antonio Ruiz, and Miguel Toro. 2002. Repairing syntax errors in LR parsers. ACM Transactions on Programming Languages and Systems (TOPLAS), 24, 6 (2002), 698–710. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Nelson Cowan. 2001. Metatheory of storage capacity limits. Behavioral and brain sciences, 24, 1 (2001), 154–176. Google ScholarGoogle Scholar
  15. Andrew M. Dai and Quoc V. Le. 2015. Semi-supervised Sequence Learning. In Advances in Neural Information Processing Systems. 3079–3087. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Pierpaolo Degano and Corrado Priami. 1995. Comparison of syntactic error handling in LR parsers. Software: Practice and Experience, 25, 6 (1995), 657–679. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel-rahman Mohamed, and Pushmeet Kohli. 2017. Robustfill: Neural program learning under noisy i/o. In International conference on machine learning. 990–998. Google ScholarGoogle Scholar
  18. Lukas Diekmann and Laurence Tratt. 2020. Don’t Panic! Better, Fewer, Syntax Errors for LR Parsers. In 34th European Conference on Object-Oriented Programming, ECOOP 2020 (LIPIcs, Vol. 166). 6:1–6:32. Google ScholarGoogle Scholar
  19. Ian Drosos, Philip J Guo, and Chris Parnin. 2017. HappyFace: Identifying and predicting frustrating obstacles for learning programming at scale. In 2017 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). 171–179. Google ScholarGoogle ScholarCross RefCross Ref
  20. Kevin Ellis, Daniel Ritchie, Armando Solar-Lezama, and Josh Tenenbaum. 2018. Learning to infer graphics programs from hand-drawn images. Advances in neural information processing systems, 31 (2018). Google ScholarGoogle Scholar
  21. Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In Findings of the Association for Computational Linguistics: EMNLP (Findings of ACL, Vol. EMNLP 2020). 1536–1547. Google ScholarGoogle Scholar
  22. Charles Fischer, Bernard Dion, and Jon Mauney. 1979. A locally least-cost LR-error corrector. University of Wisconsin-Madison, Department of Computer Sciences. Google ScholarGoogle Scholar
  23. Xiang Gao, Arjun Radhakrishna, Gustavo Soares, Ridwan Shariffdeen, Sumit Gulwani, and Abhik Roychoudhury. 2021. APIfix: Output-Oriented Program Synthesis for Combating Breaking Changes in Libraries. Proc. ACM Program. Lang., 5, OOPSLA (2021), Article 161, oct, 27 pages. https://doi.org/10.1145/3485538 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. 2019. Google Sheets. https://www.google.com/sheets/about/ Google ScholarGoogle Scholar
  25. Claire Le Goues, Michael Pradel, and Abhik Roychoudhury. 2019. Automated program repair. Commun. ACM, 62, 12 (2019), 56–65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Claire Le Goues, Michael Pradel, and Abhik Roychoudhury. 2019. Automated Program Repair. Commun. ACM, 62, 12 (2019), Nov., 56–65. issn:0001-0782 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Sumit Gulwani. 2011. Automating string processing in spreadsheets using input-output examples. ACM Sigplan Notices, 46, 1 (2011), 317–330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Sumit Gulwani, Oleksandr Polozov, and Rishabh Singh. 2017. Program Synthesis. Found. Trends Program. Lang., 4, 1-2 (2017), 1–119. Google ScholarGoogle ScholarCross RefCross Ref
  29. Daya Guo, Alexey Svyatkovskiy, Jian Yin, Nan Duan, Marc Brockschmidt, and Miltiadis Allamanis. 2021. Learning to Complete Code with Sketches. In International Conference on Learning Representations. Google ScholarGoogle Scholar
  30. Rahul Gupta, Soham Pal, Aditya Kanade, and Shirish Shevade. 2017. DeepFix: Fixing Common C Language Errors by Deep Learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI’17). 1345–1351. Google ScholarGoogle ScholarCross RefCross Ref
  31. William T Hallahan, Ennan Zhai, and Ruzica Piskac. 2017. Automated repair by example for firewalls. In 2017 Formal Methods in Computer Aided Design (FMCAD). 220–229. Google ScholarGoogle Scholar
  32. Austin Zachary Henley. 2018. Human-centric Tools for Navigating Code. The University of Memphis. Google ScholarGoogle Scholar
  33. Ashwin Kalyan, Abhishek Mohta, Oleksandr Polozov, Dhruv Batra, Prateek Jain, and Sumit Gulwani. 2018. Neural-guided deductive search for real-time program synthesis from examples. arXiv preprint arXiv:1804.01186. Google ScholarGoogle Scholar
  34. Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic patch generation learned from human-written patches. In International Conference on Software Engineering, ICSE. 802–811. Google ScholarGoogle ScholarCross RefCross Ref
  35. Ik-Soon Kim and Kwangkeun Yi. 2010. LR error repair using the A* algorithm. Acta Inf., 47 (2010), 179–207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Fan Long, Peter Amidon, and Martin Rinard. 2017. Automatic inference of code transforms for patch generation. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. 727–739. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Fan Long and Martin Rinard. 2016. Automatic patch generation by learning correct code. In POPL. 298–312. Google ScholarGoogle Scholar
  38. Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2016. Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis. In International Conference on Software Engineering (ICSE). 691–701. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. 2021. Microsoft Excel. https://www.microsoft.com/en-us/microsoft-365/excel Google ScholarGoogle Scholar
  40. 2019. Microsoft Power Apps. https://powerapps.microsoft.com/en-us/ Google ScholarGoogle Scholar
  41. 2019. Microsoft Power Automate. https://flow.microsoft.com/en-us/ Google ScholarGoogle Scholar
  42. 2022. Microsoft Power Fx overview. https://docs.microsoft.com/en-us/power-platform/power-fx/overview Google ScholarGoogle Scholar
  43. 2022. Microsoft PROSE Github. https://github.com/microsoft/prose Google ScholarGoogle Scholar
  44. Anders Miltner, Sumit Gulwani, Vu Le, Alan Leung, Arjun Radhakrishna, Gustavo Soares, Ashish Tiwari, and Abhishek Udupa. 2019. On the Fly Synthesis of Edit Suggestions. Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 143, oct, 29 pages. https://doi.org/10.1145/3360569 Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Martin Monperrus. 2020. The Living Review on Automated Program Repair. HAL. Google ScholarGoogle Scholar
  46. 2015. Morgan Stanley Technology, Media & Telecom Conference. https://www.microsoft.com/en-us/investor/events/FY-2015/morgan-stanley-qi-lu.aspx?EventID=156417 Google ScholarGoogle Scholar
  47. 2021. MrExcel Message Board. https://www.mrexcel.com/board/ Google ScholarGoogle Scholar
  48. Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. 2013. SemFix: Program repair via semantic analysis. In International Conference on Software Engineering (ICSE). 772–781. Google ScholarGoogle Scholar
  49. 2022. New GPT-3 Capabilities: Edit & Insert. https://openai.com/blog/gpt-3-edit-insert/ Google ScholarGoogle Scholar
  50. Gabriel Poesia, Alex Polozov, Vu Le, Ashish Tiwari, Gustavo Soares, Christopher Meek, and Sumit Gulwani. 2022. Synchromesh: Reliable Code Generation from Pre-trained Language Models. In International Conference on Learning Representations. https://openreview.net/forum?id=KmtVD97J43e Google ScholarGoogle Scholar
  51. Oleksandr Polozov and Sumit Gulwani. 2015. Flashmeta: A framework for inductive program synthesis. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. 107–126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. 2021. Power Apps Community. https://powerusers.microsoft.com/t5/Power-Apps-Community/ct-p/PowerApps1 Google ScholarGoogle Scholar
  53. Varot Premtoon, James Koppel, and Armando Solar-Lezama. 2020. Semantic Code Search via Equational Reasoning. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2020). Association for Computing Machinery, New York, NY, USA. 1066–1082. isbn:9781450376136 https://doi.org/10.1145/3385412.3386001 Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2019. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683. Google ScholarGoogle Scholar
  55. Kia Rahmani, Mohammad Raza, Sumit Gulwani, Vu Le, Daniel Morris, Arjun Radhakrishna, Gustavo Soares, and Ashish Tiwari. 2021. Multi-Modal Program Inference: A Marriage of Pre-Trained Language Models and Component-Based Synthesis. Proc. ACM Program. Lang., 5, OOPSLA (2021), Article 158, oct, 29 pages. https://doi.org/10.1145/3485535 Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Sanguthevar Rajasekaran and Marius Nicolae. 2014. An error correcting parser for context free grammars that takes less than cubic time. CoRR, abs/1406.3405 (2014). Google ScholarGoogle Scholar
  57. Mohammad Raza and Sumit Gulwani. 2017. Automated data extraction using predictive program synthesis. In Proceedings of the AAAI Conference on Artificial Intelligence. 31. Google ScholarGoogle ScholarCross RefCross Ref
  58. Reudismam Rolim, Gustavo Soares, Loris D’Antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, and Björn Hartmann. 2017. Learning syntactic program transformations from examples. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). 404–415. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Eddie Antonio Santos, Joshua Charles Campbell, Dhvani Patel, Abram Hindle, and José Nelson Amaral. 2018. Syntax and sensibility: Using language models to detect and correct syntax errors. In International Conference on Software Analysis, Evolution and Reengineering (SANER). 311–322. Google ScholarGoogle ScholarCross RefCross Ref
  60. Michael Spenke, Heinz Muhlenbein, Monika Mevenkamp, Friedemann Mattern, and Christian Beilken. 1984. A language independent error recovery method for LL(1) parsers. Software: Practice and Experience, 14, 11 (1984), 1095–1107. Google ScholarGoogle ScholarCross RefCross Ref
  61. Yu Tang, Long Zhou, Ambrosio Blanco, Shujie Liu, Furu Wei, Ming Zhou, and Muyun Yang. 2021. Grammar-based patches generation for automated program repair. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 1300–1305. Google ScholarGoogle ScholarCross RefCross Ref
  62. V Javier Traver. 2010. On compiler error messages: what they say and what they mean. Advances in Human-Computer Interaction, 2010 (2010). Google ScholarGoogle Scholar
  63. Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2018. An empirical investigation into learning bug-fixing patches in the wild via neural machine translation. In International Conference on Automated Software Engineering, ASE. 832–837. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. 2019. UiPath. https://www.uipath.com/ Google ScholarGoogle Scholar
  65. Marko Vasic, Aditya Kanade, Petros Maniatis, David Bieber, and Rishabh Singh. 2018. Neural Program Repair by Jointly Learning to Localize and Repair. In International Conference on Learning Representations. Google ScholarGoogle Scholar
  66. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, 30 (2017). Google ScholarGoogle Scholar
  67. 2022. Low-code app development platform Crowdbotics raises $22M. https://venturebeat.com/2022/01/20/low-code-app-development-platform-crowdbotics-raises-22m/ Google ScholarGoogle Scholar
  68. Gust Verbruggen, Vu Le, and Sumit Gulwani. 2021. Semantic programming by example with pre-trained models. Proceedings of the ACM on Programming Languages, 5, OOPSLA (2021), 1–25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer networks. Advances in neural information processing systems, 28 (2015). Google ScholarGoogle Scholar
  70. Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding patches using genetic programming. In International Conference on Software Engineering. 364–374. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Michihiro Yasunaga and Percy Liang. 2020. Graph-based, Self-Supervised Program Repair from Diagnostic Feedback. In International Conference on Machine Learning. 119, 10799–10808. Google ScholarGoogle Scholar
  72. Michihiro Yasunaga and Percy Liang. 2021. Break-It-Fix-It: Unsupervised Learning for Program Repair. In International Conference on Machine Learning, ICML. 139, 11941–11952. Google ScholarGoogle Scholar
  73. Zhongxing Yu, Matias Martinez, Tegawendé F Bissyandé, and Martin Monperrus. 2019. Learning the relation between code features and code transforms with structured prediction. arXiv preprint arXiv:1907.09282. Google ScholarGoogle Scholar
  74. Qihao Zhu, Zeyu Sun, Yuan-an Xiao, Wenjie Zhang, Kang Yuan, Yingfei Xiong, and Lu Zhang. 2021. A syntax-guided edit decoder for neural program repair. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 341–353. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Neurosymbolic repair for low-code formula languages

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the ACM on Programming Languages
      Proceedings of the ACM on Programming Languages  Volume 6, Issue OOPSLA2
      October 2022
      1932 pages
      EISSN:2475-1421
      DOI:10.1145/3554307
      Issue’s Table of Contents

      Copyright © 2022 Owner/Author

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 31 October 2022
      Published in pacmpl Volume 6, Issue OOPSLA2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!