skip to main content

A structural model for contextual code changes

Published:13 November 2020Publication History
Skip Abstract Section

Abstract

We address the problem of predicting edit completions based on a learned model that was trained on past edits. Given a code snippet that is partially edited, our goal is to predict a completion of the edit for the rest of the snippet. We refer to this task as the EditCompletion task and present a novel approach for tackling it. The main idea is to directly represent structural edits. This allows us to model the likelihood of the edit itself, rather than learning the likelihood of the edited code. We represent an edit operation as a path in the program’s Abstract Syntax Tree (AST), originating from the source of the edit to the target of the edit. Using this representation, we present a powerful and lightweight neural model for the EditCompletion task.

We conduct a thorough evaluation, comparing our approach to a variety of representation and modeling approaches that are driven by multiple strong models such as LSTMs, Transformers, and neural CRFs. Our experiments show that our model achieves a 28% relative gain over state-of-the-art sequential models and 2× higher accuracy than syntactic models that learn to generate the edited code, as opposed to modeling the edits directly.

Our code, dataset, and trained models are publicly available at <a>https://github.com/tech-srl/c3po/</a> .

Skip Supplemental Material Section

Supplemental Material

Auxiliary Presentation Video

A video presentation for the paper "A Structural Model for Contextual Code Changes" by Shaked Brody, Uri Alon, and Eran Yahav.

References

  1. Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Vancouver, Canada, 132-140. https://doi.org/10.18653/v1/ P17-2021 Miltiadis Allamanis. 2019. The Adverse Efects of Code Duplication in Machine Learning Models of Code. In Miltiadis Allamanis, Earl T. Barr, Christian Bird, and Charles Sutton. 2015. Suggesting Accurate Method and Class Names. Google ScholarGoogle ScholarCross RefCross Ref
  2. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (Bergamo, Italy) (ESEC/FSE 2015 ).Google ScholarGoogle Scholar
  3. ACM, New York, NY, USA, 38-49. https://doi.org/10.1145/2786805.2786849 Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2018. Learning to Represent Programs with Graphs. In International Conference on Learning Representations. https://openreview.net/forum?id=BJOFETxRMiltiadis Allamanis, Hao Peng, and Charles Sutton. 2016. A Convolutional Attention Network for Extreme Summarization of Source Code. In Proceedings of The 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 48 ), Maria Florina Balcan and Kilian Q. Weinberger (Eds.). PMLR, New York, New York, USA, 2091-2100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Proc. ACM Program. Lang. 3, POPL, Article 40 ( Jan. 2019 ), 29 pages. https://doi.org/10.1145/3290353 Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 ( 2014 ). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. William Chan, Navdeep Jaitly, Quoc Le, and Oriol Vinyals. 2016. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4960-4964.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. arXiv: 2003. 05620 [cs.SE] Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (Nov. 1997 ), 1735-1780.Google ScholarGoogle Scholar
  7. https://doi.org/10.1162/neco. 1997. 9.8.1735 James W. Hunt and M. Douglas McIlroy. 1975. An algorithm for diferential file comparison. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing Source Code using a Neural Attention Model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, 2073-2083. https://doi.org/10.18653/v1/ P16-1195 Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. http://arxiv.org/abs/1412. Google ScholarGoogle ScholarCross RefCross Ref
  9. 6980 cite arxiv: 1412.6980Comment: Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego, 2015.Google ScholarGoogle Scholar
  10. Cristina V Lopes, Petr Maj, Pedro Martins, Vaibhav Saini, Di Yang, Jakub Zitny, Hitesh Sajnani, and Jan Vitek. 2017. DéjàVu: a map of code duplicates on GitHub. Proceedings of the ACM on Programming Languages 1, OOPSLA ( 2017 ), 1-28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Efective Approaches to Attention-based Neural Machine Translation. CoRR abs/1508.04025 ( 2015 ). arXiv: 1508.04025 http://arxiv.org/abs/1508.04025 Xuezhe Ma and Eduard Hovy. 2016. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1064-1074.Google ScholarGoogle Scholar
  12. Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, and Aliaksei Severyn. 2019. Encode, Tag, Realize: HighPrecision Text Editing. In EMNLP-IJCNLP.Google ScholarGoogle Scholar
  13. Ali Mesbah, Andrew Rice, Emily Johnston, Nick Glorioso, and Edward Aftandilian. 2019. DeepDelta: learning to repair compilation errors. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 925-936.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Reuven Rubinstein. 1999. The cross-entropy method for combinatorial and continuous optimization. Methodology and Computing in Applied Probability 1, 2 ( 1999 ), 127-190.Google ScholarGoogle Scholar
  15. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 5998-6008.Google ScholarGoogle Scholar
  16. http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer Networks. arXiv:1506. 03134 [stat.ML] Pengcheng Yin, Graham Neubig, Miltiadis Allamanis, Marc Brockschmidt, and Alexander L. Gaunt. 2019. Learning to Represent Edits. In International Conference on Learning Representations. https://openreview.net/forum?id=BJl6AjC5F7Google ScholarGoogle Scholar

Index Terms

  1. A structural model for contextual code changes

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!