skip to main content
research-article
Open Access
Distinguished Paper

Learning semantic program embeddings with graph interval neural network

Published:13 November 2020Publication History
Skip Abstract Section

Abstract

Learning distributed representations of source code has been a challenging task for machine learning models. Earlier works treated programs as text so that natural language methods can be readily applied. Unfortunately, such approaches do not capitalize on the rich structural information possessed by source code. Of late, Graph Neural Network (GNN) was proposed to learn embeddings of programs from their graph representations. Due to the homogeneous (i.e. do not take advantage of the program-specific graph characteristics) and expensive (i.e. require heavy information exchange among nodes in the graph) message-passing procedure, GNN can suffer from precision issues, especially when dealing with programs rendered into large graphs. In this paper, we present a new graph neural architecture, called Graph Interval Neural Network (GINN), to tackle the weaknesses of the existing GNN. Unlike the standard GNN, GINN generalizes from a curated graph representation obtained through an abstraction method designed to aid models to learn. In particular, GINN focuses exclusively on intervals (generally manifested in looping construct) for mining the feature representation of a program, furthermore, GINN operates on a hierarchy of intervals for scaling the learning to large graphs.

We evaluate GINN for two popular downstream applications: variable misuse prediction and method name prediction. Results show in both cases GINN outperforms the state-of-the-art models by a comfortable margin. We have also created a neural bug detector based on GINN to catch null pointer deference bugs in Java code. While learning from the same 9,000 methods extracted from 64 projects, GINN-based bug detector significantly outperforms GNN-based bug detector on 13 unseen test projects. Next, we deploy our trained GINN-based bug detector and Facebook Infer, arguably the state-of-the-art static analysis tool, to scan the codebase of 20 highly starred projects on GitHub. Through our manual inspection, we confirm 38 bugs out of 102 warnings raised by GINN-based bug detector compared to 34 bugs out of 129 warnings for Facebook Infer. We have reported 38 bugs GINN caught to developers, among which 11 have been fixed and 12 have been confirmed (fix pending). GINN has shown to be a general, powerful deep neural network for learning precise, semantic program embeddings.

Skip Supplemental Material Section

Supplemental Material

Auxiliary Presentation Video

This is a presentation video of my talk at OOPSLA 2020 on our paper accepted in the research track. In this paper, we present a new graph neural architecture, called Graph Interval Neural Network (GINN), to tackle the weaknesses of the existing GNN. Unlike the standard GNN, GINN generalizes from a curated graph representation obtained through an abstraction method designed to aid models to learn. In particular, GINN focuses exclusively on intervals (generally manifested in looping construct) for mining the feature representation of a program, furthermore, GINN operates on a hierarchy of intervals for scaling the learning to large graphs.

References

  1. Miltiadis Allamanis. 2019. The Adverse Efects of Code Duplication in Machine Learning Models of Code. In Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward! 2019 ).Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2018. Learning to represent programs with graphs. International Conference on Learning Representations ( 2018 ).Google ScholarGoogle Scholar
  3. Frances E. Allen. 1970. Control Flow Analysis. In Proceedings of a Symposium on Compiler Optimization.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Uri Alon, Omer Levy, and Eran Yahav. 2019a. code2seq: Generating sequences from structured representations of code. International Conference on Learning Representations ( 2019 ).Google ScholarGoogle Scholar
  5. Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019b. Code2Vec: Learning Distributed Representations of Code. Proc. ACM Program. Lang. POPL ( 2019 ).Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. International Conference on Learning Representations ( 2015 ).Google ScholarGoogle Scholar
  7. Josh Berdine, Cristiano Calcagno, and Peter W. O'Hearn. 2006. Smallfoot: Modular Automatic Assertion Checking with Separation Logic. In Proceedings of the 4th International Conference on Formal Methods for Components and Objects (FMCO'05).Google ScholarGoogle Scholar
  8. Cristiano Calcagno, Dino Distefano, Jeremy Dubreil, Dominik Gabi, Pieter Hooimeijer, Martino Luca, Peter O'Hearn, Irene Papakonstantinou, Jim Purbrick, and Dulma Rodriguez. 2015. Moving Fast with Software Verification. In NASA Formal Methods, Klaus Havelund, Gerard Holzmann, and Rajeev Joshi (Eds.).Google ScholarGoogle Scholar
  9. Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP).Google ScholarGoogle ScholarCross RefCross Ref
  10. P. Cousot and R. Cousot. 1977. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Conference Record of the Fourth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages.Google ScholarGoogle Scholar
  11. Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard Schwartz, and John Makhoul. 2014. Fast and robust neural network joint models for statistical machine translation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1 : Long Papers).Google ScholarGoogle ScholarCross RefCross Ref
  12. Elizabeth Dinella, Hanjun Dai, Ziyang Li, Mayur Naik, Le Song, and Ke Wang. 2020. Hoppity: Learning Graph Transformations to Detect and Fix Bugs in Programs. International Conference on Learning Representations ( 2020 ).Google ScholarGoogle Scholar
  13. Patrick Fernandes, Miltiadis Allamanis, and Marc Brockschmidt. 2019. Structured Neural Summarization. International Conference on Learning Representations ( 2019 ).Google ScholarGoogle Scholar
  14. Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. 2017. Neural Message Passing for Quantum Chemistry. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 (ICML'17).Google ScholarGoogle Scholar
  15. Marco Gori, Gabriele Monfardini, and Franco Scarselli. 2005. A new model for learning in graph domains. In Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  16. Rahul Gupta, Soham Pal, Aditya Kanade, and Shirish Shevade. 2017. DeepFix: Fixing Common C Language Errors by Deep Learning. In Thirty-First AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  17. Vincent J. Hellendoorn, Charles Sutton, Rishabh Singh, Petros Maniatis, and David Bieber. 2020. Global Relational Models of Source Code. International Conference on Learning Representations ( 2020 ).Google ScholarGoogle Scholar
  18. Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the Naturalness of Software. In Proceedings of the 34th International Conference on Software Engineering (ICSE '12).Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-term Memory. Neural computation ( 1997 ).Google ScholarGoogle Scholar
  20. L. Jiang, G. Misherghi, Z. Su, and S. Glondu. 2007. DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones. In 29th International Conference on Software Engineering (ICSE'07).Google ScholarGoogle Scholar
  21. René Just, Darioush Jalali, and Michael D Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis. ACM, 437-440.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. 2016. Gated graph sequence neural networks. International Conference on Learning Representations ( 2016 ).Google ScholarGoogle Scholar
  23. Chris Maddison and Daniel Tarlow. 2014. Structured generative models of natural source code. In International Conference on Machine Learning.Google ScholarGoogle Scholar
  24. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jefrey Dean. 2013. Distributed Representations of Words and Phrases and Their Compositionality. In Neural Information Processing Systems (NIPS).Google ScholarGoogle Scholar
  25. Tung Thanh Nguyen, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N. Nguyen. 2013. A Statistical Semantic Language Model for Source Code. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2013 ).Google ScholarGoogle Scholar
  26. Renaud Pawlak, Martin Monperrus, Nicolas Petitprez, Carlos Noguera, and Lionel Seinturier. 2015. Spoon: A Library for Implementing Analyses and Transformations of Java Source Code. Software: Practice and Experience ( 2015 ).Google ScholarGoogle Scholar
  27. Michael Pradel and Koushik Sen. 2018. Deepbugs: a learning approach to name-based bug detection. Proceedings of the ACM on Programming Languages OOPSLA ( 2018 ).Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Yewen Pu, Karthik Narasimhan, Armando Solar-Lezama, and Regina Barzilay. 2016. Sk_P: A Neural Program Corrector for MOOCs. In Companion Proceedings of the 2016 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH).Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Veselin Raychev, Pavol Bielik, and Martin Vechev. 2016. Probabilistic Model for Code with Decision Trees. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2016 ).Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Ripon Saha, Yingjun Lyu, Wing Lam, Hiroaki Yoshida, and Mukul Prasad. 2018. Bugs. jar: a large-scale, diverse dataset of real-world java bugs. In 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR). IEEE, 10-13.Google ScholarGoogle Scholar
  31. Xujie Si, Hanjun Dai, Mukund Raghothaman, Mayur Naik, and Le Song. 2018. Learning Loop Invariants for Program Verification. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS'18).Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Daniel Svozil, Vladimir Kvasnicka, and Jiri Pospichal. 1997. Introduction to multi-layer feed-forward neural networks. Chemometrics and intelligent laboratory systems ( 1997 ).Google ScholarGoogle Scholar
  33. David A Tomassi, Naji Dmeiri, Yichen Wang, Antara Bhowmick, Yen-Chuan Liu, Premkumar T Devanbu, Bogdan Vasilescu, and Cindy Rubio-González. 2019. Bugswarm: mining and continuously growing a dataset of reproducible failures and ifxes. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 339-349.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Marko Vasic, Aditya Kanade, Petros Maniatis, David Bieber, and Rishabh singh. 2019. Neural Program Repair by Jointly Learning to Localize and Repair. International Conference on Learning Representations ( 2019 ).Google ScholarGoogle Scholar
  35. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems.Google ScholarGoogle Scholar
  36. Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer Networks. In Advances in Neural Information Processing Systems 28.Google ScholarGoogle Scholar
  37. Ke Wang. 2019. Learning Scalable and Precise Representation of Program Semantics. arXiv preprint arXiv: 1905. 05251 ( 2019 ).Google ScholarGoogle Scholar
  38. Ke Wang and Mihai Christodorescu. 2019. COSET: A Benchmark for Evaluating Neural Program Embeddings. arXiv preprint arXiv: 1905. 11445 ( 2019 ).Google ScholarGoogle Scholar
  39. Ke Wang, Rishabh Singh, and Zhendong Su. 2018. Dynamic Neural Program Embedding for Program Repair. International Conference on Learning Representations ( 2018 ).Google ScholarGoogle Scholar
  40. Ke Wang and Zhendong Su. 2020. Blended, Precise Semantic Program Embeddings. In Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI '20).Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Song Wang, Taiyue Liu, and Lin Tan. 2016. Automatically learning semantic features for defect prediction. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). IEEE.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Jiayi Wei, Maruth Goyal, Greg Durrett, and Isil Dillig. 2020. LambdaNet: Probabilistic Type Inference using Graph Neural Networks. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  43. Mark Weiser. 1981. Program Slicing. In Proceedings of the 5th International Conference on Software Engineering (ICSE '81).Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Xin Ye, Razvan Bunescu, and Chang Liu. 2014. Learning to rank relevant files for bug reports using domain knowledge. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 689-699.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning semantic program embeddings with graph interval neural network

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!