skip to main content
research-article
Open Access

Abridging source code

Published:12 October 2017Publication History
Skip Abstract Section

Abstract

In this paper, we consider the problem of source code abridgment, where the goal is to remove statements from a source code in order to display the source code in a small space, while at the same time leaving the ``important'' parts of the source code intact, so that an engineer can read the code and quickly understand purpose of the code. To this end, we develop an algorithm that looks at a number of examples, human-created source code abridgments, and learns how to remove lines from the code in order to mimic the human abridger. The learning algorithm takes into account syntactic features of the code, as well as semantic features such as control flow and data dependencies. Through a comprehensive user study, we show that the abridgments that our system produces can decrease the time that a user must look at code in order to understand its functionality, as well as increase the accuracy of the assessment, while displaying the code in a greatly reduced area.

References

  1. Nahla J Abid, Natalia Dragan, Michael L Collard, and Jonathan I Maletic. 2015. Using stereotypes in the automatic generation of natural language summaries for c++ methods. In Software Maintenance and Evolution (ICSME), 2015 IEEE International Conference on. IEEE, 561–565.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Hiralal Agrawal and Joseph R Horgan. 1990. Dynamic program slicing. In ACM SIGPlan Notices, Vol. 25. ACM, 246–256.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. David M Blei. 2012. Probabilistic topic models. Commun. ACM 55, 4 (2012), 77–84.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research 3, Jan (2003), 993–1022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Raymond PL Buse and Westley R Weimer. 2008. Automatic documentation inference for exceptions. In Proceedings of the 2008 international symposium on Software testing and analysis. ACM, 273–282.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Natalia Dragan, Michael L Collard, and Jonathan I Maletic. 2006. Reverse engineering method stereotypes. In Software Maintenance, 2006. ICSM’06. 22nd IEEE International Conference on. IEEE, 24–34.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Natalia Dragan, Michael L Collard, and Jonathan I Maletic. 2010. Automatic identification of class stereotypes. In Software Maintenance (ICSM), 2010 IEEE International Conference on. IEEE, 1–10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Brian P Eddy, Jeffrey A Robinson, Nicholas A Kraft, and Jeffrey C Carver. 2013. Evaluating source code summarization techniques: Replication and expansion. In Program Comprehension (ICPC), 2013 IEEE 21st International Conference on. IEEE, 13–22.Google ScholarGoogle ScholarCross RefCross Ref
  9. Bradley Efron. 1982. The jackknife, the bootstrap and other resampling plans. SIAM. Google ScholarGoogle ScholarCross RefCross Ref
  10. Jeanne Ferrante, Karl J Ottenstein, and Joe D Warren. 1987. The program dependence graph and its use in optimization. ACM Transactions on Programming Languages and Systems (TOPLAS) 9, 3 (1987), 319–349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jaroslav Fowkes, Razvan Ranca, Miltiadis Allamanis, Mirella Lapata, and Charles Sutton. 2014. Autofolding for Source Code Summarization. arXiv preprint arXiv:1403.4503 (2014).Google ScholarGoogle Scholar
  12. Pascale Fung, Grace Ngai, and Chi-Shun Cheung. 2003. Combining optimal clustering and hidden Markov models for extractive summarization. In Proceedings of the ACL 2003 workshop on Multilingual summarization and question answeringVolume 12. Association for Computational Linguistics, 21–28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Aria Haghighi and Lucy Vanderwende. 2009. Exploring content models for multi-document summarization. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 362–370. Google ScholarGoogle ScholarCross RefCross Ref
  14. Sonia Haiduc, Jairo Aponte, and Andrian Marcus. 2010a. Supporting program comprehension with source code summarization. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 2. ACM, 223–226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sonia Haiduc, Jairo Aponte, Laura Moreno, and Andrian Marcus. 2010b. On the use of automated text summarization techniques for summarizing source code. In Reverse Engineering (WCRE), 2010 17th Working Conference on. IEEE, 35–44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. 2009. ParamILS: an automatic algorithm configuration framework. Journal of Artificial Intelligence Research 36, 1 (2009), 267–306.Google ScholarGoogle ScholarCross RefCross Ref
  17. Frank Hutter, Holger H Hoos, and Thomas Stützle. 2007. Automatic algorithm configuration based on local search. In AAAI, Vol. 7. 1152–1157.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Karen Spärck Jones. 2007. Automatic summarising: The state of the art. Information Processing & Management 43, 6 (2007), 1449–1481.Google ScholarGoogle Scholar
  19. Tomonori Kikuchi, Sadaoki Furui, and Chiori Hori. 2003. Automatic speech summarization based on sentence extraction and compaction. In Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). 2003 IEEE International Conference on, Vol. 1. IEEE, I–I. Google ScholarGoogle ScholarCross RefCross Ref
  20. Yu Liu, Xiaobing Sun, Xiangyue Liu, and Yun Li. 2014. Supporting program comprehension with program summarization. In Computer and Information Science (ICIS), 2014 IEEE/ACIS 13th International Conference on. IEEE, 363–368. Google ScholarGoogle ScholarCross RefCross Ref
  21. C. Lopes, S. Bajracharya, J. Ossher, and P. Baldi. 2010. UCI Source Code Data Sets. (2010). http://www.ics.uci.edu/$\ sim$lopes/datasets/Google ScholarGoogle Scholar
  22. Paul W McBurney, Cheng Liu, Collin McMillan, and Tim Weninger. 2014. Improving topic model source code summarization. In Proceedings of the 22nd International Conference on Program Comprehension. ACM, 291–294.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Paul W McBurney and Collin McMillan. 2014. Automatic documentation generation via source code summarization of method context. In Proceedings of the 22nd International Conference on Program Comprehension. ACM, 279–290.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Paul W McBurney and Collin McMillan. 2016. Automatic Source Code Summarization of Context for Java Methods. IEEE Transactions on Software Engineering 42, 2 (2016), 103–119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Laura Moreno. 2014. Summarization of complex software artifacts. In Companion Proceedings of the 36th International Conference on Software Engineering. ACM, 654–657. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Laura Moreno, Jairo Aponte, Giriprasad Sridhara, Andrian Marcus, Lori Pollock, and K Vijay-Shanker. 2013. Automatic generation of natural language summaries for java classes. In 2013 21st International Conference on Program Comprehension (ICPC). IEEE, 23–32. Google ScholarGoogle ScholarCross RefCross Ref
  27. Laura Moreno and Andrian Marcus. 2012. Jstereocode: automatically identifying method and class stereotypes in java code. In Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering. ACM, 358–361. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Giriprasad Sridhara, Emily Hill, Divya Muppaneni, Lori Pollock, and K Vijay-Shanker. 2010. Towards automatically generating summary comments for java methods. In Proceedings of the IEEE/ACM international conference on Automated software engineering. ACM, 43–52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Giriprasad Sridhara, Lori Pollock, and K Vijay-Shanker. 2011a. Automatically detecting and describing high level actions within methods. In 2011 33rd International Conference on Software Engineering (ICSE). IEEE, 101–110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Giriprasad Sridhara, Lori Pollock, and K Vijay-Shanker. 2011b. Generating parameter comments and integrating with method summaries. In Program Comprehension (ICPC), 2011 IEEE 19th International Conference on. IEEE, 71–80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Manu Sridharan, Stephen J Fink, and Rastislav Bodik. 2007. Thin slicing. ACM SIGPLAN Notices 42, 6 (2007), 112–122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Mark Weiser. 1981. Program slicing. In Proceedings of the 5th international conference on Software engineering. IEEE Press, 439–449.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Kam-Fai Wong, Mingli Wu, and Wenjie Li. 2008. Extractive summarization using supervised and semi-supervised learning. In Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, 985–992. Google ScholarGoogle ScholarCross RefCross Ref
  34. Annie TT Ying and Martin P Robillard. 2013. Code fragment summarization. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. ACM, 655–658.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Sai Zhang, Cheng Zhang, and Michael D Ernst. 2011. Automated documentation inference to explain failed tests. In Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering. IEEE Computer Society, 63–72.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Abridging source code

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the ACM on Programming Languages
      Proceedings of the ACM on Programming Languages  Volume 1, Issue OOPSLA
      October 2017
      1786 pages
      EISSN:2475-1421
      DOI:10.1145/3152284
      Issue’s Table of Contents

      Copyright © 2017 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 October 2017
      Published in pacmpl Volume 1, Issue OOPSLA

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!