Abstract
In this paper, we consider the problem of source code abridgment, where the goal is to remove statements from a source code in order to display the source code in a small space, while at the same time leaving the ``important'' parts of the source code intact, so that an engineer can read the code and quickly understand purpose of the code. To this end, we develop an algorithm that looks at a number of examples, human-created source code abridgments, and learns how to remove lines from the code in order to mimic the human abridger. The learning algorithm takes into account syntactic features of the code, as well as semantic features such as control flow and data dependencies. Through a comprehensive user study, we show that the abridgments that our system produces can decrease the time that a user must look at code in order to understand its functionality, as well as increase the accuracy of the assessment, while displaying the code in a greatly reduced area.
- Nahla J Abid, Natalia Dragan, Michael L Collard, and Jonathan I Maletic. 2015. Using stereotypes in the automatic generation of natural language summaries for c++ methods. In Software Maintenance and Evolution (ICSME), 2015 IEEE International Conference on. IEEE, 561–565.Google Scholar
Digital Library
- Hiralal Agrawal and Joseph R Horgan. 1990. Dynamic program slicing. In ACM SIGPlan Notices, Vol. 25. ACM, 246–256.Google Scholar
Digital Library
- David M Blei. 2012. Probabilistic topic models. Commun. ACM 55, 4 (2012), 77–84.Google Scholar
Digital Library
- David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research 3, Jan (2003), 993–1022.Google Scholar
Digital Library
- Raymond PL Buse and Westley R Weimer. 2008. Automatic documentation inference for exceptions. In Proceedings of the 2008 international symposium on Software testing and analysis. ACM, 273–282.Google Scholar
Digital Library
- Natalia Dragan, Michael L Collard, and Jonathan I Maletic. 2006. Reverse engineering method stereotypes. In Software Maintenance, 2006. ICSM’06. 22nd IEEE International Conference on. IEEE, 24–34.Google Scholar
Digital Library
- Natalia Dragan, Michael L Collard, and Jonathan I Maletic. 2010. Automatic identification of class stereotypes. In Software Maintenance (ICSM), 2010 IEEE International Conference on. IEEE, 1–10.Google Scholar
Digital Library
- Brian P Eddy, Jeffrey A Robinson, Nicholas A Kraft, and Jeffrey C Carver. 2013. Evaluating source code summarization techniques: Replication and expansion. In Program Comprehension (ICPC), 2013 IEEE 21st International Conference on. IEEE, 13–22.Google Scholar
Cross Ref
- Bradley Efron. 1982. The jackknife, the bootstrap and other resampling plans. SIAM. Google Scholar
Cross Ref
- Jeanne Ferrante, Karl J Ottenstein, and Joe D Warren. 1987. The program dependence graph and its use in optimization. ACM Transactions on Programming Languages and Systems (TOPLAS) 9, 3 (1987), 319–349. Google Scholar
Digital Library
- Jaroslav Fowkes, Razvan Ranca, Miltiadis Allamanis, Mirella Lapata, and Charles Sutton. 2014. Autofolding for Source Code Summarization. arXiv preprint arXiv:1403.4503 (2014).Google Scholar
- Pascale Fung, Grace Ngai, and Chi-Shun Cheung. 2003. Combining optimal clustering and hidden Markov models for extractive summarization. In Proceedings of the ACL 2003 workshop on Multilingual summarization and question answeringVolume 12. Association for Computational Linguistics, 21–28. Google Scholar
Digital Library
- Aria Haghighi and Lucy Vanderwende. 2009. Exploring content models for multi-document summarization. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 362–370. Google Scholar
Cross Ref
- Sonia Haiduc, Jairo Aponte, and Andrian Marcus. 2010a. Supporting program comprehension with source code summarization. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 2. ACM, 223–226. Google Scholar
Digital Library
- Sonia Haiduc, Jairo Aponte, Laura Moreno, and Andrian Marcus. 2010b. On the use of automated text summarization techniques for summarizing source code. In Reverse Engineering (WCRE), 2010 17th Working Conference on. IEEE, 35–44. Google Scholar
Digital Library
- Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. 2009. ParamILS: an automatic algorithm configuration framework. Journal of Artificial Intelligence Research 36, 1 (2009), 267–306.Google Scholar
Cross Ref
- Frank Hutter, Holger H Hoos, and Thomas Stützle. 2007. Automatic algorithm configuration based on local search. In AAAI, Vol. 7. 1152–1157.Google Scholar
Digital Library
- Karen Spärck Jones. 2007. Automatic summarising: The state of the art. Information Processing & Management 43, 6 (2007), 1449–1481.Google Scholar
- Tomonori Kikuchi, Sadaoki Furui, and Chiori Hori. 2003. Automatic speech summarization based on sentence extraction and compaction. In Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). 2003 IEEE International Conference on, Vol. 1. IEEE, I–I. Google Scholar
Cross Ref
- Yu Liu, Xiaobing Sun, Xiangyue Liu, and Yun Li. 2014. Supporting program comprehension with program summarization. In Computer and Information Science (ICIS), 2014 IEEE/ACIS 13th International Conference on. IEEE, 363–368. Google Scholar
Cross Ref
- C. Lopes, S. Bajracharya, J. Ossher, and P. Baldi. 2010. UCI Source Code Data Sets. (2010). http://www.ics.uci.edu/$\ sim$lopes/datasets/Google Scholar
- Paul W McBurney, Cheng Liu, Collin McMillan, and Tim Weninger. 2014. Improving topic model source code summarization. In Proceedings of the 22nd International Conference on Program Comprehension. ACM, 291–294.Google Scholar
Digital Library
- Paul W McBurney and Collin McMillan. 2014. Automatic documentation generation via source code summarization of method context. In Proceedings of the 22nd International Conference on Program Comprehension. ACM, 279–290.Google Scholar
Digital Library
- Paul W McBurney and Collin McMillan. 2016. Automatic Source Code Summarization of Context for Java Methods. IEEE Transactions on Software Engineering 42, 2 (2016), 103–119.Google Scholar
Digital Library
- Laura Moreno. 2014. Summarization of complex software artifacts. In Companion Proceedings of the 36th International Conference on Software Engineering. ACM, 654–657. Google Scholar
Digital Library
- Laura Moreno, Jairo Aponte, Giriprasad Sridhara, Andrian Marcus, Lori Pollock, and K Vijay-Shanker. 2013. Automatic generation of natural language summaries for java classes. In 2013 21st International Conference on Program Comprehension (ICPC). IEEE, 23–32. Google Scholar
Cross Ref
- Laura Moreno and Andrian Marcus. 2012. Jstereocode: automatically identifying method and class stereotypes in java code. In Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering. ACM, 358–361. Google Scholar
Digital Library
- Giriprasad Sridhara, Emily Hill, Divya Muppaneni, Lori Pollock, and K Vijay-Shanker. 2010. Towards automatically generating summary comments for java methods. In Proceedings of the IEEE/ACM international conference on Automated software engineering. ACM, 43–52. Google Scholar
Digital Library
- Giriprasad Sridhara, Lori Pollock, and K Vijay-Shanker. 2011a. Automatically detecting and describing high level actions within methods. In 2011 33rd International Conference on Software Engineering (ICSE). IEEE, 101–110. Google Scholar
Digital Library
- Giriprasad Sridhara, Lori Pollock, and K Vijay-Shanker. 2011b. Generating parameter comments and integrating with method summaries. In Program Comprehension (ICPC), 2011 IEEE 19th International Conference on. IEEE, 71–80. Google Scholar
Digital Library
- Manu Sridharan, Stephen J Fink, and Rastislav Bodik. 2007. Thin slicing. ACM SIGPLAN Notices 42, 6 (2007), 112–122. Google Scholar
Digital Library
- Mark Weiser. 1981. Program slicing. In Proceedings of the 5th international conference on Software engineering. IEEE Press, 439–449.Google Scholar
Digital Library
- Kam-Fai Wong, Mingli Wu, and Wenjie Li. 2008. Extractive summarization using supervised and semi-supervised learning. In Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, 985–992. Google Scholar
Cross Ref
- Annie TT Ying and Martin P Robillard. 2013. Code fragment summarization. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. ACM, 655–658.Google Scholar
Digital Library
- Sai Zhang, Cheng Zhang, and Michael D Ernst. 2011. Automated documentation inference to explain failed tests. In Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering. IEEE Computer Society, 63–72.Google Scholar
Digital Library
Index Terms
Abridging source code
Recommendations
CODES: mining source code descriptions from developers discussions
ICPC 2014: Proceedings of the 22nd International Conference on Program ComprehensionProgram comprehension is a crucial activity, preliminary to any software maintenance task. Such an activity can be difficult when the source code is not adequately documented, or the documentation is outdated. Differently from the many existing ...
Code Clone Graph Metrics for Detecting Diffused Code Clones
APSEC '09: Proceedings of the 2009 16th Asia-Pacific Software Engineering ConferenceCode clones (duplicated source code in a software system) are one of the major factors in decreasing maintainability. Many code clone detection methods have been proposed to find code clones automatically from large-scale software. However, it is still ...
Modifiable Source Code Virtual Views
SBES '19: Proceedings of the XXXIII Brazilian Symposium on Software EngineeringThis paper introduces the modifiable source code virtual view concept, which is a variant of a source code fragment from the version used to generate the binary code for execution, enriched with information provided by the impact analysis process. A ...






Comments