skip to main content
research-article
Open Access
Artifacts Available
Artifacts Evaluated & Functional

Automatic diagnosis and correction of logical errors for functional programming assignments

Published:24 October 2018Publication History
Skip Abstract Section

Abstract

We present FixML, a system for automatically generating feedback on logical errors in functional programming assignments. As functional languages have been gaining popularity, the number of students enrolling functional programming courses has increased significantly. However, the quality of feedback, in particular for logical errors, is hardly satisfying. To provide personalized feedback on logical errors, we present a new error-correction algorithm for functional languages, which combines statistical error-localization and type-directed program synthesis enhanced with components reduction and search space pruning using symbolic execution. We implemented our algorithm in a tool, called FixML, and evaluated it with 497 students’ submissions from 13 exercises, including not only introductory but also more advanced problems. Our experimental results show that our tool effectively corrects various and complex errors: it fixed 43% of the 497 submissions in 5.4 seconds on average and managed to fix a hard-to-find error in a large submission, consisting of 154 lines. We also performed user study with 18 undergraduate students and confirmed that our system actually helps students to better understand their programming errors.

Skip Supplemental Material Section

Supplemental Material

a158-lee.webm

References

  1. Aws Albarghouthi, Sumit Gulwani, and Zachary Kincaid. 2013. Recursive Program Synthesis. In Proceedings of the 25th International Conference on Computer Aided Verification (CAV’13). Springer-Verlag, Berlin, Heidelberg, 934–950.Google ScholarGoogle ScholarCross RefCross Ref
  2. Thomas Ball, Mayur Naik, and Sriram K. Rajamani. 2003. From Symptom to Cause: Localizing Errors in Counterexample Traces. In Proceedings of the 30th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’03). ACM, New York, NY, USA, 97–105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2017. DeepCoder: Learning to Write Programs. In ICLR.Google ScholarGoogle Scholar
  4. Sahil Bhatia, Pushmeet Kohli, and Rishabh Singh. 2018. Neuro-symbolic Program Corrector for Introductory Programming Assignments. In Proceedings of the 40th International Conference on Software Engineering (ICSE ’18). ACM, New York, NY, USA, 60–70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sheng Chen and Martin Erwig. 2014. Counter-factual Typing for Debugging Type Errors. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’14). ACM, New York, NY, USA, 583–594. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Loris D’Antoni, Roopsha Samanta, and Rishabh Singh. 2016. Qlose: Program Repair with Quantiative Objectives. (July 2016). https://www.microsoft.com/en-us/research/publication/qlose-program-repair-with-quantiative-objectives/Google ScholarGoogle Scholar
  7. Yu Feng, Ruben Martins, Jacob Van Geffen, Isil Dillig, and Swarat Chaudhuri. 2017a. Component-based Synthesis of Table Consolidation and Transformation Tasks from Examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, New York, NY, USA, 422–436. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yu Feng, Ruben Martins, Yuepeng Wang, Isil Dillig, and Thomas W. Reps. 2017b. Component-based Synthesis for Complex APIs. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2017). ACM, New York, NY, USA, 599–612. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. John K. Feser, Swarat Chaudhuri, and Isil Dillig. 2015. Synthesizing Data Structure Transformations from Input-output Examples. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). ACM, New York, NY, USA, 229–239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Stephanie Forrest, ThanhVu Nguyen, Westley Weimer, and Claire Le Goues. 2009. A Genetic Programming Approach to Automated Software Repair. In Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO ’09). ACM, New York, NY, USA, 947–954. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jonathan Frankle, Peter-Michael Osera, David Walker, and Steve Zdancewic. 2016. Example-directed Synthesis: A Typetheoretic Interpretation. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’16). ACM, New York, NY, USA, 802–815. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Andreas Griesmayer, Stefan Staber, and Roderick Bloem. 2007. Automated Fault Localization for C Programs. Electron. Notes Theor. Comput. Sci. 174, 4 (May 2007), 95–111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Alex Groce, Sagar Chaki, Daniel Kroening, and Ofer Strichman. 2006. Error Explanation with Distance Metrics. Int. J. Softw. Tools Technol. Transf. 8, 3 (June 2006), 229–247.Google ScholarGoogle Scholar
  14. Sumit Gulwani. 2011. Automating String Processing in Spreadsheets Using Input-output Examples. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’11). ACM, New York, NY, USA, 317–330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sumit Gulwani, Ivan Radiček, and Florian Zuleger. 2014. Feedback Generation for Performance Problems in Introductory Programming Assignments. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014). ACM, New York, NY, USA, 41–51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Sumit Gulwani, Ivan Radiček, and Florian Zuleger. 2018. Automated Clustering and Program Repair for Introductory Programming Assignments. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). ACM, New York, NY, USA, 465–480. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Rahul Gupta, Soham Pal, Aditya Kanade, and Shirish K. Shevade. 2017. DeepFix: Fixing Common C Language Errors by Deep Learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA. 1345–1351. http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14603Google ScholarGoogle Scholar
  18. James A. Jones, Mary Jean Harrold, and John Stasko. 2002. Visualization of Test Information to Assist Fault Localization. In Proceedings of the 24th International Conference on Software Engineering (ICSE ’02). ACM, New York, NY, USA, 467–477. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Manu Jose and Rupak Majumdar. 2011. Cause Clue Clauses: Error Localization Using Maximum Satisfiability. SIGPLAN Not. 46, 6 (June 2011), 437–446. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Dohyeong Kim, Yonghwi Kwon, Peng Liu, I. Luk Kim, David Mitchel Perry, Xiangyu Zhang, and Gustavo Rodriguez-Rivera. 2016. Apex: Automatic Programming Assignment Error Explanation. SIGPLAN Not. 51, 10 (Oct. 2016), 311–327.Google ScholarGoogle Scholar
  21. Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic Patch Generation Learned from Humanwritten Patches. In Proceedings of the 2013 International Conference on Software Engineering (ICSE ’13). IEEE Press, Piscataway, NJ, USA, 802–811. http://dl.acm.org/citation.cfm?id=2486788.2486893 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Dileep Kini and Sumit Gulwani. 2015. FlashNormalize: Programming by Examples for Text Normalization. In Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI’15). AAAI Press, 776–783. http://dl.acm.org/citation. cfm?id=2832249.2832357 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Etienne Kneuss, Manos Koukoutos, and Viktor Kuncak. 2015. Deductive Program Repair. In Computer Aided Verification, Daniel Kroening and Corina S. Păsăreanu (Eds.). Springer International Publishing, Cham, 217–233.Google ScholarGoogle Scholar
  24. Etienne Kneuss, Ivan Kuraj, Viktor Kuncak, and Philippe Suter. 2013. Synthesis Modulo Recursive Functions. SIGPLAN Not. 48, 10 (Oct. 2013), 407–426. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Robert Könighofer and Roderick Bloem. 2011. Automated Error Localization and Correction for Imperative Programs. In Proceedings of the International Conference on Formal Methods in Computer-Aided Design (FMCAD ’11). FMCAD Inc, Austin, TX, 91–100. http://dl.acm.org/citation.cfm?id=2157654.2157671 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A Systematic Study of Automated Program Repair: Fixing 55 out of 105 Bugs for $8 Each. In Proceedings of the 34th International Conference on Software Engineering (ICSE ’12). IEEE Press, Piscataway, NJ, USA, 3–13. http://dl.acm.org/citation.cfm?id=2337223.2337225 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Benjamin S. Lerner, Matthew Flower, Dan Grossman, and Craig Chambers. 2007. Searching for Type-error Messages. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’07). ACM, New York, NY, USA, 425–434. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Fan Long and Martin Rinard. 2016. Automatic Patch Generation by Learning Correct Code. SIGPLAN Not. 51, 1 (Jan. 2016), 298–312.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. 2013. SemFix: Program Repair via Semantic Analysis. In Proceedings of the 2013 International Conference on Software Engineering (ICSE ’13). IEEE Press, Piscataway, NJ, USA, 772–781. http://dl.acm.org/citation.cfm?id=2486788.2486890 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Peter-Michael Osera and Steve Zdancewic. 2015. Type-and-example-directed Program Synthesis. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). ACM, New York, NY, USA, 619–630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Zvonimir Pavlinovic, Tim King, and Thomas Wies. 2014. Finding Minimum Type Error Sources. SIGPLAN Not. 49, 10 (Oct. 2014), 525–542. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Zvonimir Pavlinovic, Tim King, and Thomas Wies. 2015. Practical SMT-based Type Error Localization. SIGPLAN Not. 50, 9 (Aug. 2015), 412–423.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Nadia Polikarpova, Ivan Kuraj, and Armando Solar-Lezama. 2016. Program Synthesis from Polymorphic Refinement Types. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16). ACM, New York, NY, USA, 522–538. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Yewen Pu, Karthik Narasimhan, Armando Solar-Lezama, and Regina Barzilay. 2016. Sk_P: A Neural Program Corrector for MOOCs. In Companion Proceedings of the 2016 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH Companion 2016). ACM, New York, NY, USA, 39–40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Manos Renieris and Steven P. Reiss. 2003. Fault Localization with Nearest Neighbor Queries. In Proceedings of the 18th IEEE International Conference on Automated Software Engineering (ASE’03). IEEE Press, Piscataway, NJ, USA, 30–39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Saul Schleimer, Daniel S. Wilkerson, and Alex Aiken. 2003. Winnowing: Local Algorithms for Document Fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD ’03). ACM, New York, NY, USA, 76–85. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Eric L. Seidel, Huma Sibghat, Kamalika Chaudhuri, Westley Weimer, and Ranjit Jhala. 2017. Learning to Blame: Localizing Novice Type Errors with Data-Driven Diagnosis. CoRR abs/1708.07583 (2017). arXiv: 1708.07583 http://arxiv.org/abs/1708. 07583 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Rishabh Singh, Sumit Gulwani, and Armando Solar-Lezama. 2013. Automated Feedback Generation for Introductory Programming Assignments. SIGPLAN Not. 48, 6 (June 2013), 15–26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Edward K. Smith, Earl T. Barr, Claire Le Goues, and Yuriy Brun. 2015. Is the Cure Worse Than the Disease? Overfitting in Automated Program Repair. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). ACM, New York, NY, USA, 532–543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Sunbeom So and Hakjoo Oh. 2017. Synthesizing Imperative Programs from Examples Guided by Static Analysis. In Static Analysis - 24th International Symposium, SAS 2017, New York, NY, USA, August 30 - September 1, 2017, Proceedings. 364–381.Google ScholarGoogle Scholar
  41. Ke Wang, Rishabh Singh, and Zhendong Su. 2018. Search, Align, and Repair: Data-driven Feedback Generation for Introductory Programming Exercises. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). ACM, New York, NY, USA, 481–495. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically Finding Patches Using Genetic Programming. In Proceedings of the 31st International Conference on Software Engineering (ICSE ’09). IEEE Computer Society, Washington, DC, USA, 364–374. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Baijun Wu, John Peter Campora III, and Sheng Chen. 2017. Learning User Friendly Type-error Messages. Proc. ACM Program. Lang. 1, OOPSLA, Article 106 (Oct. 2017), 29 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Baijun Wu and Sheng Chen. 2017. How Type Errors Were Fixed and What Students Did? Proc. ACM Program. Lang. 1, OOPSLA, Article 105 (Oct. 2017), 27 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Navid Yaghmazadeh, Yuepeng Wang, Isil Dillig, and Thomas Dillig. 2017. SQLizer: Query Synthesis from Natural Language. Proc. ACM Program. Lang. 1, OOPSLA, Article 63 (Oct. 2017), 26 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Danfeng Zhang, Andrew C. Myers, Dimitrios Vytiniotis, and Simon Peyton-Jones. 2017. SHErrLoc: A Static Holistic Error Locator. ACM Trans. Program. Lang. Syst. 39, 4, Article 18 (Aug. 2017), 47 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatic diagnosis and correction of logical errors for functional programming assignments

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!