skip to main content
research-article

By the Community & For the Community: A Deep Learning Approach to Assist Collaborative Editing in Q&A Sites

Authors Info & Claims
Published:06 December 2017Publication History
Skip Abstract Section

Abstract

Community edits to questions and answers (called post edits) plays an important role in improving content quality in Stack Overflow. Our study of post edits in Stack Overflow shows that a large number of edits are about formatting, grammar and spelling. These post edits usually involve small-scale sentence edits and our survey of trusted contributors suggests that most of them care much or very much about such small sentence edits. To assist users in making small sentence edits, we develop an edit-assistance tool for identifying minor textual issues in posts and recommending sentence edits for correction. We formulate the sentence editing task as a machine translation problem, in which an original sentence is "translated" into an edited sentence. Our tool implements a character-level Recurrent Neural Network (RNN) encoder-decoder model, trained with about 6.8 millions original-edited sentence pairs from Stack Overflow post edits. We evaluate our edit assistance tool using a large-scale archival post edits, a field study of assisting a novice post editor, and a survey of trusted contributors. Our evaluation demonstrates the feasibility of training a deep learning model with post edits by the community and then using the trained model to assist post editing for the community.

References

  1. 2017. The Objective Revision Evaluation Service. https://ores.wikimedia.org/. (2017).Google ScholarGoogle Scholar
  2. Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016).Google ScholarGoogle Scholar
  3. Mohammad Allahbakhsh, Boualem Benatallah, Aleksandar Ignjatovic, Hamid Reza Motahari-Nezhad, Elisa Bertino, and Schahram Dustdar. 2013. Quality control in crowdsourcing systems: Issues and directions. IEEE Internet Computing 17, 2 (2013), 76--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Guntis Barzdins, Steve Renals, and Didzis Gosko. 2016. Character-Level Neural Translation for Multilingual Media Monitoring in the SUMMA Project. arXiv preprint arXiv:1604.01221 (2016).Google ScholarGoogle Scholar
  5. Lasse Bergroth, Harri Hakonen, and Timo Raita. 2000. A survey of longest common subsequence algorithms. In String Processing and Information Retrieval, 2000. SPIRE 2000. Proceedings. Seventh International Symposium on. IEEE, 39--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Peter F Brown, Peter V Desouza, Robert L Mercer, Vincent J Della Pietra, and Jenifer C Lai. 1992. Class-based n-gram models of natural language. Computational linguistics 18, 4 (1992), 467--479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chunyang Chen and Zhenchang Xing. 2016. Mining technology landscape from stack overflow. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. ACM, 14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chunyang Chen and Zhenchang Xing. 2016. Towards correlating search on google and asking on stack overflow. In Computer Software and Applications Conference (COMPSAC), 2016 IEEE 40th Annual, Vol. 1. IEEE, 83--92.Google ScholarGoogle ScholarCross RefCross Ref
  9. Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).Google ScholarGoogle Scholar
  10. Shamil Chollampatt, Kaveh Taghipour, and Hwee Tou Ng. 2016. Neural Network Translation Models for Grammatical Error Correction. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9-15 July 2016. 2768--2774. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Robert Dale, Ilya Anisimoff, and George Narroway. 2012. HOO 2012: A report on the preposition and determiner error correction shared task. In Proceedings of the Seventh Workshop on Building Educational Applications Using NLP. Association for Computational Linguistics, 54--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Robert Dale and Adam Kilgarriff. 2011. Helping our own: The HOO 2011 pilot shared task. In Proceedings of the 13th European Workshop on Natural Language Generation. Association for Computational Linguistics, 242--249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Mariano Felice. 2016. Artificial error generation for translation-based grammatical error correction. Technical Report. University of Cambridge, Computer Laboratory.Google ScholarGoogle Scholar
  14. Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In Acoustics, speech and signal processing (icassp), 2013 ieee international conference on. IEEE, 6645--6649.Google ScholarGoogle ScholarCross RefCross Ref
  15. Jonathan Grudin. 1994. Groupware and social dynamics: Eight challenges for developers. Commun. ACM 37, 1 (1994), 92--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Marcin Junczys-Dowmunt and Roman Grundkiewicz. 2016. Phrase-based Machine Translation is State-of-the-Art for Automatic Grammatical Error Correction. arXiv preprint arXiv:1605.06353 (2016).Google ScholarGoogle Scholar
  17. Aniket Kittur and Robert E Kraut. 2008. Harnessing the wisdom of crowds in wikipedia: quality through coordination. In Proceedings of the 2008 ACM conference on Computer supported cooperative work. ACM, 37--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Philipp Koehn. 2009. Statistical machine translation. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Jean Lave and Etienne Wenger. 1999. Legitimate peripheral participation. Learners, learning and assessment, London: The Open University (1999), 83--89.Google ScholarGoogle Scholar
  20. Guo Li, Tun Lu, Xianghua Ding, and Ning Gu. 2016. Predicting Collaborative Edits of Questions and Answers in Online Q&A Sites. 17, 6 (2016), 1187--1194.Google ScholarGoogle Scholar
  21. Guo Li, Haiyi Zhu, Tun Lu, Xianghua Ding, and Ning Gu. 2015. Is It Good to Be Like Wikipedia?: Exploring the Trade-offs of Introducing Collaborative Editing Model to Q&A Sites. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. ACM, 1080--1091. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Zhuoran Liu and Yang Liu. 2016. Exploiting Unlabeled Data for Neural Grammatical Error Detection. arXiv preprint arXiv:1611.08987 (2016).Google ScholarGoogle Scholar
  23. Thang Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, and Wojciech Zaremba. 2015. Addressing the Rare Word Problem in Neural Machine Translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers. 11--19.Google ScholarGoogle Scholar
  24. Lena Mamykina, Bella Manoim, Manas Mittal, George Hripcsak, and Björn Hartmann. 2011. Design lessons from the fastest q&a site in the west. In Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 2857--2866. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Piotr Mirowski and Andreas Vlachos. 2015. Dependency recurrent neural language models for sentence completion. arXiv preprint arXiv:1507.01193 (2015).Google ScholarGoogle Scholar
  26. Tomoya Mizumoto and Yuji Matsumoto. 2016. Discriminative reranking for grammatical error correction with statistical machine translation. In Proceedings of NAACL-HLT. 1133--1138.Google ScholarGoogle ScholarCross RefCross Ref
  27. Courtney Napoles, Keisuke Sakaguchi, Matt Post, and Joel Tetreault. 2015. Ground truth for grammatical error correction metrics. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Vol. 2. 588--593.Google ScholarGoogle Scholar
  28. Courtney Napoles, Keisuke Sakaguchi, Matt Post, and Joel Tetreault. 2016. GLEU Without Tuning. arXiv preprint arXiv:1605.02592 (2016).Google ScholarGoogle Scholar
  29. Courtney Napoles, Keisuke Sakaguchi, and Joel Tetreault. 2016. There's No Comparison: Reference-less Evaluation Metrics in Grammatical Error Correction. arXiv preprint arXiv:1610.02124 (2016).Google ScholarGoogle Scholar
  30. Hwee Tou Ng, Siew Mei Wu, Ted Briscoe, Christian Hadiwinoto, Raymond Hendy Susanto, and Christopher Bryant. 2014. The CoNLL-2014 Shared Task on Grammatical Error Correction. In CoNLL Shared Task. 1--14.Google ScholarGoogle Scholar
  31. Daniel Ortiz-Martínez, Ismael García-Varea, and Francisco Casacuberta. 2005. Thot: a toolkit to train phrase-based statistical translation models. Tenth Machine Translation Summit. AAMT, Phuket, Thailand, September (2005).Google ScholarGoogle Scholar
  32. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 311--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Keisuke Sakaguchi, Courtney Napoles, Matt Post, and Joel Tetreault. 2016. Reassessing the goals of grammatical error correction: Fluency instead of grammaticality. Transactions of the Association for Computational Linguistics 4 (2016), 169--182.Google ScholarGoogle ScholarCross RefCross Ref
  34. Allen Schmaltz, Yoon Kim, Alexander M Rush, and Stuart M Shieber. 2016. Sentence-level grammatical error identification as sequence-to-sequence correction. arXiv preprint arXiv:1604.04677 (2016).Google ScholarGoogle Scholar
  35. Andrew W Vargo and Shigeo Matsubara. 2016. Editing Unfit Questions in Q&A. In Advanced Applied Informatics (IIAI-AAI), 2016 5th IIAI International Congress on. IEEE, 107--112.Google ScholarGoogle ScholarCross RefCross Ref
  36. Fernanda B Viegas, Martin Wattenberg, and Jonathan Feinberg. 2009. Participatory visualization with wordle. IEEE transactions on visualization and computer graphics 15, 6 (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Paul J Werbos. 1990. Backpropagation through time: what it does and how to do it. Proc. IEEE 78, 10 (1990), 1550--1560.Google ScholarGoogle ScholarCross RefCross Ref
  38. Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. 2016. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv preprint arXiv:1609.08144 (2016).Google ScholarGoogle Scholar
  39. Ziang Xie, Anand Avati, Naveen Arivazhagan, Dan Jurafsky, and Andrew Y Ng. 2016. Neural language correction with character-based attention. arXiv preprint arXiv:1603.09727 (2016).Google ScholarGoogle Scholar
  40. Zheng Yuan and Ted Briscoe. 2016. Grammatical error correction using neural machine translation. In Proceedings of NAACL-HLT. 380--386.Google ScholarGoogle ScholarCross RefCross Ref
  41. Zheng Yuan, Ted Briscoe, and Mariano Felice. 2016. Candidate re-ranking for SMT-based grammatical error correction. In Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications. 256--266.Google ScholarGoogle ScholarCross RefCross Ref
  42. Ying Zhang, Stephan Vogel, and Alex Waibel. 2004. Interpreting bleu/nist scores: How much improvement do we need to have a better system?. In LREC.Google ScholarGoogle Scholar

Index Terms

  1. By the Community & For the Community: A Deep Learning Approach to Assist Collaborative Editing in Q&A Sites

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!