skip to main content

Grounded Copilot: How Programmers Interact with Code-Generating Models

Published:06 April 2023Publication History
Skip Abstract Section

Abstract

Powered by recent advances in code-generating models, AI assistants like Github Copilot promise to change the face of programming forever. But what is this new face of programming? We present the first grounded theory analysis of how programmers interact with Copilot, based on observing 20 participants—with a range of prior experience using the assistant—as they solve diverse programming tasks across four languages. Our main finding is that interactions with programming assistants are bimodal: in acceleration mode, the programmer knows what to do next and uses Copilot to get there faster; in exploration mode, the programmer is unsure how to proceed and uses Copilot to explore their options. Based on our theory, we provide recommendations for improving the usability of future AI programming assistants.

References

  1. Matej Balog, Alexander L Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2016. Deepcoder: Learning to write programs. arXiv preprint arXiv:1611.01989. Google ScholarGoogle Scholar
  2. Shraddha Barke, Michael B. James, and Nadia Polikarpova. 2023. Grounded Copilot: How Programmers Interact with Code-Generating Models. https://doi.org/10.5281/zenodo.7713789 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Mohammad Bavarian, Heewoo Jun, Nikolas Tezak, John Schulman, Christine McLeavey, Jerry Tworek, and Mark Chen. 2022. Efficient Training of Language Models to Fill in the Middle. Jul, https://doi.org/10.48550/arXiv.2207.14255 arXiv:2207.14255 [cs]. Google ScholarGoogle Scholar
  4. Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, and Amanda Askell. 2020. Language models are few-shot learners. Advances in neural information processing systems, 33 (2020), 1877–1901. Google ScholarGoogle Scholar
  5. Antony Bryant and Kathy Charmaz. 2007. The SAGE Handbook of Grounded Theory. SAGE Publications Ltd. isbn:978-1-4129-2346-0 https://doi.org/10.4135/9781848607941 Google ScholarGoogle ScholarCross RefCross Ref
  6. Donal E Carlston. 2013. Dual-Process Theories. isbn:978-0-19-998468-8 http://public.ebookcentral.proquest.com/choice/publicfullrecord.aspx?p=1336453 Google ScholarGoogle Scholar
  7. Jeff Carver. 2004. The Impact of Background and Experience on Software Inspections. Empirical Softw. Engg., 9, 3 (2004), sep, 259–262. issn:1382-3256 https://doi.org/10.1023/B:EMSE.0000027786.04555.97 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Sarah E. Chasins, Maria Mueller, and Rastislav Bodik. 2018. Rousillon: Scraping Distributed Hierarchical Web Data. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology (UIST ’18). Association for Computing Machinery, New York, NY, USA. 963–975. isbn:9781450359481 https://doi.org/10.1145/3242587.3242661 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, and Greg Brockman. 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374. Google ScholarGoogle Scholar
  10. Koen Claessen and John Hughes. 2000. QuickCheck: a lightweight tool for random testing of Haskell programs. In Proceedings of the fifth ACM SIGPLAN international conference on Functional programming (ICFP’00). Association for Computing Machinery, 268–279. isbn:978-1-58113-202-1 https://doi.org/10.1145/351240.351266 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Michael Coblenz, Gauri Kambhatla, Paulette Koronkevich, Jenna L. Wise, Celeste Barnaby, Joshua Sunshine, Jonathan Aldrich, and Brad A. Myers. 2021. PLIERS: A Process that Integrates User-Centered Methods into Programming Language Design. ACM Transactions on Computer-Human Interaction, 28, 4 (2021), Jul, 28:1–28:53. issn:1073-0516 https://doi.org/10.1145/3452379 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Mihaly Csikszentmihalyi. 2014. Flow and the Foundations of Positive Psychology. Springer Netherlands. isbn:978-94-017-9087-1 https://doi.org/10.1007/978-94-017-9088-8 Google ScholarGoogle ScholarCross RefCross Ref
  13. Françoise Détienne and Frank Bott. 2001. Software design—cognitive aspects. Springer-Verlag. isbn:978-1-85233-253-2 Google ScholarGoogle Scholar
  14. Ian Drosos, Titus Barik, Philip J. Guo, Robert DeLine, and Sumit Gulwani. 2020. Wrex: A Unified Programming-by-Example Interaction for Synthesizing Readable Code for Data Scientists. Association for Computing Machinery, New York, NY, USA. 1–12. isbn:9781450367080 https://doi.org/10.1145/3313831.3376442 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova. 2021. LooPy: Interactive Program Synthesis with Control Structures. Proc. ACM Program. Lang., 5, OOPSLA (2021), Article 153, oct, 29 pages. https://doi.org/10.1145/3485530 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kasra Ferdowsifard, Allen Ordookhanians, Hila Peleg, Sorin Lerner, and Nadia Polikarpova. 2020. Small-Step Live Programming by Example. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. Association for Computing Machinery, New York, NY, USA. 614–626. isbn:9781450375146 https://doi.org/10.1145/3379337.3415869 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer, and Mike Lewis. 2022. InCoder: A Generative Model for Code Infilling and Synthesis. Apr, https://doi.org/10.48550/arXiv.2204.05999 arXiv:2204.05999 [cs]. Google ScholarGoogle Scholar
  18. Nat Friedman. 2021. https://github.blog/2021-06-29-introducing-github-copilot-ai-pair-programmer/ Google ScholarGoogle Scholar
  19. Barney G. Glaser and Anselm L. Strauss. 1967. The discovery of grounded theory: strategies for qualitative research (5. paperback print ed.). Aldine Transaction. isbn:978-0-202-30260-7 Google ScholarGoogle Scholar
  20. Elena L. Glassman, Jeremy Scott, Rishabh Singh, Philip J. Guo, and Robert C. Miller. 2015. OverCode: Visualizing Variation in Student Solutions to Programming Problems at Scale. ACM Trans. Comput.-Hum. Interact., 22, 2 (2015), Article 7, mar, 35 pages. issn:1073-0516 https://doi.org/10.1145/2699751 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Sumit Gulwani. 2011. Automating string processing in spreadsheets using input-output examples. ACM Sigplan Notices, 46, 1 (2011), 317–330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Daya Guo, Alexey Svyatkovskiy, Jian Yin, Nan Duan, Marc Brockschmidt, and Miltiadis Allamanis. 2021. Learning to Complete Code with Sketches. In International Conference on Learning Representations. Google ScholarGoogle Scholar
  23. Michael B. James, Zheng Guo, Ziteng Wang, Shivani Doshi, Hila Peleg, Ranjit Jhala, and Nadia Polikarpova. 2020. Digging for Fold: Synthesis-Aided API Discovery for Haskell. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 205, nov, 27 pages. https://doi.org/10.1145/3428273 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Dhanya Jayagopal, Justin Lubin, and Sarah E Chasins. 2022. Exploring the Learnability of Program Synthesizers by Novice Programmers. 15. Google ScholarGoogle Scholar
  25. Ellen Jiang, Edwin Toh, Alejandra Molina, Kristen Olson, Claire Kayacik, Aaron Donsbach, Carrie J Cai, and Michael Terry. 2022. Discovering the Syntax and Strategies of Natural Language Programming with Generative Language Models. In CHI Conference on Human Factors in Computing Systems. 1–19. Google ScholarGoogle Scholar
  26. Daniel Kahneman. 2011. Thinking, fast and slow. Penguin Books. isbn:978-0-14-103357-0 Google ScholarGoogle Scholar
  27. Ashwin Kalyan, Abhishek Mohta, Oleksandr Polozov, Dhruv Batra, Prateek Jain, and Sumit Gulwani. 2018. Neural-guided deductive search for real-time program synthesis from examples. arXiv preprint arXiv:1804.01186. Google ScholarGoogle Scholar
  28. Kite. 2020. Kite: AI-Powered Completions for JupyterLab. https://www.kite.com/integrations/jupyter/ Google ScholarGoogle Scholar
  29. Matthew Lee. 2020. Detecting Affective Flow States of Knowledge Workers Using Physiological Sensors. arXiv:2006.10635 [cs], Jun, arxiv:2006.10635 arXiv: 2006.10635. Google ScholarGoogle Scholar
  30. Sorin Lerner. 2020. Projection Boxes: On-the-fly Reconfigurable Visualization for Live Programming. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. ACM, 1–7. isbn:978-1-4503-6708-0 https://doi.org/10.1145/3313831.3376494 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de Masson d’Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J. Mankowitz, Esme Sutherland Robson, Pushmeet Kohli, Nando de Freitas, Koray Kavukcuoglu, and Oriol Vinyals. 2022. Competition-Level Code Generation with AlphaCode. https://doi.org/10.48550/ARXIV.2203.07814 Google ScholarGoogle Scholar
  32. Justin Lubin and Sarah E. Chasins. 2021. How statically-typed functional programmers write code. Proceedings of the ACM on Programming Languages, 5, OOPSLA (2021), Oct, 1–30. issn:2475-1421 https://doi.org/10.1145/3485532 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Smitha Milli, Falk Lieder, and Thomas L. Griffiths. 2021. A rational reinterpretation of dual-process theories. Cognition, 217 (2021), 104881. issn:0010-0277 https://doi.org/10.1016/j.cognition.2021.104881 Google ScholarGoogle ScholarCross RefCross Ref
  34. Anders Miltner, Sumit Gulwani, Vu Le, Alan Leung, Arjun Radhakrishna, Gustavo Soares, Ashish Tiwari, and Abhishek Udupa. 2019. On the fly synthesis of edit suggestions. Proceedings of the ACM on Programming Languages, 3, OOPSLA (2019), 1–29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Michael Muller. 2014. Curiosity, Creativity, and Surprise as Analytic Tools: Grounded Theory Method. Springer, 25–48. isbn:978-1-4939-0378-8 https://doi.org/10.1007/978-1-4939-0378-8_2 Google ScholarGoogle ScholarCross RefCross Ref
  36. Brad A. Myers, Amy J. Ko, Thomas D. LaToza, and YoungSeok Yoon. 2016. Programmers Are Users Too: Human-Centered Methods for Improving Programming Tools. Computer, 49, 7 (2016), Jul, 44–52. issn:0018-9162 https://doi.org/10.1109/MC.2016.200 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Wode Ni, Joshua Sunshine, Vu Le, Sumit Gulwani, and Titus Barik. 2021. reCode: A Lightweight Find-and-Replace Interaction in the IDE for Transforming Code by Example. In The 34th Annual ACM Symposium on User Interface Software and Technology. 258–269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Cyrus Omar, Ian Voysey, Ravi Chugh, and Matthew A. Hammer. 2019. Live Functional Programming with Typed Holes. Proc. ACM Program. Lang., 3, POPL (2019), Jan, 14:1–14:32. issn:2475-1421 https://doi.org/10.1145/3290327 Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, and Ramesh Karri. 2021. Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions. arXiv preprint arXiv:2108.09293. Google ScholarGoogle Scholar
  40. Hila Peleg, Roi Gabay, Shachar Itzhaky, and Eran Yahav. 2020. Programming with a Read-Eval-Synth Loop. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 159, nov, 30 pages. https://doi.org/10.1145/3428227 Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Hila Peleg, Sharon Shoham, and Eran Yahav. 2018. Programming Not Only by Example. In Proceedings of the 40th International Conference on Software Engineering (ICSE ’18). ACM, 1114–1124. isbn:978-1-4503-5638-1 https://doi.org/10.1145/3180155.3180189 tex.ids: pelegProgrammingNotOnly2018a event-place: Gothenburg, Sweden. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Nancy Pennington. 1987. Stimulus structures and mental representations in expert comprehension of computer programs. Cognitive Psychology, 19, 3 (1987), Jul, 295–341. issn:00100285 https://doi.org/10.1016/0010-0285(87)90007-7 Google ScholarGoogle ScholarCross RefCross Ref
  43. Veselin Raychev, Martin Vechev, and Eran Yahav. 2014. Code completion with statistical language models. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation. 419–428. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Advait Sarkar, Andrew D Gordon, Carina Negreanu, Christian Poelitz, Sruti Srinivasa Ragavan, and Ben Zorn. 2022. What is it like to program with artificial intelligence? arXiv preprint arXiv:2208.06213. Google ScholarGoogle Scholar
  45. Armando Solar-Lezama. 2013. Program sketching. International Journal on Software Tools for Technology Transfer, 15, 5–6 (2013), Oct, 475–495. issn:1433-2779, 1433-2787 https://doi.org/10.1007/s10009-012-0249-7 Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Klaas-Jan Stol, Paul Ralph, and Brian Fitzgerald. 2016. Grounded theory in software engineering research: a critical review and guidelines. In Proceedings of the 38th International Conference on Software Engineering (ICSE ’16). Association for Computing Machinery, 120–131. isbn:978-1-4503-3900-1 https://doi.org/10.1145/2884781.2884833 Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Anselm L. Strauss and Juliet Corbin. 1990. Basics of Qualitative Reseach: Grounded Theory Procedures and Techniques. SAGE Publications, Inc.. isbn:0-8039-3250-2 Google ScholarGoogle Scholar
  48. TabNine. 2018. TabNine: AI Assistant for Development Teams. https://www.tabnine.com/ Google ScholarGoogle Scholar
  49. Steven J. Taylor and Robert Bogdan. 1998. Introduction to qualitative research methods: A guidebook and resource, 3rd ed. John Wiley and Sons Inc. isbn:978-0-471-16868-3 Google ScholarGoogle Scholar
  50. Priyan Vaithilingam, Tianyi Zhang, and Elena Glassman. 2022. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In CHI Late-Breaking Work. Google ScholarGoogle Scholar
  51. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. Dec, https://doi.org/10.48550/arXiv.1706.03762 arXiv:1706.03762 [cs]. Google ScholarGoogle Scholar
  52. Regina Vollmeyer and Falko Rheinberg. 2006. Motivational Effects on Self-Regulated Learning with Different Tasks. Educational Psychology Review, 18, 3 (2006), Nov, 239–253. issn:1040-726X, 1573-336X https://doi.org/10.1007/s10648-006-9017-0 Google ScholarGoogle ScholarCross RefCross Ref
  53. Chenglong Wang, Yu Feng, Rastislav Bodik, Isil Dillig, Alvin Cheung, and Amy J Ko. 2021. Falx: Synthesis-Powered Visualization Authoring. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI ’21). Association for Computing Machinery, New York, NY, USA. Article 106, 15 pages. isbn:9781450380966 https://doi.org/10.1145/3411764.3445249 Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Eric Wastl. 2021. Advent of Code. https://adventofcode.com/2021 Google ScholarGoogle Scholar
  55. Justin D. Weisz, Michael Muller, Stephanie Houde, John Richards, Steven I. Ross, Fernando Martinez, Mayank Agarwal, and Kartik Talamadupula. 2021. Perfection Not Required? Human-AI Partnerships in Code Translation. In 26th International Conference on Intelligent User Interfaces. Association for Computing Machinery, New York, NY, USA. 402–412. isbn:9781450380171 https://doi.org/10.1145/3397481.3450656 Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, and Graham Neubig. 2020. Incorporating External Knowledge through Pre-training for Natural Language to Code Generation. https://doi.org/10.48550/ARXIV.2004.09015 Google ScholarGoogle Scholar
  57. Frank F. Xu, Bogdan Vasilescu, and Graham Neubig. 2021. In-IDE Code Generation from Natural Language: Promise and Challenges. https://doi.org/10.48550/ARXIV.2101.11149 Google ScholarGoogle Scholar
  58. Wojciech Zaremba, Greg Brockman, and OpenAI. 2021. Codex. https://openai.com/blog/openai-codex/ Google ScholarGoogle Scholar
  59. Tianyi Zhang, Zhiyang Chen, Yuanli Zhu, Priyan Vaithilingam, Xinyu Wang, and Elena L. Glassman. 2021. Interpretable Program Synthesis. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI ’21). Association for Computing Machinery, New York, NY, USA. Article 105, 16 pages. isbn:9781450380966 https://doi.org/10.1145/3411764.3445646 Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Tianyi Zhang, London Lowmanstone, Xinyu Wang, and Elena L. Glassman. 2020. Interactive Program Synthesis by Augmented Examples. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. Association for Computing Machinery, New York, NY, USA. 627–648. isbn:9781450375146 https://doi.org/10.1145/3379337.3415900 Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Xiangyu Zhou, Ras Bodik, Alvin Cheung, and Chenglong Wang. 2022. Synthesizing Analytical SQL Queries from Computation Demonstration. In PLDI. Google ScholarGoogle Scholar

Index Terms

  1. Grounded Copilot: How Programmers Interact with Code-Generating Models

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Article Metrics

        • Downloads (Last 12 months)1,087
        • Downloads (Last 6 weeks)285

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!