Abstract
Powered by recent advances in code-generating models, AI assistants like Github Copilot promise to change the face of programming forever. But what is this new face of programming? We present the first grounded theory analysis of how programmers interact with Copilot, based on observing 20 participants—with a range of prior experience using the assistant—as they solve diverse programming tasks across four languages. Our main finding is that interactions with programming assistants are bimodal: in acceleration mode, the programmer knows what to do next and uses Copilot to get there faster; in exploration mode, the programmer is unsure how to proceed and uses Copilot to explore their options. Based on our theory, we provide recommendations for improving the usability of future AI programming assistants.
- Matej Balog, Alexander L Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2016. Deepcoder: Learning to write programs. arXiv preprint arXiv:1611.01989.
Google Scholar
- Shraddha Barke, Michael B. James, and Nadia Polikarpova. 2023. Grounded Copilot: How Programmers Interact with Code-Generating Models. https://doi.org/10.5281/zenodo.7713789
Google Scholar
Digital Library
- Mohammad Bavarian, Heewoo Jun, Nikolas Tezak, John Schulman, Christine McLeavey, Jerry Tworek, and Mark Chen. 2022. Efficient Training of Language Models to Fill in the Middle. Jul, https://doi.org/10.48550/arXiv.2207.14255 arXiv:2207.14255 [cs].
Google Scholar
- Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, and Amanda Askell. 2020. Language models are few-shot learners. Advances in neural information processing systems, 33 (2020), 1877–1901.
Google Scholar
- Antony Bryant and Kathy Charmaz. 2007. The SAGE Handbook of Grounded Theory. SAGE Publications Ltd. isbn:978-1-4129-2346-0 https://doi.org/10.4135/9781848607941
Google Scholar
Cross Ref
- Donal E Carlston. 2013. Dual-Process Theories. isbn:978-0-19-998468-8 http://public.ebookcentral.proquest.com/choice/publicfullrecord.aspx?p=1336453
Google Scholar
- Jeff Carver. 2004. The Impact of Background and Experience on Software Inspections. Empirical Softw. Engg., 9, 3 (2004), sep, 259–262. issn:1382-3256 https://doi.org/10.1023/B:EMSE.0000027786.04555.97
Google Scholar
Digital Library
- Sarah E. Chasins, Maria Mueller, and Rastislav Bodik. 2018. Rousillon: Scraping Distributed Hierarchical Web Data. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology (UIST ’18). Association for Computing Machinery, New York, NY, USA. 963–975. isbn:9781450359481 https://doi.org/10.1145/3242587.3242661
Google Scholar
Digital Library
- Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, and Greg Brockman. 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374.
Google Scholar
- Koen Claessen and John Hughes. 2000. QuickCheck: a lightweight tool for random testing of Haskell programs. In Proceedings of the fifth ACM SIGPLAN international conference on Functional programming (ICFP’00). Association for Computing Machinery, 268–279. isbn:978-1-58113-202-1 https://doi.org/10.1145/351240.351266
Google Scholar
Digital Library
- Michael Coblenz, Gauri Kambhatla, Paulette Koronkevich, Jenna L. Wise, Celeste Barnaby, Joshua Sunshine, Jonathan Aldrich, and Brad A. Myers. 2021. PLIERS: A Process that Integrates User-Centered Methods into Programming Language Design. ACM Transactions on Computer-Human Interaction, 28, 4 (2021), Jul, 28:1–28:53. issn:1073-0516 https://doi.org/10.1145/3452379
Google Scholar
Digital Library
- Mihaly Csikszentmihalyi. 2014. Flow and the Foundations of Positive Psychology. Springer Netherlands. isbn:978-94-017-9087-1 https://doi.org/10.1007/978-94-017-9088-8
Google Scholar
Cross Ref
- Françoise Détienne and Frank Bott. 2001. Software design—cognitive aspects. Springer-Verlag. isbn:978-1-85233-253-2
Google Scholar
- Ian Drosos, Titus Barik, Philip J. Guo, Robert DeLine, and Sumit Gulwani. 2020. Wrex: A Unified Programming-by-Example Interaction for Synthesizing Readable Code for Data Scientists. Association for Computing Machinery, New York, NY, USA. 1–12. isbn:9781450367080 https://doi.org/10.1145/3313831.3376442
Google Scholar
Digital Library
- Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova. 2021. LooPy: Interactive Program Synthesis with Control Structures. Proc. ACM Program. Lang., 5, OOPSLA (2021), Article 153, oct, 29 pages. https://doi.org/10.1145/3485530
Google Scholar
Digital Library
- Kasra Ferdowsifard, Allen Ordookhanians, Hila Peleg, Sorin Lerner, and Nadia Polikarpova. 2020. Small-Step Live Programming by Example. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. Association for Computing Machinery, New York, NY, USA. 614–626. isbn:9781450375146 https://doi.org/10.1145/3379337.3415869
Google Scholar
Digital Library
- Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer, and Mike Lewis. 2022. InCoder: A Generative Model for Code Infilling and Synthesis. Apr, https://doi.org/10.48550/arXiv.2204.05999 arXiv:2204.05999 [cs].
Google Scholar
- Nat Friedman. 2021. https://github.blog/2021-06-29-introducing-github-copilot-ai-pair-programmer/
Google Scholar
- Barney G. Glaser and Anselm L. Strauss. 1967. The discovery of grounded theory: strategies for qualitative research (5. paperback print ed.). Aldine Transaction. isbn:978-0-202-30260-7
Google Scholar
- Elena L. Glassman, Jeremy Scott, Rishabh Singh, Philip J. Guo, and Robert C. Miller. 2015. OverCode: Visualizing Variation in Student Solutions to Programming Problems at Scale. ACM Trans. Comput.-Hum. Interact., 22, 2 (2015), Article 7, mar, 35 pages. issn:1073-0516 https://doi.org/10.1145/2699751
Google Scholar
Digital Library
- Sumit Gulwani. 2011. Automating string processing in spreadsheets using input-output examples. ACM Sigplan Notices, 46, 1 (2011), 317–330.
Google Scholar
Digital Library
- Daya Guo, Alexey Svyatkovskiy, Jian Yin, Nan Duan, Marc Brockschmidt, and Miltiadis Allamanis. 2021. Learning to Complete Code with Sketches. In International Conference on Learning Representations.
Google Scholar
- Michael B. James, Zheng Guo, Ziteng Wang, Shivani Doshi, Hila Peleg, Ranjit Jhala, and Nadia Polikarpova. 2020. Digging for Fold: Synthesis-Aided API Discovery for Haskell. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 205, nov, 27 pages. https://doi.org/10.1145/3428273
Google Scholar
Digital Library
- Dhanya Jayagopal, Justin Lubin, and Sarah E Chasins. 2022. Exploring the Learnability of Program Synthesizers by Novice Programmers. 15.
Google Scholar
- Ellen Jiang, Edwin Toh, Alejandra Molina, Kristen Olson, Claire Kayacik, Aaron Donsbach, Carrie J Cai, and Michael Terry. 2022. Discovering the Syntax and Strategies of Natural Language Programming with Generative Language Models. In CHI Conference on Human Factors in Computing Systems. 1–19.
Google Scholar
- Daniel Kahneman. 2011. Thinking, fast and slow. Penguin Books. isbn:978-0-14-103357-0
Google Scholar
- Ashwin Kalyan, Abhishek Mohta, Oleksandr Polozov, Dhruv Batra, Prateek Jain, and Sumit Gulwani. 2018. Neural-guided deductive search for real-time program synthesis from examples. arXiv preprint arXiv:1804.01186.
Google Scholar
- Kite. 2020. Kite: AI-Powered Completions for JupyterLab. https://www.kite.com/integrations/jupyter/
Google Scholar
- Matthew Lee. 2020. Detecting Affective Flow States of Knowledge Workers Using Physiological Sensors. arXiv:2006.10635 [cs], Jun, arxiv:2006.10635 arXiv: 2006.10635.
Google Scholar
- Sorin Lerner. 2020. Projection Boxes: On-the-fly Reconfigurable Visualization for Live Programming. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. ACM, 1–7. isbn:978-1-4503-6708-0 https://doi.org/10.1145/3313831.3376494
Google Scholar
Digital Library
- Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de Masson d’Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J. Mankowitz, Esme Sutherland Robson, Pushmeet Kohli, Nando de Freitas, Koray Kavukcuoglu, and Oriol Vinyals. 2022. Competition-Level Code Generation with AlphaCode. https://doi.org/10.48550/ARXIV.2203.07814
Google Scholar
- Justin Lubin and Sarah E. Chasins. 2021. How statically-typed functional programmers write code. Proceedings of the ACM on Programming Languages, 5, OOPSLA (2021), Oct, 1–30. issn:2475-1421 https://doi.org/10.1145/3485532
Google Scholar
Digital Library
- Smitha Milli, Falk Lieder, and Thomas L. Griffiths. 2021. A rational reinterpretation of dual-process theories. Cognition, 217 (2021), 104881. issn:0010-0277 https://doi.org/10.1016/j.cognition.2021.104881
Google Scholar
Cross Ref
- Anders Miltner, Sumit Gulwani, Vu Le, Alan Leung, Arjun Radhakrishna, Gustavo Soares, Ashish Tiwari, and Abhishek Udupa. 2019. On the fly synthesis of edit suggestions. Proceedings of the ACM on Programming Languages, 3, OOPSLA (2019), 1–29.
Google Scholar
Digital Library
- Michael Muller. 2014. Curiosity, Creativity, and Surprise as Analytic Tools: Grounded Theory Method. Springer, 25–48. isbn:978-1-4939-0378-8 https://doi.org/10.1007/978-1-4939-0378-8_2
Google Scholar
Cross Ref
- Brad A. Myers, Amy J. Ko, Thomas D. LaToza, and YoungSeok Yoon. 2016. Programmers Are Users Too: Human-Centered Methods for Improving Programming Tools. Computer, 49, 7 (2016), Jul, 44–52. issn:0018-9162 https://doi.org/10.1109/MC.2016.200
Google Scholar
Digital Library
- Wode Ni, Joshua Sunshine, Vu Le, Sumit Gulwani, and Titus Barik. 2021. reCode: A Lightweight Find-and-Replace Interaction in the IDE for Transforming Code by Example. In The 34th Annual ACM Symposium on User Interface Software and Technology. 258–269.
Google Scholar
Digital Library
- Cyrus Omar, Ian Voysey, Ravi Chugh, and Matthew A. Hammer. 2019. Live Functional Programming with Typed Holes. Proc. ACM Program. Lang., 3, POPL (2019), Jan, 14:1–14:32. issn:2475-1421 https://doi.org/10.1145/3290327
Google Scholar
Digital Library
- Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, and Ramesh Karri. 2021. Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions. arXiv preprint arXiv:2108.09293.
Google Scholar
- Hila Peleg, Roi Gabay, Shachar Itzhaky, and Eran Yahav. 2020. Programming with a Read-Eval-Synth Loop. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 159, nov, 30 pages. https://doi.org/10.1145/3428227
Google Scholar
Digital Library
- Hila Peleg, Sharon Shoham, and Eran Yahav. 2018. Programming Not Only by Example. In Proceedings of the 40th International Conference on Software Engineering (ICSE ’18). ACM, 1114–1124. isbn:978-1-4503-5638-1 https://doi.org/10.1145/3180155.3180189 tex.ids: pelegProgrammingNotOnly2018a event-place: Gothenburg, Sweden.
Google Scholar
Digital Library
- Nancy Pennington. 1987. Stimulus structures and mental representations in expert comprehension of computer programs. Cognitive Psychology, 19, 3 (1987), Jul, 295–341. issn:00100285 https://doi.org/10.1016/0010-0285(87)90007-7
Google Scholar
Cross Ref
- Veselin Raychev, Martin Vechev, and Eran Yahav. 2014. Code completion with statistical language models. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation. 419–428.
Google Scholar
Digital Library
- Advait Sarkar, Andrew D Gordon, Carina Negreanu, Christian Poelitz, Sruti Srinivasa Ragavan, and Ben Zorn. 2022. What is it like to program with artificial intelligence? arXiv preprint arXiv:2208.06213.
Google Scholar
- Armando Solar-Lezama. 2013. Program sketching. International Journal on Software Tools for Technology Transfer, 15, 5–6 (2013), Oct, 475–495. issn:1433-2779, 1433-2787 https://doi.org/10.1007/s10009-012-0249-7
Google Scholar
Digital Library
- Klaas-Jan Stol, Paul Ralph, and Brian Fitzgerald. 2016. Grounded theory in software engineering research: a critical review and guidelines. In Proceedings of the 38th International Conference on Software Engineering (ICSE ’16). Association for Computing Machinery, 120–131. isbn:978-1-4503-3900-1 https://doi.org/10.1145/2884781.2884833
Google Scholar
Digital Library
- Anselm L. Strauss and Juliet Corbin. 1990. Basics of Qualitative Reseach: Grounded Theory Procedures and Techniques. SAGE Publications, Inc.. isbn:0-8039-3250-2
Google Scholar
- TabNine. 2018. TabNine: AI Assistant for Development Teams. https://www.tabnine.com/
Google Scholar
- Steven J. Taylor and Robert Bogdan. 1998. Introduction to qualitative research methods: A guidebook and resource, 3rd ed. John Wiley and Sons Inc. isbn:978-0-471-16868-3
Google Scholar
- Priyan Vaithilingam, Tianyi Zhang, and Elena Glassman. 2022. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In CHI Late-Breaking Work.
Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. Dec, https://doi.org/10.48550/arXiv.1706.03762 arXiv:1706.03762 [cs].
Google Scholar
- Regina Vollmeyer and Falko Rheinberg. 2006. Motivational Effects on Self-Regulated Learning with Different Tasks. Educational Psychology Review, 18, 3 (2006), Nov, 239–253. issn:1040-726X, 1573-336X https://doi.org/10.1007/s10648-006-9017-0
Google Scholar
Cross Ref
- Chenglong Wang, Yu Feng, Rastislav Bodik, Isil Dillig, Alvin Cheung, and Amy J Ko. 2021. Falx: Synthesis-Powered Visualization Authoring. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI ’21). Association for Computing Machinery, New York, NY, USA. Article 106, 15 pages. isbn:9781450380966 https://doi.org/10.1145/3411764.3445249
Google Scholar
Digital Library
- Eric Wastl. 2021. Advent of Code. https://adventofcode.com/2021
Google Scholar
- Justin D. Weisz, Michael Muller, Stephanie Houde, John Richards, Steven I. Ross, Fernando Martinez, Mayank Agarwal, and Kartik Talamadupula. 2021. Perfection Not Required? Human-AI Partnerships in Code Translation. In 26th International Conference on Intelligent User Interfaces. Association for Computing Machinery, New York, NY, USA. 402–412. isbn:9781450380171 https://doi.org/10.1145/3397481.3450656
Google Scholar
Digital Library
- Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, and Graham Neubig. 2020. Incorporating External Knowledge through Pre-training for Natural Language to Code Generation. https://doi.org/10.48550/ARXIV.2004.09015
Google Scholar
- Frank F. Xu, Bogdan Vasilescu, and Graham Neubig. 2021. In-IDE Code Generation from Natural Language: Promise and Challenges. https://doi.org/10.48550/ARXIV.2101.11149
Google Scholar
- Wojciech Zaremba, Greg Brockman, and OpenAI. 2021. Codex. https://openai.com/blog/openai-codex/
Google Scholar
- Tianyi Zhang, Zhiyang Chen, Yuanli Zhu, Priyan Vaithilingam, Xinyu Wang, and Elena L. Glassman. 2021. Interpretable Program Synthesis. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI ’21). Association for Computing Machinery, New York, NY, USA. Article 105, 16 pages. isbn:9781450380966 https://doi.org/10.1145/3411764.3445646
Google Scholar
Digital Library
- Tianyi Zhang, London Lowmanstone, Xinyu Wang, and Elena L. Glassman. 2020. Interactive Program Synthesis by Augmented Examples. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. Association for Computing Machinery, New York, NY, USA. 627–648. isbn:9781450375146 https://doi.org/10.1145/3379337.3415900
Google Scholar
Digital Library
- Xiangyu Zhou, Ras Bodik, Alvin Cheung, and Chenglong Wang. 2022. Synthesizing Analytical SQL Queries from Computation Demonstration. In PLDI.
Google Scholar
Index Terms
Grounded Copilot: How Programmers Interact with Code-Generating Models
Recommendations
How Statically-Typed Functional Programmers Author Code
CHI EA '21: Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing SystemsHow working statically-typed functional programmers author code is largely understudied. And yet, a better understanding of developer practices could pave the way for the design of more useful and usable tooling, more ergonomic languages, and more ...
Exploring the Learnability of Program Synthesizers by Novice Programmers
UIST '22: Proceedings of the 35th Annual ACM Symposium on User Interface Software and TechnologyModern program synthesizers are increasingly delivering on their promise of lightening the burden of programming by automatically generating code, but little research has addressed how we can make such systems learnable to all. In this work, we ask: ...
Do programmer pairs make different mistakes than solo programmers?
Objective: Comparison of program defects caused by programmer pairs and solo developers. Design: Analysis of programs developed during two counter balanced experiments. Setting: Programming lab at University. Experimental units: 42 programs developed by ...






Comments