Abstract

Similar to a design process for designing graphical user interfaces, conversation designers often apply an iterative design process by defining a conversation flow, testing with users, reviewing user data, and improving the design. While it is possible to iterate on conversation design with existing chatbot prototyping tools, there still remain challenges in recruiting participants on-demand and collecting structured feedback on specific conversational components. These limitations hinder designers from running rapid iterations and making informed design decisions. We posit that involving a crowd in the conversation design process can address these challenges, and introduce ProtoChat, a crowd-powered chatbot design tool built to support the iterative process of conversation design. ProtoChat makes it easy to recruit crowd workers to test the current conversation within the design tool. ProtoChat's crowd-testing tool allows crowd workers to provide concrete and practical feedback and suggest improvements on specific parts of the conversation. With the data collected from crowd-testing, ProtoChat provides multiple types of visualizations to help designers analyze and revise their design. Through a three-day study with eight designers, we found that ProtoChat enabled an iterative design process for designing a chatbot. Designers improved their design by not only modifying the conversation design itself, but also adjusting the persona and getting UI design implications beyond the conversation design itself. The crowd responses were helpful for designers to explore user needs, contexts, and diverse response formats. With ProtoChat, designers can successfully collect concrete evidence from the crowd and make decisions to iteratively improve their conversation design.
- Ram G. Athreya, Axel-Cyrille Ngonga Ngomo, and Ricardo Usbeck. 2018. Enhancing Community Interactions with Data-Driven Chatbots--The DBpedia Chatbot. In Companion Proceedings of the The Web Conference 2018 (Lyon, France) (WWW '18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 143--146. https://doi.org/10.1145/3184558.3186964Google Scholar
Digital Library
- Yan Chen, Maulishree Pandey, Jean Y. Song, Walter S. Lasecki, and Steve Oney. 2020. Improving Crowd-Supported GUI Testing with Structural Guidance. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI '20). Association for Computing Machinery, New York, NY, USA, 1--13. https: //doi.org/10.1145/3313831.3376835Google Scholar
Digital Library
- Yoonseo Choi, Hyungyu Shin, Toni-Jan Keith Monserrat, Nyoungwoo Lee, Jeongeon Park, and Juho Kim. 2020. Supporting an Iterative Conversation Design Process. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI EA '20). Association for Computing Machinery, New York, NY, USA, 1--8. https://doi.org/10.1145/3334480.3382951Google Scholar
Digital Library
- Biplab Deka, Zifeng Huang, Chad Franzen, Jeffrey Nichols, Yang Li, and Ranjitha Kumar. 2017. ZIPT: Zero-Integration Performance Testing of Mobile App Designs. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology (Québec City, QC, Canada) (UIST '17). Association for Computing Machinery, New York, NY, USA, 727--736. https://doi.org/10.1145/3126594.3126647Google Scholar
Digital Library
- Fabio Guaiani and Henry Muccini. 2015. Crowd and Laboratory Testing Can They Co-Exist? An Exploratory Study. In Proceedings of the Second International Workshop on CrowdSourcing in Software Engineering (Florence, Italy) (CSI-SE '15). IEEE Press, 32--37.Google Scholar
Digital Library
- Henry Muccini. 2014. Is Crowd Testing (relevant) for Software Engineers? Keynote presentation at AST 2014, the 9th IEEE/ACM International Workshop on Automation of Software Test. http://www.slideshare.net/henry.muccini/is-crowd-testing-relevant-for-software-engineers.Google Scholar
- Tianran Hu, Anbang Xu, Zhe Liu, Quanzeng You, Yufan Guo, Vibha Sinha, Jiebo Luo, and Rama Akkiraju. 2018. Touch Your Heart: A Tone-aware Chatbot for Customer Care on Social Media. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC, Canada) (CHI '18). ACM, New York, NY, USA, Article 415, 12 pages. https://doi.org/10.1145/3173574.3173989Google Scholar
Digital Library
- Ting-Hao Kenneth Huang, Amos Azaria, and Jeffrey P. Bigham. 2016. InstructableCrowd: Creating IF-THEN Rules via Conversations with the Crowd. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems (San Jose, California, USA) (CHI EA '16). Association for Computing Machinery, New York, NY, USA, 1555--1562. https://doi.org/10.1145/2851581.2892502Google Scholar
- Ting-Hao Kenneth Huang, Amos Azaria, Oscar J. Romero, and Jeffrey P. Bigham. 2019. InstructableCrowd: Creating IF-THEN Rules for Smartphones via Conversations with the Crowd. Human Computation (2019), 101--131.Google Scholar
- Patrik Jonell, Mattias Bystedt, Fethiye Irmak Dogan, Per Fallgren, Jonas Ivarsson, Marketa Slukova, Ulme Wennberg, José Lopes, Johan Boye, and Gabriel Skantze. 2018. Fantom: A crowdsourced social chatbot using an evolving dialog graph. Proc. Alexa Prize (2018).Google Scholar
- PatrikJonell,PerFallgren,FethiyeIrmakDo?an,JoséLopes,UlmeWennberg,andGabrielSkantze.2019.Crowdsourcing a Self-Evolving Dialog Graph. In Proceedings of the 1st International Conference on Conversational User Interfaces (Dublin, Ireland) (CUI '19). Association for Computing Machinery, New York, NY, USA, Article 14, 8 pages. https: //doi.org/10.1145/3342775.3342790Google Scholar
- Aniket Kittur, Ed H. Chi, and Bongwon Suh. 2008. Crowdsourcing User Studies with Mechanical Turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Florence, Italy) (CHI '08). Association for Computing Machinery, New York, NY, USA, 453--456. https://doi.org/10.1145/1357054.1357127Google Scholar
Digital Library
- Meng-Chieh Ko and Zih-Hong Lin. 2018. CardBot: A Chatbot for Business Card Management. In Proceedings of the 23rd International Conference on Intelligent User Interfaces Companion (Tokyo, Japan) (IUI '18 Companion). ACM, New York, NY, USA, Article 5, 2 pages. https://doi.org/10.1145/3180308.3180313Google Scholar
Digital Library
- Rafal Kocielnik, Daniel Avrahami, Jennifer Marlow, Di Lu, and Gary Hsieh. 2018. Designing for Workplace Reflection: A Chat and Voice-Based Conversational Agent. In Proceedings of the 2018 Designing Interactive Systems Conference (Hong Kong, China) (DIS '18). Association for Computing Machinery, New York, NY, USA, 881--894. https://doi.org/ 10.1145/3196709.3196784Google Scholar
Digital Library
- Rafal Kocielnik, Lillian Xiao, Daniel Avrahami, and Gary Hsieh. 2018. Reflection Companion: A Conversational System for Engaging Users in Reflection on Physical Activity. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 2, Article 70 (July 2018), 26 pages. https://doi.org/10.1145/3214273Google Scholar
Digital Library
- Steven Komarov, Katharina Reinecke, and Krzysztof Z. Gajos. 2013. Crowdsourcing Performance Evaluations of User Interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Paris, France) (CHI '13). Association for Computing Machinery, New York, NY, USA, 207--216. https://doi.org/10.1145/2470654.2470684Google Scholar
Digital Library
- Walter S. Lasecki, Juho Kim, Nick Rafter, Onkur Sen, Jeffrey P. Bigham, and Michael S. Bernstein. 2015. Apparition: Crowdsourced User Interfaces That Come to Life as You Sketch Them. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (Seoul, Republic of Korea) (CHI '15). Association for Computing Machinery, New York, NY, USA, 1925--1934. https://doi.org/10.1145/2702123.2702565Google Scholar
- Walter S. Lasecki, Rachel Wesley, Jeffrey Nichols, Anand Kulkarni, James F. Allen, and Jeffrey P. Bigham. 2013. Chorus: A Crowd-Powered Conversational Assistant. In Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology (St. Andrews, Scotland, United Kingdom) (UIST '13). Association for Computing Machinery, New York, NY, USA, 151--162. https://doi.org/10.1145/2501988.2502057Google Scholar
Digital Library
- Sang Won Lee, Rebecca Krosnick, Sun Young Park, Brandon Keelean, Sach Vaidya, Stephanie D. O'Keefe, and Walter S. Lasecki. 2018. Exploring Real-Time Collaboration in Crowd-Powered Systems Through a UI Design Tool. Proc. ACM Hum.-Comput. Interact. 2, CSCW, Article 104(Nov. 2018), 23 pages. https://doi.org/10.1145/3274373Google Scholar
- Niklas Leicht, Ivo Blohm, and Jan Marco Leimeister. 2017. Leveraging the Power of the Crowd for Software Testing. IEEE Softw. 34, 2 (March 2017), 62--69. https://doi.org/10.1109/MS.2017.37Google Scholar
Digital Library
- Toby Li and Oriana Riva. 2018. Kite: Building Conversational Bots from Mobile Apps. 96--109. https://doi.org/10.1145/ 3210240.3210339Google Scholar
- Xulei Liang, Rong Ding, Mengxiang Lin, Lei Li, Xingchi Li, and Song Lu. 2017. CI-Bot: A Hybrid Chatbot Enhanced by Crowdsourcing. In Web and Big Data, Shaoxu Song, Matthias Renz, and Yang-Sae Moon (Eds.). Springer International Publishing, Cham, 195--203.Google Scholar
- Kurt Luther, Jari-Lee Tolentino, Wei Wu, Amy Pavel, Brian P. Bailey, Maneesh Agrawala, Björn Hartmann, and Steven P. Dow. 2015. Structuring, Aggregating, and Evaluating Crowdsourced Design Critique. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work amp; Social Computing (Vancouver, BC, Canada) (CSCW '15). Association for Computing Machinery, New York, NY, USA, 473--485. https://doi.org/10.1145/2675133.2675283Google Scholar
- Michael Nebeling, Stefania Leone, and Moira C Norrie. 2012. Crowd sourced web engineering and design. In International Conference on Web Engineering. Springer, 31--45.Google Scholar
- Michael Nebeling, Maximilian Speicher, Michael Grossniklaus, and Moira C Norrie. 2012. Crowdsourced web site evaluation with crowd study. In International Conference on Web Engineering. Springer, 494--497.Google Scholar
Digital Library
- Michael Nebeling, Maximilian Speicher, and Moira C. Norrie. 2013. Crowd Study: General Toolkit for Crowdsourced Evaluation of Web Interfaces. In Proceedings of the 5th ACM SIGCHI Symposium on Engineering Interactive Computing Systems (London, United Kingdom) (EICS '13). Association for Computing Machinery, New York, NY, USA, 255--264. https://doi.org/10.1145/2494603.2480303Google Scholar
- Jonas Oppenlaender, Thanassis Tiropanis, and Simo Hosio. 2020. CrowdUI: Supporting Web Design with the Crowd. Proc. ACM Hum.-Comput. Interact. 4, EICS, Article 76(June 2020), 28 pages. https://doi.org/10.1145/3394978Google Scholar
Digital Library
- Archana Prasad, Sean Blagsvedt, Tej Pochiraju, and Indrani Medhi Thies. 2019. Dara: A Chatbot to Help Indian Artists and Designers Discover International Opportunities. In Proceedings of the 2019 on Creativity and Cognition (San Diego, CA, USA) (C&C '19). ACM, New York, NY, USA, 626--632. https://doi.org/10.1145/3325480.3326577Google Scholar
Digital Library
- Joao Sedoc, Daphne Ippolito, Arun Kirubarajan, Jai Thirani, Lyle Ungar, and Chris Callison-Burch. 2019. ChatEval: A Tool for Chatbot Evaluation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations) (Minneapolis, Minnesota). Association for Computational Linguistics, 60--65. http://aclweb.org/anthology/N19-4011Google Scholar
- Nikita Spirin, Motahhare Eslami, Jie Ding, Pooja Jain, Brian Bailey, and Karrie Karahalios. 2014. Searching for Design Examples with Crowdsourcing. In Proceedings of the 23rd International Conference on World Wide Web (Seoul, Korea) (WWW '14 Companion). Association for Computing Machinery, New York, NY, USA, 381--382. https://doi.org/10. 1145/2567948.2577371Google Scholar
Digital Library
- Junjie Wang, Mingyang Li, Song Wang, Tim Menzies, and Qing Wang. 2019. Images don't lie: Duplicate crowdtesting reports detection with screenshot information. Information and Software Technology 110 (Jun 2019), 139--155. https: //doi.org/10.1016/j.infsof.2019.03.003Google Scholar
- Anbang Xu, Shih-Wen Huang, and Brian Bailey. 2014. Voyant: Generating Structured Feedback on Visual Designs Using a Crowd of Non-Experts. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work amp; Social Computing (Baltimore, Maryland, USA) (CSCW '14). Association for Computing Machinery, New York, NY, USA, 1433--1444. https://doi.org/10.1145/2531602.2531604Google Scholar
Digital Library
- Anbang Xu, Zhe Liu, Yufan Guo, Vibha Sinha, and Rama Akkiraju. 2017. A New Chatbot for Customer Service on Social Media. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI '17). ACM, New York, NY, USA, 3506--3510. https://doi.org/10.1145/3025453.3025496Google Scholar
Digital Library
- Zhou Hao Yu, Ziyu Xu, Alan W. Black, and Alexander I. Rudnicky. 2016. Chatbot Evaluation and Database Expansion via Crowdsourcing.Google Scholar
- Alvin Yuan, Kurt Luther, Markus Krause, Sophie Isabel Vennix, Steven P Dow, and Bjorn Hartmann. 2016. Almost an Expert: The Effects of Rubrics and Expertise on Perceived Value of Crowdsourced Design Critiques. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work amp; Social Computing (San Francisco, California, USA) (CSCW '16). Association for Computing Machinery, New York, NY, USA, 1005--1017. https://doi.org/10.1145/ 2818048.2819953Google Scholar
Digital Library
- Li Zhou, Jianfeng Gao, Di Li, and Heung-Yeung Shum. 2018. The Design and Implementation of XiaoIce, an Empathetic Social Chatbot. CoRR abs/1812.08989 (2018). arXiv:1812.08989 http://arxiv.org/abs/1812.08989Google Scholar
Index Terms
ProtoChat: Supporting the Conversation Design Process with Crowd Feedback
Recommendations
Supporting an Iterative Conversation Design Process
CHI EA '20: Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing SystemsConversation design is an essential step in building a chatbot. Much like visual user interface design, conversation design benefits from prototyping and user testing to allow for conversation exploration and improvement. However, it can be overwhelming ...
ProtoChat: Supporting the Conversation Design Process with Crowd Feedback
CSCW '20 Companion: Conference Companion Publication of the 2020 on Computer Supported Cooperative Work and Social ComputingConversation designers use iterative design to create, test, and improve conversation flows. While it is possible to iterate conversation design with existing chatbot prototyping tools, challenges remain such as recruiting participants and collecting ...
Leveraging the Crowd to Support the Conversation Design Process
CUI '20: Proceedings of the 2nd Conference on Conversational User InterfacesBuilding a chatbot with human-like conversation capabilities is essential for users to feel more natural in task completion. Many designers try to collect human conversation data and apply them into a chatbot conversation, aiming that it could work like ...






Comments