skip to main content
research-article

The Role of Conversational Grounding in Supporting Symbiosis Between People and Digital Assistants

Published:29 May 2020Publication History
Skip Abstract Section

Abstract

In "smart speaker'' digital assistant systems such as Google Home, there is no visual user interface, so people must learn about the system's capabilities and limitations by experimenting with different questions and commands. However, many new users give up quickly and limit their use to a few simple tasks. This is a problem for both the user and the system. Users who stop trying out new things cannot learn about new features and functionality, and the system receives less data upon which to base future improvements. Symbiosis---a mutually beneficial relationship---between AI systems like digital assistants and people is an important aspect of developing systems that are partners to humans and not just tools. In order to better understand requirements for symbiosis, we investigated the relationship between the types of digital assistant responses and users' subsequent questions, focusing on identifying interactions that were discouraging to users when speaking with a digital assistant. We conducted a user study with 20 participants who completed a series of information seeking tasks using the Google Home, and analyzed transcripts using a method based on applied conversation analysis. We found that the most common response from the Google Home, a version of "Sorry, I'm not sure how to help'', provided no feedback for participants to build on when forming their next question. However, responses that provided somewhat strange but tangentially related answers were actually more helpful for conversational grounding, which extended the interaction. We discuss the connection between grounding and symbiosis, and present recommendations for requirements for forming partnerships with digital assistants.

Skip Supplemental Material Section

Supplemental Material

References

  1. Saleema Amershi, Saleema Weld, DaAmershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N. Bennett, Kori Inkpen, Jaime Teevan, Ruth Kikin-Gil, and Eric Horvitz. 2019. Guidelines for Human-AI Interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). Association for Computing Machinery, New York, NY, USA, 1--13. https://doi.org/10.1145/3290605.3300233Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Tawfiq Ammari, Jofish Kaye, Janice Y. Tsai, and Frank Bentley. 2019. Music, Search, and IoT: How People (Really) Use Voice Assistants. ACM Trans. Comput.-Hum. Interact., Vol. 26, 3 (Apr. 2019). https://doi.org/10.1145/3311956Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Zahra Ashktorab, Mohit Jain, Q. Vera Liao, and Justin D. Weisz. 2019. Resilient Chatbots: Repair Strategy Preferences for Conversational Breakdowns. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). Association for Computing Machinery, New York, NY, USA, 1--12. https://doi.org/10.1145/3290605.3300484Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Erin Beneteau, Olivia K Richards, Mingrui Zhang, Julie A Kientz, Jason Yip, and Alexis Hiniker. 2019. Communication Breakdowns Between Families and Alexa. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). Association for Computing Machinery, New York, NY, USA, 1--13. https://doi.org/10.1145/3290605.3300473Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Frank Bentley, Chris Luvogt, Max Silverman, Rushani Wirasinghe, Brooke White, and Danielle Lottridge. 2018. Understanding the Long-Term Use of Smart Speaker Assistants. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 2, 3 (Sep. 2018). https://doi.org/10.1145/3264901Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dan Bohus and Alexander I. Rudnicky. 2005. Sorry, I didn't catch that! An investigation of non-understanding errors and recovery strategies. In 6th SIGdial Workshop on Discourse and Dialogue. 128--143. https://www.isca-speech.org/archive_open/sigdial6/sgd6_128.htmlGoogle ScholarGoogle Scholar
  7. S.E. Brennan. 1991. Conversation With and Through Computers. User Modeling and User-Adapted Interaction 1 (1991), 67--86. https://doi.org/10.1007/BF00158952Google ScholarGoogle ScholarCross RefCross Ref
  8. Susan E Brennan. 1998. The grounding problem in conversations with and through computers. Social and cognitive approaches to interpersonal communication (1998), 201--225.Google ScholarGoogle Scholar
  9. Yun-Nung Chen, Asli Celikyilmaz, and Dilek Hakkani-Tür. 2017. Deep Learning for Dialogue Systems. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics-Tutorial Abstracts,. Association for Computational Linguistics, Vancouver, Canada, 8--14. https://www.aclweb.org/anthology/P17--5004Google ScholarGoogle ScholarCross RefCross Ref
  10. Minji Cho, Sang-su Lee, and Kun-Pyo Lee. 2019. Once a Kind Friend is Now a Thing: Understanding How Conversational Agents at Home are Forgotten. In Proceedings of the 2019 on Designing Interactive Systems Conference (DIS '19). Association for Computing Machinery, New York, NY, USA, 1557--1569. https://doi.org/10.1145/3322276.3322332Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Herbert H. Clark. 1996. Using Language .Cambridge University Press. https://doi.org/10.1017/CBO9780511620539Google ScholarGoogle Scholar
  12. Herbert H Clark and Susan E Brennan. 1991. Grounding in Communication. (1991), 127--149. https://doi.org/10.1037/10096-006Google ScholarGoogle Scholar
  13. Herbert H Clark and Edward F Schaefer. 1989. Contributing to Discourse. Cognitive Science, Vol. 13, 2 (1989), 259--294. https://doi.org/10.1016/0364-0213(89)90008--6Google ScholarGoogle ScholarCross RefCross Ref
  14. Eric Enge. 2017. Rating the Smarts of the Digital Personal Assistants. https://blogs.perficientdigital.com/2017/04/27/1-rating-the-smarts-of-the-digital-personal-assistants/Google ScholarGoogle Scholar
  15. Joel E. Fischer, Stuart Reeves, Martin Porcheron, and Rein Ove Sikveland. 2019. Progressivity for Voice Interface Design. In Proceedings of the 1st International Conference on Conversational User Interfaces (CUI '19). Association for Computing Machinery, New York, NY, USA, Article 26, 8 pages. https://doi.org/10.1145/3342775.3342788Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Susan R Fussell and Robert M Krauss. 1989. The effects of intended audience on message production and comprehension: Reference in a common ground framework. Journal of experimental social psychology, Vol. 25, 3 (1989), 203--219.Google ScholarGoogle ScholarCross RefCross Ref
  17. Radhika Garg and Christopher Moreno. 2019. Understanding Motivators, Constraints, and Practices of Sharing Internet of Things. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 3, 2, Article 44 (2019), 21 pages. https://doi.org/10.1145/3328915Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Radhika Garg and Subhasree Sengupta. 2019. “When You Can Do It, Why Can't I?”: Racial and Socioeconomic Differences in Family Technology Use and Non-Use. Proc. ACM Hum.-Comput. Interact., Vol. 3, CSCW, Article 63 (2019), 22 pages. https://doi.org/10.1145/3359165Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Nigel Gilbert, Robin Wooffitt, and Norman Fraser. 1990. Organising Computer Talk. In Computers and Conversation, Paul Luff, Nigel Gilbert, and David Frohlich (Eds.). Academic Press, Chapter 11, 235--257. https://doi.org/10.1016/B978-0-08-050264--9.50016--6Google ScholarGoogle Scholar
  20. Jonathan Grudin. 2017. From Tool to Partner: The Evolution of Human-Computer Interaction. Synthesis Lectures on Human-Centered Interaction, Vol. 10, 1 (2017), i--183. https://doi.org/10.2200/S00745ED1V01Y201612HCI035Google ScholarGoogle ScholarCross RefCross Ref
  21. Jonathan Grudin and Richard Jacques. 2019. Chatbots, Humbots, and the Quest for Artificial General Intelligence. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). Association for Computing Machinery, New York, NY, USA, Article 209, 11 pages. https://doi.org/10.1145/3290605.3300439Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Drew Harwell. 2018. Why some accents don't work on Alexa or Google Home. https://www.washingtonpost.com/graphics/2018/business/alexa-does-not-understand-your-accent/Google ScholarGoogle Scholar
  23. Ryuichiro Higashinaka, Kotaro Funakoshi, Masahiro Araki, Hiroshi Tsukahara, Yuka Kobayashi, and Masahiro Mizukami. 2015. Towards Taxonomy of Errors in Chat-Oriented Dialogue Systems. In Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics, 87--95. https://doi.org/10.18653/v1/W15--4611Google ScholarGoogle ScholarCross RefCross Ref
  24. Eric Horvitz. 1999. Principles of Mixed-Initiative User Interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '99). Association for Computing Machinery, New York, NY, USA, 159--166. https://doi.org/10.1145/302979.303030Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Jiepu Jiang, Wei Jeng, and Daqing He. 2013. How Do Users Respond to Voice Input Errors? Lexical and Phonetic Query Reformulation in Voice Search. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '13). Association for Computing Machinery, New York, NY, USA, 143--152. https://doi.org/10.1145/2484028.2484092Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Rafal Kocielnik, Saleema Amershi, and Paul N Bennett. 2019. Will You Accept an Imperfect AI? Exploring Designs for Adjusting End-user Expectations of AI Systems. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, Article 411. https://doi.org/10.1145/3290605.3300641Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Robert M Krauss and Susan R Fussell. 1991. Perspective-taking in communication: Representations of others' knowledge in reference. Social cognition, Vol. 9, 1 (1991), 2--24. https://doi.org/10.1521/soco.1991.9.1.2Google ScholarGoogle Scholar
  28. Lenneke Kuijer and Elisa Giaccardi. 2018. Co-performance: Conceptualizing the role of artificial agency in the design of everyday Life. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18). Association for Computing Machinery, New York, NY, USA, Article 125, 13 pages. https://doi.org/10.1145/3173574.3173699Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. C. R. Licklider. 1960. Man-Computer Symbiosis. IRE Transactions on Human Factors in Electronics, Vol. HFE-1, 1 (March 1960), 4--11. https://doi.org/10.1109/THFE2.1960.4503259Google ScholarGoogle Scholar
  30. Gustavo López, Luis Quesada, and Luis A. Guerrero. 2017. Alexa vs. Siri vs. Cortana vs. Google Assistant: A Comparison of Speech-Based Natural User Interfaces. In Advances in Human Factors and Systems Interaction. AHFE 2017, Isabel L. Nunes (Ed.), Vol. 592. Springer, Cham, 241--250. https://doi.org/10.1007/978--3--319--60366--7_23Google ScholarGoogle Scholar
  31. Ewa Luger and Abigail Sellen. 2016. Like Having a Really Bad PA: the Gulf Between User Expectation and Experience of Conversational Agents. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '16). 5286--5297. https://doi.org/10.1145/2858036.2858288Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Gary Marchionini. 1995. Information Seeking in Electronic Environments .Cambridge University Press. https://doi.org/10.1017/CBO9780511626388Google ScholarGoogle Scholar
  33. Matthew Marge and Alexander I Rudnicky. 2019. Miscommunication Detection and Recovery in Situated Human--Robot Dialogue. ACM Trans. Interact. Intell. Syst., Vol. 9, 1, Article 3 (2019). https://doi.org/10.1145/3237189Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Michael McTear. 2008. Handling Miscommunication: Why Bother? In Recent trends in Discourse and Dialogue, Dybkjær L. and Minker W. (Eds.). Text, Speech and Language Technology, Vol. 39. Springer, Dordrecht, 101--122. https://doi.org/10.1007/978--1--4020--6821--8_5Google ScholarGoogle Scholar
  35. Michael McTear. 2018. Conversational Modelling for Chatbots: Current Approaches and Future Directions. In Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2018, André Berton, Udo Haiber, and Wolfgang Minker (Eds.). TUDpress, Dresden, 175--185.Google ScholarGoogle Scholar
  36. Michael McTear, Zoraida Callejas, and David Griol. 2016. The conversational interface: Talking to smart devices .Springer. https://doi.org/10.1007/978--3--319--32967--3Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Raveesh Meena, José Lopes, Gabriel Skantze, and Joakim Gustafson. 2015. Automatic Detection of Miscommunication in Spoken Dialogue Systems. In Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue. 354--363. https://www.aclweb.org/anthology/W15--4647Google ScholarGoogle ScholarCross RefCross Ref
  38. Grégoire Mesnil, Yann Dauphin, Kaisheng Yao, Yoshua Bengio, Li Deng, Dilek Hakkani-Tur, Xiaodong He, Larry Heck, Gokhan Tur, Dong Yu, et al. 2014. Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 23, 3 (2014), 530--539. https://doi.org/10.1109/TASLP.2014.2383614Google ScholarGoogle ScholarCross RefCross Ref
  39. Roger K Moore. 2017. Is Spoken Language All-or-Nothing? Implications for Future Speech-Based Human-Machine Interaction. In Dialogues with Social Robots, Jokinen K. and Wilcock G. (Eds.). Lecture Notes in Electrical Engineering, Vol. 427. Springer, Singapore, 281--291. https://doi.org/10.1007/978--981--10--2585--3_22Google ScholarGoogle Scholar
  40. M Granger Morgan, Baruch Fischhoff, Ann Bostrom, and Cynthia J Atman. 2002. Risk communication: A mental models approach .Cambridge University Press.Google ScholarGoogle Scholar
  41. Chelsea Myers, Anushay Furqan, Jessica Nebolsky, Karina Caro, and Jichen Zhu. 2018. Patterns for How Users Overcome Obstacles in Voice User Interfaces. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '18). Association for Computing Machinery, New York, NY, USA, Article 6. https://doi.org/10.1145/3173574.3173580Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Katashi Nagao. 2019. Symbiosis between Humans and Artificial Intelligence. In Artificial Intelligence Accelerates Human Learning. Springer, 135--151. https://doi.org/10.1007/978--981--13--6175--3_6Google ScholarGoogle Scholar
  43. Jakob Nielsen. 1993. Usability Engineering .Academic Press, Inc.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Sarah Perez. 2018. Google says it sold a Google Home device every second since October 19. https://techcrunch.com/2018/12/28/smart-speakers-hit-critical-mass-in-2018/Google ScholarGoogle Scholar
  45. Martin Porcheron, Joel E Fischer, Stuart Reeves, and Sarah Sharples. 2018. Voice Interfaces in Everyday Life. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '18). Association for Computing Machinery, New York, NY, USA, Article 640. https://doi.org/10.1145/3173574.3174214Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Martin Porcheron, Joel E Fischer, and Sarah Sharples. 2017. Do animals have accents? Talking with Agents in Multi-Party Conversation. In Proceedings of the ACM Conference on Computer Supported Cooperative Work and Social Computing. 207--219. https://doi.org/10.1145/2998181.2998298Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Antonio Roque and David Traum. 2008. Degrees of Grounding Based on Evidence of Understanding. In Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue. 54--63. https://www.aclweb.org/anthology/W08-0107/Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Johnny Salda na. 2015. The Coding Manual for Qualitative Researchers .Sage.Google ScholarGoogle Scholar
  49. Emanuel A Schegloff. 2007. Sequence Organization in Interaction: A Primer. In Conversation Analysis. Vol. 1. Cambridge University Press.Google ScholarGoogle Scholar
  50. Alex Sciuto, Arnita Saini, Jodi Forlizzi, and Jason I Hong. 2018. Hey Alexa, What's Up? A Mixed-Methods Studies of In-Home Conversational Agent Usage. In Proceedings of the Designing Interactive Systems Conference (DIS '18). Association for Computing Machinery, New York, NY, USA, 857--868. https://doi.org/10.1145/3196709.3196772Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Phoebe Sengers and Bill Gaver. 2006. Staying Open to Interpretation: Engaging Multiple Meanings in Design and Evaluation. In Proceedings of the 6th conference on Designing Interactive systems (DIS '06). Association for Computing Machinery, New York, NY, USA, 99--108. https://doi.org/10.1145/1142405.1142422Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Gabriel Skantze. 2005. Exploring Human Error Recovery Strategies: Implications for Spoken Dialogue Systems. Speech Communication, Vol. 45, 3 (2005), 325--341. https://doi.org/10.1016/j.specom.2004.11.005Google ScholarGoogle ScholarCross RefCross Ref
  53. Chairs Constantine Stephanidis, Gavriel Salvendy, Members of the Group Margherita Antona, Jessie Y. C. Chen, Jianming Dong, Vincent G. Duffy, Xiaowen Fang, Cali Fidopiastis, Gino Fragomeni, Limin Paul Fu, Yinni Guo, Don Harris, Andri Ioannou, Kyeong ah (Kate) Jeong, Shin'ichi Konomi, Heidi Krömker, Masaaki Kurosu, James R. Lewis, Aaron Marcus, Gabriele Meiselwitz, Abbas Moallem, Hirohiko Mori, Fiona Fui-Hoon Nah, Stavroula Ntoa, Pei-Luen Patrick Rau, Dylan Schmorrow, Keng Siau, Norbert Streitz, Wentao Wang, Sakae Yamamoto, Panayiotis Zaphiris, and Jia Zhou. 2019. Seven HCI Grand Challenges. International Journal of Human-Computer Interaction, Vol. 35, 14 (2019), 1229--1269. https://doi.org/10.1080/10447318.2019.1619259Google ScholarGoogle ScholarCross RefCross Ref
  54. Lucy Suchman. 2007. Human-machine reconfigurations: Plans and situated actions .Cambridge university press. https://doi.org/10.1017/CBO9780511808418Google ScholarGoogle Scholar
  55. Paul Ten Have. 2007. Doing Conversation Analysis 2nd ed.). Sage. https://doi.org/10.4135/9781849208895Google ScholarGoogle Scholar
  56. Jason Wu, Karan Ahuja, Richard Li, Victor Chen, and Jeffrey Bigham. 2019. ScratchThat: Supporting Command-Agnostic Speech Repair in Voice-Driven Assistants. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 3, 2, Article 63 (2019). https://doi.org/10.1145/3328934Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The Role of Conversational Grounding in Supporting Symbiosis Between People and Digital Assistants

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!