10.1145/3025453.3026033acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedings
research-article
Public Access
Honorable Mention

Looking Coordinated: Bidirectional Gaze Mechanisms for Collaborative Interaction with Virtual Characters

ABSTRACT

Successful collaboration relies on the coordination and alignment of communicative cues. In this paper, we present mechanisms of bidirectional gaze - the coordinated production and detection of gaze cues - by which a virtual character can coordinate its gaze cues with those of its human user. We implement these mechanisms in a hybrid stochastic/heuristic model synthesized from data collected in human-human interactions. In three lab studies wherein a virtual character instructs participants in a sandwich-making task, we demonstrate how bidirectional gaze can lead to positive outcomes in error rate, completion time, and the agent's ability to produce quick, effective nonverbal references. The first study involved an on-screen agent and the participant wearing eye-tracking glasses. The second study demonstrates that these positive outcomes can be achieved using head-pose estimation in place of full eye tracking. The third study demonstrates that these effects also transfer into virtual-reality interactions.

References

  1. Sean Andrist, Wesley Collier, Michael Gleicher, Bilge Mutlu, and David Shaffer. 2015. Look together: analyzing gaze coordination with epistemic network analysis. Frontiers in psychology 6, 1016 (2015), 1--15. Google ScholarGoogle ScholarCross RefCross Ref
  2. Sean Andrist, Bilge Mutlu, and Michael Gleicher. 2013. Conversational gaze aversion for virtual agents. In Intelligent Virtual Agents. Springer, 249--262. Google ScholarGoogle ScholarCross RefCross Ref
  3. Sean Andrist, Tomislav Pejsa, Bilge Mutlu, and Michael Gleicher. 2012. Designing effective gaze mechanisms for virtual agents. In Proc. of CHI. ACM, 705--714. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Gérard Bailly, Stephan Raidt, and Frédéric Elisei. 2010. Gaze, conversational agents and face-to-face communication. Speech Communication 52, 6 (2010), 598--612. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ellen Gurman Bard, Robin Hill, Manabu Arai, and ME Foster. 2009. Referring and gaze alignment: Accessibility is alive and well in situated dialogue. In Proc. of CogSci ('09). Cognitive Science Society, 1246--1251.Google ScholarGoogle Scholar
  6. Nikolaus Bee, Johannes Wagner, Elisabeth André, Thurid Vogt, Fred Charles, David Pizzi, and Marc Cavazza. 2010. Discovering eye gaze behavior during human-agent conversation in an interactive storytelling application. In Proc. of ICML-MLMI ('10). ACM, 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jean-David Boucher, Ugo Pattacini, Amelie Lelong, Gerard Bailly, Frederic Elisei, Sascha Fagel, Peter Ford Dominey, and Jocelyne Ventre-Dominey. 2012. I reach faster when I see you look: Gaze effects in human-human and human-robot face-to-face cooperation. Frontiers in Neurorobotics 6 (2012). Google ScholarGoogle ScholarCross RefCross Ref
  8. Susan E Brennan, Xin Chen, Christopher A Dickinson, Mark B Neider, and Gregory J Zelinsky. 2008. Coordinating cognition: The costs and benefits of shared gaze during collaborative search. Cognition 106, 3 (2008), 1465--1477.Google ScholarGoogle ScholarCross RefCross Ref
  9. Susan E Brennan, JE Hanna, GJ Zelinsky, and Kelly J. Savietta. 2012. Eye gaze cues for coordination in collaborative tasks. In Proc. of CSCW DUET 2012 Workshop, Vol. 9.Google ScholarGoogle Scholar
  10. Andrew G Brooks and Cynthia Breazeal. 2006. Working with robots and objects: Revisiting deictic reference for achieving spatial common ground. In Proc. of HRI ('06). ACM, 297--304.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sarah Brown-Schmidt, Ellen Campana, and Michael K. Tanenhaus. 2005. Real-time reference resolution by naïve participants during a task-based unscripted conversation. Approaches to studying world-situated language use: Bridging the language-as-product and language-as-action traditions (2005), 153--171.Google ScholarGoogle Scholar
  12. Ellen Campana, Jason Baldridge, John Dowding, Beth Ann Hockey, Roger W Remington, and Leland S. Stone. 2001. Using eye movements to determine referents in a spoken dialogue system. In Proceedings of the 2001 Workshop on Perceptive User Interfaces. ACM, 1--5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Herbert H Clark. 1996. Using language. Cambridge university press.Google ScholarGoogle Scholar
  14. Herbert H Clark. 2005. Coordinating with each other in a material world. Discourse studies 7, 4--5 (2005), 507--525.Google ScholarGoogle Scholar
  15. Herbert H Clark and Susan E Brennan. 1991. Grounding in communication. Perspectives on socially shared cognition 13, 1991 (1991), 127--149.Google ScholarGoogle Scholar
  16. Herbert H Clark and Meredyth A Krych. 2004. Speaking while monitoring addressees for understanding. Journal of Memory and Language 50, 1 (2004), 62--81. Google ScholarGoogle ScholarCross RefCross Ref
  17. Herbert H Clark and Deanna Wilkes-Gibbs. 1986. Referring as a collaborative process. Cognition 22, 1 (1986), 1--39. Google ScholarGoogle ScholarCross RefCross Ref
  18. Sidney D'Mello, Andrew Olney, Claire Williams, and Patrick Hays. 2012. Gaze tutor: A gaze-reactive intelligent tutoring system. International Journal of Human-Computer Studies 70, 5 (2012), 377--398. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Mica R Endsley. 1995. Toward a theory of situation awareness in dynamic systems. Human Factors: The Journal of the Human Factors and Ergonomics Society 37, 1 (1995), 32--64.Google ScholarGoogle ScholarCross RefCross Ref
  20. S Garrido-Jurado, Rafael Muñoz-Salinas, Francisco José Madrid-Cuevas, and Manuel Jesús Marín-Jiménez. 2014. Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recognition 47, 6 (2014), 2280--2292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Darren Gergle and Alan T Clark. 2011. See what I'm saying?: Using dyadic mobile eye tracking to study collaborative reference. In Proc. of CSCW ('11). ACM, 435--444.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Darren Gergle, Robert E Kraut, and Susan R Fussell. 2013. Using visual information for grounding and awareness in collaborative tasks. Human-Computer Interaction 28, 1 (2013), 1--39.Google ScholarGoogle Scholar
  23. Zenzi M Griffin. 2004. The eyes are right when the mouth is wrong. Psychological Science 15, 12 (2004), 814--821. Google ScholarGoogle ScholarCross RefCross Ref
  24. Joy E Hanna and Susan E Brennan. 2007. Speakers' eye gaze disambiguates referring expressions early during face-to-face conversation. Journal of Memory and Language 57, 4 (2007), 596--615. Google ScholarGoogle ScholarCross RefCross Ref
  25. Mary Hayhoe and Dana Ballard. 2005. Eye movements in natural behavior. Trends in cognitive sciences 9, 4 (2005), 188--194. Google ScholarGoogle ScholarCross RefCross Ref
  26. Graeme Hirst, Susan McRoy, Peter Heeman, Philip Edmonds, and Diane Horton. 1994. Repairing conversational misunderstandings and non-understandings. Speech communication 15, 3 (1994), 213--229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Mohammed Moshiul Hoque and Kaushik Deb. 2012. Robotic system for making eye contact pro-actively with humans. In Proc. of ICECE ('12). IEEE, 125--128.Google ScholarGoogle ScholarCross RefCross Ref
  28. Chien-Ming Huang and Bilge Mutlu. 2016. Anticipatory robot control for efficient human-robot collaboration. In Proc. of HRI ('16). IEEE, 83--90.Google ScholarGoogle ScholarCross RefCross Ref
  29. George Julnes and Lawrence B Mohr. 1989. Analysis of no-difference findings in evaluation research. Evaluation Review 13, 6 (1989), 628--655. Google ScholarGoogle ScholarCross RefCross Ref
  30. B.J. Lance and S.C. Marsella. 2010. The Expressive Gaze Model: Using Gaze to Express Emotion. Computer Graphics and Applications, IEEE 30, 4 (2010), 62--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Michael Land, Neil Mennie, and Jennifer Rusted. 1999. The roles of vision and eye movements in the control of activities of daily living. Perception 28, 11 (1999), 1311--1328. Google ScholarGoogle ScholarCross RefCross Ref
  32. Gregor Mehlmann, Markus Häring, Kathrin Janowski, Tobias Baur, Patrick Gebhard, and Elisabeth André. 2014. Exploring a model of gaze for grounding in multimodal HRI. In Proc. of ICMI ('14). ACM, 247--254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Antje Meyer, Femke van der Meulen, and Adrian Brooks. 2004. Eye movements during speech planning: talking about present and remembered objects. Visual Cognition 11, 5 (2004), 553--576. Google ScholarGoogle ScholarCross RefCross Ref
  34. AJung Moon, Daniel M Troniak, Brian Gleeson, Matthew KXJ Pan, Minhua Zheng, Benjamin A Blumer, Karon MacLean, and Elizabeth A Croft. 2014. Meet me where i'm gazing: how shared attention gaze affects human-robot handover timing. In Proc. of HRI ('14). ACM, 334--341.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Bilge Mutlu, Takayuki Kanda, Jodi Forlizzi, Jessica Hodgins, and Hiroshi Ishiguro. 2012. Conversational gaze mechanisms for humanlike robots. ACM Transactions on Interactive Intelligent Systems (TiiS) 1, 2 (2012), 12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Mark B Neider, Xin Chen, Christopher A Dickinson, Susan E Brennan, and Gregory J Zelinsky. 2010. Coordinating spatial referencing using shared gaze. Psychonomic bulletin & review 17, 5 (2010), 718--724.Google ScholarGoogle Scholar
  37. David G Novick, Brian Hansen, and Karen Ward. 1996. Coordinating turn-taking with gaze. In Proc. of ICSLP ('96), Vol. 3. IEEE, 1888--1891.Google ScholarGoogle ScholarCross RefCross Ref
  38. Tomislav Pejsa, Sean Andrist, Michael Gleicher, and Bilge Mutlu. 2015. Gaze and Attention Management for Embodied Conversational Agents. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 1 (2015), 3.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. C. Pelachaud and M. Bilvi. 2003. Modelling gaze behavior for conversational agents. In Intelligent Virtual Agents. Springer, 93--100. Google ScholarGoogle ScholarCross RefCross Ref
  40. Christopher Peters, Stylianos Asteriadis, and Kostas Karpouzis. 2010. Investigating shared attention with a virtual agent using a gaze-based interface. Journal on Multimodal User Interfaces 3, 1--2 (2010), 119--130.Google ScholarGoogle ScholarCross RefCross Ref
  41. Daniel C Richardson and Rick Dale. 2005. Looking to understand: The coupling between speakers' and listeners' eye movements and its relationship to discourse comprehension. Cognitive science 29, 6 (2005), 1045--1060. Google ScholarGoogle ScholarCross RefCross Ref
  42. Daniel C Richardson, Rick Dale, and Natasha Z Kirkham. 2007. The art of conversation is coordination common ground and the coupling of eye movements during dialogue. Psychological science 18, 5 (2007), 407--413.Google ScholarGoogle Scholar
  43. Daniel C Richardson, Rick Dale, and John M Tomlinson. 2009. Conversation, gaze coordination, and beliefs about visual context. Cognitive Science 33, 8 (2009), 1468--1482.Google ScholarGoogle ScholarCross RefCross Ref
  44. Kenji Sakita, Koichi Ogawara, Shinji Murakami, Kentaro Kawamura, and Katsushi Ikeuchi. 2004. Flexible cooperation between human and robot by interpreting human intention from gaze information. In Proc. of IROS ('04), Vol. 1. IEEE, 846--851. Google ScholarGoogle ScholarCross RefCross Ref
  45. Michael F Schober. 1993. Spatial perspective-taking in conversation. Cognition 47, 1 (1993), 1--24.Google ScholarGoogle ScholarCross RefCross Ref
  46. Natalie Sebanz, Harold Bekkering, and Gunther Knoblich. 2006. Joint action: bodies and minds moving together. Trends in cognitive sciences 10, 2 (2006), 70--76. Google ScholarGoogle ScholarCross RefCross Ref
  47. Gabriel Skantze, Anna Hjalmarsson, and Catharine Oertel. 2014. Turn-taking, feedback and joint attention in situated human--robot interaction. Speech Communication 65 (2014), 50--66. Google ScholarGoogle ScholarCross RefCross Ref
  48. Michael K Tanenhaus, Michael J Spivey-Knowlton, Kathleen M Eberhard, and Julie C Sedivy. 1995. Integration of visual and linguistic information in spoken language comprehension. Science 268, 5217 (1995), 1632--1634.Google ScholarGoogle Scholar
  49. Cristen Torrey, Aaron Powers, Susan R Fussell, and Sara Kiesler. 2007. Exploring adaptive dialogue based on a robot's awareness of human gaze and task progress. In Proc. of HRI ('07). ACM, 247--254.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Weilie Yi and Dana Ballard. 2009. Recognizing behavior in hand-eye coordination patterns. International Journal of Humanoid Robotics 6, 03 (2009), 337--359. Google ScholarGoogle ScholarCross RefCross Ref
  51. Yuichiro Yoshikawa, Kazuhiko Shinozawa, Hiroshi Ishiguro, Norihiro Hagita, and Takanori Miyamoto. 2006. Responsive Robot Gaze to Interaction Partner.. In Proc. of RSS ('06). Google ScholarGoogle ScholarCross RefCross Ref
  52. Christopher J Zahn. 1984. A reexamination of conversational repair. Communications Monographs 51, 1 (1984), 56--66.Google ScholarGoogle ScholarCross RefCross Ref

Supplemental Material

pn4791-file3.mp4

Index Terms

  1. Looking Coordinated

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!