skip to main content
research-article

Robust Unsupervised Gaze Calibration Using Conversation and Manipulation Attention Priors

Published:27 January 2022Publication History
Skip Abstract Section

Abstract

Gaze estimation is a difficult task, even for humans. However, as humans, we are good at understanding a situation and exploiting it to guess the expected visual focus of attention of people, and we usually use this information to retrieve people’s gaze. In this article, we propose to leverage such situation-based expectation about people’s visual focus of attention to collect weakly labeled gaze samples and perform person-specific calibration of gaze estimators in an unsupervised and online way. In this context, our contributions are the following: (i) we show how task contextual attention priors can be used to gather reference gaze samples, which is a cumbersome process otherwise; (ii) we propose a robust estimation framework to exploit these weak labels for the estimation of the calibration model parameters; and (iii) we demonstrate the applicability of this approach on two human-human and human-robot interaction settings, namely conversation and manipulation. Experiments on three datasets validate our approach, providing insights on the priors effectiveness and on the impact of different calibration models, particularly the usefulness of taking head pose into account.

REFERENCES

  1. [1] Admoni Henny and Scassellati Brian. 2017. Social eye gaze in human-robot interaction: A review. Journal of Human-Robot Interaction 6, 1 (2017), 2553. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Ba Sileye and Odobez Jean-Marc. 2011. Multiperson visual focus of attention from head pose and meeting contextual cues. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 1 (2011), 101–116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Bohus Dan and Horvitz Eric. 2010. Computational Models for Multiparty Turn Taking. Technical Report MSR-TR 2010-115. Microsoft Research.Google ScholarGoogle Scholar
  4. [4] Chen Mon Chu, Anderson John R., and Sohn Myeong Ho. 2001. What can a mouse cursor tell us more? Correlation of eye/mouse movements on web browsing. In CHI’01 Extended Abstracts on Human Factors in Computing Systems. ACM, New York, NY, 281–282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Chen Zhaokang and Shi Bertram. 2020. GEDDnet: A network for gaze estimation with dilation and decomposition. arXiv preprint arXiv:2001.09284 (2020).Google ScholarGoogle Scholar
  6. [6] Funes Kenneth, Monay Florent, and Odobez Jean-Marc. 2014. EYEDIAP: A database for the development and evaluation of gaze estimation algorithms from RGB and RGB-D cameras. In Proceedings of the ACM Symposium on Eye Tracking Research and Applications. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Funes Kenneth, Nguyen Laurent, Gatica-Perez Daniel, and Odobez Jean-Marc. 2013. A semi-automated system for accurate gaze coding in natural dyadic interactions. In Proceedings of the International Conference on Multimodal Interaction. ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Funes Kenneth and Odobez Jean-Marc. 2012. Gaze estimation from multimodal kinect data. In Proceedings of the Conference in Computer Vision and Pattern Recognition Workshop on Gesture Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Funes Kenneth and Odobez Jean-Marc. 2016. Gaze estimation in the 3D space using RGB-D sensors. International Journal of Computer Vision 118 (2016), 194216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Guestrin Elias Daniel and Eizenman Moshe. 2006. General theory of remote gaze estimation using the pupil center and corneal reflections. IEEE Transactions on BioMedical Engineering 53, 6 (2006), 1124–1133.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Isaac Linda, Vrijsen Janna, Rinck Mike, Speckens Anne, and Becker Eni. 2014. Shorter gaze duration for happy faces in current but not remitted depression: Evidence from eye movements. Psychiatry Research 218, 1–2 (2014), 7986.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Johansson Roland, Westling Goran, Backstrom Anders, and Flanagan Randall. 2001. Eye-hand coordination in object manipulation. Journal of Neuroscience 21, 17 (2001), 69176932.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Katsuki Fumi and Constantinidis Christos. 2014. Bottom-up and top-down attention: Different processes and overlapping neural systems. Neuroscientist 20, 5 (2014), 509521.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] King Davis E.. 2009. Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research 10 (2009), 17551758. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Klotz David, Wienke Johannes, Peltason Julia, Wrede Britta, Wrede Sebastian, Khalidov Vasil, and Odobez Jean-Marc. 2011. Engagement-based multi-party dialog with a humanoid robot. InProceedings of the SIGDIAL 2011 Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Kluttz Nathan, Mayes Brandon, West Roger, and Kerby Dave. 2009. The effect of head turn on the perception of gaze. Vision Research 49, 15 (2009), 19791993.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Krafka Kyle, Khosla Aditya, Kellnhofer Petr, Kannan Harini, Bhandarkar Suchendra, Matusik Wojciech, and Torralba Antonio. 2016. Eye tracking for everyone. In Proceedings of the Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Land Michael. 2009. Vision, eye movements, and natural behavior. Visual Neuroscience 26, 1 (2009), 5162.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Langton Stephen, Honeyman Helen, and Tessler Emma. 2004. The influence of head contour and nose angle on the perception of eye-gaze direction. Perception & Psychophysics 66, 5 (2004), 752771.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Lappi Otto, Lehtonen Esko, Pekkanen Jami, and Itkonen Teemu. 2013. Beyond the tangent point: Gaze targets in naturalistic driving. Journal of Vision 13, 13 (2013), 11.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Lee Won June, Kim Ji Hong, Shin Yong Un, Hwang Sunjin, and Lim Han Woong. 2019. Differences in eye movement range based on age and gaze direction. Eye 33, 7 (2019), 11451151.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Linden Erik, Sjostrand Jonas, and Proutiere Alexandre. 2019. Learning to personalize in appearance-based gaze tracking. In Proceedings of the International Conference on Computer Vision Workshops.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Liu Gang, Yu Yu, and Odobez Jean-Marc. 2021. A differential approach for gaze estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 3 (2021), 1092–1099.Google ScholarGoogle Scholar
  24. [24] Masame Ken. 1990. Perception of where a person is looking: Overestimation and underestimation of gaze direction. Tohoku Psychologica Folia 49 (1990), 3341.Google ScholarGoogle Scholar
  25. [25] Masko David. 2017. Calibration in Eye Tracking Using Transfer Learning. Degree Project in Computer Science and Engineering, KTH.Google ScholarGoogle Scholar
  26. [26] Masse Benoit, Ba Sileye, and Horaud Radu. 2018. Tracking gaze and visual focus of attention of people involved in social interaction. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 11 (2018), 27112724.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Morimoto Carlos and Mimica Marcio. 2005. Eye gaze tracking techniques for interactive applications. Computer Vision and Image Understanding 98, 1 (2005), 424. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Muller Philipp, Buschek Daniel, Huang Michael Xuelin, and Bulling Andreas. 2019. Reducing calibration drift in mobile eye trackers by exploiting mobile phone usage. In Proceedings of the ACM Symposium on Eye Tracking Research and Applications. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Muralidhar Skanda, Nguyen Laurent Son, Frauendorfer Denise, Odobez Jean-Marc, Mast Marianne Schmid, and Gatica-Perez Daniel. 2016. Training on the job: Behavioral analysis of job interviews in hospitality. In Proceedings of the International Conference on Multimodal Interactions. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Muralidhar Skanda, Siegfried Remy, Odobez Jean-Marc, and Gatica-Perez Daniel. 2018. Facing employers and customers: What do gaze and expressions tell about soft skills? In Proceedings of the International Conference on Mobile and Ubiquitous Multimedia. ACM, New York, NY, 121126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Nystrom Marcus, Andersson Richard, Holmqvist Kenneth, and Weijer Joost van de. 2013. The influence of calibration method and eye physiology on eyetracking data quality. Behavior Research Methods 45, 1 (2013), 272288.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Oertel Catharine, Funes Kenneth A., Sheikhi Samira, Odobez Jean-Marc, and Gustafson Joakim. 2014. Who will get the grant? A multimodal corpus for the analysis of conversational behaviours in group interviews. In Proceedings of the Workshop on Understanding and Modeling Multiparty, Multimodal Interactions. 2732. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Oertel Catharine, Wlodarczak Marcin, Edlund Jens, Wagner Petra, and Gustafson Joakim. 2013. Gaze patterns in turn-taking. In Proceedings of the Annual Conference of the International Speech Communication Association.Google ScholarGoogle Scholar
  34. [34] Palmero Cristina, Selva Javier, Bagheri Mohammad Ali, and Escalera Sergio. 2018. Recurrent CNN for 3D gaze estimation using appearance and shape cues. In Proceedings of the 29th British Machine Vision Conference.Google ScholarGoogle Scholar
  35. [35] Park Seonwook, Zhang Xucong, Bulling Andreas, and Hilliges Otmar. 2018. Learning to find eye region landmarks for remote gaze estimation in unconstrained settings. In Proceedings of the ACM Symposium on Eye Tracking Research and Applications. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Pfeuffer Ken, Vidal Melodie, Turner Jayson, Bulling Andreas, and Gellersen Hans. 2013. Pursuit calibration: Making gaze calibration less tedious and more flexible. In Proceedings of the Symposium on User Interface Software and Technology. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Pi Jimin and Shi Bertram E.. 2019. Task-embedded online eye-tracker calibration for improving robustness to head motion. In Proceedings of the ACM Symposium on Eye Tracking Research and Applications. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Rousseeuw Peter. 1984. Least median of squares regression. Journal of the American Statistical Association 79, 388 (1984), 871880.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Santini Thiago, Fuhl Wolfgang, and Kasneci Enkelejda. 2017. CalibMe: Fast and unsupervised eye tracker calibration for gaze-based pervasive human-computer interaction. In Proceedings of the CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 25942605. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Sheikhi Samira and Odobez Jean-Marc. 2015. Combining dynamic head pose-gaze mapping with the robot conversational state for attention recognition in human-robot interactions. Pattern Recognition Letters 66 (2015), 8190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Sidenmark Ludwig and Gellersen Hans. 2019. Eye, head and torso coordination during gaze shifts in virtual reality. ACM Transactions on Computer-Human Interaction 27, 1 (2019), 140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Sidenmark Ludwig and Lundström Anders. 2019. Gaze behaviour on interacted objects during hand interaction in virtual reality for eye tracking calibration. In Proceedings of the 11th ACM Symposium on Eye Tracking Research and Applications. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Siegfried Remy, Aminian Bozorgmehr, and Odobez Jean-Marc. 2020. ManiGaze: A dataset for evaluating remote gaze estimator in object manipulation situations. In Proceedings of the ACM Symposium on Eye Tracking Research and Applications. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Siegfried Remy, Yu Yu, and Odobez Jean-Marc. 2017. Towards the use of social interaction conventions as prior for gaze model adaptation. In Proceedings of the International Conference on Multimodal Interaction. ACM, New York, NY, 154162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Smith Brian, Yin Qi, Feiner Steven, and Nayar Shree. 2013. Gaze locking: Passive eye contact detection for human-object interaction. In Proceedings of the ACM Symposium on User Interface Software and Technology. ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. [46] Sugano Yusuke, Matsushita Yasuyuki, and Sato Yoichi. 2014. Learning-by-synthesis for appearance-based 3D gaze estimation. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 18211828. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. [47] Sugano Yusuke, Matsushita Yasuyuki, Sato Yoichi, and Koike Hideki. 2015. Appearance-based gaze estimation with online calibration from mouse operations. IEEE Transactions on Human-Machine Systems 45, 6 (2015), 750760.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Wang Kang, Zhao Rui, Su Hui, and Ji Qiang. 2019. Generalizing eye tracking with Bayesian adversarial learning. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 1190711916.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Yu Yu, Funes Kenneth, and Odobez Jean-Marc. 2018. HeadFusion: 360 degree head pose tracking combining 3D morphable model and 3D reconstruction. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 11 (2018).Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Yu Yu, Liu Gang, and Odobez Jean-Marc. 2019. Improving few-shot user-specific gaze adaptation via gaze redirection synthesis. In Proceedings of the Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Zhang Xucong, Sugano Yusuke, and Bulling Andreas. 2019. Evaluation of appearance-based methods and implications for gaze-based applications. In Proceedings of the CHI Conference on Human Factors in Computing Systems. ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Zhang Xucong, Sugano Yusuke, Fritz Mario, and Bulling Andreas. 2015. Appearance-based gaze estimation in the wild. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 45114520.Google ScholarGoogle ScholarCross RefCross Ref
  53. [53] Zhang Xucong, Sugano Yusuke, Fritz Mario, and Bulling Andreas. 2017. MPIIGaze: Real-world dataset and deep appearance-based gaze estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 1 (2017), 162175.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Robust Unsupervised Gaze Calibration Using Conversation and Manipulation Attention Priors

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Multimedia Computing, Communications, and Applications
            ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 1
            January 2022
            517 pages
            ISSN:1551-6857
            EISSN:1551-6865
            DOI:10.1145/3505205
            Issue’s Table of Contents

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 27 January 2022
            • Accepted: 1 June 2021
            • Revised: 1 May 2021
            • Received: 1 November 2020
            Published in tomm Volume 18, Issue 1

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Refereed
          • Article Metrics

            • Downloads (Last 12 months)91
            • Downloads (Last 6 weeks)8

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Full Text

          View this article in Full Text.

          View Full Text

          HTML Format

          View this article in HTML Format .

          View HTML Format
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!