skip to main content
research-article

Look at Me! Correcting Eye Gaze in Live Video Communication

Authors Info & Claims
Published:05 June 2019Publication History
Skip Abstract Section

Abstract

Although live video communication is widely used, it is generally less engaging than face-to-face communication because of limitations on social, emotional, and haptic feedback. Missing eye contact is one such problem caused by the physical deviation between the screen and camera on a device. Manipulating video frames to correct eye gaze is a solution to this problem. In this article, we introduce a system to rotate the eyeball of a local participant before the video frame is sent to the remote side. It adopts a warping-based convolutional neural network to relocate pixels in eye regions. To improve visual quality, we minimize the L2 distance between the ground truths and warped eyes. We also present several newly designed loss functions to help network training. These new loss functions are designed to preserve the shape of eye structures and minimize color changes around the periphery of eye regions. To evaluate the presented network and loss functions, we objectively and subjectively compared results generated by our system and the state-of-the-art, DeepWarp, in relation to two datasets. The experimental results demonstrated the effectiveness of our system. In addition, we showed that our system can perform eye-gaze correction in real time on a consumer-level laptop. Because of the quality and efficiency of the system, gaze correction by postprocessing through this system is a feasible solution to the problem of missing eye contact in video communication.

References

  1. T. Banerjee. Webinar 8 Webcast Market Size, Trends 8 Analysis--Forecasts To 2025. Retrieved from https://medium.com/@banerjee.treesha/webinar-webcast-market-size-trends-analysis-forecasts-to-2025-1877a838ce39.Google ScholarGoogle Scholar
  2. P. S. N. Lee, L. Leung, V. Lo, C. Xiong, and T. Wu. 2011. Internet communication versus face-to-face interaction in quality of life. Soc. Indicat. Res. 100, 3 (01 Feb. 2011), 375--389.Google ScholarGoogle Scholar
  3. The Late Late Show with James Corden. 2017. Harry Styles video chats with james corden. Retrieved from https://www.youtube.com/watch?v=H7ZjRna4ZK4.Google ScholarGoogle Scholar
  4. Y. Ganin, D. Kononenko, D. Sungatullina, and V. Lempitsky. 2016. DeepWarp: Photorealistic Image Resynthesis for Gaze Manipulation. Springer International Publishing, 311--326.Google ScholarGoogle Scholar
  5. G. Huang, Z. Liu, and K. Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 2261--2269. arxiv:1608.06993 http://arxiv.org/abs/1608.06993Google ScholarGoogle Scholar
  6. R. Yang and Z. Zhang. 2001. Eye Gaze Correction with Stereovision for Video-Teleconferencing. Technical Report. Microsoft. Retrieved from https://www.microsoft.com/en-us/research/publication/eye-gaze-correction-with-stereovision-for-video-teleconferencing/.Google ScholarGoogle Scholar
  7. A. Criminisi, J. Shotton, A. Blake, and P. H. S. Torr. 2003. Gaze manipulation for one-to-one teleconferencing. In Proceedings 9th IEEE International Conference on Computer Vision, Vol. 1. 191--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Wolf, Z. Freund, and S. Avidan. 2010. An eye for an eye: A single camera gaze-replacement method. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 817--824.Google ScholarGoogle Scholar
  9. F. Solina and R. Ravnik. 2011. Fixing missing eye-contact in video conferencing systems. In Proceedings of the 33rd International Conference on Information Technology Interfaces (ITI’11). 233--236.Google ScholarGoogle Scholar
  10. J. Gemmell, K. Toyama, C. L. Zitnick, T. Kang, and S. Seitz. 2000. Gaze awareness for video-conferencing: A software approach. IEEE Multimedia 7, 4 (2000), 26--35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Giger, J. C. Bazin, C. Kuster, T. Popa, and M. Gross. 2014. Gaze correction with a single webcam. In Proceedings of the 2014 IEEE International Conference on Multimedia and Expo (ICME’14). 1--6.Google ScholarGoogle Scholar
  12. A. Jaklič, F. Solina, and L. Šajn. 2017. User interface for a better eye contact in videoconferencing. Displays 46 (2017), 25--36.Google ScholarGoogle Scholar
  13. L. S. Bohannon, A. M. Herbert, J. B. Pelz, and E. M. Rantanen. 2013. Eye contact and video-mediated communication: A review. Displays 34, 2 (2013), 177--185.Google ScholarGoogle ScholarCross RefCross Ref
  14. E. T. Baek and Y. S. Ho. 2017. Gaze correction using feature-based view morphing and performance evaluation. Signal Image Vid. Process. 11, 1 (2017), 187--194.Google ScholarGoogle ScholarCross RefCross Ref
  15. G. Doherty-Sneddon, A. Anderson, C. O’Malley, S. Langton, S. Garrod, and V. Bruce. 1997. Face-to-face and video-mediated communication: A comparison of dialogue structure and task performance. J. Exp. Psychol. Appl. 3, 2 (1997), 105--125.Google ScholarGoogle ScholarCross RefCross Ref
  16. E. M. Tapia, S. S. Intille, J. R. Rebula, and S. Stoddard. 2003. Concept and partial prototype video: Ubiquitous video communication with the perception of eye contact. In Proceedings of the UBICOMP 2003 Video Program.Google ScholarGoogle Scholar
  17. A. Jones, M. Lang, G. Fyffe, X. Yu, J. Busch, I. McDowall, M. Bolas, and P. Debevec. 2009. Achieving eye contact in a one-to-many 3D video teleconferencing system. ACM Trans. Graph. 28, 3 (2009), 64:1--64:8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. B. M. Rappoport, C. J. Stringer, F. R. Rothkopf, J. C. Franklin, J. P. Ternus, J. C. Hoenig, R. P. Howarth, S. A. MYERS, and S. B. Lynch. 2016. Devices and methods for providing access to internal component. United States Patent US20160358543A1, 2016.Google ScholarGoogle Scholar
  19. T. OGITA, S. Takanashi, and S. Takatsuka 2012. Sensor-equipped display apparatus and electronic apparatus. United States Patent US20120069042A1, 2012.Google ScholarGoogle Scholar
  20. M. Dumont, S. Rogmans, S. Maesen, and P. Bekaert. 2009. Optimized two-party video chat with restored eye contact using graphics hardware. In e-Business and Telecommunications, Joaquim Filipe and Mohammad S. Obaidat (Eds.). Springer, Berlin, 358--372.Google ScholarGoogle Scholar
  21. C. Kuster, T. Popa, J. C. Bazin, C. Gotsman, and M. Gross. 2012. Gaze correction for home video conferencing. ACM Trans. Graph. 31, 6 (2012), 174:1--174:6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. Weiner and N. Kiryati. 2003. Virtual gaze redirection in face images. In Proceedings of the 12th International Conference on Image Analysis and Processing. 76--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Qin, K. C. Lien, M. Turk, and T. Höllerer. 2015. Eye Gaze Correction with a Single Webcam Based on Eye-Replacement. Springer International Publishing, Cham, 599--609.Google ScholarGoogle Scholar
  24. Z. Shu, E. Shechtman, D. Samaras, and S. Hadap. 2016. EyeOpener: Editing eyes in the wild. ACM Trans. Graph. 36, 1 (2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. E. Wood, T. Baltrušaitis, L. P. Morency, P. Robinson, and A. Bulling. 2018. GazeDirector: Fully articulated eye gaze redirection in video. Eurographics 37, 2 (2018), 217--225.Google ScholarGoogle ScholarCross RefCross Ref
  26. D. A. Forsyth and J. Ponce. 2002. Computer Vision: A Modern Approach. Prentice Hall Professional. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. N. A. Dodgson. 2004. Variation and extrema of human interpupillary distance. In Stereoscopic Displays and Virtual Reality Systems XI, Andrew J. Woods, John O. Merritt, Stephen A. Benton, and Mark T. Bolas (Eds.), Vol. 5291. SPIE, 19--22.Google ScholarGoogle Scholar
  28. D. E. King. 2009. Dlib-ml: A machine learning toolkit. J. Mach. Learn. Res. 10 (2009), 1755--1758. https://dl.acm.org/citation.cfm?id=1755843 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. V. Kazemi and J. Sullivan. 2014. One millisecond face alignment with an ensemble of regression trees. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1867--1874. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. B. Xu, N. Wang, T. Chen, and M. Li. 2015. Empirical evaluation of rectified activations in convolutional network. In Proceedings of the ICML Deep Learning Workshop (2015). 06--11. arxiv:1505.00853 http://arxiv.org/abs/1505.00853Google ScholarGoogle Scholar
  31. S. Ioffe and C. Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on International Conference on Machine Learning, Vol. 37. 448--456. http://dl.acm.org/citation.cfm?id=3045118.3045167 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Retrieved from https://www.tensorflow.org/.Google ScholarGoogle Scholar
  33. D. P. Kingma and J. Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations (2015). arxiv:1412.6980 http://arxiv.org/abs/1412.6980Google ScholarGoogle Scholar
  34. B. A. Smith, Q. Yin, S. K. Feiner, and S. K. Nayar. 2013. Gaze locking: Passive eye contact detection for human--object interaction. In Proceedings of the ACM Symposium on User Interface Software and Technology (UIST’13). 271--280. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Look at Me! Correcting Eye Gaze in Live Video Communication

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!