skip to main content
research-article

Visual Attention Analysis and Prediction on Human Faces for Children with Autism Spectrum Disorder

Authors Info & Claims
Published:15 October 2019Publication History
Skip Abstract Section

Abstract

The focus of this article is to analyze and predict the visual attention of children with Autism Spectrum Disorder (ASD) when looking at human faces. Social difficulties are the hallmark features of ASD and will lead to atypical visual attention toward various stimuli more or less, especially on human faces. Learning the visual attention of children with ASD could contribute to related research in the field of medical science, psychology, and education. We first construct a Visual Attention on Faces for Autism Spectrum Disorder (VAFA) database, which consists of 300 natural scene images with human faces and corresponding eye movement data collected from 13 children with ASD. Compared with matched typically developing (TD) controls, we quantify atypical visual attention on human faces in ASD. Statistics show that some high-level factors such as face size, facial features, face pose, and facial emotions have different impacts on the visual attention of children with ASD. Combining the feature maps extracted from the state-of-the-art saliency models, we get the visual attention model on human faces for individuals with ASD. The proposed model shows the best performance among all competitors. With the help of our proposed model, researchers in related fields could design specialized education contents containing human faces for the children with ASD or produce the specific model for rapidly screening ASD using their eye movement data.

References

  1. Dima Amso, Sara Haas, and Julie Markant. 2014. An eye tracking investigation of developmental change in bottom-up attention orienting to faces in cluttered natural scenes. PLoS One 9, 1 (2014), e85701.Google ScholarGoogle ScholarCross RefCross Ref
  2. Jakob Åsberg Johnels, Daniel Hovey, Nicole Zürcher, Loyse Hippolyte, Eric Lemonnier, Christopher Gillberg, and Nouchine Hadjikhani. 2017. Autism and emotional face-viewing. Autism Research 10, 5 (2017), 901--910.Google ScholarGoogle ScholarCross RefCross Ref
  3. American Psychiatric Association et al. 2013. Diagnostic and Statistical Manual of Mental Disorders (DSM-5®). American Psychiatric Pub.Google ScholarGoogle Scholar
  4. Tadas Baltrušaitis, Marwa Mahmoud, and Peter Robinson. 2015. Cross-dataset learning and person-specific normalisation for automatic action unit detection. In Proceedings of the IEEE International Conference on Automatic Face 8 Gesture Recognition (FG’15), Vol. 6. 1--6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Tadas Baltrusaitis, Amir Zadeh, Yao Chong Lim, and Louis-Philippe Morency. 2018. OpenFace 2.0: Facial behavior analysis toolkit. In Proceedings of the IEEE International Conference on Automatic Face 8 Gesture Recognition (FG’18). 59--66.Google ScholarGoogle ScholarCross RefCross Ref
  6. Yair Bar-Haim, Cory Shulman, Dominique Lamy, and Arnon Reuveni. 2006. Attention to eyes and mouth in high-functioning children with autism. Journal of Autism and Developmental Disorders 36, 1 (2006), 131--137.Google ScholarGoogle ScholarCross RefCross Ref
  7. Elina Birmingham, Moran Cerf, and Ralph Adolphs. 2011. Comparing social attention in autism and amygdala lesions: Effects of stimulus and task condition. Social Neuroscience 6, 5–6 (2011), 420--435.Google ScholarGoogle ScholarCross RefCross Ref
  8. Ali Borji. 2012. Boosting bottom-up and top-down visual features for saliency estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12). 438--445.Google ScholarGoogle ScholarCross RefCross Ref
  9. Ali Borji, Dicky N. Sihite, and Laurent Itti. 2013. Quantitative analysis of human-model agreement in visual saliency modeling: A comparative study. IEEE Transactions on Image Processing 22, 1 (2013), 55--69.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Claus Bundesen, Signe Vangkilde, and Anders Petersen. 2015. Recent developments in a computational theory of visual attention (TVA). Vision Research 116 (2015), 210--218.Google ScholarGoogle ScholarCross RefCross Ref
  11. Zoya Bylinskii, Tilke Judd, Aude Oliva, Antonio Torralba, and Frédo Durand. 2018. What do different evaluation metrics tell us about saliency models? IEEE Transactions on Pattern Analysis and Machine Intelligence (2018).Google ScholarGoogle Scholar
  12. Katarzyna Chawarska, Suzanne Macari, and Frederick Shic. 2013. Decreased spontaneous attention to social scenes in 6-month-old infants later diagnosed with autism spectrum disorders. Biological Psychiatry 74, 3 (2013), 195--203.Google ScholarGoogle ScholarCross RefCross Ref
  13. Coralie Chevallier, Julia Parish-Morris, Alana McVey, Keiran M. Rump, Noah J. Sasson, John D. Herrington, and Robert T. Schultz. 2015. Measuring social attention and motivation in autism spectrum disorder using eye-tracking: Stimulus type matters. Autism Research 8, 5 (2015), 620--628.Google ScholarGoogle ScholarCross RefCross Ref
  14. Ben Corden, Rebecca Chilvers, and David Skuse. 2008. Avoidance of emotionally arousing stimuli predicts social--perceptual impairment in Asperger’s syndrome. Neuropsychologia 46, 1 (2008), 137--147.Google ScholarGoogle ScholarCross RefCross Ref
  15. Marcella Cornia, Lorenzo Baraldi, Giuseppe Serra, and Rita Cucchiara. 2016. A deep multi-level network for saliency prediction. In Proceedings of the IEEE International Conference on Pattern Recognition (ICPR’16). 3488--3493.Google ScholarGoogle ScholarCross RefCross Ref
  16. Marcella Cornia, Lorenzo Baraldi, Giuseppe Serra, and Rita Cucchiara. 2018. Predicting human eye fixations via an LSTM-based saliency attentive model. IEEE Transactions on Image Processing (2018).Google ScholarGoogle ScholarCross RefCross Ref
  17. Geraldine Dawson, Sara Jane Webb, and James McPartland. 2005. Understanding the nature of face processing impairment in autism: Insights from behavioral and electrophysiological studies. Developmental Neuropsychology 27, 3 (2005), 403--424.Google ScholarGoogle ScholarCross RefCross Ref
  18. Huiyu Duan, Guangtao Zhai, Xiongkuo Min, Zhaohui Che, Yi Fang, Xiaokang Yang, Jesús Gutiérrez, and Patrick Le Callet. 2019. A dataset of eye movements for the children with autism spectrum disorder. In Proceedings of the ACM Multimedia Systems Conference (MMSys’19).Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Huiyu Duan, Guangtao Zhai, Xiongkuo Min, Yi Fang, Zhaohui Che, Xiaokang Yang, Cheng Zhi, Hua Yang, and Ning Liu. 2018. Learning to predict where the children with ASD look. In Proceedings of the IEEE International Conference on Image Processing (ICIP’18). 704--708.Google ScholarGoogle ScholarCross RefCross Ref
  20. Huiyu Duan, Guangtao Zhai, Xiongkuo Min, Yucheng Zhu, Yi Fang, and Xiaokang Yang. 2018. Perceptual quality assessment of omnidirectional images. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS’18). 1--5.Google ScholarGoogle ScholarCross RefCross Ref
  21. Hedwig Eisenbarth and Georg W. Alpers. 2011. Happy mouth and sad eyes: Scanning emotional facial expressions. Emotion 11, 4 (2011), 860.Google ScholarGoogle Scholar
  22. Terje Falck-Ytter, Sven Bölte, and Gustaf Gredebäck. 2013. Eye tracking in early autism research. Journal of Neurodevelopmental Disorders 5, 1 (2013), 28.Google ScholarGoogle ScholarCross RefCross Ref
  23. Terje Falck-Ytter, Elisabeth Fernell, Christopher Gillberg, and Claes Von Hofsten. 2010. Face scanning distinguishes social from communication impairments in autism. Developmental Science 13, 6 (2010), 864--875.Google ScholarGoogle ScholarCross RefCross Ref
  24. Terje Falck-Ytter and Claes von Hofsten. 2011. How special is social looking in ASD: A review. In Progress in Brain Research. Vol. 189. Elsevier, 209--222.Google ScholarGoogle Scholar
  25. Shaojing Fan, Zhiqi Shen, Ming Jiang, Bryan L. Koenig, Juan Xu, Mohan S. Kankanhalli, and Qi Zhao. 2018. Emotional attention: A study of image sentiment and visual attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 7521--7531.Google ScholarGoogle ScholarCross RefCross Ref
  26. Bruno Gepner, Beatrice de Gelder, and Scania de Schonen. 1996. Face processing in autistics: Evidence for a generalised deficit? Child Neuropsychology 2, 2 (1996), 123--139.Google ScholarGoogle ScholarCross RefCross Ref
  27. Quentin Guillon, Nouchine Hadjikhani, Sophie Baduel, and Bernadette Rogé. 2014. Visual social attention in autism spectrum disorder: Insights from eye tracking studies. Neuroscience 8 Biobehavioral Reviews 42 (2014), 279--297.Google ScholarGoogle Scholar
  28. Chenlei Guo and Liming Zhang. 2010. A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Transactions on Image Processing 19, 1 (2010), 185--198.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Xun Huang, Chengyao Shen, Xavier Boix, and Qi Zhao. 2015. Salicon: Reducing the semantic gap in saliency prediction by adapting deep neural networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’15). 262--270.Google ScholarGoogle ScholarCross RefCross Ref
  30. Laurent Itti, Christof Koch, and Ernst Niebur. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 11 (1998), 1254--1259.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ming Jiang, Shengsheng Huang, Juanyong Duan, and Qi Zhao. 2015. Salicon: Saliency in context. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 1072--1080.Google ScholarGoogle ScholarCross RefCross Ref
  32. Ming Jiang and Qi Zhao. 2017. Learning visual attention to identify people with autism spectrum disorder. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). 3267--3276.Google ScholarGoogle ScholarCross RefCross Ref
  33. Tilke Judd, Krista Ehinger, Frédo Durand, and Antonio Torralba. 2009. Learning to predict where humans look. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’09). 2106--2113.Google ScholarGoogle ScholarCross RefCross Ref
  34. Chawarska Katarzyna, Volkmar Fred, and Klin Ami. 2010. Limited attentional bias for faces in toddlers with autism spectrum disorders. Archives of General Psychiatry 67, 2 (2010), 178--185.Google ScholarGoogle ScholarCross RefCross Ref
  35. Ami Klin and Warren Jones. 2008. Altered face scanning and impaired recognition of biological motion in a 15-month-old infant with autism. Developmental Science 11, 1 (2008), 40--46.Google ScholarGoogle ScholarCross RefCross Ref
  36. Ami Klin, Sara S. Sparrow, Annelies De Bildt, Domenic V. Cicchetti, Donald J. Cohen, and Fred R. Volkmar. 1999. A normed study of face recognition in autism and related disorders. Journal of Autism and Developmental Disorders 29, 6 (1999), 499--508.Google ScholarGoogle ScholarCross RefCross Ref
  37. Srinivas S. S. Kruthiventi, Kumar Ayush, and R. Venkatesh Babu. 2017. Deepfix: A fully convolutional neural network for predicting human eye fixations. IEEE Transactions on Image Processing 26, 9 (2017), 4446--4456.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Olivier Le Meur, Patrick Le Callet, and Dominique Barba. 2007. Predicting visual fixations on video based on low-level visual features. Vision Research 47, 19 (2007), 2483--2498.Google ScholarGoogle Scholar
  39. Zhiqiang Li, Tao Fang, and Hong Huo. 2010. A saliency model based on wavelet transform and visual attention. Science China: Information Sciences 53, 4 (2010), 738--751.Google ScholarGoogle ScholarCross RefCross Ref
  40. Ming Liang and Xiaolin Hu. 2015. Predicting eye fixations with higher-level visual features. IEEE Transactions on Image Processing 24, 3 (2015), 1178--1189.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. René Marois and Jason Ivanoff. 2005. Capacity limits of information processing in the brain. Trends in Cognitive Sciences 9, 6 (2005), 296--305.Google ScholarGoogle ScholarCross RefCross Ref
  42. James C. McPartland, Sara Jane Webb, Brandon Keehn, and Geraldine Dawson. 2011. Patterns of visual attention to faces and objects in autism spectrum disorder. Journal of Autism and Developmental Disorders 41, 2 (2011), 148--157.Google ScholarGoogle ScholarCross RefCross Ref
  43. Xiongkuo Min, Ke Gu, Guangtao Zhai, Jing Liu, Xiaokang Yang, and Chang Wen Chen. 2017. Blind quality assessment based on pseudo-reference image. IEEE Transactions on Multimedia 20, 8 (2017), 2049--2062.Google ScholarGoogle ScholarCross RefCross Ref
  44. Xiongkuo Min, Kede Ma, Ke Gu, Guangtao Zhai, Zhou Wang, and Weisi Lin. 2017. Unified blind quality assessment of compressed natural, graphic, and screen content images. IEEE Transactions on Image Processing 26, 11 (2017), 5462--5474.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Xiongkuo Min, Guangtao Zhai, Ke Gu, Jing Liu, Shiqi Wang, Xinfeng Zhang, and Xiaokang Yang. 2017. Visual attention analysis and prediction on human faces. Information Sciences 420 (2017), 417--430.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Xiongkuo Min, Guangtao Zhai, Ke Gu, Yutao Liu, and Xiaokang Yang. 2018. Blind image quality estimation via distortion aggravation. IEEE Transactions on Broadcasting 64, 2 (2018), 508--517.Google ScholarGoogle ScholarCross RefCross Ref
  47. Xiongkuo Min, Guangtao Zhai, Ke Gu, and Xiaokang Yang. 2017. Fixation prediction through multimodal analysis. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 13, 1 (2017), 6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Julie Osterling and Geraldine Dawson. 1994. Early recognition of children with autism: A study of first birthday home videotapes. Journal of Autism and Developmental Disorders 24, 3 (1994), 247--257.Google ScholarGoogle ScholarCross RefCross Ref
  49. Junting Pan, Cristian Canton Ferrer, Kevin McGuinness, Noel E. O’Connor, Jordi Torres, Elisa Sayrol, and Xavier Giro-i Nieto. 2017. Salgan: Visual saliency prediction with generative adversarial networks. Arxiv Preprint Arxiv:1701.01081 (2017).Google ScholarGoogle Scholar
  50. Robert J. Peters, Asha Iyer, Laurent Itti, and Christof Koch. 2005. Components of bottom-up gaze allocation in natural images. Vision Research 45, 18 (2005), 2397--2416.Google ScholarGoogle ScholarCross RefCross Ref
  51. Katherine Rice, Jennifer M. Moriuchi, Warren Jones, and Ami Klin. 2012. Parsing heterogeneity in autism spectrum disorders: Visual scanning of dynamic social scenes in school-aged children. Journal of the American Academy of Child 8 Adolescent Psychiatry 51, 3 (2012), 238--248.Google ScholarGoogle ScholarCross RefCross Ref
  52. Caroline E. Robertson and Simon Baron-Cohen. 2017. Sensory perception in autism. Nature Reviews Neuroscience 18, 11 (2017), 671.Google ScholarGoogle Scholar
  53. Manar D. Samad, Norou Diawara, Jonna L. Bobzien, John W. Harrington, Megan A. Witherow, and Khan M. Iftekharuddin. 2018. A feasibility study of autism behavioral markers in spontaneous facial, visual, and hand movement response data. IEEE Transactions on Neural Systems and Rehabilitation Engineering 26, 2 (2018), 353--361.Google ScholarGoogle ScholarCross RefCross Ref
  54. Noah J. Sasson, Jed T. Elison, Lauren M. Turner-Brown, Gabriel S. Dichter, and James W. Bodfish. 2011. Brief report: Circumscribed attention in young children with autism. Journal of Autism and Developmental Disorders 41, 2 (2011), 242--247.Google ScholarGoogle ScholarCross RefCross Ref
  55. Noah J. Sasson and Emily W. Touchstone. 2014. Visual attention to competing social and object images by preschool children with autism spectrum disorder. Journal of Autism and Developmental Disorders 44, 3 (2014), 584--592.Google ScholarGoogle ScholarCross RefCross Ref
  56. David R. Simmons, Ashley E. Robertson, Lawrie S. McKay, Erin Toal, Phil McAleer, and Frank E. Pollick. 2009. Vision in autism spectrum disorders. Vision Research 49, 22 (2009), 2705--2739.Google ScholarGoogle ScholarCross RefCross Ref
  57. Michael J. Swain and Dana H. Ballard. 1991. Color indexing. International Journal of Computer Vision 7, 1 (1991), 11--32.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Luan Tran, Xi Yin, and Xiaoming Liu. 2017. Disentangled representation learning GAN for pose-invariant face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1415--1424.Google ScholarGoogle ScholarCross RefCross Ref
  59. Andrius Vabalas and Megan Freeth. 2016. Brief report: Patterns of eye movements in face to face conversation are associated with autistic traits: Evidence from a student sample. Journal of Autism and Developmental Disorders 46, 1 (2016), 305--314.Google ScholarGoogle ScholarCross RefCross Ref
  60. Eleonora Vig, Michael Dorr, and David Cox. 2014. Large-scale optimization of hierarchical features for saliency prediction in natural images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). 2798--2805.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Lijun Wang, Huchuan Lu, Xiang Ruan, and Ming-Hsuan Yang. 2015. Deep networks for saliency detection via local estimation and global search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 3183--3192.Google ScholarGoogle ScholarCross RefCross Ref
  62. Linzhao Wang, Lijun Wang, Huchuan Lu, Pingping Zhang, and Xiang Ruan. 2016. Saliency detection with recurrent fully convolutional networks. In Proceedings of the European Conference on Computer Vision (ECCV’16). Springer, 825--841.Google ScholarGoogle ScholarCross RefCross Ref
  63. Shuo Wang, Ming Jiang, Xavier Morin Duchesne, Elizabeth A. Laugeson, Daniel P. Kennedy, Ralph Adolphs, and Qi Zhao. 2015. Atypical visual saliency in autism spectrum disorder quantified through model-based eye tracking. Neuron 88, 3 (2015), 604--616.Google ScholarGoogle ScholarCross RefCross Ref
  64. Shuo Wang, Juan Xu, Ming Jiang, Qi Zhao, Rene Hurlemann, and Ralph Adolphs. 2014. Autism spectrum disorder, but not amygdala lesions, impairs social attention in visual search. Neuropsychologia 63 (2014), 259--274.Google ScholarGoogle Scholar
  65. Li Yi, Yuebo Fan, Paul C. Quinn, Cong Feng, Dan Huang, Jiao Li, Guoquan Mao, and Kang Lee. 2013. Abnormality in face scanning by children with autism spectrum disorder is limited to the eye region: Evidence from multi-method analyses of eye tracking data. Journal of Vision 13, 10 (2013), 5--5.Google ScholarGoogle ScholarCross RefCross Ref
  66. Li Yi, Cong Feng, Paul C. Quinn, Haiyan Ding, Jiao Li, Yubing Liu, and Kang Lee. 2014. Do individuals with and without autism spectrum disorder scan faces differently? A new multi-method look at an existing controversy. Autism Research 7, 1 (2014), 72--83.Google ScholarGoogle ScholarCross RefCross Ref
  67. Bangjie Yin, Luan Tran, Haoxiang Li, Xiaohui Shen, and Xiaoming Liu. 2018. Towards interpretable face recognition. Arxiv Preprint Arxiv:1805.00611 (2018).Google ScholarGoogle Scholar
  68. Amir Zadeh, Yao Chong Lim, Tadas Baltrusaitis, and Louis-Philippe Morency. 2017. Convolutional experts constrained local model for 3D facial landmark detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17) Workshops. 2519--2528.Google ScholarGoogle ScholarCross RefCross Ref
  69. Guangtao Zhai, Xiaolin Wu, Xiaokang Yang, Weisi Lin, and Wenjun Zhang. 2011. A psychovisual quality metric in free-energy principle. IEEE Transactions on Image Processing 21, 1 (2011), 41--52.Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Lingyun Zhang, Matthew H. Tong, Tim K. Marks, Honghao Shan, and Garrison W. Cottrell. 2008. SUN: A Bayesian framework for saliency using natural statistics. Journal of Vision 8, 7 (2008), 32--32.Google ScholarGoogle ScholarCross RefCross Ref
  71. Xiuzhuang Zhou, Kai Jin, Yuanyuan Shang, and Guodong Guo. 2018. Visually interpretable representation learning for depression recognition from facial images. IEEE Transactions on Affective Computing (2018).Google ScholarGoogle Scholar
  72. Wenhan Zhu, Guangtao Zhai, Xiongkuo Min, Menghan Hu, Jing Liu, Guodong Guo, and Xiaokang Yang. 2019. Multi-channel decomposition in tandem with free-energy principle for reduced-reference image quality assessment. IEEE Transactions on Multimedia 21, 9 (2019), 2334–2346.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Visual Attention Analysis and Prediction on Human Faces for Children with Autism Spectrum Disorder

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 15, Issue 3s
      Special Issue on Face Analysis for Applications and Special Issue on Affective Computing for Large-Scale Heterogeneous Multimedia Data
      November 2019
      304 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3368027
      Issue’s Table of Contents

      Copyright © 2019 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 October 2019
      • Accepted: 1 May 2019
      • Revised: 1 March 2019
      • Received: 1 November 2018
      Published in tomm Volume 15, Issue 3s

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!