research-article

Human and DNN Classification Performance on Images With Quality Distortions: A Comparative Study

Published:16 March 2019Publication History
Skip Abstract Section

Abstract

Image quality is an important practical challenge that is often overlooked in the design of machine vision systems. Commonly, machine vision systems are trained and tested on high-quality image datasets, yet in practical applications the input images cannot be assumed to be of high quality. Modern deep neural networks (DNNs) have been shown to perform poorly on images affected by blur or noise distortions. In this work, we investigate whether human subjects also perform poorly on distorted stimuli and provide a direct comparison with the performance of DNNs. Specifically, we study the effect of Gaussian blur and additive Gaussian noise on human and DNN classification performance. We perform two experiments: one crowd-sourced experiment with unlimited stimulus display time, and one lab experiment with 100ms display time. In both cases, we found that humans outperform neural networks on distorted stimuli, even when the networks are retrained with distorted data.

References

  1. Timothy J. Andrews and David M. Coppola. 1999. Idiosyncratic characteristics of saccadic eye movements when viewing different visual environments. Vision Research 39, 17 (1999), 2947--2953.Google ScholarGoogle ScholarCross RefCross Ref
  2. Talis Bachmann. 1991. Identification of spatially quantised tachistoscopic images of faces: How many pixels does it take to carry identity? European Journal of Cognitive Psychology 3, 1 (1991), 87--103.Google ScholarGoogle ScholarCross RefCross Ref
  3. Ali Borji and Laurent Itti. 2014. Human vs. computer in scene and object recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). 113--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Tejas S. Borkar and Lina J. Karam. 2017. DeepCorrect: Correcting DNN models against image distortions. arXiv:1705.02406.Google ScholarGoogle Scholar
  5. Charles F. Cadieu, Ha Hong, Daniel L. K. Yamins, Nicolas Pinto, Diego Ardila, Ethan A. Solomon, Najib J. Majaj, et al. 2014. Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Computational Biology 10, 12 (2014), e1003963.Google ScholarGoogle ScholarCross RefCross Ref
  6. Yue Chen, Ryan McBain, and Daniel Norton. 2015. Specific vulnerability of face perception to noise: A similar effect in schizophrenia patients and healthy individuals. Psychiatry Research 225, 3 (2015), 619--624.Google ScholarGoogle ScholarCross RefCross Ref
  7. Steven Diamond, Vincent Sitzmann, Stephen Boyd, Gordon Wetzstein, and Felix Heide. 2016. Dirty pixels: Optimizing image classification architectures for raw sensor data. arXiv:1701.06487.Google ScholarGoogle Scholar
  8. Samuel Dodge and Lina Karam. 2016. Understanding how image quality affects deep neural networks. In Proceedings of teh 2016 8th International Conference on Quality of Multimedia Experience (QoMEX’16). 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  9. Samuel Dodge and Lina Karam. 2017. Quality resilient neural networks. arXiv:1703.08119.Google ScholarGoogle Scholar
  10. Li Fei-Fei, Rob Fergus, and Pietro Perona. 2007. Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories. Computer Vision and Image Understanding 106, 1 (2007), 59--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. François Fleuret, Ting Li, Charles Dubout, Emma K. Wampler, Steven Yantis, and Donald Geman. 2011. Comparing machines and humans on a visual categorization test. Proceedings of the National Academy of Sciences 108, 43 (2011), 17621--17625.Google ScholarGoogle ScholarCross RefCross Ref
  12. Robert Geirhos, David H. J. Janssen, Heiko H. Schutt, Jonas Rauberand, Matthias Bethge, and Felix A. Wichmann. 2017. Comparing deep neural networks against humans: Object recognition when the signal gets weaker. arXiv:1706.06969.Google ScholarGoogle Scholar
  13. Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). 580--587. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In Proceedings of the International Conference on Learning Representations (ICLR’15).Google ScholarGoogle Scholar
  15. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep residual learning for image recognition. arXiv:1512.03385.Google ScholarGoogle Scholar
  16. Md Tahmid Hossain, Shyh Wei Teng, Dengsheng Zhang, Suryani Lim, and Guojun Lu. 2018. Distortion robust image classification with deep convolutional neural network based on discrete cosine transform. arXiv:1811.05819.Google ScholarGoogle Scholar
  17. Lina J. Karam and Tong Zhu. 2015. Quality labeled faces in the wild (QLFW): A database for studying face recognition in real-world environments. In IS&T/SPIE Electronic Imaging. International Society for Optics and Photonics, Bellingham, WA, 93940B1--93940B10.Google ScholarGoogle Scholar
  18. Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of GANs for improved quality, stability, and variation. arXiv:1710.10196.Google ScholarGoogle Scholar
  19. C. Keysers, D.-K. Xiao, P. Földiák, and D. I. Perrett. 2001. The speed of sight. Journal of Cognitive Neuroscience 13, 1 (2001), 90--101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Saeed Reza Kheradpisheh, Masoud Ghodrati, Mohammad Ganjtabesh, and Timothée Masquelier. 2016. Deep networks can resemble human feed-forward vision in invariant object recognition. Scientific Reports 6 (2016), 32672.Google ScholarGoogle ScholarCross RefCross Ref
  21. Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 3431--3440.Google ScholarGoogle ScholarCross RefCross Ref
  22. Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. 2017. Universal adversarial perturbations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).Google ScholarGoogle ScholarCross RefCross Ref
  23. Aude Oliva and Antonio Torralba. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42, 3 (2001), 145--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Mary C. Potter, Brad Wyble, Carl Erick Hagmann, and Emily S. McCourt. 2014. Detecting meaning in RSVP at 13 ms per picture. Attention, Perception, and Psychophysics 76, 2 (2014), 270--279.Google ScholarGoogle ScholarCross RefCross Ref
  25. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, et al. 2015. ImageNet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  27. Sebastian Stabinger, Antonio Rodríguez-Sánchez, and Justus Piater. 2016. 25 years of CNNs: Can we compare to human abstraction capabilities? In Proceedings of the International Conference on Artificial Neural Networks. 380--387.Google ScholarGoogle ScholarCross RefCross Ref
  28. Jiawei Su, Danilo Vasconcellos Vargas, and Sakurai Kouichi. 2017. One pixel attack for fooling deep neural networks. arXiv:1710.08864.Google ScholarGoogle Scholar
  29. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, et al. 2015. Going deeper with convolutions. In Proceedings of the IEEE International Conference on Computer Vision (CVPR’15). 1--9.Google ScholarGoogle ScholarCross RefCross Ref
  30. JianWen Tao, Wenjun Hu, and Shiting Wen. 2016. Multi-source adaptation joint kernel sparse representation for visual classification. Neural Networks 76 (2016), 135--151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Antonio Torralba, Rob Fergus, and William T. Freeman. 2008. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 11 (2008), 1958--1970. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Shimon Ullman, Liav Assif, Ethan Fetaya, and Daniel Harari. 2016. Atoms of recognition in human and computer vision. Proceedings of the National Academy of Sciences 113, 10, 2744--2749.Google ScholarGoogle ScholarCross RefCross Ref
  33. Igor Vasiljevic, Ayan Chakrabarti, and Gregory Shakhnarovich. 2016. Examining the impact of blur on recognition by convolutional networks. arXiv:1611.05760.Google ScholarGoogle Scholar
  34. Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, et al. 2017. Residual attention network for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).Google ScholarGoogle ScholarCross RefCross Ref
  35. Amir R. Zamir, Te-Lin Wu, Lin Sun, William Shen, Jitendra Malik, and Silvio Savarese. 2017. Feedback networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).Google ScholarGoogle ScholarCross RefCross Ref
  36. Yiren Zhou, Sibo Song, and Ngai-Man Cheung. 2017. On classification of distorted images with deep convolutional neural networks. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’17). 1213--1217.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Human and DNN Classification Performance on Images With Quality Distortions: A Comparative Study

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Applied Perception
          ACM Transactions on Applied Perception  Volume 16, Issue 2
          April 2019
          94 pages
          ISSN:1544-3558
          EISSN:1544-3965
          DOI:10.1145/3320114
          Issue’s Table of Contents

          Copyright © 2019 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 March 2019
          • Revised: 1 December 2018
          • Accepted: 1 December 2018
          • Received: 1 June 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!