Abstract
We study the XAI (explainable AI) on the face recognition task, particularly the face verification. Face verification has become a crucial task in recent days and it has been deployed to plenty of applications, such as access control, surveillance, and automatic personal log-on for mobile devices. With the increasing amount of data, deep convolutional neural networks can achieve very high accuracy for the face verification task. Beyond exceptional performances, deep face verification models need more interpretability so that we can trust the results they generate. In this article, we propose a novel similarity metric, called explainable cosine (xCos), that comes with a learnable module that can be plugged into most of the verification models to provide meaningful explanations. With the help of xCos, we can see which parts of the two input faces are similar, where the model pays its attention to, and how the local similarities are weighted to form the output xCos score. We demonstrate the effectiveness of our proposed method on LFW and various competitive benchmarks, not only resulting in providing novel and desirable model interpretability for face verification but also ensuring the accuracy as plugging into existing face recognition models.
- [1] , , , , and (Eds.). 2012. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings. http://papers.nips.cc/book/advances-in-neural-information-processing-systems-25-2012.Google Scholar
- [2] . 2019. Approximating CNNs with Bag-of-Local-Features models works surprisingly well on ImageNet. In International Conference on Learning Representations. https://openreview.net/pdf?id=SkfMWhAqYQ.Google Scholar
- [3] . 2018. Vggface2: A dataset for recognising faces across pose and age. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG’18). IEEE, 67–74.Google Scholar
Cross Ref
- [4] . 2018. Visualizing and quantifying discriminative features for face recognition. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG’18), 16–23.Google Scholar
Cross Ref
- [5] . 2019. Free-form video inpainting with 3D gated convolution and temporal PatchGAN. In Proceedings of the International Conference on Computer Vision (ICCV’19).Google Scholar
Cross Ref
- [6] . 2018. Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV’18). 839–847. https://doi.org/10.1109/WACV.2018.00097Google Scholar
Cross Ref
- [7] . 2015. Face recognition and retrieval using cross-age reference coding with cross-age celebrity dataset. IEEE Transactions on Multimedia 17, 6 (2015), 804–815.Google Scholar
Digital Library
- [8] . 2019. Explaining neural networks semantically and quantitatively. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’19).Google Scholar
Cross Ref
- [9] . 2019. ArcFace: Additive angular margin loss for deep face recognition. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).Google Scholar
- [10] . 2017. Marginal loss for deep face recognition. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17), 2006–2014.Google Scholar
- [11] . 2020. VINet: A visually interpretable image diagnosis network. IEEE Transactions on Multimedia 22, 7 (2020), 1720–1729.Google Scholar
Cross Ref
- [12] . 2017. Explainable artificial intelligence (XAI). Defense Advanced Research Projects Agency (DARPA), nd Web 2 (2017).Google Scholar
- [13] . 2016. MS-Celeb-1M: A dataset and benchmark for large-scale face recognition. In European Conference Computer Vision (ECCV’16).Google Scholar
- [14] . 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 770–778. https://doi.org/10.1109/CVPR.2016.90Google Scholar
Cross Ref
- [15] . 2019. TED: Teaching AI to explain its decisions. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (AIES’19). ACM, New York, NY, 123–129. https://doi.org/10.1145/3306618.3314273 Google Scholar
Digital Library
- [16] . 2015. Distilling the knowledge in a neural network. In NIPS Deep Learning and Representation Learning Workshop. http://arxiv.org/abs/1503.02531.Google Scholar
- [17] . 2007. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments.
Technical Report 07-49. University of Massachusetts, Amherst.Google Scholar - [18] . 2017. Beyond face rotation: Global and local perception GAN for photorealistic and identity preserving frontal view synthesis. In The IEEE International Conference on Computer Vision (ICCV’17).Google Scholar
Cross Ref
- [19] . 2014. Stacked progressive auto-encoders (SPAE) for face recognition across poses. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). 1883–1890. Google Scholar
Digital Library
- [20] . 2009. Attribute and simile classifiers for face verification. In IEEE 12th International Conference on Computer Vision. 365–372.Google Scholar
Cross Ref
- [21] . 2017. SphereFace: Deep hypersphere embedding for face recognition. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).Google Scholar
Cross Ref
- [22] . 2018. Improving the interpretability of deep neural networks with knowledge distillation. In IEEE International Conference on Data Mining Workshops (ICDMW’18), 905–912.Google Scholar
- [23] . 2015. Surpassing human-level face verification performance on LFW with gaussian face. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI’15). AAAI Press, 3811–3819. http://dl.acm.org/citation.cfm?id=2888116.2888245. Google Scholar
Digital Library
- [24] . 1998. The AR face database. Tech. Rep. 24 CVC Technical Report (
Jan. 1998).Google Scholar - [25] . 2020. Multi-features fusion and decomposition for age-invariant face recognition. In Proceedings of the 28th ACM International Conference on Multimedia (MM’20). Association for Computing Machinery, New York, NY, 3146–3154. https://doi.org/10.1145/3394171.3413499 Google Scholar
Digital Library
- [26] . 2017. AgeDB: The first manually collected, in-the-wild age database. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17). 1997–2005. https://doi.org/10.1109/CVPRW.2017.250Google Scholar
Cross Ref
- [27] . 2015. Deep face recognition. In The British Machine Vision Conference (BMVC’15).Google Scholar
- [28] . 2017. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 6 (2017), 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031 Google Scholar
Digital Library
- [29] . 2016. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16). ACM, New York, NY, 1135–1144. https://doi.org/10.1145/2939672.2939778 Google Scholar
Digital Library
- [30] . 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In IEEE International Conference on Computer Vision (ICCV’17). 618–626. https://doi.org/10.1109/ICCV.2017.74Google Scholar
Cross Ref
- [31] . 2016. Frontal to profile face verification in the wild. In IEEE Winter Conference on Applications of Computer Vision (WACV’16). 1–9. https://doi.org/10.1109/WACV.2016.7477558Google Scholar
Cross Ref
- [32] . 2017. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 4 (2017), 640–651. https://doi.org/10.1109/TPAMI.2016.2572683 Google Scholar
Digital Library
- [33] . 2014. Deep learning face representation by joint identification-verification. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014. 1988–1996. http://papers.nips.cc/paper/5416-deep-learning-face-representation-by-joint-identification-verification. Google Scholar
Digital Library
- [34] . 2018. CosFace: Large margin cosine loss for deep face recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18). 5265–5274.Google Scholar
- [35] . 2016. A discriminative feature learning approach for deep face recognition. In European Conference Computer Vision (ECCV’16).Google Scholar
- [36] . 2020. Explainable face recognition. In 16th European Conference Computer Vision (ECCV’20), Proceedings, Part XI(
Lecture Notes in Computer Science , Vol. 12356), , , , and (Eds.). Springer, 248–263. https://doi.org/10.1007/978-3-030-58621-8_15Google ScholarDigital Library
- [37] . 2011. Face recognition in unconstrained videos with matched background similarity. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). 529–534. Google Scholar
Digital Library
- [38] . 2019. Towards rich feature discovery with class activation maps augmentation for person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19).Google Scholar
Cross Ref
- [39] . 2014. Learning face representation from scratch. ArXiv abs/1411.7923 (2014).Google Scholar
- [40] . 2019. Towards interpretable face recognition. In Proceeding of International Conference on Computer Vision (ICCV’19).Google Scholar
- [41] . 2016. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters 23, 10 (
Oct. 2016), 1499–1503. https://doi.org/10.1109/LSP.2016.2603342Google ScholarCross Ref
- [42] . 2017. Cross-age LFW: A database for studying cross-age face recognition in unconstrained environments. CoRR abs/1708.08197 (2017).
arxiv:1708.08197 http://arxiv.org/abs/1708.08197.Google Scholar - [43] . 2015. Learning deep features for discriminative localization. arXiv e-prints, Article
arXiv:1512.04150 (Dec 2015), arXiv:1512.04150 pages.arxiv:1512.04150 [cs.CV]Google Scholar - [44] . 2013. Deep learning identity-preserving face space. In IEEE International Conference on Computer Vision (ICCV’13). 113–120.Google Scholar
Index Terms
xCos: An Explainable Cosine Metric for Face Verification Task
Recommendations
High-resolution face verification using pore-scale facial features
Face recognition methods, which usually represent face images using holistic or local facial features, rely heavily on alignment. Their performances also suffer a severe degradation under variations in expressions or poses, especially when there is one ...
Does face restoration improve face verification?
AbstractMethods for face verification works reasonably well on face images with standardized (frontal) face positions and good spatial resolution. However such methods have significant challenges on poor resolution images, poor lighting conditions and not ...
Face Verification Across Age Progression
Human faces undergo considerable amounts of variations with aging. While face recognition systems have been proven to be sensitive to factors such as illumination and pose, their sensitivity to facial aging effects is yet to be studied. How does age ...






Comments