skip to main content
research-article

Detection of AI-Manipulated Fake Faces via Mining Generalized Features

Authors Info & Claims
Published:04 March 2022Publication History
Skip Abstract Section

Abstract

Recently, AI-manipulated face techniques have developed rapidly and constantly, which has raised new security issues in society. Although existing detection methods consider different categories of fake faces, the performance on detecting the fake faces with “unseen” manipulation techniques is still poor due to the distribution bias among cross-manipulation techniques. To solve this problem, we propose a novel framework that focuses on mining intrinsic features and further eliminating the distribution bias to improve the generalization ability. First, we focus on mining the intrinsic clues in the channel difference image (CDI) and spectrum image (SI) view of two different aspects, including the camera imaging process and the indispensable step in AI manipulation process. Then, we introduce the Octave Convolution and an attention-based fusion module to effectively and adaptively mine intrinsic features from CDI and SI view of these two different but intrinsic aspects. Finally, we design an alignment module to eliminate the bias of manipulation techniques to obtain a more generalized detection framework. We evaluate the proposed framework on four categories of fake faces datasets with the most popular and state-of-the-art manipulation techniques and achieve very competitive performances. We further conduct experiments on cross-manipulation techniques, and the results of our method show the superior advantages on improving generalization ability.

REFERENCES

  1. [1] Afchar Darius, Nozick Vincent, Yamagishi Junichi, and Echizen Isao. 2018. Mesonet: A compact facial video forgery detection network. In Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS’18). IEEE, 17.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Amerini Irene, Galteri Leonardo, Caldelli Roberto, and Bimbo Alberto Del. 2019. Deepfake video detection through optical flow based cnn. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 00.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Chai Lucy, Bau David, Lim Ser-Nam, and Isola Phillip. 2020. What makes fake images detectable? Understanding properties that generalize. In Proceedings of the European Conference on Computer Vision. Springer, 103120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Chen Yunpeng, Fan Haoqi, Xu Bing, Yan Zhicheng, Kalantidis Yannis, Rohrbach Marcus, Yan Shuicheng, and Feng Jiashi. 2019. Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 34353444.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Chen Ying-Cong, Xu Xiaogang, Tian Zhuotao, and Jia Jiaya. 2019. Homomorphic latent space interpolation for unpaired image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 24082416.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Choi Yunjey, Choi Minje, Kim Munyoung, Ha Jung-Woo, Kim Sunghun, and Choo Jaegul. 2018. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 87898797.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Chugh Komal, Gupta Parul, Dhall Abhinav, and Subramanian Ramanathan. 2020. Not made for each other-audio-visual dissonance-based deepfake detection and localization. In Proceedings of the 28th ACM International Conference on Multimedia. 439447.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Ciftci Umur Aybars, Demir Ilke, and Yin Lijun. 2020. Fakecatcher: Detection of synthetic portrait videos using biological signals. IEEE Trans. Pattern Anal. Mach. Intell. (2020).Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Cozzolino Davide, Thies Justus, Rössler Andreas, Riess Christian, Nießner Matthias, and Verdoliva Luisa. 2018. Forensictransfer: Weakly-supervised domain adaptation for forgery detection. arXiv:1812.02510. Retrieved from https://arxiv.org/abs/1812.02510.Google ScholarGoogle Scholar
  10. [10] Dang Hao, Liu Feng, Stehouwer Joel, Liu Xiaoming, and Jain Anil K.. 2020. On the detection of digital face manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 57815790.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Dang L. Minh, Hassan Syed Ibrahim, Im Suhyeon, and Moon Hyeonjoon. 2019. Face image manipulation detection based on a convolutional neural network. Expert Syst. Appl. 129 (2019), 156168.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Boer Pieter-Tjerk De, Kroese Dirk P., Mannor Shie, and Rubinstein Reuven Y.. 2005. A tutorial on the cross-entropy method. Ann. Operat. Res. 134, 1 (2005), 1967.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Ding Hui, Sricharan Kumar, and Chellappa Rama. 2018. Exprgan: Facial expression editing with controllable expression intensity. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Fernandes Steven, Raj Sunny, Ortiz Eddy, Vintila Iustina, Salter Margaret, Urosevic Gordana, and Jha Sumit. 2019. Predicting heart rate variations of deepfake videos using neural ODE. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 00.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Frank Joel, Eisenhofer Thorsten, Schönherr Lea, Fischer Asja, Kolossa Dorothea, and Holz Thorsten. 2020. Leveraging frequency analysis for deep fake image recognition. In Proceedings of the International Conference on Machine Learning. PMLR, 32473258.Google ScholarGoogle Scholar
  16. [16] Gretton Arthur, Borgwardt Karsten M., Rasch Malte J., Schölkopf Bernhard, and Smola Alexander. 2012. A kernel two-sample test. J. Mach. Learn. Res. 13, 1 (2012), 723773.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Güera David and Delp Edward J.. 2018. Deepfake video detection using recurrent neural networks. In Proceedings of the 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS’18). IEEE, 16.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Gunturk Bahadir K., Altunbasak Yucel, and Mersereau Russell M.. 2002. Color plane interpolation using alternating projections. IEEE Trans. Image Process. 11, 9 (2002), 9971013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770778.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] He Peisong, Li Haoliang, and Wang Hongxia. 2019. Detection of fake images via the ensemble of deep representations from multi color spaces. In Proceedings of the IEEE International Conference on Image Processing (ICIP’19). IEEE, 22992303.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] He Zhenliang, Zuo Wangmeng, Kan Meina, Shan Shiguang, and Chen Xilin. 2019. Attgan: Facial attribute editing by only changing what you want. IEEE Trans. Image Process. 28, 11 (2019), 54645478.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Huang Gary B., Mattar Marwan, Berg Tamara, and Learned-Miller Eric. 2008. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In Workshop on Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition.Google ScholarGoogle Scholar
  23. [23] Jeon Hyeonseong, Bang Young Oh, Kim Junyaup, and Woo Simon. 2020. T-GD: Transferable GAN-generated images detection framework. In Proceedings of the International Conference on Machine Learning. PMLR, 47464761.Google ScholarGoogle Scholar
  24. [24] Karras Tero, Aila Timo, Laine Samuli, and Lehtinen Jaakko. 2018. Progressive growing of GANs for improved quality, stability, and variation. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  25. [25] Karras Tero, Laine Samuli, and Aila Timo. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 44014410.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Karras Tero, Laine Samuli, Aittala Miika, Hellsten Janne, Lehtinen Jaakko, and Aila Timo. 2020. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 81108119.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Kingma Diederik P. and Ba Jimmy. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980. Retrieved from https://arxiv.org/abs/1412.6980.Google ScholarGoogle Scholar
  28. [28] Korshunova Iryna, Shi Wenzhe, Dambre Joni, and Theis Lucas. 2017. Fast face-swap using convolutional neural networks. In Proceedings of the IEEE International Conference on Computer Vision. 36773685.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Li Haodong, Li Bin, Tan Shunquan, and Huang Jiwu. 2020. Identification of deep network generated images using disparities in color components. Sign. Process. 174 (2020), 107616.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Li Lingzhi, Bao Jianmin, Zhang Ting, Yang Hao, Chen Dong, Wen Fang, and Guo Baining. 2020. Face x-ray for more general face forgery detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 50015010.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Li Yuezun, Chang Ming-Ching, and Lyu Siwei. 2018. In ictu oculi: Exposing ai created fake videos by detecting eye blinking. In Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS’18). IEEE, 17.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Li Yuezun and Lyu Siwei. 2019. Exposing DeepFake videos by detecting face warping artifacts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 4652.Google ScholarGoogle Scholar
  33. [33] Liu Ming, Ding Yukang, Xia Min, Liu Xiao, Ding Errui, Zuo Wangmeng, and Wen Shilei. 2019. Stgan: A unified selective transfer network for arbitrary image attribute editing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 36733682.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Liu Shiguang and Huang Ziqing. 2019. Efficient image hashing with geometric invariant vector distance for copy detection. ACM Trans. Multimedia Comput. Commun. Appl. 15, 4 (2019), 122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Liu Ziwei, Luo Ping, Wang Xiaogang, and Tang Xiaoou. 2015. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision. 37303738.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Liu Zhengzhe, Qi Xiaojuan, and Torr Philip H. S.. 2020. Global texture enhancement for fake face detection in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 80608069.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Maaten Laurens van der and Hinton Geoffrey. 2008. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 11(Nov.2008), 25792605.Google ScholarGoogle Scholar
  38. [38] Marra Francesco, Saltori Cristiano, Boato Giulia, and Verdoliva Luisa. 2019. Incremental learning for the detection and classification of GAN-generated images. In Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS’19). IEEE, 16.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Matern Falko, Riess Christian, and Stamminger Marc. 2019. Exploiting visual artifacts to expose deepfakes and face manipulations. In Proceedings of the IEEE Winter Applications of Computer Vision Workshops (WACVW’19). IEEE, 8392.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] McCloskey Scott and Albright Michael. 2019. Detecting GAN-generated imagery using saturation cues. In Proceedings of the IEEE International Conference on Image Processing (ICIP’19). IEEE, 45844588.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Mi Zhongjie, Jiang Xinghao, Sun Tanfeng, and Xu Ke. 2020. GAN-generated image detection with self-attention mechanism against GAN generator defect. IEEE J. Select. Top. Sign. Process. 14, 5 (2020), 969981.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Mittal Trisha, Bhattacharya Uttaran, Chandra Rohan, Bera Aniket, and Manocha Dinesh. 2020. Emotions don’t lie: An audio-visual deepfake detection method using affective cues. In Proceedings of the 28th ACM International Conference on Multimedia. 28232832.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Mou Luntian, Huang Tiejun, Tian Yonghong, Jiang Menglin, and Gao Wen. 2013. Content-based copy detection through multimodal feature representation and temporal pyramid matching. ACM Trans. Multimedia Comput. Commun. Appl. 10, 1 (2013), 120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Nguyen Huy H., Yamagishi Junichi, and Echizen Isao. 2019. Capsule-forensics: Using capsule networks to detect forged images and videos. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19). IEEE, 23072311.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Pumarola Albert, Agudo Antonio, Martinez Aleix M, Sanfeliu Alberto, and Moreno-Noguer Francesc. 2018. Ganimation: Anatomically-aware facial animation from a single image. In Proceedings of the European Conference on Computer Vision (ECCV’18). 818833.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Qian Yuyang, Yin Guojun, Sheng Lu, Chen Zixuan, and Shao Jing. 2020. Thinking in frequency: Face forgery detection by mining frequency-aware clues. In Proceedings of the European Conference on Computer Vision. Springer, 86103.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. [47] Rossler Andreas, Cozzolino Davide, Verdoliva Luisa, Riess Christian, Thies Justus, and Nießner Matthias. 2019. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE International Conference on Computer Vision. 111.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Sabir Ekraam, Cheng Jiaxin, Jaiswal Ayush, AbdAlmageed Wael, Masi Iacopo, and Natarajan Prem. 2019. Recurrent convolutional strategies for face manipulation detection in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 8087.Google ScholarGoogle Scholar
  49. [49] Selvaraju Ramprasaath R., Cogswell Michael, Das Abhishek, Vedantam Ramakrishna, Parikh Devi, and Batra Dhruv. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision. 618626.Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Thies Justus, Zollhöfer Michael, and Nießner Matthias. 2019. Deferred neural rendering: Image synthesis using neural textures. ACM Trans. Graph. 38, 4 (2019), 112.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. [51] Thies Justus, Zollhofer Michael, Stamminger Marc, Theobalt Christian, and Nießner Matthias. 2016. Face2face: Real-time face capture and reenactment of rgb videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 23872395.Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Tolosana Ruben, Vera-Rodriguez Ruben, Fierrez Julian, Morales Aythami, and Ortega-Garcia Javier. 2020. Deepfakes and beyond: A survey of face manipulation and fake detection. Inf. Fus. 64 (2020), 131148.Google ScholarGoogle ScholarCross RefCross Ref
  53. [53] Wang Run, Juefei-Xu Felix, Ma Lei, Xie Xiaofei, Huang Yihao, Wang Jian, and Liu Yang. 2019. FakeSpotter: A simple yet robust baseline for spotting AI-synthesized fake faces. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20. 34443451.Google ScholarGoogle Scholar
  54. [54] Wang Sheng-Yu, Wang Oliver, Zhang Richard, Owens Andrew, and Efros Alexei A.. 2020. CNN-generated images are surprisingly easy to spot... for now. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. 86958704.Google ScholarGoogle ScholarCross RefCross Ref
  55. [55] Wei Yang, Wang Zhuzhu, Xiao Bin, Liu Ximeng, Yan Zheng, and Ma Jianfeng. 2020. Controlling neural learning network with multiple scales for image splicing forgery detection. ACM Trans. Multimedia Comput. Commun. Appl. 16, 4 (2020), 122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. [56] Yang Xin, Li Yuezun, and Lyu Siwei. 2019. Exposing deep fakes using inconsistent head poses. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19). IEEE, 82618265.Google ScholarGoogle ScholarCross RefCross Ref
  57. [57] Yang Xin, Li Yuezun, Qi Honggang, and Lyu Siwei. 2019. Exposing GAN-synthesized faces using landmark locations. In Proceedings of the ACM Workshop on Information Hiding and Multimedia Security. 113118.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. [58] Yosinski Jason, Clune Jeff, Bengio Yoshua, and Lipson Hod. 2014. How transferable are features in deep neural networks?Advances in Neural Information Processing Systems 27 (2014), 33203328.Google ScholarGoogle Scholar
  59. [59] Yu Ning, Davis Larry S., and Fritz Mario. 2019. Attributing fake images to gans: Learning and analyzing gan fingerprints. In Proceedings of the IEEE International Conference on Computer Vision. 75567566.Google ScholarGoogle ScholarCross RefCross Ref
  60. [60] Zhang Kaipeng, Zhang Zhanpeng, Li Zhifeng, and Qiao Yu. 2016. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Sign. Process. Lett. 23, 10 (2016), 14991503.Google ScholarGoogle ScholarCross RefCross Ref
  61. [61] Zhang Xu, Karaman Svebor, and Chang Shih-Fu. 2019. Detecting and simulating artifacts in gan fake images. In Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS’19). IEEE, 16.Google ScholarGoogle ScholarCross RefCross Ref
  62. [62] Zhao Guoying, Huang Xiaohua, Taini Matti, Li Stan Z., and PietikäInen Matti. 2011. Facial expression recognition from near-infrared videos. Image Vis. Comput. 29, 9 (2011), 607619.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Detection of AI-Manipulated Fake Faces via Mining Generalized Features

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Multimedia Computing, Communications, and Applications
          ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 4
          November 2022
          497 pages
          ISSN:1551-6857
          EISSN:1551-6865
          DOI:10.1145/3514185
          • Editor:
          • Abdulmotaleb El Saddik
          Issue’s Table of Contents

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 4 March 2022
          • Accepted: 1 November 2021
          • Revised: 1 September 2021
          • Received: 1 April 2021
          Published in tomm Volume 18, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Full Text

        View this article in Full Text.

        View Full Text

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!