skip to main content
research-article

Exploring the Effect of High-frequency Components in GANs Training

Published:16 March 2023Publication History
Skip Abstract Section

Abstract

Generative Adversarial Networks (GANs) have the ability to generate images that are visually indistinguishable from real images. However, recent studies have revealed that generated and real images share significant differences in the frequency domain. In this article, we argue that the frequency gap is caused by the high-frequency sensitivity of the discriminator. According to our observation, during the training of most GANs, severe high-frequency differences make the discriminator focus on high-frequency components excessively, which hinders the generator from fitting the low-frequency components that are important for learning images’ content. Then, we propose two simple yet effective image pre-processing operations in the frequency domain for eliminating the side effects caused by high-frequency differences in GANs training: High-frequency Confusion (HFC) and High-frequency Filter (HFF). The proposed operations are general and can be applied to most existing GANs at a fraction of the cost. The advanced performance of the proposed operations is verified on multiple loss functions, network architectures, and datasets. Specifically, the proposed HFF achieves significant improvements of 42.5% FID on CelebA (128*128) unconditional generation based on SNGAN, 30.2% FID on CelebA unconditional generation based on SSGAN, and 69.3% FID on CelebA unconditional generation based on InfoMAXGAN. Furthermore, we also adopt HFF as the first attempt at data augmentation in the frequency domain for contrastive learning, achieving state-of-the-art performance on unconditional generation. Code is available at https://github.com/iceli1007/HFC-and-HFF.

REFERENCES

  1. [1] Arjovsky Martin, Chintala Soumith, and Bottou Léon. 2017. Wasserstein GAN. Retrieved from https://arXiv:1701.07875.Google ScholarGoogle Scholar
  2. [2] Basri Ronen, Jacobs David, Kasten Yoni, and Kritchman Shira. 2019. The convergence rate of neural networks for learned functions of different frequencies. Retrieved from https://arXiv:1906.00425.Google ScholarGoogle Scholar
  3. [3] Bińkowski Mikołaj, Sutherland Dougal J., Arbel Michael, and Gretton Arthur. 2018. Demystifying MMD GANs. Retrieved from https://arXiv:1801.01401.Google ScholarGoogle Scholar
  4. [4] Cao Yuan, Fang Zhiying, Wu Yue, Zhou Ding-Xuan, and Gu Quanquan. 2019. Towards understanding the spectral bias of deep learning. Retrieved from https://arXiv:1912.01198.Google ScholarGoogle Scholar
  5. [5] Chen Ting, Kornblith Simon, Norouzi Mohammad, and Hinton Geoffrey. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of the International Conference on Machine Learning. PMLR, 15971607.Google ScholarGoogle Scholar
  6. [6] Chen Ting, Zhai Xiaohua, Ritter Marvin, Lucic Mario, and Houlsby Neil. 2019. Self-supervised GANs via auxiliary rotation loss. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1215412163.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Chen Yuanqi, Li Ge, Jin Cece, Liu Shan, and Li Thomas. 2021. SSD-GAN: Measuring the realness in the spatial and spectral domains. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 11051112.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Coates Adam, Ng Andrew, and Lee Honglak. 2011. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the International Conference on Artificial Intelligence and Statistics. 215223.Google ScholarGoogle Scholar
  9. [9] Duan Mingxing, Li Kenli, Deng Jiayan, Xiao Bin, and Tian Qi. 2022. A novel multi-sample generation method for adversarial attacks. ACM Trans. Multimedia Comput. Commun. Appl. 18, 4 (2022), 121.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Durall Ricard, Keuper Margret, and Keuper Janis. 2020. Watch your up-convolution: CNN-based generative deep neural networks are failing to reproduce spectral distributions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 78907899.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Dzanic Tarik, Shah Karan, and Witherden Freddie. 2019. Fourier spectrum discrepancies in deep network generated images. Retrieved from https://arXiv:1911.06465.Google ScholarGoogle Scholar
  12. [12] Frank Joel, Eisenhofer Thorsten, Schönherr Lea, Fischer Asja, Kolossa Dorothea, and Holz Thorsten. 2020. Leveraging frequency analysis for deep fake image recognition. Retrieved from https://arXiv:2003.08685.Google ScholarGoogle Scholar
  13. [13] Fuoli Dario, Gool Luc Van, and Timofte Radu. 2021. Fourier space losses for efficient perceptual image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 23602369.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Goodfellow Ian, Pouget-Abadie Jean, Mirza Mehdi, Xu Bing, Warde-Farley David, Ozair Sherjil, Courville Aaron, and Bengio Yoshua. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems. MIT Press, 26722680.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Goodfellow Ian J., Shlens Jonathon, and Szegedy Christian. 2014. Explaining and harnessing adversarial examples. Retrieved from https://arXiv:1412.6572.Google ScholarGoogle Scholar
  16. [16] Gu Jinjin, Shen Yujun, and Zhou Bolei. 2020. Image processing using multi-code GAN prior. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 30123021.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Gulrajani Ishaan, Ahmed Faruk, Arjovsky Martin, Dumoulin Vincent, and Courville Aaron C.. 2017. Improved training of wasserstein GANs. In Advances in Neural Information Processing Systems. MIT Press, 57675777.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] He Kaiming, Fan Haoqi, Wu Yuxin, Xie Saining, and Girshick Ross. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 97299738.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Heusel Martin, Ramsauer Hubert, Unterthiner Thomas, Nessler Bernhard, and Hochreiter Sepp. 2017. Gans trained by a two time-scale update rule converge to a local Nash equilibrium. Adv. Neural Info. Process. Syst. 30 (2017), 66266637.Google ScholarGoogle Scholar
  20. [20] Jeong Jongheon and Shin Jinwoo. 2021. Training GANs with stronger augmentations via contrastive discriminator. Retrieved from https://arXiv:2103.09742.Google ScholarGoogle Scholar
  21. [21] Jiang Liming, Dai Bo, Wu Wayne, and Loy Chen Change. 2021. Deceive D: Adaptive pseudo augmentation for GAN training with limited data. In Proceedings of the 35th Conference on Neural Information Processing Systems.Google ScholarGoogle Scholar
  22. [22] Karras Tero, Aila Timo, Laine Samuli, and Lehtinen Jaakko. 2017. Progressive growing of GANs for improved quality, stability, and variation. Retrieved from https://arXiv:1710.10196.Google ScholarGoogle Scholar
  23. [23] Karras Tero, Aittala Miika, Hellsten Janne, Laine Samuli, Lehtinen Jaakko, and Aila Timo. 2020. Training generative adversarial networks with limited data. Adv. Neural Info. Process. Syst. 33 (2020).Google ScholarGoogle Scholar
  24. [24] Karras Tero, Laine Samuli, and Aila Timo. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 44014410.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Karras Tero, Laine Samuli, Aittala Miika, Hellsten Janne, Lehtinen Jaakko, and Aila Timo. 2020. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 81108119.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Khayatkhoei Mahyar and Elgammal Ahmed. 2020. Spatial frequency bias in convolutional generative adversarial networks. Retrieved from https://arXiv:2010.01473.Google ScholarGoogle Scholar
  27. [27] Krizhevsky Alex, Hinton Geoffrey et al. 2009. Learning multiple layers of features from tiny images. http://www.cs.utoronto.ca/kriz/learning-features-2009-TR.pdf.Google ScholarGoogle Scholar
  28. [28] Lee Kwot Sin, Tran Ngoc-Trung, and Cheung Ngai-Man. 2020. InfoMax-GAN: Improved adversarial image generation via information maximization and contrastive learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 39423952.Google ScholarGoogle Scholar
  29. [29] Li Jiguo, Zhang Xinfeng, Xu Jizheng, Ma Siwei, and Gao Wen. 2021. Learning to fool the speaker recognition. ACM Trans. Multimedia Comput. Commun. Appl. 17, 3s (2021), 121.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Li Ziqiang, Tao Rentuo, Wang Jie, Li Fu, Niu Hongjing, Yue Mingdao, and Li Bin. 2021. Interpreting the latent space of GANs via measuring decoupling. IEEE Trans. Artific. Intell. 2, 1 (2021), 5870. Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Li Ziqiang, Usman Muhammad, Tao Rentuo, Xia Pengfei, Wang Chaoyue, Chen Huanhuan, and Li Bin. 2022. A systematic survey of regularization and normalization in GANs. ACM Comput. Surv. (Nov.2022). Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Li Ziqiang, Wang Chaoyue, Zheng Heliang, Zhang Jing, and Li Bin. 2022. FakeCLR: Exploring contrastive learning for solving latent discontinuity in data-efficient GANs. In Proceedings of the European Conference on Computer Vision. Springer, 598615.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Li Ziqiang, Wu Xintian, Xia Beihao, Zhang Jing, Wang Chaoyue, and Li Bin. 2022. A comprehensive survey on data-efficient GANs in image generation. Retrieved from https://arXiv:2204.08329.Google ScholarGoogle Scholar
  34. [34] Li Ziqiang, Xia Pengfei, Tao Rentuo, Niu Hongjing, and Li Bin. 2022. A new perspective on stabilizing GANs training: Direct adversarial training. In IEEE Transactions on Emerging Topics in Computational Intelligence. 112. DOI: 10.1109/TETCI.2022.3193373Google ScholarGoogle Scholar
  35. [35] Lim Jae Hyun and Ye Jong Chul. 2017. Geometric GAN. Retrieved from https://arXiv:1705.02894.Google ScholarGoogle Scholar
  36. [36] Mao Xudong, Li Qing, Xie Haoran, Lau Raymond Y. K., Wang Zhen, and Smolley Stephen Paul. 2017. Least squares generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision. 27942802.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Miyato Takeru, Kataoka Toshiki, Koyama Masanori, and Yoshida Yuichi. 2018. Spectral normalization for generative adversarial networks. Retrieved from https://arXiv:1802.05957.Google ScholarGoogle Scholar
  38. [38] Nowozin Sebastian, Cseke Botond, and Tomioka Ryota. 2016. f-GAN: Training generative neural samplers using variational divergence minimization. In Advances in Neural Information Processing Systems. MIT Press, 271279.Google ScholarGoogle Scholar
  39. [39] Oord Aaron van den, Li Yazhe, and Vinyals Oriol. 2018. Representation learning with contrastive predictive coding. Retrieved from https://arXiv:1807.03748.Google ScholarGoogle Scholar
  40. [40] Patel Parth, Kumari Nupur, Singh Mayank, and Krishnamurthy Balaji. 2021. Lt-GAN: Self-supervised GAN with latent transformation detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 31893198.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Peng Yuxin and Qi Jinwei. 2019. CM-GANs: Cross-modal generative adversarial networks for common representation learning. ACM Trans. Multimedia Comput. Commun. Appl. 15, 1 (2019), 124.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Rahaman Nasim, Baratin Aristide, Arpit Devansh, Draxler Felix, Lin Min, Hamprecht Fred, Bengio Yoshua, and Courville Aaron. 2019. On the spectral bias of neural networks. In Proceedings of the International Conference on Machine Learning. PMLR, 53015310.Google ScholarGoogle Scholar
  43. [43] Salimans Tim, Goodfellow Ian, Zaremba Wojciech, Cheung Vicki, Radford Alec, and Chen Xi. 2016. Improved techniques for training GANs. Adv. Neural Info. Process. Syst. 29 (2016), 22342242.Google ScholarGoogle Scholar
  44. [44] Shamai Gil, Slossberg Ron, and Kimmel Ron. 2019. Synthesizing facial photometries and corresponding geometries using generative adversarial networks. ACM Trans. Multimedia Comput. Commun. Appl. 15, 3s (2019), 124.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Shen Yujun, Gu Jinjin, Tang Xiaoou, and Zhou Bolei. 2020. Interpreting the latent space of GANs for semantic face editing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 92439252.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Shin Yong-Goo, Sagong Min-Cheol, Yeo Yoon-Jae, Kim Seung-Wook, and Ko Sung-Jea. 2020. Pepsi++: Fast and lightweight network for image inpainting. IEEE Trans. Neural Netw. Learn. Syst. 32, 1 (2020), 252–265.Google ScholarGoogle Scholar
  47. [47] Sun Teng, Wang Chun, Song Xuemeng, Feng Fuli, and Nie Liqiang. 2022. Response generation by jointly modeling personalized linguistic styles and emotions. ACM Trans. Multimedia Comput. Commun. Appl. 18, 2 (2022), 120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. [48] Tancik Matthew, Srinivasan Pratul P., Mildenhall Ben, Fridovich-Keil Sara, Raghavan Nithin, Singhal Utkarsh, Ramamoorthi Ravi, Barron Jonathan T., and Ng Ren. 2020. Fourier features let networks learn high frequency functions in low dimensional domains. Retrieved from https://arXiv:2006.10739.Google ScholarGoogle Scholar
  49. [49] Tao Rentuo, Li Ziqiang, Tao Renshuai, and Li Bin. 2019. ResAttr-GAN: Unpaired deep residual attributes learning for multi-domain face image translation. IEEE Access 7 (2019), 132594132608.Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Tran Ngoc-Trung, Tran Viet-Hung, Nguyen Ngoc-Bao, Nguyen Trung-Kien, and Cheung Ngai-Man. 2021. On data augmentation for GAN training. IEEE Trans. Image Process. 30 (2021), 18821897.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. [51] Wang Chaoyue, Xu Chang, Wang Chaohui, and Tao Dacheng. 2018. Perceptual adversarial networks for image-to-image transformation. IEEE Trans. Image Process. 27, 8 (2018), 40664079.Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Wang Haohan, Wu Xindi, Huang Zeyi, and Xing Eric P.. 2020. High-frequency component helps explain the generalization of convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 86848694.Google ScholarGoogle ScholarCross RefCross Ref
  53. [53] Wang Xueping, Wang Yunhong, and Li Weixin. 2019. U-Net conditional GANs for photo-realistic and identity-preserving facial expression synthesis. ACM Trans. Multimedia Comput. Commun. Appl. 15, 3s (2019), 123.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. [54] Wong Conghao, Xia Beihao, Hong Ziming, Peng Qinmu, Yuan Wei, Cao Qiong, Yang Yibo, and You Xinge. 2022. View vertically: A hierarchical network for trajectory prediction via fourier spectrums. In Proceedings of the European Conference on Computer Vision. Springer, 682700.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. [55] Wu Yi-Lun, Shuai Hong-Han, Tam Zhi-Rui, and Chiu Hong-Yu. 2021. Gradient normalization for generative adversarial networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 63736382.Google ScholarGoogle ScholarCross RefCross Ref
  56. [56] Xia Beihao, Wong Conghao, Peng Qinmu, Yuan Wei, and You Xinge. 2022. CSCNet: Contextual semantic consistency network for trajectory prediction in crowded spaces. Pattern Recogn. 126 (2022), 108552.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. [57] Xu Zhiqin John. 2018. Understanding training and generalization in deep learning by fourier analysis. Retrieved from https://arXiv:1808.04295.Google ScholarGoogle Scholar
  58. [58] Xu Zhi-Qin John. 2018. Frequency principle in deep learning with general loss functions and its potential application. Retrieved from https://arXiv:1811.10146.Google ScholarGoogle Scholar
  59. [59] Xu Zhi-Qin John, Zhang Yaoyu, Luo Tao, Xiao Yanyang, and Ma Zheng. 2019. Frequency principle: Fourier analysis sheds light on deep neural networks. Retrieved from https://arXiv:1901.06523.Google ScholarGoogle Scholar
  60. [60] Xu Zhi-Qin John, Zhang Yaoyu, and Xiao Yanyang. 2019. Training behavior of deep neural network in frequency domain. In Proceedings of the International Conference on Neural Information Processing. Springer, 264274.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. [61] Yamaguchi Shin’ya and Kanai Sekitoshi. 2021. F-Drop&Match: GANs with a dead zone in the high-frequency domain. Retrieved from https://arXiv:2106.02343.Google ScholarGoogle Scholar
  62. [62] Yang Ceyuan, Shen Yujun, Xu Yinghao, and Zhou Bolei. 2021. Data-efficient instance generation from instance discrimination. Adv. Neural Info. Process. Syst. 34 (2021).Google ScholarGoogle Scholar
  63. [63] Yang Mengping, Wang Zhe, Chi Ziqiu, and Feng Wenyi. 2022. WaveGAN: Frequency-aware GAN for high-fidelity few-shot image generation. In Proceedings of the European Conference on Computer Vision. Springer, 117.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. [64] Yu Fisher, Seff Ari, Zhang Yinda, Song Shuran, Funkhouser Thomas, and Xiao Jianxiong. 2015. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. Retrieved from https://arXiv:1506.03365.Google ScholarGoogle Scholar
  65. [65] Yu Yi, Srivastava Abhishek, and Canales Simon. 2021. Conditional LSTM-GAN for melody generation from lyrics. ACM Trans. Multimedia Comput. Commun. Appl. 17, 1 (2021), 120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. [66] Zhang Xu, Karaman Svebor, and Chang Shih-Fu. 2019. Detecting and simulating artifacts in GAN fake images. In Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS’19). IEEE, 16.Google ScholarGoogle ScholarCross RefCross Ref
  67. [67] Zhang Yaoyu, Xu Zhi-Qin John, Luo Tao, and Ma Zheng. 2019. Explicitizing an implicit bias of the frequency principle in two-layer neural networks. Retrieved from https://arXiv:1905.10264.Google ScholarGoogle Scholar
  68. [68] Zhao Shengyu, Liu Zhijian, Lin Ji, Zhu Jun-Yan, and Han Song. 2020. Differentiable augmentation for data-efficient GAN training. Adv. Neural Info. Process. Syst. 33 (2020).Google ScholarGoogle Scholar
  69. [69] Zhao Zhongwei, Song Ran, Zhang Qian, Duan Peng, and Zhang Youmei. 2022. JoT-GAN: A framework for jointly training GAN and person re-identification model. ACM Trans. Multimedia Comput. Commun. Appl. 18, 1s (2022), 118.Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. [70] Zhao Zhengli, Zhang Zizhao, Chen Ting, Singh Sameer, and Zhang Han. 2020. Image augmentations for GAN training. Retrieved from https://arXiv:2006.02595.Google ScholarGoogle Scholar

Index Terms

  1. Exploring the Effect of High-frequency Components in GANs Training

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Multimedia Computing, Communications, and Applications
            ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 5
            September 2023
            262 pages
            ISSN:1551-6857
            EISSN:1551-6865
            DOI:10.1145/3585398
            • Editor:
            • Abdulmotaleb El Saddik
            Issue’s Table of Contents

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 16 March 2023
            • Online AM: 29 December 2022
            • Accepted: 24 December 2022
            • Revised: 19 November 2022
            • Received: 2 June 2022
            Published in tomm Volume 19, Issue 5

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Full Text

          View this article in Full Text.

          View Full Text

          HTML Format

          View this article in HTML Format .

          View HTML Format
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!