Abstract
Generative adversarial networks (GANs) nowadays are capable of producing images of incredible realism. Two concerns raised are whether the state-of-the-art GAN’s learned distribution still suffers from mode collapse and what to do if so. Existing diversity tests of samples from GANs are usually conducted qualitatively on a small scale and/or depend on the access to original training data as well as the trained model parameters. This article explores GAN intra-mode collapse and calibrates that in a novel black-box setting: access to neither training data nor the trained model parameters is assumed. The new setting is practically demanded yet rarely explored and significantly more challenging. As a first stab, we devise a set of statistical tools based on sampling that can visualize, quantify, and rectify intra-mode collapse. We demonstrate the effectiveness of our proposed diagnosis and calibration techniques, via extensive simulations and experiments, on unconditional GAN image generation (e.g., face and vehicle). Our study reveals that the intra-mode collapse is still a prevailing problem in state-of-the-art GANs and the mode collapse is diagnosable and calibratable in black-box settings. Our codes are available at https://github.com/VITA-Group/BlackBoxGANCollapse.
- [1] [n.d.]. 265 Bird Species. Retrieved from https://www.kaggle.com/gpiosenka/100-bird-species.Google Scholar
- [2] [n.d.]. Flowers-17 & Flowers-102. Retrieved from https://www.robots.ox.ac.uk/ vgg/data/flowers/.Google Scholar
- [3] [n.d.]. Flowers Recognition. Retrieved from https://www.kaggle.com/alxmamaev/flowers-recognition.Google Scholar
- [4] [n.d.]. Flowers Recognition. Retrieved from https://public.roboflow.com/classification/flowers/1.Google Scholar
- [5] [n.d.]. PlantCLEF 2021: Cross-domain Plant Identification. Retrieved from https://www.imageclef.org/PlantCLEF2021.Google Scholar
- [6] . 2019. Uncovering and mitigating algorithmic bias through learned latent structure. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 289–295. Google Scholar
Digital Library
- [7] . 2017. Wasserstein generative adversarial networks. In Proceedings of the International Conference on Machine Learning (ICML’17). Google Scholar
Digital Library
- [8] . 2018. Do GANs learn the distribution? some theory and empirics. In Proceedings of the International Conference on Learning Representations (ICLR’18).Google Scholar
- [9] . 2017. Do gans actually learn the distribution? an empirical study. arXiv preprint arXiv:1706.08224.Google Scholar
- [10] . 2018. Discriminator rejection sampling. arXiv preprint arXiv:1810.06758.Google Scholar
- [11] . 2018. A note on the inception score. arXiv preprint arXiv:1801.01973.Google Scholar
- [12] . 2019. Seeing what a gan cannot generate. In Proceedings of the International Conference on Computer Vision (ICCV’19).Google Scholar
Cross Ref
- [13] . 2019. Pros and cons of gan evaluation measures. Comput. Vis. Image Understand. 179 (2019), 41–65.Google Scholar
Digital Library
- [14] . 2018. GANsfer learning: Combining labelled and unlabelled data for GAN based data augmentation. arXiv preprint arXiv:1811.10669.Google Scholar
- [15] . 2018. Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096.Google Scholar
- [16] . 2016. Mode regularized generative adversarial networks. arXiv preprint arXiv:1612.02136.Google Scholar
- [17] . 2019. ArcFace: Additive angular margin loss for deep face recognition. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’19).Google Scholar
Cross Ref
- [18] . 2019. RetinaFace: Single-stage dense face localisation in the wild. arXiv preprint arXiv:1905.00641.Google Scholar
- [19] . 2018. The Menpo benchmark for multi-pose 2D and 3D facial landmark localisation and tracking. Int. J. Comput. Vis. (2018). Google Scholar
Digital Library
- [20] . 2016. Generative multi-adversarial networks. arXiv preprint arXiv:1611.01673.Google Scholar
- [21] . 2011. Semi-supervised pattern classification of medical images: Application to mild cognitive impairment (MCI). NeuroImage (2011).Google Scholar
Cross Ref
- [22] . 2018. Multi-agent diverse generative adversarial networks. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’18).Google Scholar
Cross Ref
- [23] . 2016. Plant identification in an open-world (lifeclef 2016). In Conference and Labs of the Evaluation Forum (CLEF’16).Google Scholar
- [24] . 2017. Plant identification based on noisy web data: The amazing performance of deep learning (LifeCLEF 2017). In Conference and Labs of the Evaluation Forum (CLEF’17).Google Scholar
- [25] . 2020. Overview of lifeclef plant identification task 2020. In Conference and labs of the Evaluation Forum (CLEF’20).Google Scholar
- [26] . 2019. Autogan: Neural architecture search for generative adversarial networks. In Proceedings of the International Conference on Computer Vision (ICCV’19).Google Scholar
Cross Ref
- [27] . 2016. NIPS 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160.Google Scholar
- [28] . 2014. Generative adversarial nets. In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS’14). Google Scholar
Digital Library
- [29] . 2018. Stacked dense U-nets with dual transformers for robust face alignment. In Proceedings of the British Machine Vision Conference (BMVC’18).Google Scholar
- [30] . 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS’17). Google Scholar
Digital Library
- [31] . 2018. Improving fairness in machine learning systems: What do industry practitioners need?arXiv preprint arXiv:1812.05239. Google Scholar
Digital Library
- [32] . 2018. An introduction to image synthesis with generative adversarial nets. arXiv:1803.04469. Retrieved from https://arxiv.org/abs/1803.04469.Google Scholar
- [33] . 2021. Transgan: Two transformers can make one strong gan. arXiv preprint arXiv:2102.07074 1, 3 (2021).Google Scholar
- [34] . 2021. Enlightengan: Deep light enhancement without paired supervision. IEEE Transactions on Image Processing 30 (2021), 2340–2349.Google Scholar
Digital Library
- [35] . 2020. Msg-gan: Multi-scale gradients for generative adversarial networks. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’20).Google Scholar
Cross Ref
- [36] . 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196.Google Scholar
- [37] . 2018. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4401–4410.Google Scholar
- [38] . 2020. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8110–8119.Google Scholar
Cross Ref
- [39] . 2019. Deblurgan-v2: Deblurring (orders-of-magnitude) faster and better. In Proceedings of the International Conference on Computer Vision (ICCV’19).Google Scholar
Cross Ref
- [40] . 2018. Pacgan: The power of two samples in generative adversarial networks. In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS’18). Google Scholar
Digital Library
- [41] . 2016. Coupled generative adversarial networks. In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS’16). Google Scholar
Digital Library
- [42] . 2019. Pixel level data augmentation for semantic image segmentation using generative adversarial networks. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19).Google Scholar
Cross Ref
- [43] . 2018. RAM: A region-aware deep model for vehicle re-identification. In Proceedings of the IEEE International Conference on Multimedia & Expo (ICME’18).Google Scholar
Cross Ref
- [44] . 2017. Least squares generative adversarial networks. In Proceedings of the International Conference on Computer Vision (ICCV’17).Google Scholar
Cross Ref
- [45] . 2020. The DongNiao international birds 10000 dataset. (unpublished).Google Scholar
- [46] . 2016. Unrolled Generative Adversarial Networks. arXiv preprint arXiv:1611.02163.Google Scholar
- [47] . 2018. Spectral normalization for generative adversarial networks. In Proceedings of the International Conference on Learning Representations (ICLR’18).Google Scholar
- [48] . 2008. Automated flower classification over a large number of classes. In Proceedings of the 6th Indian Conference on Computer Vision, Graphics & Image Processing. IEEE. Google Scholar
Digital Library
- [49] . 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.Google Scholar
- [50] . 2016. Improved techniques for training gans. In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS’16). Google Scholar
Digital Library
- [51] . 2017. A classification-based study of covariate shift in gan distributions. In Proceedings of the International Conference on Machine Learning. PMLR, 4480–4489.Google Scholar
- [52] . 2018. Fairness gan. arXiv preprint arXiv:1805.09910.Google Scholar
- [53] . 2019. Efficient facial representations for age, gender and identity recognition in organizing photo albums using multi-output ConvNet. PeerJ Comput. Sci. 5 (2019), e197.Google Scholar
Cross Ref
- [54] . 2017. Jena Flowers 30 Dataset.
DOI: DOI: https://doi.org/10.7910/DVN/QDHYSTGoogle Scholar - [55] . 2017. Veegan: Reducing mode collapse in gans using implicit variational learning. In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS’17). Google Scholar
Digital Library
- [56] . 2015. Towards principled unsupervised learning. arXiv preprint arXiv:1511.06440.Google Scholar
- [57] . 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning (ICML’19). PMLR.Google Scholar
- [58] . 2018. Metropolis-hastings generative adversarial networks. In Proceedings of the International Conference on Machine Learning. PMLR, 6345–6353.Google Scholar
- [59] . 2019. All-in-one underwater image enhancement using domain-adversarial learning. In Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops’19).Google Scholar
- [60] . 2015. Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’15).Google Scholar
Cross Ref
- [61] . 2020. REVISE: A tool for measuring and mitigating bias in visual datasets. In Proceedings of the European Conference on Computer Vision (ECCV’20).Google Scholar
Digital Library
- [62] . 2020. GAN slimming: All-in-one GAN compression by a unified optimization framework. In Proceedings of the European Conference on Computer Vision (ECCV’20). Springer.Google Scholar
Digital Library
- [63] . 2019. Detecting overfitting of deep generative networks via latent recovery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11273–11282.Google Scholar
Cross Ref
- [64] . 2010. Caltech-UCSD Birds 200.
Technical Report CNS-TR-2010-001. California Institute of Technology.Google Scholar - [65] . 2020. MM-Hand: 3D-aware multi-modal guided hand generation for 3D hand pose synthesis. In Proceedings of the ACM Multimedia Conference (MM’20). Google Scholar
Digital Library
- [66] . 2019. Delving into robust object detection from unmanned aerial vehicles: A deep nuisance disentanglement approach. In Proceedings of the International Conference on Computer Vision (ICCV’19).Google Scholar
Cross Ref
- [67] . 2020. Privacy-preserving deep action recognition: An adversarial learning framework and a new dataset. IEEE Transactions on Pattern Analysis and Machine Intelligence.Google Scholar
Cross Ref
- [68] . 2018. Towards privacy-preserving visual recognition via adversarial training: A pilot study. In Proceedings of the European Conference on Computer Vision (ECCV’18).Google Scholar
Cross Ref
- [69] . 2018. Differentially private generative adversarial network. arXiv preprint arXiv:1802.06739.Google Scholar
- [70] . 2018. Fairgan: Fairness-aware generative adversarial networks. In Proceedings of the International Conference on Big Data (ICBD’18).Google Scholar
Cross Ref
- [71] . 2021. Shape-Matching GAN++: Scale controllable dynamic artistic text style transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence.Google Scholar
Cross Ref
- [72] . 2020. Deep plastic surgery: Robust and controllable image editing with human-drawn sketches. In Proceedings of the European Conference on Computer Vision. Springer, 601–617.Google Scholar
Digital Library
- [73] . 2019. Controllable artistic text style transfer via shape-matching gan. In Proceedings of the International Conference on Computer Vision (ICCV’19).Google Scholar
Cross Ref
- [74] . 2015. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365.Google Scholar
- [75] . 2018. Stackgan++: Realistic image synthesis with stacked generative adversarial networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 8 (2018), 1947–1962.Google Scholar
Cross Ref
- [76] . 2019. DADA: Deep adversarial data augmentation for extremely low data regime classification. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19).Google Scholar
Cross Ref
- [77] . 2019. Attributes guided feature learning for vehicle re-identification. arXiv preprint arXiv:1905.08997.Google Scholar
- [78] . 2020. PCAL: A privacy-preserving intelligent credit risk modeling framework based on adversarial learning. (unpublished).Google Scholar
- [79] . 2017. Inception score, label smoothing, gradient vanishing and-log (d (x)) alternative. arXiv preprint arXiv:1708.Google Scholar
- [80] . 2020. Learning attentive pairwise interaction for fine-grained classification. In Proceedings of the AAAI Annual Conference on Artificial Intelligence (AAAI’20).Google Scholar
Cross Ref
Index Terms
Black-Box Diagnosis and Calibration on GAN Intra-Mode Collapse: A Pilot Study
Recommendations
Impossibility results on weakly black-box hardness amplification
FCT'07: Proceedings of the 16th international conference on Fundamentals of Computation TheoryWe study the task of hardness amplification which transforms a hard function into a harder one. It is known that in a high complexity class such as exponential time, one can convert worst-case hardness into average-case hardness. However, in a lower ...
On the convergence and mode collapse of GAN
SA '18: SIGGRAPH Asia 2018 Technical BriefsGenerative adversarial network (GAN) is a powerful generative model. However, it suffers from several problems, such as convergence instability and mode collapse. To overcome these drawbacks, this paper presents a novel architecture of GAN, which ...
Hardness amplification proofs require majority
STOC '08: Proceedings of the fortieth annual ACM symposium on Theory of computingHardness amplification is the fundamental task of converting a δ-hard function f : (0, 1)n -> (0, 1) into a (1/2-ε)-hard function Amp(f), where f is γ-hard if small circuits fail to compute f on at least a γ fraction of the inputs. Typically, ε,δ are ...






Comments