Abstract
To cope with the problem caused by inadequate training data, many person re-identification (re-id) methods exploit generative adversarial networks (GAN) for data augmentation, where the training of GAN is typically independent of that of the re-id model. The coupling relation between them that probably brings in a performance gain of re-id is thus ignored. In this work, we propose a general framework, namely JoT-GAN, to jointly train GAN and the re-id model. It can simultaneously achieve the optima of both the generator and the re-id model, where the training is guided by each other through a discriminator. The re-id model is boosted for two reasons: (1) the adversarial training encourages it to fool the discriminator, and (2) the generated samples augment the training data. Extensive results on benchmark datasets show that for the re-id model trained with the identification loss as well as the triplet loss, the proposed joint training framework outperforms existing methods with separate training and achieves state-of-the-art re-id performance.
- [1] . 2018. Multi-level factorisation net for person re-identification. In IEEE Conference on Computer Vision and Pattern Recognition. 2109–2118.Google Scholar
Cross Ref
- [2] . 2017. Beyond triplet loss: A deep quadruplet network for person re-identification. In IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2. 403–412.Google Scholar
Cross Ref
- [3] . 2017. Person re-identification by camera correlation aware feature augmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 2 (2017), 392–408. Google Scholar
Digital Library
- [4] . 2017. Good semi-supervised learning that requires a bad GAN. arXiv preprint arXiv:1705.09783. Google Scholar
Digital Library
- [5] . 2018. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person reidentification. In IEEE Conference on Computer Vision and Pattern Recognition. 994–1003.Google Scholar
- [6] . 2018. FD-GAN: Pose-guided feature distilling GAN for robust person re-identification. arXiv preprint arXiv:1810.02936. Google Scholar
Digital Library
- [7] . 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems. 2672–2680. Google Scholar
Digital Library
- [8] . 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google Scholar
Cross Ref
- [9] . 2017. In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017).Google Scholar
- [10] . 2018. Multi-pseudo regularized label for generated data in person re-identification. IEEE Transactions on Image Processing 28, 3 (2018), 1391–1403.Google Scholar
Digital Library
- [11] . 2017. Image-to-image translation with conditional adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition. 1125–1134.Google Scholar
- [12] . 2017. Triple generative adversarial nets. In Advances in Neural Information Processing Systems. 4088–4098. Google Scholar
Digital Library
- [13] . 2019. Pose-guided representation learning for person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019), 1–1.Google Scholar
Digital Library
- [14] . 2017. Person re-identification by deep joint learning of multi-loss classification. In International Joint Conference on Artificial Intelligence. 2194–2200. Google Scholar
Digital Library
- [15] . 2018. Harmonious attention network for person re-identification. In IEEE Conference on Computer Vision and Pattern Recognition. 2285–2294.Google Scholar
Cross Ref
- [16] . 2018. Adversarial open-world person re-identification. In European Conference on Computer Vision. 280–296.Google Scholar
Cross Ref
- [17] . 2015. Person re-identification by local maximal occurrence representation and metric learning. In IEEE Conference on Computer Vision and Pattern Recognition. 2197–2206.Google Scholar
Cross Ref
- [18] . 2018. Pose transferrable person re-identification. In IEEE Conference on Computer Vision and Pattern Recognition. 4099–4108.Google Scholar
Cross Ref
- [19] . 2015. A spatio-temporal appearance representation for video-based pedestrian re-identification. In IEEE International Conference on Computer Vision. 3810–3818. Google Scholar
Digital Library
- [20] . 2020. Iterative local-global collaboration learning towards one-shot video person re-identification. IEEE Transactions on Image Processing 29 (2020), 9360–9372.Google Scholar
Cross Ref
- [21] . 2019. Bag of tricks and a strong baseline for deep person re-identification. In IEEE Conference on Computer Vision and Pattern Recognition Workshops.Google Scholar
Cross Ref
- [22] . 2017. Pose guided person image generation. In Advances in Neural Information Processing Systems. 406–416. Google Scholar
Digital Library
- [23] . 2014. Conditional generative adversarial nets. arXiv:1411.1784 (2014).Google Scholar
- [24] . 2018. Pose-normalized image generation for person re-identification. In European Conference on Computer Vision. 650–667.Google Scholar
- [25] . 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015).Google Scholar
- [26] . 2016. Performance measures and a data set for multi-target, multi-camera tracking. In European Conference on Computer Vision. Springer, 17–35.Google Scholar
Cross Ref
- [27] . 2016. Improved techniques for training GANs. In Advances in Neural Information Processing Systems. 2234–2242. Google Scholar
Digital Library
- [28] . 2018. Dual attention matching network for context-aware feature sequence based person re-identification. In IEEE Conference on Computer Vision and Pattern Recognition. 5363–5372.Google Scholar
Cross Ref
- [29] . 2020. Online decision based visual tracking via reinforcement learning. In Advances in Neural Information Processing Systems, Vol. 33. 11778–11788.Google Scholar
- [30] . 2021. Mesh saliency: An independent perceptual measure or a derivative of image saliency? In IEEE Conference on Computer Vision and Pattern Recognition. 8853–8862.Google Scholar
Cross Ref
- [31] . 2017. Pose-driven deep convolutional model for person re-identification. In IEEE International Conference on Computer Vision. 3960–3969.Google Scholar
Cross Ref
- [32] . 2018. Multi-task learning with low rank attribute embedding for multi-camera person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 5 (2018), 1167–1181.Google Scholar
Cross Ref
- [33] . 2017. Svdnet for pedestrian retrieval. arXiv preprint 1, 6 (2017).Google Scholar
- [34] . 2018. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In European Conference on Computer Vision. 480–496.Google Scholar
Cross Ref
- [35] . 2015. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition. 1–9.Google Scholar
Cross Ref
- [36] . 2016. Cross-scenario transfer person reidentification. IEEE Transactions on Circuits and Systems for Video Technology 26, 8 (2016), 1447–1460.Google Scholar
Digital Library
- [37] . 2018. Person transfer GAN to bridge domain gap for person re-identification. In IEEE Conference on Computer Vision and Pattern Recognition. 79–88.Google Scholar
Cross Ref
- [38] . 2017. Glad: Global-local-alignment descriptor for pedestrian retrieval. In ACM International Conference on Multimedia. 420–428. Google Scholar
Digital Library
- [39] . 2017. Skeleton-aided articulated motion generation. In ACM International on Multimedia Conference. 199–207. Google Scholar
Digital Library
- [40] . 2019. Deep representation learning with part loss for person re-identification. IEEE Transactions on Image Processing 28, 6 (2019), 2860–2871.Google Scholar
Cross Ref
- [41] . 2019. Feature aggregation with reinforcement learning for video-based person re-identification. IEEE Transactions on Neural Networks and Learning Systems 30, 12 (2019), 3847–3852.Google Scholar
Cross Ref
- [42] . 2020. A multi-scale spatial-temporal attention model for person re-identification in videos. IEEE Transactions on Image Processing 29 (2020), 3365–3373.Google Scholar
Digital Library
- [43] . 2017. Video-based pedestrian re-identification by adaptive spatio-temporal appearance model. IEEE Transactions on Image Processing 26, 4 (2017), 2042–2054. Google Scholar
Digital Library
- [44] . 2016. Person re-identification by saliency learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 2 (2016), 356–370. Google Scholar
Digital Library
- [45] . 2021. A framework for jointly training GAN with person re-identification model. In International Conference on Pattern Recognition Workshop: Fine-Grained Visual Recognition and re-IDentification. 36–51.Google Scholar
Cross Ref
- [46] . 2016. Mars: A video benchmark for large-scale person re-identification. In European Conference on Computer Vision. 868–884.Google Scholar
Cross Ref
- [47] . 2015. Scalable person re-identification: A benchmark. In IEEE International Conference on Computer Vision. 1116–1124. Google Scholar
Digital Library
- [48] . 2016. Person re-identification: Past, present and future. arXiv preprint arXiv:1610.02984 (2016).Google Scholar
- [49] . 2019. Joint discriminative and generative learning for person re-identification. In IEEE Conference on Computer Vision and Pattern Recognition. 2138–2147.Google Scholar
Cross Ref
- [50] . 2017. A discriminatively learned CNN embedding for person reidentification. ACM Transactions on Multimedia Computing, Communications, and Applications 14, 1 (2017), 1–20. Google Scholar
Digital Library
- [51] . 2017. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In IEEE International Conference on Computer Vision. 3754–3762.Google Scholar
Cross Ref
- [52] . 2017. Re-ranking person re-identification with k-reciprocal encoding. In IEEE Conference on Computer Vision and Pattern Recognition. 1318–1327.Google Scholar
Cross Ref
- [53] . 2018. Camera style adaptation for person re-identification. In IEEE Conference on Computer Vision and Pattern Recognition. 5157–5166.Google Scholar
Cross Ref
- [54] . 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In IEEE International Conference on Computer Vision. 2223–2232.Google Scholar
Index Terms
JoT-GAN: A Framework for Jointly Training GAN and Person Re-Identification Model
Recommendations
Exploring the Quality of GAN Generated Images for Person Re-Identification
MM '21: Proceedings of the 29th ACM International Conference on MultimediaRecently, GAN based method has demonstrated strong effectiveness in generating augmentation data for person re-identification (ReID), on account of its ability to bridge the gap between domains and enrich the data variety in feature space. However, most ...
A Framework for Jointly Training GAN with Person Re-Identification Model
Pattern Recognition. ICPR International Workshops and ChallengesAbstractTo cope with the problem caused by inadequate training data, many person re-identification (re-id) methods exploited generative adversarial networks (GAN) for data augmentation, where the training of GAN is typically independent of that of the re-...
Cross-domain person re-identification by hybrid supervised and unsupervised learning
AbstractAlthough the single-domain person re-identification (Re-ID) method has achieved great accuracy, the dependence on the label in the same image domain severely limits the scalability of this method. Therefore, cross-domain Re-ID has received more ...






Comments