Abstract
Person re-identification aims to match person of interest under non-overlapping camera views. Therefore, how to generate a robust and discriminative representation is crucial for person re-identification. Mining local clues from human body parts to describe pedestrians has been extensively studied in existing methods. However, existing methods locate human body parts coarsely and do not consider the relations among different local parts. To address the above problem, we propose a Part-based Structured Representation Learning (PSRL) for better exploiting local clues to improve the person representation. There are two important modules in our architecture: Local Semantic Feature Extraction and Structured Person Representation Learning. The Local Semantic Feature Extraction module is designed to extract local features from human body semantic regions. After obtaining the local features, the Structured Person Representation Learning is proposed to fuse the local features by considering the person structure. To model the underlying person structure, a graph convolutional network is employed to capture the relations of different semantic regions. The generated structured feature encodes underlying person structure information, and local semantic feature can solve the misalignment problem caused by pose variations in feature matching. By combining them together, we can improve the descriptive ability of the generated representation. Extensive evaluations on four standard benchmarks show that our proposed method achieves competitive performance against state-of-the-art methods.
- Song Bai, Peng Tang, Philip H. S. Torr, and Longin Jan Latecki. 2019. Re-ranking via metric fusion for object retrieval and person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).Google Scholar
Cross Ref
- Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann Lecun. 2014. Spectral networks and locally connected networks on graphs. In Proceedings of the International Conference on Learning Representations (ICLR’14).Google Scholar
- Xiaobin Chang, Timothy M. Hospedales, and Tao Xiang. 2018. Multi-level factorisation net for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2109--2118.Google Scholar
Cross Ref
- Dapeng Chen, Dan Xu, Hongsheng Li, Nicu Sebe, and Xiaogang Wang. 2018. Group consistent similarity learning via deep CRF for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8649--8658.Google Scholar
Cross Ref
- Weihua Chen, Xiaotang Chen, Jianguo Zhang, and Kaiqi Huang. 2017. Beyond triplet loss: A deep quadruplet network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17), Vol. 2.Google Scholar
Cross Ref
- De Cheng, Yihong Gong, Sanping Zhou, Jinjun Wang, and Nanning Zheng. 2016. Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1335--1344.Google Scholar
Cross Ref
- Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248--255.Google Scholar
Cross Ref
- Shengyong Ding, Liang Lin, Guangrun Wang, and Hongyang Chao. 2015. Deep feature learning with relative distance comparison for person re-identification. Pattern Recog. 48, 10 (2015), 2993--3003.Google Scholar
Digital Library
- David K. Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bombarell, Timothy Hirzel, Alán Aspuru-Guzik, and Ryan P. Adams. 2015. Convolutional networks on graphs for learning molecular fingerprints. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 2224--2232.Google Scholar
- Pedro F. Felzenszwalb, Ross B. Girshick, David McAllester, and Deva Ramanan. 2010. Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 9 (2010), 1627--1645.Google Scholar
Digital Library
- Junyu Gao, Tianzhu Zhang, Xiaoshan Yang, and Changsheng Xu. 2018. P2t: Part-to-target tracking via deep regression learning. IEEE Trans. Image Proc. 27, 6 (2018), 3074--3086.Google Scholar
Cross Ref
- Ke Gong, Xiaodan Liang, Dongyu Zhang, Xiaohui Shen, and Liang Lin. 2017. Look into person: Self-supervised structure-sensitive learning and a new benchmark for human parsing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2. 6.Google Scholar
Cross Ref
- Douglas Gray and Hai Tao. 2008. Viewpoint invariant pedestrian recognition with an ensemble of localized features. In Proceedings of the European Conference on Computer Vision. Springer, 262--275.Google Scholar
Digital Library
- Jianyuan Guo, Yuhui Yuan, Lang Huang, Chao Zhang, Jin-Ge Yao, and Kai Han. 2019. Beyond human parts: Dual part-aligned representations for person re-identification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’19).Google Scholar
Cross Ref
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google Scholar
Cross Ref
- Mikael Henaff, Joan Bruna, and Yann LeCun. 2015. Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163 (2015).Google Scholar
- Alexander Hermans, Lucas Beyer, and Bastian Leibe. 2017. In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017).Google Scholar
- Ruibing Hou, Bingpeng Ma, Hong Chang, Xinqian Gu, Shiguang Shan, and Xilin Chen. 2019. Interaction-and-aggregation network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).Google Scholar
Cross Ref
- Bo Jiang, Xixi Wang, and Bin Luo. 2019. PH-GCN: Person re-identification with part-based hierarchical graph convolutional network. arXiv preprint arXiv:1907.08822 (2019).Google Scholar
- Xu Jing, Zhao Rui, Zhu Feng, Huaming Wang, and Wanli Ouyang. 2018. Attention-aware compositional network for person re-identification. Retrieved from https://arxiv.org/abs/1805.03344.Google Scholar
- Mahdi M. Kalayeh, Emrah Basaran, Muhittin Gökmen, Mustafa E. Kamasak, and Mubarak Shah. 2018. Human semantic parsing for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1062--1071.Google Scholar
Cross Ref
- Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Thomas N. Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).Google Scholar
- Dangwei Li, Xiaotang Chen, Zhang Zhang, and Kaiqi Huang. 2017. Learning deep context-aware features over body and latent parts for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 384--393.Google Scholar
Cross Ref
- Maosen Li, Siheng Chen, Xu Chen, Ya Zhang, Yanfeng Wang, and Qi Tian. 2019. Actional-structural graph convolutional networks for skeleton-based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).Google Scholar
Cross Ref
- Wei Li, Rui Zhao, and Xiaogang Wang. 2012. Human reidentification with transferred metric learning. In Proceedings of the Asian Conference on Computer Vision.Google Scholar
- Wei Li, Rui Zhao, Tong Xiao, and Xiaogang Wang. 2014. DeepReID: Deep filter pairing neural network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Digital Library
- Wei Li, Xiatian Zhu, and Shaogang Gong. 2017. Person re-identification by deep joint learning of multi-loss classification. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. AAAI Press, 2194--2200.Google Scholar
Cross Ref
- Wei Li, Xiatian Zhu, and Shaogang Gong. 2018. Harmonious attention network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. 1. 2.Google Scholar
Cross Ref
- Y. Lin, L. Zheng, Z. Zheng, Y. Wu, and Y. Yang. 2017. Improving person re-identification by attribute and identity learning. arXiv preprint arXiv:1703.07220. (2017).Google Scholar
- Hao Liu, Jiashi Feng, Meibin Qi, Jianguo Jiang, and Shuicheng Yan. 2016. End-to-end comparative attention networks for person re-identification. arXiv preprint arXiv:1606.04404 (2016).Google Scholar
- Hao Liu, Jiashi Feng, Meibin Qi, Jianguo Jiang, and Shuicheng Yan. 2017. End-to-end comparative attention networks for person re-identification. IEEE Trans. Image Proc. 26, 7 (2017), 3492--3506.Google Scholar
Cross Ref
- Xinchen Liu, Wu Liu, Meng Zhang, Jingwen Chen, Lianli Gao, Chenggang Yan, and Tao Mei. 2019. Social relation recognition from videos via multi-scale spatial-temporal reasoning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).Google Scholar
Cross Ref
- Xinchen Liu, Meng Zhang, Wu Liu, Jingkuan Song, and Tao Mei. 2019. BraidNet: Braiding semantics and details for accurate human parsing. In Proceedings of the 27th ACM International Conference on Multimedia. 338--346.Google Scholar
Digital Library
- Xihui Liu, Haiyu Zhao, Maoqing Tian, Lu Sheng, Jing Shao, Shuai Yi, Junjie Yan, and Xiaogang Wang. 2017. HydraPlus-Net: Attentive deep features for pedestrian analysis. In Proceedings of the International Conference on Computer Vision (ICCV’17). IEEE, 350--359.Google Scholar
Cross Ref
- Hao Luo, Youzhi Gu, Xingyu Liao, Shenqi Lai, and Wei Jiang. 2019. Bags of tricks and a strong baseline for deep person re-identification. arXiv preprint arXiv:1903.07071 (2019).Google Scholar
- Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. 2016. Learning convolutional neural networks for graphs. In Proceedings of the International Conference on Machine Learning. 2014--2023.Google Scholar
- Xiaojuan Qi, Renjie Liao, Jiaya Jia, Sanja Fidler, and Raquel Urtasun. 2017. 3D graph neural networks for RGBD semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision. 5199--5208.Google Scholar
Cross Ref
- Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 91--99.Google Scholar
- Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. 2016. Performance measures and a data set for multi-target, multi-camera tracking. In Proceedings of the European Conference on Computer Vision Workshop on Benchmarking Multi-target Tracking.Google Scholar
Cross Ref
- Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. 2016. Performance measures and a data set for multi-target, multi-camera tracking. In Proceedings of the European Conference on Computer Vision. Springer, 17--35.Google Scholar
Cross Ref
- Tao Ruan, Ting Liu, Zilong Huang, Yunchao Wei, Shikui Wei, and Yao Zhao. 2019. Devil in the details: Towards accurate single and multiple human parsing. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 4814--4821.Google Scholar
Cross Ref
- M. Saquib Sarfraz, Arne Schumann, Andreas Eberle, and Rainer Stiefelhagen. 2018. A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 420--429.Google Scholar
Cross Ref
- Arne Schumann and Rainer Stiefelhagen. 2017. Person re-identification by deep learning attribute-complementary information. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17). IEEE, 1435--1443.Google Scholar
Cross Ref
- Yantao Shen, Hongsheng Li, Shuai Yi, Dapeng Chen, and Xiaogang Wang. 2018. Person re-identification with deep similarity-guided graph neural network. In Proceedings of the European Conference on Computer Vision. Springer, 508--526.Google Scholar
Cross Ref
- H. Shi, Y. Yang, X. Zhu, S. Liao, Z. Lei, W. Zheng, and S. Z. Li. 2016. Embedding deep metric for person re-identification: A study against large variations. In Proceedings of the European Conference on Computer Vision.Google Scholar
- Chenyang Si, Wentao Chen, Wei Wang, Liang Wang, and Tieniu Tan. 2019. An attention enhanced graph convolutional LSTM network for skeleton-based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).Google Scholar
Cross Ref
- Chi Su, Jianing Li, Shiliang Zhang, Junliang Xing, Wen Gao, and Qi Tian. 2017. Pose-driven deep convolutional model for person re-identification. In Proceedings of the International Conference on Computer Vision. 3980--3989.Google Scholar
Cross Ref
- Yumin Suh, Jingdong Wang, Siyu Tang, Tao Mei, and Kyoung Mu Lee. 2018. Part-aligned bilinear representations for person re-identification. In Proceedings of the European Conference on Computer Vision (ECCV’18). 402--419.Google Scholar
Cross Ref
- Yifan Sun, Qin Xu, Yali Li, Chi Zhang, Yikang Li, Shengjin Wang, and Jian Sun. 2019. Perceive where to focus: Learning visibility-aware part-level features for partial person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).Google Scholar
Cross Ref
- Yifan Sun, Liang Zheng, Weijian Deng, and Shengjin Wang. 2017. SVDNet for pedestrian retrieval. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). IEEE, 3820--3828.Google Scholar
Cross Ref
- Yifan Sun, Liang Zheng, Yi Yang, Qi Tian, and Shengjin Wang. 2017. Beyond part models: Person retrieval with refined part pooling. arXiv preprint arXiv:1711.09349 (2017).Google Scholar
- Youbao Tang and Xiangqian Wu. 2018. Scene text detection using superpixel-based stroke feature transform and deep learning based region classification. IEEE Trans. Multimedia 20, 9 (2018), 2276--2288.Google Scholar
Digital Library
- R. R. Varior, B. Shuai, J. Lu, D. Xu, and G. Wang. 2016. A siamese long short-term memory architecture for human reidentification. In Proceedings of the European Conference on Computer Vision.Google Scholar
- Cheng Wang, Qian Zhang, Chang Huang, Wenyu Liu, and Xinggang Wang. 2018. Mancs: A multi-task attentional network with curriculum sampling for person re-identification. In Proceedings of the European Conference on Computer Vision (ECCV’18). 365--381.Google Scholar
Cross Ref
- Guanshuo Wang, Yufeng Yuan, Xiong Chen, Jiwei Li, and Xi Zhou. 2018. Learning discriminative features with multiple granularities for person re-identification. Retrieved from https://arxiv.org/abs/1804.01438.Google Scholar
- Zheng Wang, Ruimin Hu, Chao Liang, Yi Yu, Junjun Jiang, Mang Ye, Jun Chen, and Qingming Leng. 2015. Zero-shot person re-identification via cross-view consistency. IEEE Trans. Multimedia 18, 2 (2015), 260--272.Google Scholar
Cross Ref
- Longhui Wei, Shiliang Zhang, Hantao Yao, Wen Gao, and Qi Tian. 2018. GLAD: Global--local-alignment descriptor for scalable person re-identification. IEEE Trans. Multimedia 21, 4 (2018), 986--999.Google Scholar
Cross Ref
- L. Wu, C. Shen, and A. van den Hengel. 2016. PersonNet: Person re-identification with deep convolutional neural networks. arXiv preprint arXiv:1601.07255 (2016).Google Scholar
- Lin Wu, Yang Wang, Junbin Gao, and Xue Li. 2018. Where-and-when to look: Deep siamese attention networks for video-based person re-identification. IEEE Trans. Multimedia 21, 6 (2018), 1412--1424.Google Scholar
Digital Library
- Tong Xiao, Hongsheng Li, Wanli Ouyang, and Xiaogang Wang. 2016. Learning deep feature representations with domain guided dropout for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1249--1258.Google Scholar
Cross Ref
- Sijie Yan, Yuanjun Xiong, and Dahua Lin. 2018. Spatial temporal graph convolutional networks for skeleton-based action recognition. arXiv preprint arXiv:1801.07455 (2018).Google Scholar
- Yichao Yan, Qiang Zhang, Bingbing Ni, Wendong Zhang, Minghao Xu, and Xiaokang Yang. 2019. Learning context graph for person search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).Google Scholar
Cross Ref
- Wenjie Yang, Houjing Huang, Zhang Zhang, Xiaotang Chen, Kaiqi Huang, and Shu Zhang. 2019. Towards rich feature discovery with class activation maps augmentation for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).Google Scholar
Cross Ref
- Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, and Qi Tian. 2017. Deep representation learning with part loss for person re-identification. arXiv preprint arXiv:1707.00798. (2017).Google Scholar
- Mang Ye, Chao Liang, Yi Yu, Zheng Wang, Qingming Leng, Chunxia Xiao, Jun Chen, and Ruimin Hu. 2016. Person reidentification via ranking aggregation of similarity pulling and dissimilarity pushing. IEEE Trans. Multimedia 18, 12 (2016), 2553--2566.Google Scholar
Digital Library
- D. Yi, Z. Lei, and S. Z. Li. 2014. Deep metric learning for practical person re-identification. In Proceedings of the International Conference on Pattern Recognition.Google Scholar
- Li Zhang, Tao Xiang, and Shaogang Gong. 2016. Learning a discriminative null space for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1239--1248.Google Scholar
Cross Ref
- Meng Zhang, Xinchen Liu, Wu Liu, Anfu Zhou, Huadong Ma, and Tao Mei. 2019. Multi-granularity reasoning for social relation recognition from images. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME’19). IEEE, 1618--1623.Google Scholar
Cross Ref
- Xuan Zhang, Hao Luo, Xing Fan, Weilai Xiang, Yixiao Sun, Qiqi Xiao, Wei Jiang, Chi Zhang, and Jian Sun. 2017. Alignedreid: Surpassing human-level performance in person re-identification. arXiv preprint arXiv:1711.08184. (2017).Google Scholar
- Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, and Zhibo Chen. 2019. Densely semantically aligned person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 667--676.Google Scholar
Cross Ref
- Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2881--2890.Google Scholar
Cross Ref
- Haiyu Zhao, Maoqing Tian, Shuyang Sun, Jing Shao, Junjie Yan, Shuai Yi, Xiaogang Wang, and Xiaoou Tang. 2017. Spindle Net: Person re-identification with human body region guided feature decomposition and fusion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1077--1085.Google Scholar
Cross Ref
- Liming Zhao, Xi Li, Yueting Zhuang, and Jingdong Wang. 2017. Deeply learned part-aligned representations for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3219--3228.Google Scholar
Cross Ref
- Long Zhao, Xi Peng, Yu Tian, Mubbasir Kapadia, and Dimitris N. Metaxas. 2019. Semantic graph convolutional networks for 3D human pose regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).Google Scholar
- Haitian Zheng, Lu Fang, Mengqi Ji, Matti Strese, Yigitcan Özer, and Eckehard Steinbach. 2016. Deep learning for surface material classification using haptic and visual information. IEEE Trans. Multimedia 18, 12 (2016), 2407--2416.Google Scholar
Digital Library
- L. Zheng, Y. Huang, H. Lu, and Y. Yang. 2017. Pose invariant embedding for deep person re-identification. arXiv preprint arXiv:1701.07732. (2017).Google Scholar
- Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. 2015. Scalable person re-identification: A benchmark. In Proceedings of the International Conference on Computer Vision.Google Scholar
Cross Ref
- Zhedong Zheng, Liang Zheng, and Yi Yang. 2017. A discriminatively learned CNN embedding for person reidentification. ACM Trans. Multimedia Comput. 14, 1 (2017).Google Scholar
- Zhun Zhong, Liang Zheng, Donglin Cao, and Shaozi Li. 2017. Re-ranking person re-identification with k-reciprocal encoding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1318--1327.Google Scholar
Cross Ref
- Kaiyang Zhou, Yongxin Yang, Andrea Cavallaro, and Tao Xiang. 2019. Omni-scale feature learning for person re-identification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’19).Google Scholar
Cross Ref
- Sanping Zhou, Jinjun Wang, Rui Shi, Qiqi Hou, Yihong Gong, and Nanning Zheng. 2017. Large margin learning in set-to-set similarity comparison for person reidentification. IEEE Trans. Multimedia 20, 3 (2017), 593--604.Google Scholar
- Fuqing Zhu, Xiangwei Kong, Liang Zheng, Haiyan Fu, and Qi Tian. 2017. Part-based deep hashing for large-scale person re-identification. IEEE Trans. Image Proc. 26, 10 (2017), 4806--4817.Google Scholar
Cross Ref
Index Terms
Part-based Structured Representation Learning for Person Re-identification
Recommendations
Person re-identification based on multi-scale feature learning
AbstractExtracting discriminative pedestrian features is an effective method in person re-identification. Most person re-identification works focus on extracting abstract features from the high-layer of the network, but ignore the middle-layer ...
Robust joint learning network: improved deep representation learning for person re-identification
AbstractExisting person re-identification methods, which based on deep representation learning, mostly only focus on either global feature or local feature. This obviously ignores the joint advantages and the correlation between global and local features. ...
Multiple Uses of Global and Local Features for Person Re-identification
ICMSSP '20: Proceedings of the 2020 5th International Conference on Multimedia Systems and Signal ProcessingPerson re-identification has been extensively studied in recent years and has made great progress. Many papers propose a lot of effective methods to improve the accuracy of the person re-identification. However, there are still many problems that remain ...






Comments