skip to main content
research-article

Beyond the Parts: Learning Coarse-to-Fine Adaptive Alignment Representation for Person Search

Authors Info & Claims
Published:25 February 2023Publication History
Skip Abstract Section

Abstract

Person search is a time-consuming computer vision task that entails locating and recognizing query people in scenic pictures. Body components are commonly mismatched during matching due to position variation, occlusions, and partially absent body parts, resulting in unsatisfactory person search results. Existing approaches for extracting local characteristics of the human body using keypoint information are unable to handle the search job when distinct body parts are misaligned, ignoring to exploit multiple granularities, which is crucial in the person search process. Moreover, the alignment learning methods learn body part features with fixed and equal weights, ignoring the beneficial contextual information, e.g., the umbrella carried by the pedestrian, which supplements compelling clues for identifying the person. In this paper, we propose a Coarse-to-Fine Adaptive Alignment Representation (CFA2R) network for learning multiple granular features in misaligned person search in the coarse-to-fine perspective. To exploit more beneficial body parts and related context of the cropped pedestrians, we design a Part-Attentional Progressive Module (PAPM) to guide the network to focus on informative body parts and positive accessorial regions. Besides, we propose a Re-weighting Alignment Module (RAM) shedding light on more contributive parts instead of treating them equally. Specifically, adaptive re-weighted but not fixed part features are reconstructed by Re-weighting Reconstruction module, considering that different parts serve unequally during image matching. Extensive experiments conducted on CUHK-SYSU and PRW datasets demonstrate competitive performance of our proposed method.

REFERENCES

  1. [1] Ainam Jean-Paul, Qin Ke, Liu Guisong, Luo Guangchun, and Agyemang Brighter. 2020. Enforcing affinity feature learning through self-attention for person re-identification. ACM Trans. Multimedia Comput. Commun. Appl. 16, 1 (2020), 16:1–16:22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Cao Zhe, Hidalgo Gines, Simon Tomas, Wei Shih-En, and Sheikh Yaser. 2021. OpenPose: Realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43, 1 (2021), 172186. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Chang Xiaojun, Huang Po-Yao, Shen Yi-Dong, Liang Xiaodan, Yang Yi, and Hauptmann Alexander G.. 2018. RCAA: Relational context-aware agents for person search. In Proc. Springer Eur. Conf. Comput. Vis., Vol. 11213. 86102.Google ScholarGoogle Scholar
  4. [4] Chen Di, Zhang Shanshan, Ouyang Wanli, Yang Jian, and Schiele Bernt. 2020. Hierarchical online instance matching for person search. In Proc. AAAI Conf. Artif. Intell.1051810525.Google ScholarGoogle Scholar
  5. [5] Chen Di, Zhang Shanshan, Ouyang Wanli, Yang Jian, and Tai Ying. 2018. Person search via a mask-guided two-stream CNN model. In Proc. Springer Eur. Conf. Comput. Vis., Vol. 11211. 764781.Google ScholarGoogle Scholar
  6. [6] Chen Di, Zhang Shanshan, Ouyang Wanli, Yang Jian, and Tai Ying. 2020. Person search by separated modeling and A mask-guided two-stream CNN model. IEEE Trans. Image Process. 29 (2020), 46694682. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Chen Di, Zhang Shanshan, Yang Jian, and Schiele Bernt. 2021. Norm-aware embedding for efficient person search and tracking. Int. J. Comput. Vis. 129, 11 (2021), 31543168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Dai Jifeng, Qi Haozhi, Xiong Yuwen, Li Yi, Zhang Guodong, Hu Han, and Wei Yichen. 2017. Deformable convolutional networks. In Proc. IEEE/CVF Int. Conf. Comput. Vis.764773.Google ScholarGoogle Scholar
  9. [9] Dai Ju, Zhang Pingping, Lu Huchuan, and Wang Hongyu. 2020. Dynamic imposter based online instance matching for person search. Pattern Recognit. 100 (2020), 107120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Dollár Piotr, Appel Ron, Belongie Serge J., and Perona Pietro. 2014. Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36, 8 (2014), 15321545. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Dollár Piotr, Tu Zhuowen, Perona Pietro, and Belongie Serge J.. 2009. Integral channel features. In Proc. BMVA Brit. Mach. Vis. Conf.111.Google ScholarGoogle Scholar
  12. [12] Dong Wenkai, Zhang Zhaoxiang, Song Chunfeng, and Tan Tieniu. 2020. Bi-directional interaction network for person search. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.28362845.Google ScholarGoogle Scholar
  13. [13] Dong Wenkai, Zhang Zhaoxiang, Song Chunfeng, and Tan Tieniu. 2020. Instance guided proposal network for person search. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.25822591.Google ScholarGoogle Scholar
  14. [14] Felzenszwalb Pedro F., Girshick Ross B., McAllester David A., and Ramanan Deva. 2010. Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 9 (2010), 16271645. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Han Chuchu, Zheng Zhedong, Gao Changxin, Sang Nong, and Yang Yi. 2021. Decoupled and memory-reinforced networks: Towards effective feature learning for one-step person search. In Proc. AAAI Conf. Artif. Intell.15051512.Google ScholarGoogle Scholar
  16. [16] He Kaiming, Gkioxari Georgia, Dollár Piotr, and Girshick Ross B.. 2020. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 42, 2 (2020), 386397. Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] He Zhenwei and Zhang Lei. 2018. End-to-end detection and re-identification integrated net for person search. In Proc. Springer Asian Conf. Comput. Vis., Vol. 11362. 349364.Google ScholarGoogle Scholar
  18. [18] Hu Jie, Shen Li, and Sun Gang. 2018. Squeeze-and-excitation networks. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.71327141. Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Huang Wenxin, Hu Ruimin, Wang Xiao, Liang Chao, and Chen Jun. 2021. Occluded suspect search via channel-guided mechanism. Neural Comput. Appl. 33, 3 (2021), 961971. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Jia Xuemei, Zhong Xian, Ye Mang, Liu Wenxuan, and Huang Wenxin. 2022. Complementary data augmentation for cloth-changing person re-identification. IEEE Trans. Image Process. 31 (2022), 42274239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Jiang Kui, Wang Zhongyuan, Yi Peng, Chen Chen, Han Zhen, Lu Tao, Huang Baojin, and Jiang Junjun. 2021. Decomposition makes better rain removal: An improved attention-guided deraining network. IEEE Trans. Circuits Syst. Video Technol. 31, 10 (2021), 39813995.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Jing Xiao-Yuan, Zhu Xiaoke, Wu Fei, Hu Ruimin, You Xinge, Wang Yunhong, Feng Hui, and Yang Jing-Yu. 2017. Super-resolution person re-identification with semi-coupled low-rank discriminant dictionary learning. IEEE Trans. Image Process. 26, 3 (2017), 13631378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Kim Hanjae, Joung Sunghun, Kim Ig-Jae, and Sohn Kwanghoon. 2021. Prototype-guided saliency feature learning for person search. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.48654874.Google ScholarGoogle Scholar
  24. [24] Lan Xu, Zhu Xiatian, and Gong Shaogang. 2018. Person search by multi-scale matching. In Proc. Springer Eur. Conf. Comput. Vis., Vol. 11205. 553569.Google ScholarGoogle Scholar
  25. [25] Li Jianheng, Liang Fuhang, Li Yuanxun, and Zheng Wei-Shi. 2019. Fast person search pipeline. In Proc. IEEE Int. Conf. Multimedia Expo. 11141119. Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Li Wenbo, Chen Ze, Fu Zhenyong, and Lu Hongtao. 2018. Multilevel collaborative attention network for person search. In Proc. Springer Asian Conf. Comput. Vis., Vol. 11361. 467482.Google ScholarGoogle Scholar
  27. [27] Li Wei, Gong Shaogang, and Zhu Xiatian. 2021. Hierarchical distillation learning for scalable person search. Pattern Recognit. 114 (2021), 107862. Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Li Xiang, Zheng Wei-Shi, Wang Xiaojuan, Xiang Tao, and Gong Shaogang. 2015. Multi-scale learning for low-resolution person re-identification. In Proc. IEEE/CVF Int. Conf. Comput. Vis.37653773.Google ScholarGoogle Scholar
  29. [29] Liao Shengcai, Hu Yang, Zhu Xiangyu, and Li Stan Z.. 2015. Person re-identification by local maximal occurrence representation and metric learning. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.21972206.Google ScholarGoogle Scholar
  30. [30] Liao Shengcai and Li Stan Z.. 2015. Efficient PSD constrained asymmetric metric learning for person re-identification. In Proc. IEEE/CVF Int. Conf. Comput. Vis.36853693.Google ScholarGoogle Scholar
  31. [31] Lin Tsung-Yi, Maire Michael, Belongie Serge J., Hays James, Perona Pietro, Ramanan Deva, Dollár Piotr, and Zitnick C. Lawrence. 2014. Microsoft COCO: Common objects in context. In Proc. Springer Eur. Conf. Comput. Vis., Vol. 8693. 740755.Google ScholarGoogle Scholar
  32. [32] Lin Tsung-Yu, RoyChowdhury Aruni, and Maji Subhransu. 2015. Bilinear CNN models for fine-grained visual recognition. In Proc. IEEE/CVF Int. Conf. Comput. Vis.14491457. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Liu Chuang, Yang Hua, Zhu Ji, Li Xinzhe, Chang Zhigang, and Zheng Shibao. 2021. Graph similarity rectification for person search. Neurocomputing 465 (2021), 184194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Liu Hao, Feng Jiashi, Jie Zequn, Karlekar Jayashree, Zhao Bo, Qi Meibin, Jiang Jianguo, and Yan Shuicheng. 2017. Neural person search machines. In Proc. IEEE/CVF Int. Conf. Comput. Vis.493501.Google ScholarGoogle Scholar
  35. [35] Liu Jiawei, Zha Zheng-Jun, Hong Richang, Wang Meng, and Zhang Yongdong. 2020. Dual context-aware refinement network for person search. In Proc. ACM Int. Conf. Multimedia. 34503459.Google ScholarGoogle Scholar
  36. [36] Ma Bingpeng, Su Yu, and Jurie Frédéric. 2012. Local descriptors encoded by Fisher vectors for person re-identification. In Proc. Springer Eur. Conf. Comput. Vis. Workshops, Vol. 7583. 413422.Google ScholarGoogle Scholar
  37. [37] Matsukawa Tetsu, Okabe Takahiro, Suzuki Einoshin, and Sato Yoichi. 2016. Hierarchical Gaussian descriptor for person re-identification. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.13631372.Google ScholarGoogle Scholar
  38. [38] Munjal Bharti, Amin Sikandar, Tombari Federico, and Galasso Fabio. 2019. Query-guided end-to-end person search. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.811820.Google ScholarGoogle Scholar
  39. [39] Pang Yanwei, Xie Jin, Khan Muhammad Haris, Anwer Rao Muhammad, Khan Fahad Shahbaz, and Shao Ling. 2019. Mask-guided attention network for occluded pedestrian detection. In Proc. IEEE/CVF Int. Conf. Comput. Vis.49664974.Google ScholarGoogle Scholar
  40. [40] Ren Shaoqing, He Kaiming, Girshick Ross B., and Sun Jian. 2017. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 6 (2017), 11371149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Si Jianlou, Zhang Honggang, Li Chun-Guang, Kuen Jason, Kong Xiangfei, Kot Alex C., and Wang Gang. 2018. Dual attention matching network for context-aware feature sequence based person re-identification. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.53635372. Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Song Chunfeng, Huang Yan, Ouyang Wanli, and Wang Liang. 2018. Mask-guided contrastive attention model for person re-identification. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.11791188.Google ScholarGoogle Scholar
  43. [43] Suh Yumin, Wang Jingdong, Tang Siyu, Mei Tao, and Lee Kyoung Mu. 2018. Part-aligned bilinear representations for person re-identification. In Proc. Springer Eur. Conf. Comput. Vis., Vol. 11218. 418437.Google ScholarGoogle Scholar
  44. [44] Szegedy Christian, Liu Wei, Jia Yangqing, Sermanet Pierre, Reed Scott E., Anguelov Dragomir, Erhan Dumitru, Vanhoucke Vincent, and Rabinovich Andrew. 2015. Going deeper with convolutions. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.19.Google ScholarGoogle Scholar
  45. [45] Tian Maoqing, Yi Shuai, Li Hongsheng, Li Shihua, Zhang Xuesen, Shi Jianping, Yan Junjie, and Wang Xiaogang. 2018. Eliminating background-bias for robust person re-identification. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.57945803.Google ScholarGoogle Scholar
  46. [46] Wang Xiao, Liu Wu, Chen Jun, Wang Xiaobo, Yan Chenggang, and Me Tao. 2020. Listen, look, and find the one: Robust person search with multimodality index. ACM Trans. Multimedia Comput. Commun. Appl. 16, 2 (2020), 47:1–47:20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. [47] Wang Xiao, Wang Zheng, Liu Wu, Xu Xin, Chen Jing, and Lin Chia-Wen. 2021. Consistency-constancy bi-knowledge learning for pedestrian detection in night surveillance. In Proc. ACM Int. Conf. Multimedia. 44634471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. [48] Xiao Jing, Hu Ruimin, Liao Liang, Chen Yu, Wang Zhongyuan, and Xiong Zixiang. 2016. Knowledge-based coding of objects for multisource surveillance video data. IEEE Trans. Multim. 18, 9 (2016), 16911706. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Xiao Tong, Li Shuang, Wang Bochao, Lin Liang, and Wang Xiaogang. 2017. Joint detection and identification feature learning for person search. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.33763385.Google ScholarGoogle Scholar
  50. [50] Xu Xin, Liu Lei, Zhang Xiaolong, Guan Weili, and Hu Ruimin. 2021. Rethinking data collection for person re-identification: Active redundancy reduction. Pattern Recognit. 113 (2021), 107827. Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Xu Xin, Wang Shiqin, Wang Zheng, Zhang Xiaolong, and Hu Ruimin. 2021. Exploring image enhancement for salient object detection in low light images. ACM Trans. Multimedia Comput. Commun. Appl. 17, 1s (2021), 119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Yan Yichao, Li Jinpeng, Qin Jie, Bai Song, Liao Shengcai, Liu Li, Zhu Fan, and Shao Ling. 2021. Anchor-free person search. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.76907699.Google ScholarGoogle Scholar
  53. [53] Yan Yichao, Zhang Qiang, Ni Bingbing, Zhang Wendong, Xu Minghao, and Yang Xiaokang. 2019. Learning context graph for person search. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.21582167.Google ScholarGoogle Scholar
  54. [54] Yang Bin, Yan Junjie, Lei Zhen, and Li Stan Z.. 2015. Convolutional channel features. In Proc. IEEE/CVF Int. Conf. Comput. Vis.8290.Google ScholarGoogle Scholar
  55. [55] Yu Tuo, Jin Haiming, Tan Wai-Tian, and Nahrstedt Klara. 2018. SKEPRID: Pose and illumination change-resistant skeleton-based person re-identification. ACM Trans. Multimedia Comput. Commun. Appl. 14, 4 (2018), 82:1–82:24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. [56] Zhang Liliang, Lin Liang, Liang Xiaodan, and He Kaiming. 2016. Is faster R-CNN doing well for pedestrian detection? In Proc. Springer Eur. Conf. Comput. Vis.443457.Google ScholarGoogle Scholar
  57. [57] Zhang Li, Xiang Tao, and Gong Shaogang. 2016. Learning a discriminative null space for person re-identification. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.12391248.Google ScholarGoogle Scholar
  58. [58] Zhang Xinyu, Wang Xinlong, Bian Jia-Wang, Shen Chunhua, and You Mingyu. 2021. Diverse knowledge distillation for end-to-end person search. In Proc. AAAI Conf. Artif. Intell.34123420.Google ScholarGoogle Scholar
  59. [59] Zhang Ying, Li Baohua, Lu Huchuan, Irie Atshushi, and Ruan Xiang. 2016. Sample-specific SVM learning for person re-identification. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.12781287.Google ScholarGoogle Scholar
  60. [60] Zhao Hengshuang, Shi Jianping, Qi Xiaojuan, Wang Xiaogang, and Jia Jiaya. 2017. Pyramid scene parsing network. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.62306239.Google ScholarGoogle Scholar
  61. [61] Zhao Haiyu, Tian Maoqing, Sun Shuyang, Shao Jing, Yan Junjie, Yi Shuai, Wang Xiaogang, and Tang Xiaoou. 2017. Spindle net: Person re-identification with human body region guided feature decomposition and fusion. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.907915.Google ScholarGoogle Scholar
  62. [62] Zheng Liang, Zhang Hengheng, Sun Shaoyan, Chandraker Manmohan, Yang Yi, and Tian Qi. 2017. Person re-identification in the wild. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.33463355.Google ScholarGoogle Scholar
  63. [63] Zheng Zhedong, Zheng Liang, and Yang Yi. 2018. A discriminatively learned CNN embedding for person reidentification. ACM Trans. Multimedia Comput. Commun. Appl. 14, 1 (2018), 13:1–13:20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. [64] Zheng Zhedong, Zheng Liang, and Yang Yi. 2019. Pedestrian alignment network for large-scale person re-identification. IEEE Trans. Circuits Syst. Video Technol. 29, 10 (2019), 30373045. Google ScholarGoogle ScholarCross RefCross Ref
  65. [65] Zhong Xian, Liu Yiting, Huang Wenxin, Wang Xiao, Ma Bo, and Yuan Jingling. 2021. Part-aligned network with background for misaligned person search. In Proc. IEEE Int. Conf. Acoustics Speech Signal Process.42504254.Google ScholarGoogle Scholar
  66. [66] Zhong Xian, Zhao Shilei, Wang Xiao, Jiang Kui, Liu Wenxuan, Huang Wenxin, and Wang Zheng. 2021. Unsupervised vehicle search in the wild: A new benchmark. In Proc. ACM Int. Conf. Multimedia. 53165325.Google ScholarGoogle Scholar
  67. [67] Zhong Yingji, Wang Xiaoyu, and Zhang Shiliang. 2020. Robust partial matching for person search in the wild. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.68266834.Google ScholarGoogle Scholar
  68. [68] Zhou Shuren, Wang Ying, Zhang Fan, and Wu Jie. 2021. Cross-view similarity exploration for unsupervised cross-domain person re-identification. Neural Comput. Appl. 33, 9 (2021), 40014011. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Beyond the Parts: Learning Coarse-to-Fine Adaptive Alignment Representation for Person Search

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Multimedia Computing, Communications, and Applications
        ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 3
        May 2023
        514 pages
        ISSN:1551-6857
        EISSN:1551-6865
        DOI:10.1145/3582886
        • Editor:
        • Abdulmotaleb El Saddik
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 February 2023
        • Online AM: 8 October 2022
        • Accepted: 26 September 2022
        • Revised: 30 August 2022
        • Received: 29 December 2021
        Published in tomm Volume 19, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
      • Article Metrics

        • Downloads (Last 12 months)173
        • Downloads (Last 6 weeks)16

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!