skip to main content
research-article

Seeing Crucial Parts: Vehicle Model Verification via a Discriminative Representation Model

Authors Info & Claims
Published:25 January 2022Publication History
Skip Abstract Section

Abstract

Widely used surveillance cameras have promoted large amounts of street scene data, which contains one important but long-neglected object: the vehicle. Here we focus on the challenging problem of vehicle model verification. Most previous works usually employ global features (e.g., fully connected features) to further perform vehicle-level deep metric learning (e.g., triplet-based network). However, we argue that it is noteworthy to investigate the distinctiveness of local features and consider vehicle-part-level metric learning by reducing the intra-class variance as much as possible. In this article, we introduce a simple yet powerful deep model—the enforced intra-class alignment network (EIA-Net)—which can learn a more discriminative image representation by localizing key vehicle parts and jointly incorporating two distance metrics: vehicle-level embedding and vehicle-part-sensitive embedding. For learning features, we propose an effective feature extraction module that is composed of two components: the regional proposal network (RPN)-based network and part-based CNN. The RPN is used to define key vehicle regions and aggregate local features on these regions, whereas part-based CNN offers supplementary global features for the RPN-based network. The fusion features learned by feature extraction module are cast into the deep metric learning module. Especially, we derived an enforced intra-class alignment loss by re-utilizing key vehicle part information to enhance reducing intra-class variance. Furthermore, we modify the coupled cluster loss to model the vehicle-level embedding by enlarging the inter-class variance while shortening intra-class variance. Extensive experiments over benchmark datasets VehicleID and CompCars have shown that the proposed EIA-Net significantly outperforms the state-of-the-art approaches for vehicle model verification. Furthermore, we also conduct comprehensive experiments on vehicle re-identification datasets (i.e., VehicleID and VeRi776) to validate the generalization ability effectiveness of our proposed method.

REFERENCES

  1. [1] Alfasly S., Hu Y., Li H., Liang T., Jin X., Liu B., and Zhao. Q.2019. Multi-label-based similarity learning for vehicle re-identification. IEEE Access 7, 16 (2019), 26052616.Google ScholarGoogle Scholar
  2. [2] Bai Y., Lou Y., Gao F., Wang S., Wu Y., and Duan. L.2018. Group-sensitive triplet embedding for vehicle re-identification. IEEE Trans. Multimedia 20, 9 (2018), 23852399. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Chen D., Cao X., Wang L., and Wen. F.2012. Bayesian face revisited: A joint formulation. In Proc. Eur. Conf. Comp. Vis. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Chen L., Papandreou G., Kokkinos I., Murphy K., and Yuille. A.2016. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 21, 1 (2016), 113.Google ScholarGoogle Scholar
  5. [5] Chen W., Chen X., Zhang J., and Huang. K.2017. Beyond triplet loss: A deep quadruplet network for person re-identification.. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  6. [6] Cheng D., Gong Y., Zhou S., Wang J., and Zheng. N.2016. Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  7. [7] Chopra S., Hadsell R., and Lecun. Y.2005. Learning a similarity metric discriminatively,with application to face verification. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Cormier M., Sommer L. W., and Teutsch. M.2016. Low resolution vehicle re-identification based on appearance features for wide area motion imagery. In Proc. IEEE Winter Appl. Comput. Vis. Workshops.Google ScholarGoogle Scholar
  9. [9] Cui P., Liu S., and Zhu. W.2018. General knowledge embedded image representation learning. IEEE Trans. Multimedia 20, 1 (2018), 198207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Lempitsky. A. Babenko and V.2015. Aggregating deep convolutional features for image retrieval. In Proc. IEEE Int. Conf. Comp. Vis.Google ScholarGoogle Scholar
  11. [11] Feris R. S., Siddiquie B., Petterson J., Zhai Y., Datta A., and Brown. L. M.2012. Group-sensitive triplet embedding for vehicle re-identification. IEEE Trans. Multimedia 14, 1 (2012), 2842. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Gordo A., Almazan J., Revaud J., and Larlus. D.2016. Deep image retrieval: Learning global representations for image search. In Proc. Eur. Conf. Comp. Vis.Google ScholarGoogle Scholar
  13. [13] Guo H., Zhao C., Liu Z., Wang J., and Lu. H.2018. Learning coarse-to-fine structured feature embedding for vehicle re-identification. In Proc. AAAI Conf. Art. Intell. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] He B., Li J., Zhao Y., and Tian. Y.2019. Part-regularized near-duplicate vehicle re-identification. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  15. [15] He K., Zhang X., Ren S., and Sun. J.2016. Deep residual learning for image recognition. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  16. [16] Hossein A., Sharif R. A., Josephine S., Atsuto M., and Stefan. C.2014. From generic to specific deep representations for visual recognition. arXiv preprint arXiv:1406.5774v1 (2014).Google ScholarGoogle Scholar
  17. [17] Hsiao E., Sinha S. N., Ramnath K., Baker S., Zitnick L., and Szeliski. R.2014. Car make and model recognition using 3D curve alignment. In Proc. IEEE Winter Appl. Comput. Vis.Google ScholarGoogle Scholar
  18. [18] Hu J., Lu J., and Tan. Y. P.2014. Discriminative deep metric learning for face verification in the wild. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Khorramshahi P., Kumar A., Peri N., Rambhatla S. S., Chen J., and Chellappa. R.2019. A dual-path model with adaptive attention for vehicle re-identification. In Proc. IEEE Int. Conf. Comp. Vis.Google ScholarGoogle Scholar
  20. [20] Krause J., Stark M., Deng J., and Fei-Fei. L.2013. 3D object representations for fine-grained categorization. In Proc. IEEE Int. Conf. Comp. Vis. Workshops. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Krizhevsky A., Sutskever I., and Hinton. G. E.2012. ImageNet classification with deep convolutional neural networks. In Proc. Adv. Neural Inf. Process. Syst. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Li Z. and Tang. J.2015. Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Trans. Multimedia 17, 11 (2015), 19891999.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Liao S., Hu Y., Zhu X., and Li. S. Z.2015. Person re-identification by local maximal occurrence representation and metric learning. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  24. [24] Lin Y.-L., Morariun V. I., Hsu W., and Davis. L. S.2014. Jointly optimizing 3D model fitting and fine-grained classification. In Proc. Eur. Conf. Comp. Vis.Google ScholarGoogle Scholar
  25. [25] Liu H., Tian Y., Wang Y., Pang L., and Huang. T.2016. Deep relative distance learning: Tell the difference between similar vehicles. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  26. [26] Liu X., Liu W., Mei T., and Ma. H.2016. A deep learning-based approach to progressive vehicle re-identification for urban surveillance. In Proc. Eur. Conf. Comp. Vis.Google ScholarGoogle Scholar
  27. [27] Liu X., Liu W., Mei T., and Ma. H.2018. PROVID: Progressive and multimodal vehicle re-identification for large-scale urban surveillance. IEEE Trans. Multimedia 20, 3 (2018), 645658. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Liu X., Liu W., Zheng J., Yan C., and Mei. Tao2020. Beyond the parts: Learning multi-view cross-part correlation for vehicle re-identification. In Proc. ACM Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Liu X., Zhang S., Huang Q., and Gao. W.2018. RAM: A region-aware deep model for vehicle re-identification. In Proc. IEEE Int. Conf. Multimedia Expo.Google ScholarGoogle Scholar
  30. [30] Long J., Shelhamer E., and Darrell. T.2015. Fully convolutional networks for semantic segmentation. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  31. [31] Lou Y., Bai Y., Liu J., Wang S., and Duan. L.2019. Embedding adversarial learning for vehicle re-identification. IEEE Trans. Image Proc. 28, 8 (2019), 37943807.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Lou Y., Bai Y., Liu J., Wang S., and Duan. L.2019. Veri-Wild: A large dataset and a new method for vehicle re-identification in the wild. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  33. [33] Ma X., Zhang T., and Xu. C.2019. GCAN: Graph convolutional adversarial network for unsupervised domain adaptation. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  34. [34] Meng D., Li L., Liu X., Li Y., Yang S., Zha Z., Gao X., Wang S., and Huang. Q.2020. Parsing-based view-aware embedding network for vehicle re-identification. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  35. [35] Qian J., Jiang W., Luo H., and Yu. H.2019. Stripe-based and attribute-aware network: A two-branch deep model for vehicle re-identification. arXiv preprint arXiv 1910.05549 (2019).Google ScholarGoogle Scholar
  36. [36] Ramnath K., Sinha S. N., Szeliski R., and Hsiao. E.2014. Car make and model recognition using 3D curve alignment. In Proc.IEEE Conf. Appl. Comp. Vis.Google ScholarGoogle Scholar
  37. [37] Ren S., He K., Girshick R., and Sun. J.2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proc. Adv. Neural Inf. Process. Syst. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Schroff F., Kalenichenko D., and Philbin. J.2015. FaceNet: A unified embedding for face recognition and clustering. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  39. [39] Selvaraju R. R., Cogswell M., Das A., Vedantam R., Parikh D., and Batra. D.2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. arXiv preprint arXiv:1610.02391 (2017).Google ScholarGoogle Scholar
  40. [40] Sermanet P., Eigen D., Zhang X., Mathieu M., Fergus R., and LeCun. Y.2013. OverFeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013).Google ScholarGoogle Scholar
  41. [41] Shen Y., Xiao T., Li H., Yi S., and Wang. X.2017. Learning deep neural networks for vehicle Re-ID with visual-spatio-temporalpath proposals. In Proc. IEEE Int. Conf. Comp. Vis.Google ScholarGoogle Scholar
  42. [42] Simo-Serra E., Trulls E., Ferraz L., Kokkinos I., Fua P., and Moreno-Noguer. F.2015. Discriminative learning of deep convolutional feature point descriptors. In Proc. IEEE Int. Conf. Comp. Vis. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Simonyan K. and Zisserman. A.2012. Very deep convolutional networks for large-scale image recognition. In ICLR.Google ScholarGoogle Scholar
  44. [44] Simonyan K. and Zisserman. A.2015. Very deep convolutional networks for large-scale image recognition. In Proc. Int. Conf. Learn. Repr.Google ScholarGoogle Scholar
  45. [45] Song H. O., Xiang Y., Jegelka S., and Savarese. S.2016. Deep metric learning via lifted structured feature embedding. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  46. [46] Sun Y., Chen Y., and Wang. X.2014. Deep learning face representation by joint identification-verification. In Proc. Adv. Neural Inf. Process. Syst. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. [47] Sun Y., Wang X., and Tang. X.2014. Deep learning face representation from predicting 10,000 classes. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. [48] Sun Y., Wang X., and Tang. X.2015. Deeply learned face representations are sparse, selective, and robust. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  49. [49] Taigman Y., Yang M., Ranzato M., and Wolf. Lior2014. DeepFace: Closing the gap to human-level performance in face verification. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Tang Z., Naphade M., Birchfield S., Tremblay J., Hodge W., Kumar R., Wang S., and Yang. X.2019. PAMTRI: Pose-aware multi-task learning for vehicle re-identification using highly randomized synthetic data. In Proc. IEEE Int. Conf. Comp. Vis.Google ScholarGoogle Scholar
  51. [51] Tolias G., Sicre R., and Jegou. H.2016. Particular object retrieval with integral max-pooling of CNN activations. In Proc. Int. Conf. Learn. Repr.Google ScholarGoogle Scholar
  52. [52] Wang J., Song Y., Leung T., Rosenberg C., Wang J., Philbin J., Chen B., and Wu. Y.2014. Learning fine-grained image similarity with deep ranking. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Wang Z., Tang L., Liu X., Yao Z., Yi S., Shao J., Yan J., Wang S., Li H., and Wang. X.2017. Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification. In Proc. IEEE Int. Conf. Comp. Vis.Google ScholarGoogle Scholar
  54. [54] Weinberger K. Q. and Saul. L. K.2015. Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 1 (2015), 207224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. [55] Xiang Y., Choi W., Lin Y., and Savarese. S.2015. Data-driven 3D voxel patterns for object category recognition. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  56. [56] Xie S., Zheng Z., Chen L., and Chen. C.2018. Learning semantic representations for unsupervised domain adaptation. In Proc. Int. Conf. Mach. Learn.Google ScholarGoogle Scholar
  57. [57] X.Yang, M.Wang, and Tao. D.2018. Person re-identification with metric learning using privileged information. IEEE Trans. Image Proc. 27, 2 (2018), 791805.Google ScholarGoogle ScholarCross RefCross Ref
  58. [58] Yang L., Luo P., Loy C. C., and Tang. X.2015. A large-scale car dataset for fine-grained categorization and verification. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  59. [59] Yang X., Wang M., Hong R., Tian Q., and Rui. Y.2017. Enhancing person re-identification in a self-trained subspace. ACM Trans. Multimed. Comput. Commun. Appl. 13, 3 (2017), 123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. [60] Yi D., Lei Z., Liao S., and Li S. Z.. 2014. Deep metric learning for person re-identification. In Proc. IEEE Int. Conf. Patt. Recogn. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. [61] Zhang X., Zhou F., Lin Y., and Zhang. S.2016. Embedding label structures for fine-grained feature representation. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  62. [62] Zhang Z., Tan T., Huang K., and Wang. Y.2012. Three dimensional deformable-model-based localization and recognition of road vehicles. IEEE Trans. Image Proc. 21, 1 (2012), 113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. [63] Zheng L., Shen L., Tian L., Wang S., Wang J., and Tian. Q.2015. Scalable person re-identification: A benchmark. In Proc. IEEE Int. Conf. Comp. Vis. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. [64] Zheng Z., Ruan T., Wei Y., Yang Y., and Mei. T.2020. VehicleNet: Learning robust visual representation for vehicle re-identification. IEEE Trans. Multimedia 23 (2020), 2683–2693.Google ScholarGoogle Scholar
  65. [65] Zhou F. and Lin. Y.2016. Fine-grained image classification by exploring bipartite-graph labels. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  66. [66] Zhou Y. and Shao. L.2017. Cross-view GAN based vehicle generation for re-identification. In Proc. Brit. Mach. Vis. Conf.Google ScholarGoogle Scholar
  67. [67] Zhou Y. and Shao. L.2018. Aware attentive multi-view inference for vehicle re-identification. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  68. [68] Zhou Y. and Shao. L.2018. Viewpoint-aware attentive multi-view inference for vehicle re-identification. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.Google ScholarGoogle Scholar
  69. [69] Zhu J., Chen X., and Yuille. A. L.2016. DeepM: A deep part-based model for object detection and semantic part localization. In Proc. Int. Conf. Learn. Repr.Google ScholarGoogle Scholar

Index Terms

  1. Seeing Crucial Parts: Vehicle Model Verification via a Discriminative Representation Model

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 1s
      February 2022
      352 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3505206
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 January 2022
      • Accepted: 1 July 2021
      • Revised: 1 May 2021
      • Received: 1 December 2020
      Published in tomm Volume 18, Issue 1s

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!