skip to main content
research-article

Robust Long-Term Tracking via Localizing Occluders

Authors Info & Claims
Published:17 February 2023Publication History
Skip Abstract Section

Abstract

Occlusion is known as one of the most challenging factors in long-term tracking because of its unpredictable shape. Existing works devoted into the design of loss functions, training strategies or model architectures, which are considered to have not directly touched the key point. Alternatively, we came up with a direct and natural idea that is discarding things that covers the target. We propose a novel occluder-aware representation learning framework to develop this idea. First, we design a local occluders detection module (LODM) to localize the occluders, which works on the principle that discriminates the non-noumenal part from a target based on the general knowledge of this category. An extra dataset and a clustering strategy is proposed to support this general knowledge. Second, we devise a feature reconstruction module to guide the occluder-aware representation learning. With the help of above methods, our localizing occluders tracker, called LOTracker, can learn an occluder-free representation and promote the performance that tracks with occlusion scenarios. Extensive experimental results show that our LOTracker achieves a state-of-the-art performance in multiple benchmarks such as LaSOT, VOTLT2018, VOTLT2019, and OxUvALT.

REFERENCES

  1. [1] Bertinetto Luca, Valmadre Jack, Henriques João F., Vedaldi Andrea, and Torr Philip H. S.. 2016. Fully-convolutional Siamese networks for object tracking. In ECCV 2016, Vol. 9914. 850865. Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Bhat Goutam, Danelljan Martin, Gool Luc Van, and Timofte Radu. 2019. Learning discriminative model prediction for tracking. In Proc. ICCV.Google ScholarGoogle Scholar
  3. [3] Branagan D. J., Marshall M. C., and Meacham B. E.. 2006. High toughness high hardness iron based PTAW weld materials. Materials Science and Engineering: A 428, 1 (2006), 116123. Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Chen Hanting, Wang Yunhe, Xu Chunjing, Shi Boxin, and Xu Chang. [n.d.]. AdderNet: Do we really need multiplications in deep learning?. In CVPR 2019.Google ScholarGoogle Scholar
  5. [5] Dai Kenan, Zhang Yunhua, Wang Dong, Li Jianhua, Lu Huchuan, and Yang Xiaoyun. 2020. High-performance long-term tracking with meta-updater. In Proc. CVPR.Google ScholarGoogle Scholar
  6. [6] Danelljan Martin, Bhat Goutam, Khan Fahad Shahbaz, and Felsberg Michael. [n.d.]. ECO: Efficient convolution operators for tracking. In CVPR 2017.Google ScholarGoogle Scholar
  7. [7] Davies E. R.. 2011. Guest editorial - Advances in people tracking. Pattern Recognition Letters 32, 6 (2011), 866. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Devries Terrance and Taylor Graham W.. 2017. Improved regularization of convolutional neural networks with cutout. CoRR abs/1708.04552 (2017). arxiv:1708.04552 http://arxiv.org/abs/1708.04552Google ScholarGoogle Scholar
  9. [9] Fan Heng, Lin Liting, Yang Fan, Chu Peng, Deng Ge, Yu Sijia, Bai Hexin, Xu Yong, Liao Chunyuan, and Ling Haibin. 2019. LaSOT: A high-quality benchmark for large-scale single object tracking. In Proc. CVPR. 53745383.Google ScholarGoogle Scholar
  10. [10] Huang L., Zhao X., and Huang K.. 2019. GOT-10k: A large high-diversity benchmark for generic object tracking in the wild. IEEE Trans. Pattern Anal. Mach. Intell. (2019), 11. Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Huang Lianghua, Zhao Xin, and Huang Kaiqi. 2020. GlobalTrack: A simple and strong baseline for long-term tracking. In Proc. AAAI.Google ScholarGoogle Scholar
  12. [12] Kalal Z., Mikolajczyk K., and Matas J.. 2012. Tracking-learning-detection. IEEE Trans.Pattern Anal. Mach. Intell. 34, 7 (2012), 14091422. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Kortylewski Adam, He Ju, Liu Qing, and Yuille Alan L.. 2020. Compositional convolutional neural networks: A deep architecture with innate robustness to partial occlusion. In Proc. CVPR.Google ScholarGoogle Scholar
  14. [14] Kortylewski Adam, Liu Qing, Wang Huiyu, Zhang Zhishuai, and Yuille Alan. 2019. Localizing occluders with compositional convolutional networks. In Proc. ICCV Workshops.Google ScholarGoogle Scholar
  15. [15] Matej Kristan, Jiri Matas, Ales Leonardis, Michael Felsberg, Roman Pflugfelder, Joni-Kristian Kamarainen, Luka Cehovin Zajc, Ondrej Drbohlav, Alan Lukezic, Amanda Berg, Abdelrahman Eldesokey, Jani Kapyla, Gustavo Fernandez, Abel Gonzalez-Garcia, Alireza Memarmoghadam, Andong Lu, Anfeng He, Anton Varfolomieiev, Antoni Chan, Ardhendu Shekhar Tripathi, Arnold Smeulders, Bala Suraj Pedasingu, Bao Xin Chen, Baopeng Zhang, Baoyuan Wu, Bi Li, Bin He, Bin Yan, Bing Bai, Bing Li, Bo Li, Byeong Hak Kim, and Byeong Hak Ki. The seventh visual object tracking VOT2019 challenge results. In Proc. ICCV.Google ScholarGoogle Scholar
  16. [16] Kwak Suha, Nam Woonhyun, Han Bohyung, and Han Joon Hee. 2011. Learning occlusion with likelihoods for visual tracking. In Proc. ICCV. IEEE, 15511558.Google ScholarGoogle Scholar
  17. [17] Lee Kuan Hui and Hwang Jeng Neng. 2015. On-road pedestrian tracking across multiple driving recorders. IEEE Trans.Multimedia 17, 9 (2015), 11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Li Bo, Wu Wei, Wang Qiang, Zhang Fangyi, Xing Junliang, and Yan Junjie. 2019. SiamRPN++: Evolution of Siamese visual tracking with very deep networks. In Proc. CVPR.Google ScholarGoogle Scholar
  19. [19] Lin Tsung Yi, Maire Michael, Belongie Serge, Hays James, and Zitnick C. Lawrence. 2014. Microsoft COCO: Common objects in context. International Journal of Computer Vision (2014).Google ScholarGoogle Scholar
  20. [20] Lukezic Alan, Zajc Luka Cehovin, Vojír Tomás, Matas Jiri, and Kristan Matej. 2018. FuCoLoT - A fully-correlational long-term tracker. In Proc. ACCV.Google ScholarGoogle Scholar
  21. [21] Lukezic Alan, Zajc Luka Cehovin, Vojír Tomás, Matas Jiri, and Kristan Matej. 2018. Now you see me: Evaluating performance in long-term visual tracking. CoRR abs/1804.07056 (2018). arxiv:1804.07056 http://arxiv.org/abs/1804.07056.Google ScholarGoogle Scholar
  22. [22] Ma Chao, Yang Xiaokang, Zhang Chongyang, and Yang Ming-Hsuan. 2015. Long-term correlation tracking. In Proc. CVPR. IEEE Computer Society, 53885396. Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Molchanov Pavlo, Yang Xiaodong, Gupta Shalini, Kim Kihwan, Tyree Stephen, and Kautz Jan. [n.d.]. Online detection and classification of dynamic hand gestures with recurrent 3D convolutional neural network. In CVPR 2016.Google ScholarGoogle Scholar
  24. [24] Nebehay Georg and Pflugfelder Roman P.. 2015. Clustering of static-adaptive correspondences for deformable object tracking. In Proc. CVPR. IEEE Computer Society, 27842791. Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Pantrigo Juan José, Hernández Javier, and Sánchez Ángel. 2010. Multiple and variable target visual tracking for video-surveillance applications. Pattern Recognit. Lett. 31, 12 (2010), 15771590. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Russakovsky Olga, Deng Jia, Su Hao, Krause Jonathan, Satheesh Sanjeev, Ma Sean, Huang Zhiheng, Karpathy Andrej, Khosla Aditya, Bernstein Michael S., Berg Alexander C., and Li Fei-Fei. 2015. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 3 (2015), 211252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Tang Siyu, Andriluka Mykhaylo, Andres Bjoern, and Schiele Bernt. [n.d.]. Multiple people tracking by lifted multicut and person re-identification. In CVPR 2017.Google ScholarGoogle Scholar
  28. [28] Tu Fangwen, Ge Shuzhi Sam, Tang Yazhe, and Hang Chang Chieh. 2017. Robust visual tracking via collaborative motion and appearance model. IEEE Transactions on Industrial Informatics 13, 5 (2017), 22512259. Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Valmadre Jack, Bertinetto Luca, Henriques Joao F, Tao Ran, Vedaldi Andrea, Smeulders Arnold W. M., Torr Philip H. S., and Gavves Efstratios. 2018. Long-term tracking in the wild: A benchmark. In Proceedings of the European Conference on Computer Vision (ECCV). 670685.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Valmadre Jack, Bertinetto Luca, Henriques João F., Tao Ran, Vedaldi Andrea, Smeulders Arnold W. M., Torr Philip H. S., and Gavves Efstratios. 2018. Long-term tracking in the wild: A benchmark. CoRR abs/1803.09502 (2018). arXiv:1803.09502 http://arxiv.org/abs/1803.09502.Google ScholarGoogle Scholar
  31. [31] Wang Xiao, Li Chenglong, Luo Bin, and Tang Jin. [n.d.]. SINT++: Robust visual tracking via adversarial positive instance generation. In CVPR 2018.Google ScholarGoogle Scholar
  32. [32] Wang Xinlong, Xiao Tete, Jiang Yuning, Shao Shuai, Sun Jian, and Shen Chunhua. 2018. Repulsion loss: Detecting pedestrians in a crowd. arxiv:1711.07752 [cs.CV]Google ScholarGoogle Scholar
  33. [33] Yan Bin, Zhao Haojie, Wang Dong, Lu Huchuan, and Yang Xiaoyun. 2019. ‘Skimming-perusal’ tracking: A framework for real-time and robust long-term tracking. In Proc. ICCV.Google ScholarGoogle Scholar
  34. [34] Zhang Tianzhu, Jia Kui, Xu Changsheng, Ma Yi, and Ahuja Narendra. 2014. Partial occlusion handling for visual tracking via robust part matching. In Proc.CVPR. 12581265.Google ScholarGoogle Scholar
  35. [35] Zhang Yunhua, Wang Dong, Wang Lijun, Qi Jinqing, and Lu Huchuan. 2018. Learning regression and verification networks for long-term visual tracking. CoRR abs/1809.04320 (2018). arXiv:1809.04320 http://arxiv.org/abs/1809.04320Google ScholarGoogle Scholar
  36. [36] Zhu Zheng, Wang Qiang, Li Bo, Wu Wei, Yan Junjie, and Hu Weiming. 2018. Distractor-aware Siamese networks for visual object tracking. CoRR abs/1808.06048 (2018). arXiv:1808.06048 http://arxiv.org/abs/1808.06048.Google ScholarGoogle Scholar

Index Terms

  1. Robust Long-Term Tracking via Localizing Occluders

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 2s
      April 2023
      545 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3572861
      • Editor:
      • Abdulmotaleb El Saddik
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 February 2023
      • Online AM: 18 August 2022
      • Accepted: 12 August 2022
      • Revised: 4 July 2022
      • Received: 14 September 2021
      Published in tomm Volume 19, Issue 2s

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)145
      • Downloads (Last 6 weeks)17

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!