skip to main content
research-article

Mask or Non-Mask? Robust Face Mask Detector via Triplet-Consistency Representation Learning

Authors Info & Claims
Published:25 January 2022Publication History
Skip Abstract Section

Abstract

In the absence of vaccines or medicines to stop COVID-19, one of the effective methods to slow the spread of the coronavirus and reduce the overloading of healthcare is to wear a face mask. Nevertheless, to mandate the use of face masks or coverings in public areas, additional human resources are required, which is tedious and attention-intensive. To automate the monitoring process, one of the promising solutions is to leverage existing object detection models to detect the faces with or without masks. As such, security officers do not have to stare at the monitoring devices or crowds, and only have to deal with the alerts triggered by the detection of faces without masks. Existing object detection models usually focus on designing the CNN-based network architectures for extracting discriminative features. However, the size of training datasets of face mask detection is small, while the difference between faces with and without masks is subtle. Therefore, in this article, we propose a face mask detection framework that uses the context attention module to enable the effective attention of the feed-forward convolution neural network by adapting their attention maps’ feature refinement. Moreover, we further propose an anchor-free detector with Triplet-Consistency Representation Learning by integrating the consistency loss and the triplet loss to deal with the small-scale training data and the similarity between masks and occlusions. Extensive experimental results show that our method outperforms the other state-of-the-art methods. The source code is released as a public download to improve public health at https://github.com/wei-1006/MaskFaceDetection.

REFERENCES

  1. [1] AIZOOTech Daniell Chiang. 2021.. AIZOOTech/FaceMaskDetection. https://github.com/AIZOOTech/FaceMaskDetection.Google ScholarGoogle Scholar
  2. [2] Belongie Serge, Malik Jitendra, and Puzicha Jan. 2002. Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 4 (2002), 509522. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Bochkovskiy Alexey, Wang Chien-Yao, and Liao Hong-Yuan Mark. 2020. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020).Google ScholarGoogle Scholar
  4. [4] Bochkovskiy Alexey, Wang Chien-Yao, and Liao Hong-Yuan Mark. 2020. YOLOv4: Optimal speed and accuracy of object detection. arXiv (2020).Google ScholarGoogle Scholar
  5. [5] Borkar Neel Ramakant and Kuwelkar Sonia. 2017. Real-time implementation of face recognition system. In 2017 International Conference on Computing Methodologies and Communication (ICCMC’17). IEEE, 249255.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Cai Zhaowei and Vasconcelos Nuno. 2018. Cascade R-CNN: Delving into high quality object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 61546162.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Cao Jiale, Cholakkal Hisham, Anwer Rao Muhammad, Khan Fahad Shahbaz, Pang Yanwei, and Shao Ling. 2020. D2det: Towards high quality object detection and instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1148511494.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Cao Zhe, Simon Tomas, Wei Shih-En, and Sheikh Yaser. 2017. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 72917299.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Chen Qiang, Wang Yingming, Yang Tong, Zhang Xiangyu, Cheng Jian, and Sun Jian. 2021. You only look one-level feature. arXiv preprint arXiv:2103.09460 (2021).Google ScholarGoogle Scholar
  10. [10] Chen Yihong, Zhang Zheng, Cao Yue, Wang Liwei, Lin Stephen, and Hu Han. 2020. RepPoints V2: Verification meets regression for object detection. In Neural Information Processing Systems (NeurIPS’20).Google ScholarGoogle Scholar
  11. [11] Chowdary G Jignesh, Punn Narinder Singh, Sonbhadra Sanjay Kumar, and Agarwal Sonali. 2020. Face mask detection using transfer learning of inceptionv3. In International Conference on Big Data Analytics. Springer, 8190.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Dai Tao, Cai Jianrui, Zhang Yongbing, Xia Shu-Tao, and Zhang Lei. 2019. Second-order attention network for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1106511074.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Dalal Navneet and Triggs Bill. 2005. Histograms of oriented gradients for human detection. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 1. IEEE, 886893. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Duan Kaiwen, Bai Song, Xie Lingxi, Qi Honggang, Huang Qingming, and Tian Qi. 2019. CenterNet: Keypoint triplets for object detection. In IEEE International Conference on Computer Vision (ICCV’19).Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Elsayed Gamaleldin, Kornblith Simon, and Le Quoc V.. 2019. Saccader: Improving accuracy of hard attention models for vision. In Advances in Neural Information Processing Systems. 702714. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Felzenszwalb Pedro, McAllester David, and Ramanan Deva. 2008. A discriminatively trained, multiscale, deformable part model. In 2008 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 18.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Felzenszwalb Pedro F., Girshick Ross B., McAllester David, and Ramanan Deva. 2009. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 9 (2009), 16271645. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Fu Cheng-Yang, Shvets Mykhailo, and Berg Alexander C.. 2019. RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free. arXiv preprint arXiv:1901.03353 (2019).Google ScholarGoogle Scholar
  19. [19] Gathani Jenil and Shah Krish. 2020. Detecting masked faces using region-based convolutional neural network. In 2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS’20). IEEE, 156161.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Ge Shiming, Li Jia, Ye Qiting, and Luo Zhao. 2017. Detecting masked faces in the wild with LLE-CNNs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 26822690.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Gidaris Spyros, Singh Praveer, and Komodakis Nikos. 2018. Unsupervised representation learning by predicting image rotations. In The International Conference on Learning Representations (ICLR’18).Google ScholarGoogle Scholar
  22. [22] Gidaris Spyros, Singh Praveer, and Komodakis Nikos. 2018. Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728 (2018).Google ScholarGoogle Scholar
  23. [23] Girshick Ross. 2015. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision. 14401448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Girshick Ross, Donahue Jeff, Darrell Trevor, and Malik Jitendra. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 580587. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Girshick Ross, Felzenszwalb Pedro, and McAllester David. 2011. Object detection with grammar models. Advances in Neural Information Processing Systems 24 (2011), 442450. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Girshick Ross B., Felzenszwalb Pedro F., and McAllester David. 2012. Discriminatively trained deformable part models, release 5. (2012).Google ScholarGoogle Scholar
  27. [27] al. J. B. Grill et2020. Bootstrap your own latent: A new approach to self-supervised learning. In Advances in Neural Information Processing Systems (NeurIPS’20). IEEE.Google ScholarGoogle Scholar
  28. [28] He Kaiming, Fan Haoqi, Wu Yuxin, Xie Saining, and Girshick Ross. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 97299738.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770778.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Howard Andrew G., Zhu Menglong, Chen Bo, Kalenichenko Dmitry, Wang Weijun, Weyand Tobias, Andreetto Marco, and Adam Hartwig. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).Google ScholarGoogle Scholar
  31. [31] Hu Jie, Shen Li, and Sun Gang. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 71327141.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Jeong Jisoo, Lee Seungeui, Kim Jeesoo, and Kwak Nojun. 2019. Consistency-based semi-supervised learning for object detection. In Advances in Neural Information Processing Systems. 1075910768. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Jian Zhang and Wan-Juan Song. 2010. Face detection for security surveillance system. In 2010 5th International Conference on Computer Science & Education. IEEE, 17351738.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Jiang Mingjie, Fan Xinqi, and Yan Hong. 2020. RetinaMask: A Face Mask Detector. (2020). arxiv:cs.CV/2005.03950Google ScholarGoogle Scholar
  35. [35] Joshi Aniruddha Srinivas, Joshi Shreyas Srinivas, Kanahasabai Goutham, Kapil Rudraksh, and Gupta Savyasachi. 2020. Deep learning framework to detect face masks from video footage. In 2020 12th International Conference on Computational Intelligence and Communication Networks (CICN’20). IEEE, 435440.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Kingma Diederik P. and Ba Jimmy. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  37. [37] Kong Tao, Sun Fuchun, Liu Huaping, Jiang Yuning, Li Lei, and Shi Jianbo. 2020. Foveabox: Beyond anchor-based object detection. IEEE Transactions on Image Processing 29 (2020), 73897398.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Kumar Ashu, Kaur Amandeep, and Kumar Munish. 2019. Face detection techniques: A review. Artificial Intelligence Review 52, 2 (2019), 927948. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Law Hei and Deng Jia. 2018. Cornernet: Detecting objects as paired keypoints. In Proceedings of the European Conference on Computer Vision (ECCV’18). 734750.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Lee Youngwan and Park Jongyoul. 2020. Centermask: Real-time anchor-free instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1390613915.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Leung Nancy H. L., Chu Daniel K. W., Shiu Eunice Y. C., Chan Kwok-Hung, McDevitt James J., Hau Benien J. P., Yen Hui-Ling, Li Yuguo, Ip Dennis K. M., Peiris J. S. Malik, et al. 2020. Respiratory virus shedding in exhaled breath and efficacy of face masks. Nature Medicine 26, 5 (2020), 676680.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Lin Shaohui, Cai Ling, Lin Xianming, and Ji Rongrong. 2016. Masked face detection via a modified LeNet. Neurocomputing 218 (2016), 197202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Lin Tsung-Yi, Dollár Piotr, Girshick Ross, He Kaiming, Hariharan Bharath, and Belongie Serge. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 21172125.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Lin Tsung-Yi, Goyal Priya, Girshick Ross, He Kaiming, and Dollár Piotr. 2017. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision. 29802988.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Liu Chuanbin, Xie Hongtao, Zha Zhengjun, Yu Lingyun, Chen Zhineng, and Zhang Yongdong. 2020. Bidirectional attention-recognition model for fine-grained object classification. IEEE Transactions on Multimedia 22, 7 (2020), 17851795.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Liu Wei, Anguelov Dragomir, Erhan Dumitru, Szegedy Christian, Reed Scott, Fu Cheng-Yang, and Berg Alexander C.. 2016. SSD: Single shot multibox detector. In European Conference on Computer Vision. Springer, 2137.Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Locatello Francesco, Weissenborn Dirk, Unterthiner Thomas, Mahendran Aravindh, Heigold Georg, Uszkoreit Jakob, Dosovitskiy Alexey, and Kipf Thomas. 2020. Object-centric learning with slot attention. Advances in Neural Information Processing Systems 33 (2020), 11525–11538.Google ScholarGoogle Scholar
  48. [48] Lowe David G.. 1999. Object recognition from local scale-invariant features. In Proceedings of the 7th IEEE International Conference on Computer Vision, Vol. 2. IEEE, 11501157. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Lowe David G.. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 2 (2004), 91110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Mahmoud Hanan A. Hosni and Mengash Hanan Abdullah. 2020. A novel technique for automated concealed face detection in surveillance videos. Personal and Ubiquitous Computing 25 (2020), 112.Google ScholarGoogle Scholar
  51. [51] Min Shaobo, Yao Hantao, Xie Hongtao, Zha Zheng-Jun, and Zhang Yongdong. 2020. Multi-objective matrix normalization for fine-grained visual recognition. IEEE Transactions on Image Processing 29 (2020), 49965009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Misra I. and Maaten L. V. D.. 2020. Self-supervised learning of pretext-invariant representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20).Google ScholarGoogle ScholarCross RefCross Ref
  53. [53] Neubeck Alexander and Gool Luc Van. 2006. Deep residual learning for image recognition. In Proceedings of IEEE International Conference on Pattern Recognition (ICPR’06).Google ScholarGoogle Scholar
  54. [54] Newell Alejandro, Yang Kaiyu, and Deng Jia. 2016. Stacked hourglass networks for human pose estimation. In European Conference on Computer Vision. Springer, 483499.Google ScholarGoogle ScholarCross RefCross Ref
  55. [55] Papandreou George, Zhu Tyler, Kanazawa Nori, Toshev Alexander, Tompson Jonathan, Bregler Chris, and Murphy Kevin. 2017. Towards accurate multi-person pose estimation in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 49034911.Google ScholarGoogle ScholarCross RefCross Ref
  56. [56] Paszke Adam, Gross Sam, Massa Francisco, Lerer Adam, Bradbury James, Chanan Gregory, Killeen Trevor, Lin Zeming, Gimelshein Natalia, Antiga Luca, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems. 80268037. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. [57] Redmon Joseph, Divvala Santosh, Girshick Ross, and Farhadi Ali. 2016. You only look once: Unified, real-time object detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16).Google ScholarGoogle ScholarCross RefCross Ref
  58. [58] Redmon Joseph, Divvala Santosh, Girshick Ross, and Farhadi Ali. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 779788.Google ScholarGoogle ScholarCross RefCross Ref
  59. [59] Redmon Joseph and Farhadi Ali. 2017. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 72637271.Google ScholarGoogle ScholarCross RefCross Ref
  60. [60] Redmon Joseph and Farhadi Ali. 2018. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018).Google ScholarGoogle Scholar
  61. [61] Ren Shaoqing, He Kaiming, Girshick Ross, and Sun Jian. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. 9199. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. [62] Ren S., He K., Girshick R., and Sun J.. 2017. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 6 (2017), 11371149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. [63] Schroff Florian, Kalenichenko Dmitry, and Philbin James. 2015. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 815823.Google ScholarGoogle ScholarCross RefCross Ref
  64. [64] Sun Ke, Xiao Bin, Liu Dong, and Wang Jingdong. 2019. Deep high-resolution representation learning for human pose estimation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).Google ScholarGoogle ScholarCross RefCross Ref
  65. [65] Sun Xiao, Shang Jiaxiang, Liang Shuang, and Wei Yichen. 2017. Compositional human pose regression. In Proceedings of the IEEE International Conference on Computer Vision. 26022611.Google ScholarGoogle ScholarCross RefCross Ref
  66. [66] Sun Xiao, Xiao Bin, Wei Fangyin, Liang Shuang, and Wei Yichen. 2018. Integral human pose regression. In Proceedings of the European Conference on Computer Vision (ECCV’18). 529545.Google ScholarGoogle ScholarCross RefCross Ref
  67. [67] Sun Yi, Wang Xiaogang, and Tang Xiaoou. 2015. Deeply learned face representations are sparse, selective, and robust. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 28922900.Google ScholarGoogle ScholarCross RefCross Ref
  68. [68] Taigman Yaniv, Yang Ming, Ranzato Marc’Aurelio, and Wolf Lior. 2014. Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 17011708. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. [69] Tan Mingxing, Pang Ruoming, and Le Quoc V.. 2020. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1078110790.Google ScholarGoogle ScholarCross RefCross Ref
  70. [70] Tian Zhi, Shen Chunhua, Chen Hao, and He Tong. 2019. FCOS: Fully convolutional one-stage object detection. In IEEE International Conference on Computer Vision (ICCV’19).Google ScholarGoogle ScholarCross RefCross Ref
  71. [71] Sande Koen E. A. Van de, Uijlings Jasper R. R., Gevers Theo, and Smeulders Arnold W. M.. 2011. Segmentation as selective search for object recognition. In 2011 International Conference on Computer Vision. IEEE, 18791886. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. [72] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 59986008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. [73] Viola Paul and Jones Michael. 2001. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’01), Vol. 1. IEEE, I–I.Google ScholarGoogle ScholarCross RefCross Ref
  74. [74] Viola Paul and Jones Michael J.. 2004. Robust real-time face detection. International Journal of Computer Vision 57, 2 (2004), 137154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. [75] Wang Fei, Jiang Mengqing, Qian Chen, Yang Shuo, Li Cheng, Zhang Honggang, Wang Xiaogang, and Tang Xiaoou. 2017. Residual attention network for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 31563164.Google ScholarGoogle ScholarCross RefCross Ref
  76. [76] Wang Jingdong, Sun Ke, Cheng Tianheng, Jiang Borui, Deng Chaorui, Zhao Yang, Liu Dong, Mu Yadong, Tan Mingkui, Wang Xinggang, Liu Wenyu, and Xiao Bin. 2019. Deep high-resolution representation learning for visual recognition. TPAMI 43, 10 (2019), 3349–3364.Google ScholarGoogle Scholar
  77. [77] Wang Tiancai, Yang Tong, Danelljan Martin, Khan Fahad Shahbaz, Zhang Xiangyu, and Sun Jian. 2020. Learning human-object interaction detection using interaction points. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 41164125.Google ScholarGoogle ScholarCross RefCross Ref
  78. [78] Woo Sanghyun, Park Jongchan, Lee Joon-Young, and Kweon In So. 2018. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV’18). 319.Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. [79] Xiao Bin, Wu Haiping, and Wei Yichen. 2018. Simple baselines for human pose estimation and tracking. In Proceedings of the European Conference on Computer Vision (ECCV’18). 466481.Google ScholarGoogle ScholarCross RefCross Ref
  80. [80] Xu Guodong, Liu Ziwei, Li Xiaoxiao, and Loy Chen Change. 2020. Knowledge distillation meets self-supervision.. In Proceedings of the European Conference on Computer Vision (ECCV’20).Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. [81] Yang Shuo, Luo Ping, Loy Chen Change, and Tang Xiaoou. 2016. WIDER FACE: A face detection benchmark. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16).Google ScholarGoogle ScholarCross RefCross Ref
  82. [82] Yang Ze, Liu Shaohui, Hu Han, Wang Liwei, and Lin Stephen. 2019. RepPoints: Point set representation for object detection. In IEEE International Conference on Computer Vision (ICCV’19).Google ScholarGoogle ScholarCross RefCross Ref
  83. [83] Zhang Shifeng, Chi Cheng, Yao Yongqiang, Lei Zhen, and Li Stan Z.. 2020. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 97599768.Google ScholarGoogle ScholarCross RefCross Ref
  84. [84] Zhao Zhong-Qiu, Zheng Peng, Xu Shou-tao, and Wu Xindong. 2019. Object detection with deep learning: A review. IEEE Transactions on Neural Networks and Learning Systems 30, 11 (2019), 32123232.Google ScholarGoogle ScholarCross RefCross Ref
  85. [85] Zheng Zhaohui, Wang Ping, Liu Wei, Li Jinze, Ye Rongguang, and Ren Dongwei. 2020. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 1299313000.Google ScholarGoogle ScholarCross RefCross Ref
  86. [86] Zhou Xingyi, Zhuo Jiacheng, and Krahenbuhl Philipp. 2019. Bottom-up object detection by grouping extreme and center points. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 850859.Google ScholarGoogle ScholarCross RefCross Ref
  87. [87] Zhu Zhenyao, Luo Ping, Wang Xiaogang, and Tang Xiaoou. 2014. Recover canonical-view faces in the wild with deep neural networks. arXiv preprint arXiv:1404.3543 (2014).Google ScholarGoogle Scholar
  88. [88] Zitnick C. Lawrence and Dollár Piotr. 2014. Edge boxes: Locating object proposals from edges. In European Conference on Computer Vision. Springer, 391405.Google ScholarGoogle ScholarCross RefCross Ref
  89. [89] Zou Zhengxia, Shi Zhenwei, Guo Yuhong, and Ye Jieping. 2019. Object detection in 20 years: A survey. arXiv preprint arXiv:1905.05055 (2019).Google ScholarGoogle Scholar

Index Terms

  1. Mask or Non-Mask? Robust Face Mask Detector via Triplet-Consistency Representation Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 1s
      February 2022
      352 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3505206
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 January 2022
      • Accepted: 1 June 2021
      • Revised: 1 May 2021
      • Received: 1 December 2020
      Published in tomm Volume 18, Issue 1s

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!