skip to main content
research-article

PMAL: A Proxy Model Active Learning Approach for Vision Based Industrial Applications

Authors Info & Claims
Published:06 October 2022Publication History
Skip Abstract Section

Abstract

Deep Learning models’ performance strongly correlate with availability of annotated data; however, massive data labeling is laborious, expensive, and error-prone when performed by human experts. Active Learning (AL) effectively handles this challenge by selecting the uncertain samples from unlabeled data collection, but the existing AL approaches involve repetitive human feedback for labeling uncertain samples, thus rendering these techniques infeasible to be deployed in industry related real-world applications. In the proposed Proxy Model based Active Learning technique (PMAL), this issue is addressed by replacing human oracle with a deep learning model, where human expertise is reduced to label only two small subsets of data for training proxy model and initializing the AL loop. In the PMAL technique, firstly, proxy model is trained with a small subset of labeled data, which subsequently acts as an oracle for annotating uncertain samples. Secondly, active model's training, uncertain samples extraction via uncertainty sampling, and annotation through proxy model is carried out until predefined iterations to achieve higher accuracy and labeled data. Finally, the active model is evaluated using testing data to verify the effectiveness of our technique for practical applications. The correct annotations by the proxy model are ensured by employing the potentials of explainable artificial intelligence. Similarly, emerging vision transformer is used as an active model to achieve maximum accuracy. Experimental results reveal that the proposed method outperforms the state-of-the-art in terms of minimum labeled data usage and improves the accuracy with 2.2%, 2.6%, and 1.35% on Caltech-101, Caltech-256, and CIFAR-10 datasets, respectively. Since the proposed technique offers a highly reasonable solution to exploit huge multimedia data, it can be widely used in different evolutionary industrial domains.

REFERENCES

  1. [1] Wang Y., Fang M., Tianyi Zhou J., Mu T., and Tao D.. 2021. Introduction to Big Multimodal Multimedia Data with Deep Analytics. 17, ed: ACM New York, NY, 2021, 13.Google ScholarGoogle Scholar
  2. [2] Xu C., Wang K., Sun Y., Guo S., and Zomaya A. Y.. 2018. Redundancy avoidance for big data in data centers: A conventional neural network approach. IEEE Transactions on Network Science and Engineering 7, 1 (2018), 104114.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Muhammad K. et al. 2021. Fuzzy logic in surveillance big video data analysis: Comprehensive review, challenges, and research directions. ACM Computing Surveys (CSUR) 54, 3 (2021), 133.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Zhang B., Zhang R., Bisagno N., Conci N., De Natale F. G., and Liu H.. 2021. Where are they going? Predicting human behaviors in crowded scenes. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 17, 4 (2021), 119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Tanveer M., Khanna P., Prasad M., and Lin C.. 2020. Introduction to the Special Issue on Computational Intelligence for Biomedical Data and Imaging. 16, ed: ACM New York, NY, USA, 2020, 14.Google ScholarGoogle Scholar
  6. [6] Singh A., Dhillon A., Kumar N., Hossain M. S., Muhammad G., and Kumar M.. 2021. eDiaPredict: An ensemble-based framework for diabetes prediction. ACM Transactions on Multimedia Computing Communications and Applications 17, 2s (2021), 126.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Tang X., Liu M., Zhong H., Ju Y., Li W., and Xu Q.. 2021. Mill: Channel attention–based deep multiple instance learning for landslide recognition. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 17, 2s (2021), 111.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Kizilkaya B., Ever E., Yatbaz H. Y., and Yazici A.. 2022. An effective forest fire detection framework using heterogeneous wireless multimedia sensor networks. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 18, 2 (2022), 121.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Shahid H. et al. 2021. Machine learning-based Mist computing enabled internet of battlefield things. ACM Transactions on Internet Technology (TOIT) 21, 4 (2021), 126.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Prathiba S. B., Raja G., Dev K., Kumar N., and Guizani M.. 2021. A hybrid deep reinforcement learning for autonomous vehicles smart-platooning. IEEE Transactions on Vehicular Technology (2021).Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Haq I. U., Muhammad K., Hussain T., Del Ser J., Sajjad M., and Baik S. W.. 2021. QuickLook: Movie summarization using scene-based leading characters with psychological cues fusion. Information Fusion 76 (2021), 2435.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Aloufi S. and Saddik A. E.. 2022. MMSUM digital twins: A multi-view multi-modality summarization framework for sporting events. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 18, 1 (2022), 125.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Lu H. et al. 2021. An adaptive neural architecture search design for collaborative edge-cloud computing. IEEE Network 35, 5 (2021), 8389.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Education C.. 2021. Data engineering, preparation, and labeling for AI 2019. https://www.cognilytica.com/document/report-data-engineering-preparation-and-labeling-for-ai-2019/ (accessed 29/11/2021, 2021).Google ScholarGoogle Scholar
  15. [15] I. Grand View Research. 2021. Data collection and labeling market worth $8.22 billion by 2028. https://www.grandviewresearch.com/press-release/global-data-collection-labeling-market (accessed 29/11/2021).Google ScholarGoogle Scholar
  16. [16] Ren P. et al. 2021. A survey of deep active learning. ACM Computing Surveys (CSUR) 54, 9 (2021), 140.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Wang K., Zhang D., Li Y., Zhang R., and Lin L.. 2016. Cost-effective active learning for deep image classification. IEEE Transactions on Circuits and Systems for Video Technology 27, 12 (2016), 25912600.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Alizadeh A., Tavallali P., Khosravi M. R., and Singhal M.. 2021. Survey on recent active learning methods for deep learning. In Advances in Parallel & Distributed Processing, and Applications. Springer, 609617.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Settles B.. 2009. Active learning literature survey. 2009.Google ScholarGoogle Scholar
  20. [20] Tuia D., Ratle F., Pacifici F., Kanevski M. F., and Emery W. J.. 2009. Active learning methods for remote sensing image classification. IEEE Transactions on Geoscience and Remote Sensing 47, 7 (2009), 22182232.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Fredriksson T., Mattos D. I., Bosch J., and Olsson H. H.. 2020. Data labeling: An empirical investigation into industrial challenges and mitigation strategies. In International Conference on Product-Focused Software Process Improvement. Springer, 202216.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Sener O. and Savarese S.. 2017. Active learning for convolutional neural networks: A core-set approach. arXiv preprint arXiv:1708.00489.Google ScholarGoogle Scholar
  23. [23] Ebrahimi S. et al. 2020. Minimax active learning. arXiv preprint arXiv:2012.10467.Google ScholarGoogle Scholar
  24. [24] Fujii K. and Kashima H.. 2016. Budgeted stream-based active learning via adaptive submodular maximization. Advances in Neural Information Processing Systems 29, 2016.Google ScholarGoogle Scholar
  25. [25] Ducoffe M. and Precioso F.. 2018. Adversarial active learning for deep networks: A margin based approach. arXiv preprint arXiv:1802.09841.Google ScholarGoogle Scholar
  26. [26] Tran T., Do T.-T., Reid I., and Carneiro G.. 2019. Bayesian generative active deep learning. In International Conference on Machine Learning. PMLR, 62956304.Google ScholarGoogle Scholar
  27. [27] Kumar M., Packer B., and Koller D.. 2010. Self-paced learning for latent variable models. Advances in Neural Information Processing Systems 23, (2010).Google ScholarGoogle Scholar
  28. [28] Cheng Y., Chen Z., Liu L., Wang J., Agrawal A., and Choudhary A.. 2013. Feedback-driven multiclass active learning for data streams. In Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. 13111320.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Smailović J., Grčar M., Lavrač N., and Žnidaršič M.. 2014. Stream-based active learning for sentiment analysis in the financial domain. Information Sciences 285 (2014), 181203.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Kumar P. and Gupta A.. 2020. Active learning query strategies for classification, regression, and clustering: A survey. Journal of Computer Science and Technology 35, 4 (2020), 913945.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Kee S., Del Castillo E., and Runger G.. 2018. Query-by-committee improvement with diversity and density in batch active learning. Information Sciences 454 (2018), 401418.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Li X. and Guo Y.. 2013. Adaptive active learning for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 859866.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Agrawal A., Tripathi S., and Vardhan M.. 2021. Active learning approach using a modified least confidence sampling strategy for named entity recognition. Progress in Artificial Intelligence 10, 2 (2021), 113128.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Elhamifar E., Sapiro G., Yang A., and Sasrty S. S.. 2013. A convex optimization framework for active learning. In Proceedings of the IEEE International Conference on Computer Vision. 209216.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Brinker K.. 2003. Incorporating diversity in active learning with support vector machines. In Proceedings of the 20th International Conference on Machine Learning (ICML'03). 5966.Google ScholarGoogle Scholar
  36. [36] Mayer C. and Timofte R.. 2020. Adversarial sampling for active learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 30713079.Google ScholarGoogle Scholar
  37. [37] Yang Y., Ma Z., Nie F., Chang X., and Hauptmann A. G.. 2015. Multi-class active learning by uncertainty sampling with diversity maximization. International Journal of Computer Vision 113, 2 (2015), 113127.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Kim K., Park D., Kim K. I., and Chun S. Y.. 2021. Task-aware variational adversarial active learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 81668175.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Yin T., Liu N., and Sun H.. 2021. Self-paced active learning for deep CNNs via effective loss function. Neurocomputing 424 (2021), 18.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Cho J. W., Kim D.-J., Jung Y., and Kweon I. S.. 2021. MCDAL: Maximum classifier discrepancy for active learning. arXiv preprint arXiv:2107.11049.Google ScholarGoogle Scholar
  41. [41] He K., Zhang X., Ren S., and Sun J.. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770778.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Ripley B. D.. 2007. Pattern Recognition and Neural Networks. Cambridge University Press (2007).Google ScholarGoogle Scholar
  43. [43] Wu Z., Shen C., and Van Den Hengel A.. 2019. Wider or deeper: Revisiting the ResNet model for visual recognition. Pattern Recognition 90 (2019), 119133.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Tan C., Sun F., Kong T., Zhang W., Yang C., and Liu C.. 2018. A survey on deep transfer learning. In International Conference on Artificial Neural Networks. Springer, 270279.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Russakovsky O. et al. 2015. ImageNet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211252.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. [46] Vaswani A. et al. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 59986008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. [47] Dosovitskiy A. et al. 2020. An image is worth 16 × 16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.Google ScholarGoogle Scholar
  48. [48] Khan S., Naseer M., Hayat M., Zamir S. W., Khan F. S., and Shah M.. 2021. Transformers in vision: A survey. arXiv preprint arXiv:2101.01169.Google ScholarGoogle Scholar
  49. [49] Schröder C., Niekler A., and Potthast M.. Revisiting Uncertainty-based Query Strategies for Active Learning with Transformers.Google ScholarGoogle Scholar
  50. [50] Krizhevsky A. and Hinton G.. 2009. Learning multiple layers of features from tiny images. (2009).Google ScholarGoogle Scholar
  51. [51] Li F.-F., Fergus R., and Perona P.. 2006. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 4 (2006), 594611.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Griffin G., Holub A., and Perona P.. 2007. Caltech-256 object category dataset. (2007).Google ScholarGoogle Scholar
  53. [53] Simonyan K. and Zisserman A.. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.Google ScholarGoogle Scholar
  54. [54] Tan M. and Le Q.. 2019. EfficientNet: Rethinking model scaling for convolutional neural networks. Presented at the Proceedings of the 36th International Conference on Machine Learning, Proceedings of Machine Learning Research (2019). [Online]. Available: https://proceedings.mlr.press/v97/tan19a.html.Google ScholarGoogle Scholar

Index Terms

  1. PMAL: A Proxy Model Active Learning Approach for Vision Based Industrial Applications

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 2s
      June 2022
      383 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3561949
      • Editor:
      • Abdulmotaleb El Saddik
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 October 2022
      • Online AM: 21 June 2022
      • Accepted: 29 April 2022
      • Revised: 12 April 2022
      • Received: 31 December 2021
      Published in tomm Volume 18, Issue 2s

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!