skip to main content
research-article

Towards optimizing human labeling for interactive image tagging

Authors Info & Claims
Published:19 August 2013Publication History
Skip Abstract Section

Abstract

Interactive tagging is an approach that combines human and computer to assign descriptive keywords to image contents in a semi-automatic way. It can avoid the problems in automatic tagging and pure manual tagging by achieving a compromise between tagging performance and manual cost. However, conventional research efforts on interactive tagging mainly focus on sample selection and models for tag prediction. In this work, we investigate interactive tagging from a different aspect. We introduce an interactive image tagging framework that can more fully make use of human's labeling efforts. That means, it can achieve a specified tagging performance by taking less manual labeling effort or achieve better tagging performance with a specified labeling cost. In the framework, hashing is used to enable a quick clustering of image regions and a dynamic multiscale clustering labeling strategy is proposed such that users can label a large group of similar regions each time. We also employ a tag refinement method such that several inappropriate tags can be automatically corrected. Experiments on a large dataset demonstrate the effectiveness of our approach

References

  1. Andoni, A. and Indyk, P. 2008. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Comm. ACM 51, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bissol, S., Mulhem, P., and Chiaramella, Y. 2003. Mialbum - a system for home photo managemet using the semi-automatic image annotation approach. In Proceedings of the International Workshop on Content-Based Multimedia Indexing.Google ScholarGoogle Scholar
  3. Chua, T.-S., Tang, J., Hong, R., Li, H., Luo, Z., and Zheng, Y.-T. 2009. NUS-WIDE: A real-world web image database from National University of Singapore. In Proceedings of the ACM Conference on Image and Video Retrieval. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cui, J., Wen, F., Xiao, R., Tian, O., and Tang, X. 2007. Easyalbum: An interactive photo annotation system based on face clustering and re-ranking. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Deng, Y. and Manjunath, B. S. 2001. Unsupervised segmentation of color-texture regions in images and video. IEEE Trans. Pattern Anal. Mach. Intell. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Duygulu, P., Barnard, K., and Forsyth, D. 2002. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In Proceedings of the European Conference on Computer Vision. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Frey, B. J. and Dueck, D. 2007. Clustering by passing messages between data points. Science 315, 972--976.Google ScholarGoogle ScholarCross RefCross Ref
  8. Girgensohn, A., Adcock, J., and Wilcox, L. 2004. Leveraging face recognition technology to find and organize photos. In Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Hauptmann, A., Lin, W. H., Yan, R., Yang, J., and Chen, M. Y. 2006. Extreme video retrieval: Joint maximization of human and computer performance. In Proceedings of the ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Huang, T., Dagli, C., Rajaram, S., Chang, E., Mandel, M., Poliner, G., and Ellis, D. 2008. Active learning for interactive multimedia retrieval. Proc. IEEE 96, 4.Google ScholarGoogle ScholarCross RefCross Ref
  11. Jeon, J., Lavrenko, V., and Manmatha, R. 2003. Automatic image annotation and retrieval using cross-media relevance models. In Proceedings of the ACM Conference on Research and Development in Information Retrieval. 119--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Joshi, A., Porikli, F., and Papanikolopoulos, N. 2009. Multi-class active learning for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  13. Kanungo, T., Mount, D. M., Netanyahu, N. S., Piatko, C. D., Silverman, R., and Wu, A. Y. 2002. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24, 881--892. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Kuchinsky, A., Pering, C., Creech, M. L., Freeze, D., Serra, B., and Gwizdka, J. 1999. Fotofile: A consumer multimedia organization and retrieval system. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Lee, S., Neve, W. D., and Ro, Y. M. 2010. Image tag refinement along the what dimension using tag categorization and neighbor voting. In Proceedings of the IEEE International Conference on Multimedia and Expo.Google ScholarGoogle Scholar
  16. Li, T., Yan, S., Mei, T., Hua, X.-S., and Kweon, I.-S. 2011. Image decomposition with multilabel context: Algorithms and applications. IEEE Trans. Image Process. 20, 8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Liu, D., Wang, M., Hua, X. S., and Zhang, H. J. 2009. Smart batch tagging of photo albums. In Proceedings of the ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Liu, W., Sun, Y., and Zhang, H. 2000. Mialbum - a system for home photo managemet using the semi-automatic image annotation approach. In Proceedings of the ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Liu, W., Susan, D., Sun, Y., Zhang, H.-J., Czerwinski, M., and Field, B. 2001. Semi-automatic image annotation. In Proceedings of the IFIP TC 13 International Conference on Human Computer Interaction.Google ScholarGoogle Scholar
  20. Makadia, A., Pavlovic, V., and Kumar, S. 2008. A new baseline for image annotation. In Proceedings of the 10th European Conference on Computer Vision. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Mu, Y., Shen, J., and Yan, S. 2010. Weakly supervised hashing in kernel space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  22. Nakamuraa, E. and Kehtarnavaz, N. 1998. Determining number of clusters and prototype locations via multi-scale clustering. Pattern Recognit. Lett. 19, 14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ng, A. Y., Jordan, M. I., and Weiss, Y. 2001. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14, MIT Press, 849--856.Google ScholarGoogle Scholar
  24. Rother, C., Bordeaux, L., Hamadi, Y., and Blake, A. 2006. Autocollage. In Proceedings of the ACM SIGGRAPH International Conference on Computer Graphics and Interactive Techniques. ACM Press, 847--852. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Rui, Y., Huang, T. S., Ortega, M., and Mehrotra, S. 1998. Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans. Circ. Syst. Video Tech. 18, 5, 644--655. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Suh, B. and Bederson, B. B. 2004. Semi-automatic image annotation using event and torso identification. Tech. rep., HCIL-2004-15, Computer Science Department, University of Maryland.Google ScholarGoogle Scholar
  27. Suh, B. and Bederson, 2007. Semi-automatic photo annotation strategies using event based clustering and clothing based person recognition. Interact. Comput. 19, 4, 524--544. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Tang, J., Chen, Q., Yan, S., Chua, T.-S., and Jain, R. 2010. One person labels one million images. In Proceedings of the ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Tang, J., Hong, R., Yan, S., Chua, T.-S., Qi, G.-J., and Jain, R. 2011. Image annotation by knn-sparse graph-based label propagation over noisily-tagged web images. ACM Trans. Intell. Syst. Technol. 2, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Tang, J., Yan, S., Hong, R., Qi, G.-J., and Chua, T.-S. 2009. Inferring semantic concepts from community contributed images and noisy tags. In Proceedings of the ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Tang, J., Zha, Z.-J., Tao, D., and Chua, T.-S. 2012. Semantic-gap-oriented active learning for multilabel image annotation. IEEE Trans. Image Process. 21, 4, 2354--2360.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Tian, Y., Liu, W., Xiao, R., Wen, F., and Tang, X. 2007. A face annotation framework with partial clustering and interactive labeling. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  33. Trec. Trec-10 proceedings appendix on common evaluation measures. http://trec.nist.gov/pubs/trec10/appendices/measures.pdf.Google ScholarGoogle Scholar
  34. Tuffield, M. M., Harris, S., et al. 2006. Image annotation with photocopain. In Proceedings of the World Wide Web Conference.Google ScholarGoogle Scholar
  35. Wang, X. J., Zhang, L., Li, X., and Ma, W. Y. 2008. Annotating images by mining image search results. IEEE Trans. Pattern Anal. Mach. Intell. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Xu, H., Wang, J., Hua, X.-S., and Li, S. 2009. Tag refinement by regularized LDA. In Proceedings of the ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Yan, R., Natsev, A., and Campbell, M. 2009. Hybrid tagging and browsing approaches for efficient manual image annotation. IEEE Multimedia Mag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Yang, K., Wang, M., and Zhang, H.-J. 2009. Active tagging for image indexing. In Proceedings of the IEEE International Conference on Multimedia and Expo. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Zhang, L., Chen, L., Li, M., and Zhang, H. 2003. Automated annotation of human faces in family albums. In Proceedings of the 11th ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Zhu, G., Yan, S., and Ma, Y. 2010. Image tag refinement towards low-rank, content-tag prior and error sparsity. In Proceedings of the ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Towards optimizing human labeling for interactive image tagging

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Multimedia Computing, Communications, and Applications
        ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 9, Issue 4
        August 2013
        168 pages
        ISSN:1551-6857
        EISSN:1551-6865
        DOI:10.1145/2501643
        Issue’s Table of Contents

        Copyright © 2013 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 19 August 2013
        • Accepted: 1 March 2013
        • Revised: 1 April 2012
        • Received: 1 July 2011
        Published in tomm Volume 9, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!