10.1145/1631272.1631292acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedings
research-article

Enhancing semantic and geographic annotation of web images via logistic canonical correlation regression

Authors Info & Claims
Published:19 October 2009

ABSTRACT

Photo community sites such as Flickr and Picasa Web Album host a massive amount of personal photos with millions of new photos uploaded every month. These photos constitute an overwhelming source of images that require effective management. There is an increasingly imperative need for semantic annotation of these web images. This paper addresses the problem by considering two kinds of annotation: semantic annotation and geographic annotation. Both are useful for image search and retrieval and for facilitating communities and social networks. This paper proposes a novel method of Logistic Canonical Correlation Regression (LCCR) for the annotation task. This model exploits the canonical correlation between heterogeneous features and an annotation lexicon of interest, and builds a generalized annotation engine based on canonical correlations in order to produce enhanced annotation for web images. We validate the effectiveness of our algorithm using a dataset of over 380,000 images tagged with GPS coordinates.

References

  1. Flickr APIs. http://www.flickr.com/services/api/.Google ScholarGoogle Scholar
  2. L. Cao, J. Luo, and T. Huang. Annotating photo collections by label propagation according to multiple similarity cues. In ACM Conference on Multimedia, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Y. Cheng. Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Analysis and Machine Intelligence, 17(8):790--799, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Comaniciu and P. Meer. Mean shift analysis and applications. IEEE International Conference on Computer Vision, pages 1197--1203, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Crandall, L. Backstrom, D. Huttenlocher, and J. Kleinberg. Mapping the world's photos. International conference on World Wide Web, pages 761--770, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American society for information science, 41(6):391--407, 1990.Google ScholarGoogle Scholar
  7. Y. Freund and R. E. Schapire. Experiments with a new boosting algorithm. ICML, pages 148--156, 1996.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. Fu, L. Cao, G. Guo, and T. Huang. Multiple feature fusion by subspace learning. In ACM Conference on Content-based Image and Video Retrieval, pages 127--134, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Hardoon, S. Szedmak, and J. Shawe-Taylor. Canonical correlation analysis: an overview with application to learning methods. Neural Computation, 16(12):2639--2664, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Hays and A. A. Efros. Im2gps: estimating geographic information from a single image. In IEEE Conference on Computer Vision and Pattern Recognition, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  11. G. Holmes, A. Donkin, and I. Witten. Weka: A machine learning workbench. Intelligent Information Systems, pages 357--361, 1994.Google ScholarGoogle Scholar
  12. H. Hotelling. Relations between two sets of variates. Biometrika, 28(3-4):321--377, 1936.Google ScholarGoogle ScholarCross RefCross Ref
  13. A. Jaffe, M. Naaman, T. Tassa, and M. Davis. Generating summaries and visualization for large collections of geo-referenced photographs. In ACM international workshop on Multimedia Information Retrieval, pages 89--98, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Jia, N. Yu, and X.-S. Hua. Annotating personal albums via web mining. In ACM International Conference on Multimedia, pages 459--468, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Jing and S. Baluja. Apply pagerank to google product image search. International World Wide Web Conference, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Kennedy, M. Naaman, S. Ahern, R. Nair, and T. Rattenbury. How flickr helps us make sense of the world: Context and content in community-contributed media collections. In ACM Conference on Multimedia, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. T. Kim, J. Kittler, and R. Cipolla. Discriminative learning and recognition of image set classes using canonical correlations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6):1005, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Lai and C. Fyfe. Kernel and nonlinear canonical correlation analysis. International Journal of Neural Systems, 16(12):2639--2664, 2004.Google ScholarGoogle Scholar
  19. J. Luo, J. Yu, D. Joshi, and W. Hao. Event recognition: viewing the world with a third eye. In ACM International Conference on Multimedia, pages 1071--1080, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Oliva and A. Torralba. Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3):145--175, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T. Quack, B. Leibe, and L. Van Gool. World-scale mining of objects and events from community photo collections. ACM Conference on Image and Video Retrieval, pages 47--56, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. G. Schindler, P. Krishnamurthy, R. Lublinerman, Y. Liu, and F. Dellaert. Detecting and matching repeated patterns for automatic geo-tagging in urban environments. In IEEE Conference on Computer Vision and Pattern Recognition, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  23. J. Smith and S. Chang. Visually searching the web for content. IEEE Multimedia Magazine, 4(3):12--20, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Sorokin and D. Forsyth. Utility data annotation with amazon mechanical turk. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 1--8, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  25. A. Torralba, R. Fergus, and W. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11):1958--1970, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Vinokourov, J. Shawe-Taylor, and N. Cristianini. Inferring a semantic representation of text via cross-language correlation analysis. Advances in Neural Information Processing Systems, pages 1497--1504, 2003.Google ScholarGoogle Scholar
  27. C. Wang, L. Zhang, and H.-J. Zhang. Learning to reduce the semantic gap in web image retrieval and annotation. ACM SIGIR, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. X.-J. Wang, L. Zhang, X. Li, and W.-Y. Ma. Annotating images by mining image search results. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. K. Q. Weinberger, M. Slaney, and R. Van Zwol. Resolving tag ambiguity. In ACM International Conference on Multimedia, pages 111--120, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. L. Wu, X.-S. Hua, N. Yu, W.-Y. Ma, and S. Li. Flickr distance. In ACM International Conference on Multimedia, pages 31--40, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Yu and J. Luo. Leveraging probabilistic season and location context models for scene understanding. In International conference on Content-based image and video retrieval, pages 169--178, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. W. Zheng, X. Zhou, C. Zou, and L. Zhao. Facial expression recognition using kernel canonical correlation analysis (KCCA). IEEE Transactions on Neural Networks, 17(1):233--238, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Enhancing semantic and geographic annotation of web images via logistic canonical correlation regression

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      ACM Conferences cover image
      MM '09: Proceedings of the 17th ACM international conference on Multimedia
      October 2009
      1202 pages
      ISBN:9781605586083
      DOI:10.1145/1631272

      Copyright © 2009 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 19 October 2009

      Permissions

      Request permissions about this article.

      Request Permissions

      Qualifiers

      • research-article

      Acceptance Rates

      MM '09 Paper Acceptance Rate 50 of 305 submissions, 16%
      Overall Acceptance Rate 2,077 of 8,139 submissions, 26%

      Upcoming Conference

      MM '22

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!