ABSTRACT
Photo community sites such as Flickr and Picasa Web Album host a massive amount of personal photos with millions of new photos uploaded every month. These photos constitute an overwhelming source of images that require effective management. There is an increasingly imperative need for semantic annotation of these web images. This paper addresses the problem by considering two kinds of annotation: semantic annotation and geographic annotation. Both are useful for image search and retrieval and for facilitating communities and social networks. This paper proposes a novel method of Logistic Canonical Correlation Regression (LCCR) for the annotation task. This model exploits the canonical correlation between heterogeneous features and an annotation lexicon of interest, and builds a generalized annotation engine based on canonical correlations in order to produce enhanced annotation for web images. We validate the effectiveness of our algorithm using a dataset of over 380,000 images tagged with GPS coordinates.
References
- Flickr APIs. http://www.flickr.com/services/api/.Google Scholar
- L. Cao, J. Luo, and T. Huang. Annotating photo collections by label propagation according to multiple similarity cues. In ACM Conference on Multimedia, 2008. Google Scholar
Digital Library
- Y. Cheng. Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Analysis and Machine Intelligence, 17(8):790--799, 1995. Google Scholar
Digital Library
- D. Comaniciu and P. Meer. Mean shift analysis and applications. IEEE International Conference on Computer Vision, pages 1197--1203, 1999. Google Scholar
Digital Library
- D. Crandall, L. Backstrom, D. Huttenlocher, and J. Kleinberg. Mapping the world's photos. International conference on World Wide Web, pages 761--770, 2009. Google Scholar
Digital Library
- S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American society for information science, 41(6):391--407, 1990.Google Scholar
- Y. Freund and R. E. Schapire. Experiments with a new boosting algorithm. ICML, pages 148--156, 1996.Google Scholar
Digital Library
- Y. Fu, L. Cao, G. Guo, and T. Huang. Multiple feature fusion by subspace learning. In ACM Conference on Content-based Image and Video Retrieval, pages 127--134, 2008. Google Scholar
Digital Library
- D. Hardoon, S. Szedmak, and J. Shawe-Taylor. Canonical correlation analysis: an overview with application to learning methods. Neural Computation, 16(12):2639--2664, 2004. Google Scholar
Digital Library
- J. Hays and A. A. Efros. Im2gps: estimating geographic information from a single image. In IEEE Conference on Computer Vision and Pattern Recognition, 2008.Google Scholar
Cross Ref
- G. Holmes, A. Donkin, and I. Witten. Weka: A machine learning workbench. Intelligent Information Systems, pages 357--361, 1994.Google Scholar
- H. Hotelling. Relations between two sets of variates. Biometrika, 28(3-4):321--377, 1936.Google Scholar
Cross Ref
- A. Jaffe, M. Naaman, T. Tassa, and M. Davis. Generating summaries and visualization for large collections of geo-referenced photographs. In ACM international workshop on Multimedia Information Retrieval, pages 89--98, 2006. Google Scholar
Digital Library
- J. Jia, N. Yu, and X.-S. Hua. Annotating personal albums via web mining. In ACM International Conference on Multimedia, pages 459--468, 2008. Google Scholar
Digital Library
- Y. Jing and S. Baluja. Apply pagerank to google product image search. International World Wide Web Conference, 2008. Google Scholar
Digital Library
- L. Kennedy, M. Naaman, S. Ahern, R. Nair, and T. Rattenbury. How flickr helps us make sense of the world: Context and content in community-contributed media collections. In ACM Conference on Multimedia, 2007. Google Scholar
Digital Library
- T. Kim, J. Kittler, and R. Cipolla. Discriminative learning and recognition of image set classes using canonical correlations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6):1005, 2007. Google Scholar
Digital Library
- P. Lai and C. Fyfe. Kernel and nonlinear canonical correlation analysis. International Journal of Neural Systems, 16(12):2639--2664, 2004.Google Scholar
- J. Luo, J. Yu, D. Joshi, and W. Hao. Event recognition: viewing the world with a third eye. In ACM International Conference on Multimedia, pages 1071--1080, 2008. Google Scholar
Digital Library
- A. Oliva and A. Torralba. Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3):145--175, 2001. Google Scholar
Digital Library
- T. Quack, B. Leibe, and L. Van Gool. World-scale mining of objects and events from community photo collections. ACM Conference on Image and Video Retrieval, pages 47--56, 2008. Google Scholar
Digital Library
- G. Schindler, P. Krishnamurthy, R. Lublinerman, Y. Liu, and F. Dellaert. Detecting and matching repeated patterns for automatic geo-tagging in urban environments. In IEEE Conference on Computer Vision and Pattern Recognition, 2008.Google Scholar
Cross Ref
- J. Smith and S. Chang. Visually searching the web for content. IEEE Multimedia Magazine, 4(3):12--20, 1997. Google Scholar
Digital Library
- A. Sorokin and D. Forsyth. Utility data annotation with amazon mechanical turk. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 1--8, 2008.Google Scholar
Cross Ref
- A. Torralba, R. Fergus, and W. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11):1958--1970, 2008. Google Scholar
Digital Library
- A. Vinokourov, J. Shawe-Taylor, and N. Cristianini. Inferring a semantic representation of text via cross-language correlation analysis. Advances in Neural Information Processing Systems, pages 1497--1504, 2003.Google Scholar
- C. Wang, L. Zhang, and H.-J. Zhang. Learning to reduce the semantic gap in web image retrieval and annotation. ACM SIGIR, 2008. Google Scholar
Digital Library
- X.-J. Wang, L. Zhang, X. Li, and W.-Y. Ma. Annotating images by mining image search results. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11), 2008. Google Scholar
Digital Library
- K. Q. Weinberger, M. Slaney, and R. Van Zwol. Resolving tag ambiguity. In ACM International Conference on Multimedia, pages 111--120, 2008. Google Scholar
Digital Library
- L. Wu, X.-S. Hua, N. Yu, W.-Y. Ma, and S. Li. Flickr distance. In ACM International Conference on Multimedia, pages 31--40, 2008. Google Scholar
Digital Library
- J. Yu and J. Luo. Leveraging probabilistic season and location context models for scene understanding. In International conference on Content-based image and video retrieval, pages 169--178, 2008. Google Scholar
Digital Library
- W. Zheng, X. Zhou, C. Zou, and L. Zhao. Facial expression recognition using kernel canonical correlation analysis (KCCA). IEEE Transactions on Neural Networks, 17(1):233--238, 2006. Google Scholar
Digital Library
Index Terms
Enhancing semantic and geographic annotation of web images via logistic canonical correlation regression





Comments