Abstract
In recent years, there have been ever-growing geographical tagged photos on the community Web sites such as Flickr. Discovering touristic landmarks from these photos can help us to make better sense of our visual world. In this article, we report our work on mining landmarks from geotagged Flickr photos for city scene summarization and touristic recommendations. We begin by exploring the geographical and visual statistics of the Web users' photographing manner, based on which we conduct landmark mining in two steps: First, we propose to partition each city into geographical regions based on spectral clustering over the geotags of Flickr photos. Second, in each landmark region, we present a representative photo mining scheme based on sparse representation. Our main idea is to regard the landmark mining problem as a process to find photos whose visual signatures can be reconstructed using other photos of this landmark region with a minimal coding length. This sparse reconstruction scheme offers a general perspective to mine the representative photos. Indeed, by simplifying the data correlation constraints in our scheme, several previous works in representative photo discovery and landmark mining can be derived. Finally, we introduce a Hyperlink-Induced Topic Search model to refine our landmark ranking, which incorporates the community knowledge to simulate the landmark ranking problem as a dynamic page ranking problem. We have deployed our proposed landmark mining framework on a city scene summarization and navigation system, which works on one million geotagged Flickr photos coming from twenty worldwide metropolises. We have also quantitatively compared our scheme with several state-of-the-art works.
- Abbasi, R., Chernov, S., Nejdl, W., Paiu, R., and Staab, S. 2009. Exploiting Flickr tags and groups for finding landmark photos. In Proceedings of the European Conference on Information Retrieval. Google Scholar
Digital Library
- Ahern, S., Naaman, M., Nair, R., and Yang, J. 2007. World explorer: Visualizing aggregate data from unstructured text in geo-referenced collections. In Proceedings of the Joint Conference on Digital Libraries. Google Scholar
Digital Library
- Brin, S. and Page, L. 1998. The anatomy of a large-scale hypo Web search engine. In Proceedings of the International World Wide Web Conference. Google Scholar
Digital Library
- Chen, S., Donoho, D., and Saunders, M. 2001. Atomic decomposition by basis pursuit. SIAM Rev. Google Scholar
Digital Library
- Crandall, D., Backstrom, L., Huitenlocher, D., and Kleinberg, J. 2009. Mapping the world's photos. In Proceedings of the International World Wide Web Conference. Google Scholar
Digital Library
- Donoho, D. 2006. For most large underdetermined systems of equations, the minimal Ll-norm. In Communications on Pure and Applied Math, Wiley Online Library.Google Scholar
- Donoho, D. and Tsaig, Y. 2006. Fast solution of' I-Norm minimization problems when the solution may be sparse. http://www.stanford.edu/tsaig/research.html, 2001.Google Scholar
- Gao, Y., Tang, J., Hong, R. Dai, Q. Chua, T.-S., and Jain, R. 2010. W2Go: A Travel Guidance System by Automatic Landmark Ranking. In Proceedings of the ACM Conference on Multimedia. Google Scholar
Digital Library
- Hays, J. and Efros, A. 2008. IMG2GPS: Estimating geographic information from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- Ji, R., Xie, X., Yao, H., and Ma, W.-Y. 2009. Mining city landmarks from blogs by graph modeling. In Proceedings of ACM Multimedia. Google Scholar
Digital Library
- Jing, F., Zhang, L., and Ma, W.-Y. 2006. VirtualTour: An online travel assistant based on high quality images. In Proceedings of ACM Multimedia. Google Scholar
Digital Library
- Jing, Y. and Baluja, S. 2008. PageRank for product image search. In Proceedings of the International World Wide Web Conference. Google Scholar
Digital Library
- Joshi, D., Gallagher, A., Yu, J., and Luo, J. 2010. Inferring photographic location using geotagged web images. In Proceedings of the Conference on Multimedia Tools and Applications.Google Scholar
- Keiji, Y. and Qiu, B. 2010. Mining regional representatiye photos from consumer-generated geotagged photos. In Handbook of Social Network Technologies and Applications.Google Scholar
- Kennedy, L., Naaman, M., Ahern, S., Nail, R., and Rattenbury, T. 2007. How Flickr helps us make sense of the world: Context and content in community-contributed media collections. In Proceedings of ACM Multimedia. Google Scholar
Digital Library
- Kleinberg, J. M. 1999. Authoritative sources in a hyperlinked environment. J. ACM 46, 5, 604--632. Google Scholar
Digital Library
- Kretzschmar, H., Stachniss, C., Plagemann, C., and Burgard, W. 2008. Estimating landmark locations from geo-referenced photographs. In Proceedings of the IEEE Conference on Intelligent Robots and Systems.Google Scholar
- Lazem, S. Y. and Sheta, W. M. 2005. Automatic landmark identification in large virtual environment: a spatial data mining approach. In Proceedings of the International Conference on Information Visualization. Google Scholar
Digital Library
- Li, X., Wu, C., Zach, C., Lazebnik, S., and Frahm, J.-M. 2008. Modeling and recognition of landmark image collections using iconic scene graphs. In Proceedings of the European Conference on Computer Vision. Google Scholar
Digital Library
- Li, Y., Crandall, D. J., and Huttenlocher, D. P. 2009. Landmark recognition in large-scale image collections. In Proceedings of the International Conference on Computer Vision.Google Scholar
- Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. In Proceedings of the International Conference on Computer Vision.Google Scholar
Cross Ref
- Ma, Y., Derksen, H., Hong, W., and Wright, J. 2007. Segmentation of multivariate mixed data via lossy coding and compression. IEEE Trans. Patt. Anal. Mach. Intell. Google Scholar
Digital Library
- Maier, M., Luxburg, D., and Hein, M. 2008. Influence of graph construction on graph-based clustering measures. In Proceedings of the Conference on Advances in Neural Information Processing Systems.Google Scholar
- Ng, A., Jordan, M., and Weiss, Y. 2001. On spectral clustering: Analysis and an algorithm. In Proceedings of the Conference on Advances in Neural Information Processing Systems.Google Scholar
- Nister, D. and Stewenius, H. 2006. Scalable recognition with a vocabulary tree. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google Scholar
Digital Library
- Papadopoulos, S., Zigkolis, C., Kompatsiaris, Y., and Vakali, A. 2011. Cluster-based landmark and event detection for tagged photo collections. IEEE Multimedia. Google Scholar
Digital Library
- Salton, G. and Buckley, C. 1988. Term weighting approaches in automatic text retrieval. Inform. Process. Manage. 24, 513--523. Google Scholar
Digital Library
- Simmon, I., Snavely, N., and Seitz, S. M. 2007. Scene summarization for online image collections. In Proceedings of the International Conference on Computer Vision.Google Scholar
- Sivic, J. and Zisserman, A. 2003. Video google: A text retrieval approach to object matching in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google Scholar
Digital Library
- Snavely, N., Seitz, S., and Szeliski, R. 2006. Photo tourism: Exploring photo collections in 3D. In Proceedings of the ACM SIGGRAPH International Conference on Computer Graphics and Interactive Techniques. Google Scholar
Digital Library
- Tibshirani, R. 1997. Regression shrinkage and selection via the Lasso. J. Royal Stat. Soc.Google Scholar
- Torniai, C., Batte, S., and Cayzer, S. 2007. Sharing, discovering and browsing geotagged pictures on the Web. Tech. rep., HP Labs.Google Scholar
- Wright, J., Yang, A., Ganesh, A., Sastry, S., and Ma, Y. 2009. Robust face recognition via sparse representation. IEEE Trans. Patt. Anal. Mach. Intell. Google Scholar
Digital Library
- Wu, J. and Rehg, J. M. 2009. Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. In Proceedings of the International Conference on Computer Vision.Google Scholar
- Yang, J., Wright, J., Huang, T., and Ma, Y. 2008. Image super-resolution as sparse representation of raw image patches. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- Zheng, Y., Zhao, M., Song, Y., and Adam, H. 2009. Tour the world: Building a web-scale landmark recognition engine. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Index Terms
Mining flickr landmarks by modeling reconstruction sparsity
Recommendations
Mining Tags from Flickr User Comments Using a Hybrid Ranking Model
ICSS '15: Proceedings of the 2015 International Conference on Service ScienceIn the Web2.0 era, user generated content has become the main source of information of many popular websites such as Flickr. In Flickr, each user can share his/her photos and browse others' easily. Tagging system is an important approach to the photo ...
Recommending Flickr groups with social topic model
The explosion of multimedia content in social media networks raises a great demand of developing tools to facilitate producing, sharing and viewing media content. Flickr groups, self-organized communities with declared common interests, are able to help ...
When Location Meets Social Multimedia: A Survey on Vision-Based Recognition and Mining for Geo-Social Multimedia Analytics
Coming with the popularity of multimedia sharing platforms such as Facebook and Flickr, recent years have witnessed an explosive growth of geographical tags on social multimedia content. This trend enables a wide variety of emerging applications, for ...






Comments