10.1145/2509230.2509238acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedings
research-article

A novel fusion method for integrating multiple modalities and knowledge for multimodal location estimation

Published:21 October 2013

ABSTRACT

This article describes a novel fusion approach using multiple modalities and knowledge sources that improves the accuracy of multimodal location estimation algorithms. The problem of "multimodal location estimation" or "placing" involves associating geo-locations with consumer-produced nmultimedia data like videos or photos that have not been tagged using GPS. Our algorithm effectively integrates data from the visual and textual modalities with external geographical knowledge bases by building a hierarchical model that combines data-driven and semantic methods to group visual and textual features together within geographical regions. We evaluate our algorithm on the MediaEval 2010 Placing Task dataset and show that our system significantly outperforms other state-of-the-art approaches, successfully locating about 40% of the videos to within a radius of 100 m.

References

  1. http://translate.google.com.Google ScholarGoogle Scholar
  2. http://www.geonames.org.Google ScholarGoogle Scholar
  3. http://www.wikipedia.org.Google ScholarGoogle Scholar
  4. http://code.google.com/apis/maps/index.html.Google ScholarGoogle Scholar
  5. J. Baldridge. The OpenNLP Project. http://www.opennlp.com, 2005.Google ScholarGoogle Scholar
  6. J. Choi, G. Friedland, V. Ekambaram, and K. Ramchandran. Multimodal location estimation of consumer media: Dealing with sparse training data. In Multimedia and Expo (ICME), 2012 IEEE International Conference on, pages 43--48, July. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Choi, A. Janin, and G. Friedland. The 2010 ICSI Video Location Estimation System. In Working Notes of the MediaEval 2010 Workshop, 2010.Google ScholarGoogle Scholar
  8. D. Ferrés and H. Rodríguez. Talp at mediaeval 2010 placing task: Geographical focus detection of flickr mtextual annotations. Proceedings of MediaEval, 2010.Google ScholarGoogle Scholar
  9. G. Friedland, O. Vinyals, and T. Darrell. Multimodal Location Estimation. In Proceedings of ACM Multimedia, pages 1245--1251, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Hays and A. A. Efros. IM2GPS: estimating geographic information from a single image. In IEEE Conference on Computer Vision and Pattern mRecognition, 2008. CVPR 2008, pages 1--8. IEEE, June 2008.Google ScholarGoogle ScholarCross RefCross Ref
  11. L. A. U.-L. M. G.-V. José M. Perea-Ortega, Miguel A. García-Cumbreras. In Working Notes Proceedings of the MediaEval 2010 Workshop, Pisa, Italy, 2010.Google ScholarGoogle Scholar
  12. P. Kelm, S. Schmiedeke, and T. Sikora. Multi-modal, multi-resource methods for placing flickr videos on the map. In ACM International Conference on Multimedia Retrieval (ICMR), page 8, Apr. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Larson, M. Soleymani, P. Serdyukov, S. Rudinac, C. Wartena, V. Murdock, G. Friedland, R. Ordelman, and G. J. F. Jones. Automatic tagging and geotagging in video collections and communities. In Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR '11, pages 51:1--51:8, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Rattenbury and M. Naaman. Methods for extracting place semantics from flickr tags. ACM Trans. Web, 3(1):1:1--1:30, Jan. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. O. Van Laere, S. Schockaert, and B. Dhoedt. Ghent University at the 2010 Placing Task. In Proceedings of MediaEval, October 2010.Google ScholarGoogle Scholar

Index Terms

  1. A novel fusion method for integrating multiple modalities and knowledge for multimodal location estimation

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              ACM Conferences cover image
              GeoMM '13: Proceedings of the 2nd ACM international workshop on Geotagging and its applications in multimedia
              October 2013
              42 pages
              ISBN:9781450323918
              DOI:10.1145/2509230

              Copyright © 2013 ACM

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 21 October 2013

              Permissions

              Request permissions about this article.

              Request Permissions

              Qualifiers

              • research-article

              Acceptance Rates

              GeoMM '13 Paper Acceptance Rate 5 of 9 submissions, 56%
              Overall Acceptance Rate 14 of 26 submissions, 54%

              Upcoming Conference

              MM '22

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!