ABSTRACT
This paper presents a strategy to identify the geographic location of videos. First, it relies on a multi-modal cascade pipeline that exploits the available sources of information, namely the user's upload history, his social network and a visual-based matching technique. Second, we present a novel divide & conquer strategy to better exploit the tags associated with the input video. It pre-selects one or several geographic area of interest of higher expected relevance and performs a deeper analysis inside the selected area(s) to return the coordinates most likely to be related to the input tags. The experiments were conducted as part of the MediaEval 2012 Placing Task. Our approach, which differs significantly from the other submitted techniques, achieves the best results on this benchmark when considering the same amount of external information, i.e. when not using any gazetteers nor any other kind of external information.
References
- J. Choi, G. Friedland, V. Ekambaram, and K. Ramchandran. The 2012 ICSI/Berkeley Video Location Estimation System. In MediaEval, 2012.Google Scholar
- D. Crandall, L. Backstrom, D. Huttenlocher, and J. Kleinberg. Mapping the World's Photos. In WWW, 2009. Google Scholar
Digital Library
- J. Hays and A. A. Efros. IM 2 GPS : estimating geographic information from a single image. In CVPR, 2008.Google Scholar
Cross Ref
- H. Jégou and O. Chum. Negative evidences and co-occurrences in image retrieval: the benefit of PCA and whitening. In ECCV, Oct. 2012.Google Scholar
Cross Ref
- H. Jégou, M. Douze, and C. Schmid. Product quantization for nearest neighbor search. PAMI, 33(1), Jan. 2011. Google Scholar
Digital Library
- H. Jégou, F. Perronnin, M. Douze, J. Sánchez, P. Pérez, and C. Schmid. Aggregating local image descriptors into compact codes. PAMI, Sep. 2012. Google Scholar
Digital Library
- P. Kelm, S. Schmiedeke, and T. Sikora. How Spatial Segmentation improves the Multimodal. In MediaEval, 2012.Google Scholar
- O. V. Laere, S. Schockaert, and J. Quinn. Ghent and Cardiff University at the 2012 Placing Task. In MediaEval, 2012.Google Scholar
- L. Li, J. Almeida, and D. Pedronette. A Multimodal Approach for Video Geocoding. In MediaEval, 2012.Google Scholar
- J. Luo, D. Joshi, J. Yu, and A. Gallagher. Geotagging in multimedia and computer vision--a survey. Multimedia Tools Appl., 51(1), Jan. 2011. Google Scholar
Digital Library
- C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008. Google Scholar
Digital Library
- N. O'Hare and V. Murdock. Modeling locations with social media. Information Retrieval, Apr. 2012. Google Scholar
Digital Library
- O. A. B. Penatti, L. T. Li, J. Almeida, and R. da S. Torres. A Visual Approach for Video Geocoding using Bag-of-Scenes. In ICMR, 2012. Google Scholar
Digital Library
- A. Popescu and N. Ballas. CEA LIST's Participation at MediaEval 2012 Placing Task. In MediaEval, 2012.Google Scholar
- A. Rae and P. Kelm. Working Notes for the Placing Task at MediaEval 2012. In MediaEval, 2012.Google Scholar
- P. Serdyukov, V. Murdock, and R. van Zwol. Placing flickr photos on a map. In SIGIR, May 2009. Google Scholar
Digital Library
- H. M. Sergieh, G. Gianini, M. Döller, H. Kosch, E. Egyed-Zsigmond, and J.-M. Pinon. Geo-based Automatic Image Annotation. In ICMR, 2012. Google Scholar
Digital Library
- B. Sigurbjörnsson and R. van Zwol. Flickr tag recommendation based on collective knowledge. In WWW, 2008. Google Scholar
Digital Library
- J. Whissell and C. Clarke. Improving document clustering using Okapi BM25 feature weighting. Information Retrieval, 14, 2011. Google Scholar
Digital Library
Index Terms
Retrieving geo-location of videos with a divide & conquer hierarchical multimodal approach





Comments