Abstract
While much exciting progress is being made in mobile visual search, one important question has been left unexplored in all current systems. When searching objects or scenes in the 3D world, which viewing angle is more likely to be successful? More particularly, if the first query fails to find the right target, how should the user control the mobile camera to form the second query? In this article, we propose a novel Active Query Sensing system for mobile location search, which actively suggests the best subsequent query view to recognize the physical location in the mobile environment. The proposed system includes two unique components: (1) an offline process for analyzing the saliencies of different views associated with each geographical location, which predicts the location search precisions of individual views by modeling their self-retrieval score distributions. (2) an online process for estimating the view of an unseen query, and suggesting the best subsequent view change. Specifically, the optimal viewing angle change for the next query can be formulated as an online information theoretic approach. Using a scalable visual search system implemented over a NYC street view dataset (0.3 million images), we show a performance gain by reducing the failure rate of mobile location search to only 12% after the second query. We have also implemented an end-to-end functional system, including user interfaces on iPhones, client-server communication, and a remote search server. This work may open up an exciting new direction for developing interactive mobile media applications through the innovative exploitation of active sensing and query formulation.
- Baatz, G., Koser, K., Chen, D., Grzeszczuk, R., and Pollefeys, M. 2010. Handling urban location recognition as a 2d homothetic problem. In Proceedings of the 10th European Conference on Computer Vision. Google Scholar
Digital Library
- Chen, D., Baatz, G., Koser, K., Tsai, S., Vedantham, R., Pylvanainen, T., Roimela, K., Chen, X., Bach, J., Pollefeys, M., Girod, B., and Grzeszczuk, R. 2011. City-scale landmark identification on mobile devices. In Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR). Google Scholar
Digital Library
- Crandall, D., Backstrom, L., and Huttenlocher, D. 2009. Mapping the world's photos. In Proceedings of the 18th International World Wide Web Conference (WWW). Google Scholar
Digital Library
- Datta, R., Joshi, D., Li, J., and Wang, J. 2008. Image retrieval ideas, influences, and trends of the new age. ACM Comput. Surv. Google Scholar
Digital Library
- Eade, E. and Drummond, T. 2008. Unified loop closing and recovery for real time monocular slam. In Proceedings of the British Machine Vision Conference (BMVC).Google Scholar
- Girod, B., Chandrasekhar, V., Chen, D., Cheung, N.-M., Grzeszczuk, R., Reznik, Y., Takacs, G., Tsai, S., and Vedantham, R. 2011. Mobile visual search. IEEE Sig. Process. Mag.Google Scholar
Cross Ref
- Goggles. http://www.google.com/mobile/goggles/.Google Scholar
- He, B. and Ounis, I. 2004. Infer query performance using pre-retrieval predictors. In Proceedings of the International Symposium on String Processing and Information Retrieval.Google Scholar
- He, J., Feng, J., Lin, T.-H., Liu, X., and Chang, S.-F. 2012. Mobile product search with bag of hash bits and boundary rerankings. In Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR). Google Scholar
Digital Library
- Irschara, A., Zach, C., Frahm, J., and Bischof, H. 2009. From structure-from-motion point clouds to fast location recognition. In Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR).Google Scholar
- Kaneva, B., Sivic, J., Torralba, A., Avidan, S., and Freeman, W. 2010. Matching and predicting street level images. In Workshop on Vision for Cognitive Tasks, Proceedings of the 10th European Conference on Computer Vision (ECCV).Google Scholar
- Knopp, J., Sivic, J., and Pajdla, T. 2010. Avoiding confusing features in place recognition. In Proceedings of the 10th European Conference on Computer Vision (ECCV). Google Scholar
Digital Library
- Kooaba. http://www.kooaba.com/.Google Scholar
- Kwok, K., Grunfeld, L., Sun, H., Deng, P., and Dinstl, N. 2004. Robust track experiments using pircs. In Proceedings of the 13th Text Retrieval Conference (TREC).Google Scholar
- Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. In Int. J. Comput. Vis. Google Scholar
Digital Library
- Nister, D. and Stewenius, H. 2006. Scalable recognition with a vocabulary tree. In Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR). Google Scholar
Digital Library
- Oliva, A. and Torralba, A. 2001. Modeling the shape of the scene a holistic representation of the spatial envelope. Int. J. Comput. Vis. Google Scholar
Digital Library
- Point and Find. http://www.pointandfind.nokia.com/.Google Scholar
- Rui, Y., Huang, T., and Chang, S. 1999. Image retrieval current techniques, promising directions, and open issues. J. Vis. Commun. Image Represent.Google Scholar
Digital Library
- Schindler, G. and Brown, M. 2007. City-scale location recognition. In Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR).Google Scholar
- Sivic, J. and Zisserman, A. 2003. Video google a text retrieval approach to object matching in videos. In Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV). Google Scholar
Digital Library
- Snaptell. http://www.snaptell.com/.Google Scholar
- Tong, S. and Chang, E. 2001. Support vector machine active learning for image retrieval. In Proceedings of ACM Multimedia. Google Scholar
Digital Library
- Wu, J. and Rehg, J. 2009. Beyond the euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. In Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV).Google Scholar
- YomTov, E., Fine, S., Carmel, D., and Darlow, A. 2005. Learning to estimate query difficulty. In Proceedings of ACM SIGIR. Google Scholar
Digital Library
- Yu, F., Ji, R., and Chang, S.-F. 2011. Active query sensing for mobile location search. In Proceedings of ACM Multimedia. Google Scholar
Digital Library
- Zha, Z.-J., Yang, L., Mei, T., Wang, M., and Wang, Z. 2009. Visual query suggestion. In Proceedings of ACM Multimedia. Google Scholar
Digital Library
- Zhang, W. and Kosecka, J. 2006. Image based localization in urban environments. In Proceedings of the 3rd International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT). Google Scholar
Digital Library
Index Terms
Active query sensing: Suggesting the best query view for mobile visual search
Recommendations
Active query sensing for mobile location search
MM '11: Proceedings of the 19th ACM international conference on MultimediaWhile much exciting progress is being made in mobile visual search, one important question has been left unexplored in all current systems. When the first query fails to find the right target (up to 50% likelihood), how should the user form his/her ...
Intelligent query formulation for mobile visual search
MM '11: Proceedings of the 19th ACM international conference on MultimediaWhile much progress is being made in mobile visual search, most efforts are on how to improve search performance (precision, recall, speed) given queries. How to help the user form a good query has generally left unexplored. Successful mobile search ...
A mobile location search system with active query sensing
MM '11: Proceedings of the 19th ACM international conference on MultimediaHow should the second query be taken once the first query fails in mobile location search based on visual recognition? In this demo, we describe a mobile search system with a unique Active Query Sensing (AQS) function to intelligently guide the mobile ...






Comments