Abstract
The ubiquity of the smartphones makes them ideal platforms for generating in-situ content. In well-attended events, photos captured by attendees have diverse views that could be subjected to occlusion and abnormal lighting effects that could obscure the view. Such unstructured photo collections also have significant redundancy. Thus, a scene that is partially occluded or has bad contrast in one photo may be captured in another photo, possibly with higher details. We propose an application called Autolink that automatically establishes content-based links between sensor-annotated photos in unstructured photo collections captured using smartphones, such that users could navigate between high-context and high-detail images. This hierarchically structured image collection facilitates the design of applications for navigation and discovery, analytics about user photography patterns, user taste, and content/event popularity. Autolink includes a framework that constructs this hierarchy efficiently and with little content-specific training data by combining photo content processing with associated sensor logs obtained from multiple participants. We evaluated the performance of Autolink on two real-world sensor tagged photo datasets. The result shows that Autolink is able to efficiently cluster photos at 20 times faster than candidate algorithms, into the appropriate hierarchy with at least 70% precision and 37% better recall than candidate algorithms.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Automated Link Generation for Sensor-Enriched Smartphone Images
- Alaa E. Abdel-Hakim and Aly A. Farag. 2006. CSIFT: A SIFT descriptor with color invariant characteristics. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06). Vol. 2. IEEE Computer Society, Los Alamitos, CA, 1978--1983. Google Scholar
Digital Library
- Majid Alivand and Hartwig Hochmair. 2013. Extracting scenic routes from VGI data sources. In Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Crowdsourced and Volunteered Geographic Information (GEOCROWD'13). ACM, New York, 23--30. Google Scholar
Digital Library
- Aydin Arpa, Luca Ballan, Rahul Sukthankar, Gabriel Taubin, Marc Pollefeys, and Ramesh Raskar. 2013. CrowdCam: Instantaneous navigation of crowd images using angled graph. In Proceedings of the International Conference on 3D Vision (3DV'13). IEEE Computer Society, Los Alamitos, CA, 422--429. Google Scholar
Digital Library
- Xuan Bao, Songchun Fan, Alexander Varshavsky, Kevin Li, and Romit Roy Choudhury. 2013. Your reactions suggest you liked the movie: Automatic content rating via reaction sensing. In Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing (UBICOMP'13). ACM, New York, 197--206. Google Scholar
Digital Library
- Michael Bergh, Xavier Boix, Gemma Roig, and Luc Gool. 2015. SEEDS: Superpixels Extracted Via Energy-Driven Sampling. Int. J. Comput. Vision 111, 3 (2015), 298--314. Google Scholar
Digital Library
- Cheng Bo, Xiang-Yang Li, Taeho Jung, Xufei Mao, Yue Tao, and Lan Yao. 2013. SmartLoc: Push the limit of the inertial sensor based metropolitan localization using smartphone. In Proceedings of the 19th Annual International Conference on Mobile Computing & Networking (MOBICOM'13). ACM, New York, 195--198. Google Scholar
Digital Library
- O. Boiman, E. Shechtman, and M. Irani. 2008. In defense of Nearest-Neighbor based image classification. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR'08). IEEE, 1--8.Google Scholar
- Eric Brachmann, Marcel Spehr, and Stefan Gumhold. 2013. Feature propagation on image webs for enhanced image retrieval. In Proceedings of the 3rd ACM Conference on International Conference on Multimedia Retrieval (ICMR'13). ACM, New York, 25--32. Google Scholar
Digital Library
- Axel Carlier, Guntur Ravindra, Vincent Charvillat, and Wei Tsang Ooi. 2011. Combining content-based analysis and crowdsourcing to improve user interaction with zoomable video. In Proceedings of the 19th ACM International Conference on Multimedia (MM'11). ACM, New York, 43--52. Google Scholar
Digital Library
- Matthew L. Cooper. 2011. Clustering geo-tagged photo collections using dynamic programming. In Proceedings of the 19th ACM International Conference on Multimedia (MM'11). ACM, New York, 1025--1028. Google Scholar
Digital Library
- P.-E. Forssen. 2007. Maximally stable colour regions for recognition and matching. In Proceedings of 2007 Conference on Computer Vision and Pattern Recognition (CVPR'07). IEEE, 1--8.Google Scholar
Cross Ref
- Jesse Prabawa Gozali, Min-Yen Kan, and Hari Sundaram. 2012. Hidden markov model for event photo stream segmentation. In Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops (ICMEW'12). IEEE Computer Society, Los Alamitos, CA, 25--30. Google Scholar
Digital Library
- K. Heath, N. Gelfand, M. Ovsjanikov, M. Aanjaneya, and L. J. Guibas. 2010. Image webs: Computing and exploiting connectivity in image collections. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR'10). IEEE Computer Society, Los Alamitos, CA, 3432--3439.Google Scholar
- Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2008. Hamming embedding and weak geometric consistency for large scale image search. In Proceedings of the 10th European Conference on Computer Vision: Part I (ECCV'08). Springer-Verlag, Berlin, Heidelberg, 304--317. Google Scholar
Digital Library
- Yurong Jiang, Xing Xu, Peter Terlecky, Tarek Abdelzaher, Amotz Bar-Noy, and Ramesh Govindan. 2013. MediaScope: Selective on-demand media retrieval from mobile devices. In Proceedings of the 12th International Conference on Information Processing in Sensor Networks (IPSN'13). ACM, New York, 289--300. Google Scholar
Digital Library
- Yannis Kalantidis, Giorgos Tolias, Yannis Avrithis, Marios Phinikettos, Evaggelos Spyrou, Phivos Mylonas, and Stefanos Kollias. 2011. VIRaL: Visual image retrieval and localization. Multimed. Tools Appl. 51, 2 (2011), 555--592. Google Scholar
Digital Library
- Yuri Almeida Lacerda, Robson Gonçcalves Fechine Feitosa, Guilherme Álvaro Rodrigues Maia Esmeraldo, Cláudio de Souza Baptista, and Leandro Balby Marinho. 2012. Compass clustering: A new clustering method for detection of points of interest using personal collections of georeferenced and oriented photographs. In Proceedings of the 18th Brazilian Symposium on Multimedia and the Web (WEBMEDIA'12). ACM, New York, 281--288. Google Scholar
Digital Library
- Heng Liu, Tao Mei, Jiebo Luo, Houqiang Li, and Shipeng Li. 2012a. Finding perfect rendezvous on the go: Accurate mobile visual localization and its applications to routing. In Proceedings of the 20th ACM International Conference on Multimedia (MM'12). ACM, New York, 9--18. Google Scholar
Digital Library
- Jiajun Liu, Zi Huang, Lei Chen, Heng Tao Shen, and Zhixian Yan. 2012b. Discovering areas of interest with geo-tagged images and check-ins. In Proceedings of the 20th ACM International Conference on Multimedia (MM'12). ACM, New York, 589--598. Google Scholar
Digital Library
- Gregor Miller, Sidney Fels, Matthias Finke, Will Motz, Walker Eagleston, and Chris Eagleston. 2009. MiniDiver: A novel mobile media playback interface for rich video content on an iPhone™. In Proceedings of the 8th International Conference on Entertainment Computing (ICEC'09). Springer-Verlag, Berlin, Heidelberg, 98--109. Google Scholar
Digital Library
- Ruiko Miyano, Takuya Inoue, Takuya Minagawa, Yuko Uematsu, and Hideo Saito. 2013. Camera pose estimation of a smartphone at a field without interest points. In Proceedings of the 11th International Conference on Computer Vision - Volume 2 (ACCV'12). Springer-Verlag, Berlin, Heidelberg, 545--555. Google Scholar
Digital Library
- Jong-Seung Park and Ramesh Jain. 2013. Identification of scene locations from geotagged images. ACM Trans. Multimedia Comput. Commun. Appl. 9, 1, Article 5 (2013). Google Scholar
Digital Library
- Chuan Qin, Xuan Bao, Romit Roy Choudhury, and Srihari Nelakuditi. 2011. TagSense: A smartphone-based approach to automatic image tagging. In Proceedings of the 9th International Conference on Mobile Systems, Applications, and Services (MOBISYS'11). ACM, New York, 1--14. Google Scholar
Digital Library
- Dragomir R. Radev, Hong Qi, Harris Wu, and Weiguo Fan. 2002. Evaluating web-based question answering systems. In Proceedings of the International Conference on Language Resources and Evaluation (LREC'02). Las Palmas, Spain, 1153--1156.Google Scholar
- Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. 2011. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision (ICCV'11). IEEE Computer Society, Los Alamitos, CA, 2564--2571. Google Scholar
Digital Library
- Frank Shipman, Andreas Girgensohn, and Lynn Wilcox. 2008. Authoring, viewing, and generating hypervideo: An overview of hyper-hitchcock. ACM Trans. Multimedia Comput. Commun. Appl. 5, 2, Article 15 (2008). Google Scholar
Digital Library
- Rahul Singh and Pierre-Yves Corlobe. 2007. Multifaceted hyperimage-based organization and interaction with bio-medical images. In Proceedings of the 20th IEEE International Symposium on Computer-Based Medical Systems (CBMS'07). IEEE Computer Society, Los Alamitos, CA, 153--158. Google Scholar
Digital Library
- Noah Snavely, Steven M. Seitz, and Richard Szeliski. 2006. Photo tourism: Exploring photo collections in 3D. ACM Trans. Graph. 25, 3 (2006), 835--846. Google Scholar
Digital Library
- Noah Snavely, Steven M. Seitz, and Richard Szeliski. 2008. Modeling the world from internet photo collections. Int. J. Comput. Vision 80, 2 (2008), 189--210. Google Scholar
Digital Library
- He Wang, Xuan Bao, Romit Roy Choudhury, and Srihari Nelakuditi. 2013. InSight: Recognizing humans without face recognition. In Proceedings of the 14th Workshop on Mobile Computing Systems and Applications (HOTMOBILE'13). ACM, New York, Article 7, 6 pages. Google Scholar
Digital Library
- Changchang Wu. 2013. Towards linear-time incremental structure from motion. In Proceedings of the International Conference on 3D Vision (3DV'13). IEEE Computer Society, Los Alamitos, CA, 127--134. Google Scholar
Digital Library
- Changchang Wu, S. Agarwal, B. Curless, and S. M. Seitz. 2011. Multicore bundle adjustment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11). IEEE Computer Society, Los Alamitos, CA, 3057--3064. Google Scholar
Digital Library
- Xing Xie, Hao Liu, Simon Goumaz, and Wei-Ying Ma. 2005. Learning user interest for image browsing on small-form-factor devices. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'05). ACM, New York, 671--680. Google Scholar
Digital Library
- Lu Zhang and Laurens van der Maaten. 2013. Structure preserving object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'13). IEEE Computer Society, Los Alamitos, CA, 1838--1845. Google Scholar
Digital Library
- Yan-Tao Zheng, Shuicheng Yan, Zheng-Jun Zha, Yiqun Li, Xiangdong Zhou, Tat-Seng Chua, and Ramesh Jain. 2013. GPSView: A scenic driving route planner. ACM Trans. Multimed. Comput. Communi. Appl. 9, 1, Article 3 (2013). Google Scholar
Digital Library
Index Terms
Automated Link Generation for Sensor-Enriched Smartphone Images
Recommendations
What makes an image popular?
WWW '14: Proceedings of the 23rd international conference on World wide webHundreds of thousands of photographs are uploaded to the internet every minute through various social networking and photo sharing platforms. While some images get millions of views, others are completely ignored. Even from the same users, different ...
Developing an interactive Jeju water UCC on a smartphone
ICOSSSE'10: Proceedings of the 9th WSEAS international conference on System science and simulation in engineeringUCC (User Created Content), which a user directly creates as a form of image, video, and so on, is being recognized as one approach of the effective branding. So, UCC makes many people's attention and various researches about UCC have been progressing. ...
Exploring Video Hyperlinking in Broadcast Media
SLAM '15: Proceedings of the Third Edition Workshop on Speech, Language & Audio in MultimediaMultimedia content produced by professionals and individual users on the daily basis and in constantly growing quantity requires creation of navigation systems that allow access to this data on different levels of granularity that can contribute to ...






Comments