skip to main content
research-article

Automated Link Generation for Sensor-Enriched Smartphone Images

Published:21 October 2015Publication History
Skip Abstract Section

Abstract

The ubiquity of the smartphones makes them ideal platforms for generating in-situ content. In well-attended events, photos captured by attendees have diverse views that could be subjected to occlusion and abnormal lighting effects that could obscure the view. Such unstructured photo collections also have significant redundancy. Thus, a scene that is partially occluded or has bad contrast in one photo may be captured in another photo, possibly with higher details. We propose an application called Autolink that automatically establishes content-based links between sensor-annotated photos in unstructured photo collections captured using smartphones, such that users could navigate between high-context and high-detail images. This hierarchically structured image collection facilitates the design of applications for navigation and discovery, analytics about user photography patterns, user taste, and content/event popularity. Autolink includes a framework that constructs this hierarchy efficiently and with little content-specific training data by combining photo content processing with associated sensor logs obtained from multiple participants. We evaluated the performance of Autolink on two real-world sensor tagged photo datasets. The result shows that Autolink is able to efficiently cluster photos at 20 times faster than candidate algorithms, into the appropriate hierarchy with at least 70% precision and 37% better recall than candidate algorithms.

Skip Supplemental Material Section

Supplemental Material

References

  1. Alaa E. Abdel-Hakim and Aly A. Farag. 2006. CSIFT: A SIFT descriptor with color invariant characteristics. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06). Vol. 2. IEEE Computer Society, Los Alamitos, CA, 1978--1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Majid Alivand and Hartwig Hochmair. 2013. Extracting scenic routes from VGI data sources. In Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Crowdsourced and Volunteered Geographic Information (GEOCROWD'13). ACM, New York, 23--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Aydin Arpa, Luca Ballan, Rahul Sukthankar, Gabriel Taubin, Marc Pollefeys, and Ramesh Raskar. 2013. CrowdCam: Instantaneous navigation of crowd images using angled graph. In Proceedings of the International Conference on 3D Vision (3DV'13). IEEE Computer Society, Los Alamitos, CA, 422--429. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Xuan Bao, Songchun Fan, Alexander Varshavsky, Kevin Li, and Romit Roy Choudhury. 2013. Your reactions suggest you liked the movie: Automatic content rating via reaction sensing. In Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing (UBICOMP'13). ACM, New York, 197--206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Michael Bergh, Xavier Boix, Gemma Roig, and Luc Gool. 2015. SEEDS: Superpixels Extracted Via Energy-Driven Sampling. Int. J. Comput. Vision 111, 3 (2015), 298--314. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Cheng Bo, Xiang-Yang Li, Taeho Jung, Xufei Mao, Yue Tao, and Lan Yao. 2013. SmartLoc: Push the limit of the inertial sensor based metropolitan localization using smartphone. In Proceedings of the 19th Annual International Conference on Mobile Computing & Networking (MOBICOM'13). ACM, New York, 195--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. O. Boiman, E. Shechtman, and M. Irani. 2008. In defense of Nearest-Neighbor based image classification. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR'08). IEEE, 1--8.Google ScholarGoogle Scholar
  8. Eric Brachmann, Marcel Spehr, and Stefan Gumhold. 2013. Feature propagation on image webs for enhanced image retrieval. In Proceedings of the 3rd ACM Conference on International Conference on Multimedia Retrieval (ICMR'13). ACM, New York, 25--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Axel Carlier, Guntur Ravindra, Vincent Charvillat, and Wei Tsang Ooi. 2011. Combining content-based analysis and crowdsourcing to improve user interaction with zoomable video. In Proceedings of the 19th ACM International Conference on Multimedia (MM'11). ACM, New York, 43--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Matthew L. Cooper. 2011. Clustering geo-tagged photo collections using dynamic programming. In Proceedings of the 19th ACM International Conference on Multimedia (MM'11). ACM, New York, 1025--1028. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P.-E. Forssen. 2007. Maximally stable colour regions for recognition and matching. In Proceedings of 2007 Conference on Computer Vision and Pattern Recognition (CVPR'07). IEEE, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  12. Jesse Prabawa Gozali, Min-Yen Kan, and Hari Sundaram. 2012. Hidden markov model for event photo stream segmentation. In Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops (ICMEW'12). IEEE Computer Society, Los Alamitos, CA, 25--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Heath, N. Gelfand, M. Ovsjanikov, M. Aanjaneya, and L. J. Guibas. 2010. Image webs: Computing and exploiting connectivity in image collections. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR'10). IEEE Computer Society, Los Alamitos, CA, 3432--3439.Google ScholarGoogle Scholar
  14. Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2008. Hamming embedding and weak geometric consistency for large scale image search. In Proceedings of the 10th European Conference on Computer Vision: Part I (ECCV'08). Springer-Verlag, Berlin, Heidelberg, 304--317. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Yurong Jiang, Xing Xu, Peter Terlecky, Tarek Abdelzaher, Amotz Bar-Noy, and Ramesh Govindan. 2013. MediaScope: Selective on-demand media retrieval from mobile devices. In Proceedings of the 12th International Conference on Information Processing in Sensor Networks (IPSN'13). ACM, New York, 289--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Yannis Kalantidis, Giorgos Tolias, Yannis Avrithis, Marios Phinikettos, Evaggelos Spyrou, Phivos Mylonas, and Stefanos Kollias. 2011. VIRaL: Visual image retrieval and localization. Multimed. Tools Appl. 51, 2 (2011), 555--592. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Yuri Almeida Lacerda, Robson Gonçcalves Fechine Feitosa, Guilherme Álvaro Rodrigues Maia Esmeraldo, Cláudio de Souza Baptista, and Leandro Balby Marinho. 2012. Compass clustering: A new clustering method for detection of points of interest using personal collections of georeferenced and oriented photographs. In Proceedings of the 18th Brazilian Symposium on Multimedia and the Web (WEBMEDIA'12). ACM, New York, 281--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Heng Liu, Tao Mei, Jiebo Luo, Houqiang Li, and Shipeng Li. 2012a. Finding perfect rendezvous on the go: Accurate mobile visual localization and its applications to routing. In Proceedings of the 20th ACM International Conference on Multimedia (MM'12). ACM, New York, 9--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Jiajun Liu, Zi Huang, Lei Chen, Heng Tao Shen, and Zhixian Yan. 2012b. Discovering areas of interest with geo-tagged images and check-ins. In Proceedings of the 20th ACM International Conference on Multimedia (MM'12). ACM, New York, 589--598. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Gregor Miller, Sidney Fels, Matthias Finke, Will Motz, Walker Eagleston, and Chris Eagleston. 2009. MiniDiver: A novel mobile media playback interface for rich video content on an iPhone™. In Proceedings of the 8th International Conference on Entertainment Computing (ICEC'09). Springer-Verlag, Berlin, Heidelberg, 98--109. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ruiko Miyano, Takuya Inoue, Takuya Minagawa, Yuko Uematsu, and Hideo Saito. 2013. Camera pose estimation of a smartphone at a field without interest points. In Proceedings of the 11th International Conference on Computer Vision - Volume 2 (ACCV'12). Springer-Verlag, Berlin, Heidelberg, 545--555. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jong-Seung Park and Ramesh Jain. 2013. Identification of scene locations from geotagged images. ACM Trans. Multimedia Comput. Commun. Appl. 9, 1, Article 5 (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Chuan Qin, Xuan Bao, Romit Roy Choudhury, and Srihari Nelakuditi. 2011. TagSense: A smartphone-based approach to automatic image tagging. In Proceedings of the 9th International Conference on Mobile Systems, Applications, and Services (MOBISYS'11). ACM, New York, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Dragomir R. Radev, Hong Qi, Harris Wu, and Weiguo Fan. 2002. Evaluating web-based question answering systems. In Proceedings of the International Conference on Language Resources and Evaluation (LREC'02). Las Palmas, Spain, 1153--1156.Google ScholarGoogle Scholar
  25. Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. 2011. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision (ICCV'11). IEEE Computer Society, Los Alamitos, CA, 2564--2571. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Frank Shipman, Andreas Girgensohn, and Lynn Wilcox. 2008. Authoring, viewing, and generating hypervideo: An overview of hyper-hitchcock. ACM Trans. Multimedia Comput. Commun. Appl. 5, 2, Article 15 (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Rahul Singh and Pierre-Yves Corlobe. 2007. Multifaceted hyperimage-based organization and interaction with bio-medical images. In Proceedings of the 20th IEEE International Symposium on Computer-Based Medical Systems (CBMS'07). IEEE Computer Society, Los Alamitos, CA, 153--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Noah Snavely, Steven M. Seitz, and Richard Szeliski. 2006. Photo tourism: Exploring photo collections in 3D. ACM Trans. Graph. 25, 3 (2006), 835--846. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Noah Snavely, Steven M. Seitz, and Richard Szeliski. 2008. Modeling the world from internet photo collections. Int. J. Comput. Vision 80, 2 (2008), 189--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. He Wang, Xuan Bao, Romit Roy Choudhury, and Srihari Nelakuditi. 2013. InSight: Recognizing humans without face recognition. In Proceedings of the 14th Workshop on Mobile Computing Systems and Applications (HOTMOBILE'13). ACM, New York, Article 7, 6 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Changchang Wu. 2013. Towards linear-time incremental structure from motion. In Proceedings of the International Conference on 3D Vision (3DV'13). IEEE Computer Society, Los Alamitos, CA, 127--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Changchang Wu, S. Agarwal, B. Curless, and S. M. Seitz. 2011. Multicore bundle adjustment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11). IEEE Computer Society, Los Alamitos, CA, 3057--3064. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Xing Xie, Hao Liu, Simon Goumaz, and Wei-Ying Ma. 2005. Learning user interest for image browsing on small-form-factor devices. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'05). ACM, New York, 671--680. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Lu Zhang and Laurens van der Maaten. 2013. Structure preserving object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'13). IEEE Computer Society, Los Alamitos, CA, 1838--1845. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yan-Tao Zheng, Shuicheng Yan, Zheng-Jun Zha, Yiqun Li, Xiangdong Zhou, Tat-Seng Chua, and Ramesh Jain. 2013. GPSView: A scenic driving route planner. ACM Trans. Multimed. Comput. Communi. Appl. 9, 1, Article 3 (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automated Link Generation for Sensor-Enriched Smartphone Images

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 12, Issue 1s
      Special Issue on Smartphone-Based Interactive Technologies, Systems, and Applications and Special Issue on Extended Best Papers from ACM Multimedia 2014
      October 2015
      317 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/2837676
      Issue’s Table of Contents

      Copyright © 2015 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 October 2015
      • Accepted: 1 July 2015
      • Revised: 1 April 2015
      • Received: 1 January 2015
      Published in tomm Volume 12, Issue 1s

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)1

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!