skip to main content
research-article

Semantic Photo Retargeting Under Noisy Image Labels

Authors Info & Claims
Published:20 May 2016Publication History
Skip Abstract Section

Abstract

With the popularity of mobile devices, photo retargeting has become a useful technique that adapts a high-resolution photo onto a low-resolution screen. Conventional approaches are limited in two aspects. The first factor is the de-emphasized role of semantic content that is many times more important than low-level features in photo aesthetics. Second is the importance of image spatial modeling: toward a semantically reasonable retargeted photo, the spatial distribution of objects within an image should be accurately learned. To solve these two problems, we propose a new semantically aware photo retargeting that shrinks a photo according to region semantics. The key technique is a mechanism transferring semantics of noisy image labels (inaccurate labels predicted by a learner like an SVM) into different image regions. In particular, we first project the local aesthetic features (graphlets in this work) onto a semantic space, wherein image labels are selectively encoded according to their noise level. Then, a category-sharing model is proposed to robustly discover the semantics of each image region. The model is motivated by the observation that the semantic distribution of graphlets from images tagged by a common label remains stable in the presence of noisy labels. Thereafter, a spatial pyramid is constructed to hierarchically encode the spatial layout of graphlet semantics. Based on this, a probabilistic model is proposed to enforce the spatial layout of a retargeted photo to be maximally similar to those from the training photos. Experimental results show that (1) noisy image labels predicted by different learners can improve the retargeting performance, according to both qualitative and quantitative analysis, and (2) the category-sharing model stays stable even when 32.36% of image labels are incorrectly predicted.

References

  1. Stuart Andrews, Ioannis Tsochantaridis, and Thomas Hofmann. 2003. Support vector machines for multiple-instance learning. In Proc. of NIPS, 561--568, 2003.Google ScholarGoogle Scholar
  2. Shai Avidan and Ariel Shamir. 2007. Seam carving for content-aware image resizing. ACM TOG, 26(3), 10, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Francesco Banterle, Alessandro Artusi, Tunc O. Aydin, Piotr Didyk, Elmar Eisemann, Diego Gutierrez, Rafal Mantiuk, and Karol Myszkowski. 2011. Spatial image retargeting. In multidimensional image retargeting. SIGGRAPH Asia Courses, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Subhabrata Bhattacharya, Rahul Sukthankar, and Mubarak Shah. 2010. A framework for photo-quality assessment and enhancement based on visual aesthetics. ACM Multimedia, 271--280, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ali Borji. 2012. Boosting bottom-up and top-down visual features for saliency estimation. In Proc. of CVPR, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Neil D. B. Bruce and John K. Tsotsos. 2009. Saliency, attention, and visual search: An information theoretic approach. J. Vision, 9(3), article 5, 2009.Google ScholarGoogle Scholar
  7. Bin Cheng, Bingbing Ni, Shuicheng Yan, and Qi Tian. 2010. Learning to photograph. ACM Multimedia, 291--300, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In Proc. of CVPR, 886--893, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Sagnik Dhar, Vicente Ordonez, and Tamara L. Berg. 2011. High level describable attributes for predicting aesthetics and interestingness. In Proc. of CVPR, 1657--1664, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Yanwen Guo, Feng Liu, Jian Shi, Zhihua Zhou, and Michael Gleicher. 2009. Image retargeting using mesh parameterization. IEEE T-MM, 11(5), 856--867, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Judy Hoffman, Sergio Guadarrama, Eric Tzeng, Ronghang Hu, Jeff Donahue, Ross Girshick, Trevor Darrell, and Kate Saenko. 2014. LSDA: Large scale detection through adaptation. In Proc. of NIPS, 151--158, 2014.Google ScholarGoogle Scholar
  12. Ashish Kapoor, Kristen Grauman, Raquel Urtasun, and Trevor Darrell. 2007. Active learning with Gaussian processes for object categorization. In Proc. of ICCV, 1--8, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  13. John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. of ICML, 282--289, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Shih-Syun Lin, I-Cheng Yeh, Chao-Hung Lin, and Tong-Yee Lee. 2013. Patch-based image warping for content-aware retargeting. IEEE T-MM, 15(2), 359--368, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ce Liu, Jenny Yuen, and Antonio Torralba. 2009. Nonparametric scene parsing: Label transfer via dense scene alignment. In Proc. of CVPR, 1972--1979, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  16. Yang Liu, Jing Liu, Zechao Li, Jinhui Tang, and Hanqing Lu. 2013. Weakly-supervised dual clustering for image semantic segmentation. In Proc. of CVPR, 2075--2082, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Masashi Nishiyama, Takahiro Okabe, Yoichi Sato, and Imari Sato. 2009. Sensation-based photo cropping. ACM Multimedia, 669--672, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Naila Murray, Luca Marchesotti, and Florent Perronnin. 2012. AVA: A large-scale database for aesthetic visual analysis. In Proc. of CVPR, 2408--2415, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Aude Oliva and Antonio Torralba. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV, 42(3), 145--175, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yael Pritch, Eitam Kav-Venaki, and Shmuel Peleg. 2009. Shift-map image editing. In Proc. of ICCV, 151--158, 2009.Google ScholarGoogle Scholar
  21. Michael Rubinstein, Diego Gutierrez, Olga Sorkine, and Ariel Shamir. 2010. A comparative study of image retargeting. ACM TOG, 29(5), 160, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Michael Rubinstein, Ariel Shamir, and Shai Avidan. 2008. Improved seam carving for video retargeting. ACM TOG, 27(3), 16, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Michael Rubinstein, Ariel Shamir, and Shai Avidan. 2009. Multi-operator media retargeting. ACM TOG, 28(3), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ariel Shamir, Alexander Sorkine-Hornung, and Olga Sorkine-Hornung. 2012. Modern approaches to media retargeting. SIGGRAPH Asia Courses, 2012.Google ScholarGoogle Scholar
  25. Dongjin Song and Dacheng Tao. 2010. Biologically inspired feature manifold for scene classification. IEEE T-IP, 19(1), 174--184, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Markus Stricker and Markus Orengo. 1995. Similarity of color images. Storage and Retrieval of Image and Video Databases, 381--392, 1995.Google ScholarGoogle Scholar
  27. Jinhui Tang, Qiang Chen, Meng Wang, Shuicheng Yan, Tat-Seng Chua, and Ramesh Jain. 2013. Towards optimizing human labeling for interactive image tagging. ACM TOMCCAP, 9(4), 29, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Daniel Vaquero, Matthew Turk, Kari Pulli, Marius Tico, and Natasha Gelfand. 2010. A survey of image retargeting techniques. In Proc. of SPIE, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  29. Jakob J. Verbeek and Bill Triggs. 2007. Region classification with Markov field aspect models. In Proc. of CVPR, 1--8, 2007.Google ScholarGoogle Scholar
  30. Alexander Vezhnevets and Joachim M. Buhmann. 2010. Towards weakly supervised semantic segmentation by means of multiple instance and multitask learning. In Proc of CVPR, 3249--3256, 2010.Google ScholarGoogle Scholar
  31. Alexander Vezhnevets, Vittorio Ferrari, and Joachim M. Buhmann. 2011. Weakly supervised semantic segmentation with a multi-image model. In Proc of ICCV, 643--650, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Alexander Vezhnevets, Vittorio Ferrari, and Joachim M. Buhmann. 2012. Weakly supervised structured output learning for semantic segmentation. In Proc. of CVPR, 845--852, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Alexander Vezhnevets, Joachim M. Buhmann, and Vittorio Ferrari. 2012. Active learning for semantic segmentation with expected change. In Proc. of CVPR, 3162--3169, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Xinchao Wang, Zhu Li, and Dacheng Tao. 2011. Subspaces indexing model on Grassmann manifold for image search. IEEE T-IP, 20(9), 2627--2635, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yu-Shuen Wang, Chiew-Lan Tai, Olga Sorkine, and Tong-Yee Lee. 2008. Optimized scale-and-stretch for image resizing. ACM TOG, 27(5), 118, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Lior Wolf, Moshe Guttmann, and Daniel Cohen-Or. 2007. Non-homogeneous content-driven video retargeting. In Proc. of ICCV, 1--6, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  37. Shiming Xiang, Feiping Nie, Yangqiu Song, Changshui Zhang, and Chunxia Zhang. 2008. Embedding new data points for manifold learning via coordinate propagation. Knowl. Inf. Syst., 19(2), 159--184, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  38. Xuejian Xiong and Kap Luk Chan. 2000. Towards an unsupervised optimal fuzzy clustering algorithm for image database organization. In Proc. of ICPR, 897--900, 2000.Google ScholarGoogle Scholar
  39. Jia Xu, Alexander G. Schwing, and Raquel Urtasun. 2014. Tell me what you see and I will show you where it is. In Proc. of CVPR, 3190--3197, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Victoria Yanulevskaya, Jasper R. R. Uijlings, Elia Bruni, Andreza Sartori, Elisa Zamboni, Francesca Bacci, David Melcher, and Nicu Sebe. 2007. Introduction to a large scale general purpose ground truth dataset: Methodology, annotation tool, and benchmarks. In Proc. of EMMCVPR, 169--183, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Yifang Yin, Zhijie Shen, Luming Zhang, and Roger Zimmermann. 2014. Spatial-temporal tag mining for automatic geospatial video annotations. ACM TOMCCAP, 11(2), 29, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Jin Yuan, Yi-Liang Zhao, Huan-Bo Luan, Meng Wang, and Tat-Seng Chua. 2014. Memory recall based video search: Finding videos you have seen before based on your memory. ACM TOMCCAP, 10(2), 21, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Luming Zhang, Mingli Song, Zicheng Liu, Xiao Liu, Jiajun Bu, and Chun Chen. 2013. Probabilistic graphlet cut: Exploring spatial structure cue for weakly-supervised image segmentation. In Proc. of CVPR, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Luming Zhang, Mingli Song, Qi Zhao, Xiao Liu, Jiajun Bu, and Chun Chen. 2013. Probabilistic graphlet transfer for photo cropping. IEEE T-IP, 22(2), 802--815, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Ying Zhang, Luming Zhang, and Roger Zimmermann. 2014. Aesthetics-guided summarization from multiple user generated videos. ACM TOMCCAP, 11(2), 24, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 12, Issue 3
    June 2016
    227 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/2901366
    Issue’s Table of Contents

    Copyright © 2016 ACM

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 20 May 2016
    • Revised: 1 November 2015
    • Accepted: 1 November 2015
    • Received: 1 March 2015
    Published in tomm Volume 12, Issue 3

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader
About Cookies On This Site

We use cookies to ensure that we give you the best experience on our website.

Learn more

Got it!