Abstract
With the popularity of mobile devices, photo retargeting has become a useful technique that adapts a high-resolution photo onto a low-resolution screen. Conventional approaches are limited in two aspects. The first factor is the de-emphasized role of semantic content that is many times more important than low-level features in photo aesthetics. Second is the importance of image spatial modeling: toward a semantically reasonable retargeted photo, the spatial distribution of objects within an image should be accurately learned. To solve these two problems, we propose a new semantically aware photo retargeting that shrinks a photo according to region semantics. The key technique is a mechanism transferring semantics of noisy image labels (inaccurate labels predicted by a learner like an SVM) into different image regions. In particular, we first project the local aesthetic features (graphlets in this work) onto a semantic space, wherein image labels are selectively encoded according to their noise level. Then, a category-sharing model is proposed to robustly discover the semantics of each image region. The model is motivated by the observation that the semantic distribution of graphlets from images tagged by a common label remains stable in the presence of noisy labels. Thereafter, a spatial pyramid is constructed to hierarchically encode the spatial layout of graphlet semantics. Based on this, a probabilistic model is proposed to enforce the spatial layout of a retargeted photo to be maximally similar to those from the training photos. Experimental results show that (1) noisy image labels predicted by different learners can improve the retargeting performance, according to both qualitative and quantitative analysis, and (2) the category-sharing model stays stable even when 32.36% of image labels are incorrectly predicted.
- Stuart Andrews, Ioannis Tsochantaridis, and Thomas Hofmann. 2003. Support vector machines for multiple-instance learning. In Proc. of NIPS, 561--568, 2003.Google Scholar
- Shai Avidan and Ariel Shamir. 2007. Seam carving for content-aware image resizing. ACM TOG, 26(3), 10, 2007. Google Scholar
Digital Library
- Francesco Banterle, Alessandro Artusi, Tunc O. Aydin, Piotr Didyk, Elmar Eisemann, Diego Gutierrez, Rafal Mantiuk, and Karol Myszkowski. 2011. Spatial image retargeting. In multidimensional image retargeting. SIGGRAPH Asia Courses, 2011. Google Scholar
Digital Library
- Subhabrata Bhattacharya, Rahul Sukthankar, and Mubarak Shah. 2010. A framework for photo-quality assessment and enhancement based on visual aesthetics. ACM Multimedia, 271--280, 2010. Google Scholar
Digital Library
- Ali Borji. 2012. Boosting bottom-up and top-down visual features for saliency estimation. In Proc. of CVPR, 2012. Google Scholar
Digital Library
- Neil D. B. Bruce and John K. Tsotsos. 2009. Saliency, attention, and visual search: An information theoretic approach. J. Vision, 9(3), article 5, 2009.Google Scholar
- Bin Cheng, Bingbing Ni, Shuicheng Yan, and Qi Tian. 2010. Learning to photograph. ACM Multimedia, 291--300, 2010. Google Scholar
Digital Library
- Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In Proc. of CVPR, 886--893, 2005. Google Scholar
Digital Library
- Sagnik Dhar, Vicente Ordonez, and Tamara L. Berg. 2011. High level describable attributes for predicting aesthetics and interestingness. In Proc. of CVPR, 1657--1664, 2011. Google Scholar
Digital Library
- Yanwen Guo, Feng Liu, Jian Shi, Zhihua Zhou, and Michael Gleicher. 2009. Image retargeting using mesh parameterization. IEEE T-MM, 11(5), 856--867, 2009. Google Scholar
Digital Library
- Judy Hoffman, Sergio Guadarrama, Eric Tzeng, Ronghang Hu, Jeff Donahue, Ross Girshick, Trevor Darrell, and Kate Saenko. 2014. LSDA: Large scale detection through adaptation. In Proc. of NIPS, 151--158, 2014.Google Scholar
- Ashish Kapoor, Kristen Grauman, Raquel Urtasun, and Trevor Darrell. 2007. Active learning with Gaussian processes for object categorization. In Proc. of ICCV, 1--8, 2007.Google Scholar
Cross Ref
- John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. of ICML, 282--289, 2001. Google Scholar
Digital Library
- Shih-Syun Lin, I-Cheng Yeh, Chao-Hung Lin, and Tong-Yee Lee. 2013. Patch-based image warping for content-aware retargeting. IEEE T-MM, 15(2), 359--368, 2013. Google Scholar
Digital Library
- Ce Liu, Jenny Yuen, and Antonio Torralba. 2009. Nonparametric scene parsing: Label transfer via dense scene alignment. In Proc. of CVPR, 1972--1979, 2009.Google Scholar
Cross Ref
- Yang Liu, Jing Liu, Zechao Li, Jinhui Tang, and Hanqing Lu. 2013. Weakly-supervised dual clustering for image semantic segmentation. In Proc. of CVPR, 2075--2082, 2013. Google Scholar
Digital Library
- Masashi Nishiyama, Takahiro Okabe, Yoichi Sato, and Imari Sato. 2009. Sensation-based photo cropping. ACM Multimedia, 669--672, 2009. Google Scholar
Digital Library
- Naila Murray, Luca Marchesotti, and Florent Perronnin. 2012. AVA: A large-scale database for aesthetic visual analysis. In Proc. of CVPR, 2408--2415, 2012. Google Scholar
Digital Library
- Aude Oliva and Antonio Torralba. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV, 42(3), 145--175, 2001. Google Scholar
Digital Library
- Yael Pritch, Eitam Kav-Venaki, and Shmuel Peleg. 2009. Shift-map image editing. In Proc. of ICCV, 151--158, 2009.Google Scholar
- Michael Rubinstein, Diego Gutierrez, Olga Sorkine, and Ariel Shamir. 2010. A comparative study of image retargeting. ACM TOG, 29(5), 160, 2010. Google Scholar
Digital Library
- Michael Rubinstein, Ariel Shamir, and Shai Avidan. 2008. Improved seam carving for video retargeting. ACM TOG, 27(3), 16, 2008. Google Scholar
Digital Library
- Michael Rubinstein, Ariel Shamir, and Shai Avidan. 2009. Multi-operator media retargeting. ACM TOG, 28(3), 2009. Google Scholar
Digital Library
- Ariel Shamir, Alexander Sorkine-Hornung, and Olga Sorkine-Hornung. 2012. Modern approaches to media retargeting. SIGGRAPH Asia Courses, 2012.Google Scholar
- Dongjin Song and Dacheng Tao. 2010. Biologically inspired feature manifold for scene classification. IEEE T-IP, 19(1), 174--184, 2010. Google Scholar
Digital Library
- Markus Stricker and Markus Orengo. 1995. Similarity of color images. Storage and Retrieval of Image and Video Databases, 381--392, 1995.Google Scholar
- Jinhui Tang, Qiang Chen, Meng Wang, Shuicheng Yan, Tat-Seng Chua, and Ramesh Jain. 2013. Towards optimizing human labeling for interactive image tagging. ACM TOMCCAP, 9(4), 29, 2013. Google Scholar
Digital Library
- Daniel Vaquero, Matthew Turk, Kari Pulli, Marius Tico, and Natasha Gelfand. 2010. A survey of image retargeting techniques. In Proc. of SPIE, 2010.Google Scholar
Cross Ref
- Jakob J. Verbeek and Bill Triggs. 2007. Region classification with Markov field aspect models. In Proc. of CVPR, 1--8, 2007.Google Scholar
- Alexander Vezhnevets and Joachim M. Buhmann. 2010. Towards weakly supervised semantic segmentation by means of multiple instance and multitask learning. In Proc of CVPR, 3249--3256, 2010.Google Scholar
- Alexander Vezhnevets, Vittorio Ferrari, and Joachim M. Buhmann. 2011. Weakly supervised semantic segmentation with a multi-image model. In Proc of ICCV, 643--650, 2011. Google Scholar
Digital Library
- Alexander Vezhnevets, Vittorio Ferrari, and Joachim M. Buhmann. 2012. Weakly supervised structured output learning for semantic segmentation. In Proc. of CVPR, 845--852, 2012. Google Scholar
Digital Library
- Alexander Vezhnevets, Joachim M. Buhmann, and Vittorio Ferrari. 2012. Active learning for semantic segmentation with expected change. In Proc. of CVPR, 3162--3169, 2012. Google Scholar
Digital Library
- Xinchao Wang, Zhu Li, and Dacheng Tao. 2011. Subspaces indexing model on Grassmann manifold for image search. IEEE T-IP, 20(9), 2627--2635, 2011. Google Scholar
Digital Library
- Yu-Shuen Wang, Chiew-Lan Tai, Olga Sorkine, and Tong-Yee Lee. 2008. Optimized scale-and-stretch for image resizing. ACM TOG, 27(5), 118, 2008. Google Scholar
Digital Library
- Lior Wolf, Moshe Guttmann, and Daniel Cohen-Or. 2007. Non-homogeneous content-driven video retargeting. In Proc. of ICCV, 1--6, 2007.Google Scholar
Cross Ref
- Shiming Xiang, Feiping Nie, Yangqiu Song, Changshui Zhang, and Chunxia Zhang. 2008. Embedding new data points for manifold learning via coordinate propagation. Knowl. Inf. Syst., 19(2), 159--184, 2008.Google Scholar
Cross Ref
- Xuejian Xiong and Kap Luk Chan. 2000. Towards an unsupervised optimal fuzzy clustering algorithm for image database organization. In Proc. of ICPR, 897--900, 2000.Google Scholar
- Jia Xu, Alexander G. Schwing, and Raquel Urtasun. 2014. Tell me what you see and I will show you where it is. In Proc. of CVPR, 3190--3197, 2014. Google Scholar
Digital Library
- Victoria Yanulevskaya, Jasper R. R. Uijlings, Elia Bruni, Andreza Sartori, Elisa Zamboni, Francesca Bacci, David Melcher, and Nicu Sebe. 2007. Introduction to a large scale general purpose ground truth dataset: Methodology, annotation tool, and benchmarks. In Proc. of EMMCVPR, 169--183, 2007. Google Scholar
Digital Library
- Yifang Yin, Zhijie Shen, Luming Zhang, and Roger Zimmermann. 2014. Spatial-temporal tag mining for automatic geospatial video annotations. ACM TOMCCAP, 11(2), 29, 2014. Google Scholar
Digital Library
- Jin Yuan, Yi-Liang Zhao, Huan-Bo Luan, Meng Wang, and Tat-Seng Chua. 2014. Memory recall based video search: Finding videos you have seen before based on your memory. ACM TOMCCAP, 10(2), 21, 2014. Google Scholar
Digital Library
- Luming Zhang, Mingli Song, Zicheng Liu, Xiao Liu, Jiajun Bu, and Chun Chen. 2013. Probabilistic graphlet cut: Exploring spatial structure cue for weakly-supervised image segmentation. In Proc. of CVPR, 2013. Google Scholar
Digital Library
- Luming Zhang, Mingli Song, Qi Zhao, Xiao Liu, Jiajun Bu, and Chun Chen. 2013. Probabilistic graphlet transfer for photo cropping. IEEE T-IP, 22(2), 802--815, 2013. Google Scholar
Digital Library
- Ying Zhang, Luming Zhang, and Roger Zimmermann. 2014. Aesthetics-guided summarization from multiple user generated videos. ACM TOMCCAP, 11(2), 24, 2014. Google Scholar
Digital Library
Recommendations
Image annotation with weak labels
WAIM'13: Proceedings of the 14th international conference on Web-Age Information ManagementIn this paper, we address the problem of image annotation when the given labels of training image are incomplete, inaccurate, and unevenly distributed, in the form of weak labels, which is frequently encountered when dealing with large scale web image ...
Artist friendly facial animation retargeting
This paper presents a novel facial animation retargeting system that is carefully designed to support the animator's workflow. Observation and analysis of the animators' often preferred process of key-frame animation with blendshape models informed our ...
Artist friendly facial animation retargeting
SA '11: Proceedings of the 2011 SIGGRAPH Asia ConferenceThis paper presents a novel facial animation retargeting system that is carefully designed to support the animator's workflow. Observation and analysis of the animators' often preferred process of key-frame animation with blendshape models informed our ...






Comments