Abstract
Many people are interested in taking astonishing photos and sharing them with others. Emerging high-tech hardware and software facilitate the ubiquitousness and functionality of digital photography. Because composition matters in photography, researchers have leveraged some common composition techniques, such as the rule of thirds and the perspective-related techniques, in providing photo-taking assistance. However, composition techniques developed by professionals are far more diverse than well-documented techniques can cover. We present a new approach to leverage the underexplored photography ideas, which are virtually unlimited, diverse, and correlated. We propose a comprehensive fork-join framework, named CAPTAIN (
- [1] . 2014. 2D human pose estimation: New benchmark and state of the art analysis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 3686–3693. Google Scholar
Digital Library
- [2] . 2009. PatchMatch: A randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28, 3 (2009), 1–24. Google Scholar
Digital Library
- [3] . 2010. A framework for photo-quality assessment and enhancement based on visual aesthetics. In Proceedings of the ACM International Conference on Multimedia. ACM, New York, NY, 271–280. Google Scholar
Digital Library
- [4] . 2011. A holistic approach to aesthetic enhancement of photographs. ACM Trans. Multimedia Comput. Commun. Appl. 7, 1 (2011), 1–21. Google Scholar
Digital Library
- [5] . 2017. Deep visual-semantic quantization for efficient image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, 1328–1337.Google Scholar
Cross Ref
- [6] . 2019. OpenPose: Realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43, 1 (2019), 172–186.Google Scholar
Digital Library
- [7] . 2017. Realtime multi-person 2D pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, 7291–7299.Google Scholar
Cross Ref
- [8] . 2015. R2P: Recomposition and retargeting of photographic images. In Proceedings of the ACM International Conference on Multimedia. ACM, 927–930. Google Scholar
Digital Library
- [9] . 2009. Finding good composition in panoramic scenes. In Proceedings of the International Conference on Computer Vision (ICCV’09). IEEE, 2225–2231.Google Scholar
- [10] . 2014. Return of the devil in the details: Delving deep into convolutional nets. In Proceedings of the British Machine Vision Conference. BMVA Press.Google Scholar
Cross Ref
- [11] . 2017. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 4 (2017), 834–848.Google Scholar
Cross Ref
- [12] . 2008. The patch transform and its applications to image editing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08). IEEE, 1–8.Google Scholar
- [13] . 2011. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS’11), Vol. 15. PMLR, 215–223.Google Scholar
- [14] . 2006. Studying aesthetics in photographic images using a computational approach. In Proceedings of the European Conference on Computer Vision (ECCV’06). Springer, 288–301. Google Scholar
Digital Library
- [15] . 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09). IEEE, 248–255.Google Scholar
Cross Ref
- [16] . 1973. A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybernet. 3, 3 (1973), 32–57.Google Scholar
Cross Ref
- [17] . 2017. Intelligent portrait composition assistance: integrating deep-learned models and photography idea retrieval. In Proceedings of the ACM Conference on Multimedia, Thematic Workshops. ACM, 17–25. Google Scholar
Digital Library
- [18] . 2012. Improving photo composition elegantly: Considering image similarity during composition optimization. Comput. Graph. Forum 31, 7 (2012), 2193–2202. Google Scholar
Digital Library
- [19] . 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE, 770–778.Google Scholar
Cross Ref
- [20] . 2018. Discovering triangles in portraits for supporting photographic creation. IEEE Trans. Multimedia 20, 2 (2018), 496–508. Google Scholar
Digital Library
- [21] . 2018. Fast spectral ranking for similarity search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE, 7632–7641.Google Scholar
Cross Ref
- [22] . 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20, 11 (1998), 1254–1259. Google Scholar
Digital Library
- [23] . 2016. Shape matching using skeleton context for automated bow echo detection. In Proceedings of the International Conference on Big Data. IEEE, 901–908.Google Scholar
Cross Ref
- [24] . 2018. Skeleton matching with applications in severe weather detection. Appl. Soft Comput. 70 (2018), 1154–1166.Google Scholar
Cross Ref
- [25] . 2006. The design of high-level features for photo quality assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 1. IEEE, 419–426. Google Scholar
Digital Library
- [26] . 1996. The application of cluster analysis in strategic management research: An analysis and critique. Strateg. Manage. J. 17, 6 (1996), 441–458.Google Scholar
Cross Ref
- [27] . 2016. Photo aesthetics ranking network with attributes and content adaptation. In Proceedings of the European Conference on Computer Vision (ECCV’16). Springer, Cham, Germany, 662–679.Google Scholar
Cross Ref
- [28] . 2012. The Art of Composition. Skyhorse Publishing, New York, NY.Google Scholar
- [29] . 2011. Design Basics. Wadsworth Publishing, Belmont, CA.Google Scholar
- [30] . 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.Google Scholar
Cross Ref
- [31] . 2004. Rcv1: A new benchmark collection for text categorization research. J. Mach. Learn. Res. 5(Apr.2004), 361–397. Google Scholar
Digital Library
- [32] . 2015. Photo composition feedback and enhancement. In Mobile Cloud Visual Media Computing. Springer, Cham, Germany, 113–144.Google Scholar
Cross Ref
- [33] . 2015. Seam carving based aesthetics enhancement for photos. Sign. Process.: Image Commun. 39 (2015), 509–516. Google Scholar
Digital Library
- [34] . 2014. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision (ECCV’14). Springer, Cham, Germany, 740–755.Google Scholar
Cross Ref
- [35] . 2010. Realtime aesthetic image retargeting. Comput. Aesthet. 10 (2010), 1–8. Google Scholar
Digital Library
- [36] . 2018. Deep active learning with contaminated tags for image aesthetics assessment. IEEE Trans. Image Process. (2018), 1–1.Google Scholar
- [37] . 2015. Rating image aesthetics using deep learning. IEEE Trans. Multimedia 17, 11 (2015), 2021–2034.Google Scholar
Digital Library
- [38] . 2008. Photo and video quality evaluation: Focusing on the subject. In Proceedings of the European Conference on Computer Vision (ECCV’08). Springer, Berlin, 386–399. Google Scholar
Digital Library
- [39] . 2017. Spatial-semantic image search by visual feature synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, 1121–1130.Google Scholar
Cross Ref
- [40] . 2016. Composition-preserving deep photo aesthetics assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE, 497–506.Google Scholar
Cross Ref
- [41] . 2011. Assessing the aesthetic quality of photographs using generic image descriptors. In Proceedings of the International Conference on Computer Vision (ICCV’11). IEEE, 1784–1791. Google Scholar
Digital Library
- [42] . 2016. Content-based image retrieval tutorial. arXiv:1608.03811. Retrieved from https://arxiv.org/abs/1608.03811.Google Scholar
- [43] . 2012. AVA: A large-scale database for aesthetic visual analysis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12). IEEE, 2408–2415. Google Scholar
Digital Library
- [44] . 2013. Learning to photograph: A compositional perspective. IEEE Trans. Multimedia 15, 5 (2013), 1138–1151. Google Scholar
Digital Library
- [45] . 2012. Modeling photo composition and its application to photo re-arrangement. In Proceedings of the IEEE Conference on Image Processing. IEEE, 2741–2744.Google Scholar
Cross Ref
- [46] . 2009. Shift-map image editing. In Proceedings of the International Conference on Computer Vision (ICCV’09), Vol. 9. IEEE, 151–158.Google Scholar
- [47] . 2018. Fine-tuning CNN image retrieval with no human annotation. IEEE Trans. Pattern Anal. Mach. Intell. 41, 7 (2018), 1655–1668.Google Scholar
Cross Ref
- [48] . 2015. Real-time assistance in multimedia capture using social media. In Proceedings of the ACM International Conference on Multimedia. ACM, New York, NY, 641–644. Google Scholar
Digital Library
- [49] . 2014. Context-based photography learning using crowdsourced images and social media. In Proceedings of the ACM International Conference on Multimedia. ACM, New York, NY, 217–220. Google Scholar
Digital Library
- [50] . 2015. Context-aware photography learning for smart mobile devices. ACM Trans. Multimedia Comput. Commun. Appl. 12, 1s (2015), 1–24. Google Scholar
Digital Library
- [51] . 2016. Clicksmart: A context-aware viewpoint recommendation system for mobile photography. IEEE Trans. Circ. Syst. Vid. Technol. 27, 1 (2016), 149–158. Google Scholar
Digital Library
- [52] . 2019. Photography and exploration of tourist locations based on optimal foraging theory. IEEE Trans. Circ. Syst. Vid. Technol. 30, 7 (2019), 2276–2287.Google Scholar
- [53] . 2017. A spring-electric graph model for socialized group photography. IEEE Trans. Multimedia 20, 3 (2017), 754–766. Google Scholar
Digital Library
- [54] . 2018. Yolov3: An incremental improvement. arXiv:1804.02767. Retrieved from https://arxiv.org/abs/1804.02767.Google Scholar
- [55] . 2017. Personalized image aesthetics. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). IEEE, 638–647.Google Scholar
Cross Ref
- [56] . 2017. Faster R-CNN: Towards real-time object detection with region proposal networks.IEEE Trans. Pattern Anal. Mach. Intell. 39, 6 (2017), 1137. Google Scholar
Digital Library
- [57] . 1987. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20 (1987), 53–65. Google Scholar
Digital Library
- [58] . 2008. LabelMe: A database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 1–3 (2008), 157–173. Google Scholar
Digital Library
- [59] . 2015. Data-driven automatic cropping using semantic composition search. Comput. Graph. Forum 34, 1 (2015), 141–151. Google Scholar
Digital Library
- [60] . 2006. Gaze-based interaction for semi-automatic photo cropping. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’06). ACM, New York, NY, 771–780. Google Scholar
Digital Library
- [61] . 2014. CNN features off-the-shelf: An astounding baseline for recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. IEEE, 806–813. Google Scholar
Digital Library
- [62] . 2007. Attention based auto image cropping. In Proceedings of the International Conference on Computer Vision Systems.Google Scholar
- [63] . 2003. Automatic thumbnail cropping and its effectiveness. In Proceedings of the ACM Symposium on User Interface Software and Technology (UIST’03). ACM, New York, NY, 95–104. Google Scholar
Digital Library
- [64] . 2019. Deep high-resolution representation learning for human pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). IEEE, 5693–5703.Google Scholar
Cross Ref
- [65] . 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). IEEE, 1–9.Google Scholar
Cross Ref
- [66] . 2018. Nima: Neural image assessment. IEEE Trans. Image Process. 27, 8 (2018), 3998–4011.Google Scholar
Cross Ref
- [67] . 2012. Picture Perfect Practice: A Self-training Guide to Mastering the Challenges of Taking Photographs. New Riders, Indianapolis, IN. Google Scholar
Digital Library
- [68] . 2008. Online photography assistance by exploring geo-referenced photos on MID/UMPC. In Workshop on Multimedia Signal Processing. IEEE, 6–10.Google Scholar
- [69] . 2018. Good view hunting: Learning photo composition from dense view pairs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE, 5437–5446.Google Scholar
Cross Ref
- [70] . 2009. Saliency-enhanced image aesthetics class prediction. In Proceedings of the IEEE Conference on Image Processing. IEEE, 997–1000. Google Scholar
Digital Library
- [71] . 2016. Unsupervised deep embedding for clustering analysis. In Proceedings of the International Conference on Machine Learning (ICML’16). PMLR, 478–487. Google Scholar
Digital Library
- [72] . 2013. Learning the change for automatic image cropping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13). IEEE, 971–978. Google Scholar
Digital Library
- [73] . 2014. Personalized photograph ranking and selection system considering positive and negative user feedback. ACM Trans. Multimedia Comput. Commun. Appl. 10, 4 (2014), 1–20. Google Scholar
Digital Library
- [74] . 2013. Socialized mobile photography: Learning to photograph with social context via mobile devices. IEEE Trans. Multimedia 16, 1 (2013), 184–200.Google Scholar
Cross Ref
- [75] . 2005. Auto cropping for digital photographs. In Proceedings of the IEEE Conference on Multimedia and Expo. IEEE.Google Scholar
- [76] . 2017. Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, 2881–2890.Google Scholar
Cross Ref
- [77] . 2017. Scene parsing through ADE20K dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, 633–641.Google Scholar
Cross Ref
- [78] . 2017. Detecting dominant vanishing points in natural scenes with application to composition-sensitive image retrieval. IEEE Trans. Multimedia 19, 12 (2017), 2651–2665.Google Scholar
Cross Ref
Index Terms
CAPTAIN: Comprehensive Composition Assistance for Photo Taking
Recommendations
Intelligent Portrait Composition Assistance: Integrating Deep-learned Models and Photography Idea Retrieval
Thematic Workshops '17: Proceedings of the on Thematic Workshops of ACM Multimedia 2017Retrieving photography ideas corresponding to a given location facilitates the usage of smart cameras, where there is a high interest among amateurs and enthusiasts to take astonishing photos at anytime and in any location. Existing research captures ...






Comments