Abstract
Baselines are the starting point of any quantitative multimedia research, and benchmarks are essential for pushing those baselines further. In this article, we present baselines for the artistic domain with a new benchmark dataset featuring over 2 million images with rich structured metadata dubbed OmniArt. OmniArt contains annotations for dozens of attribute types and features semantic context information through concepts, IconClass labels, color information, and (limited) object-level bounding boxes. For our dataset we establish and present baseline scores on multiple tasks such as artist attribution, creation period estimation, type, style, and school prediction. In addition to our metadata related experiments, we explore the color spaces of art through different types and evaluate a transfer learning object recognition pipeline.
- Xavier Anguera, Luis Javier Rodríguez-Fuentes, Igor Szöke, Andi Buzo, and Florian Metze. 2014. Query by example search on speech at mediaeval 2014. In Working Notes Proceedings of the MediaEval 2014 Workshop. http://ceurws.org/Vol1263/mediaeval2014_submission_35.pdf.Google Scholar
- George Awad, Asad Butt, Jonathan Fiscus, David Joy, Andrew Delgado, Martial Michel, Alan F. Smeaton, Yvette Graham, Wessel Kraaij, Georges Quénot, Maria Eskevich, Roeland Ordelman, Gareth J. F. Jones, and Benoit Huet. 2017. TRECVID 2017: Evaluating ad-hoc and instance video search, events detection, video captioning and hyperlinking. In Proceedings of the Annual Text Retrieval Conference on Video Retrieval Evaluation (TRECVID’17). NIST.Google Scholar
- Yaniv Bar, Noga Levy, and Lior Wolf. 2014. Classification of artistic styles using binarized features derived from a deep neural network. In Proceedings of the Workshop at the European Conference on Computer Vision. Springer, 71--84.Google Scholar
- Yoann Baveye, Emmanuel Dellandrea, Christel Chamaret, and Liming Chen. 2015. Liris-accede: A video database for affective content analysis. IEEE Trans. Affect. Comput. 6, 1 (2015), 43--55.Google Scholar
Digital Library
- Roy S. Berns and Marissa I. Haddock. 2010. A color target for museum applications. In Proceedings of the Color and Imaging Conference, Vol. 2010. Society for Imaging Science and Technology, 27--32.Google Scholar
- Nicola Conci, Francesco De Natale, Vasileios Mezaris, and Mike Matton. 2015. Synchronization of multi-user event media at MediaEval 2015: Task description, datasets, and evaluation. In Proceedings of the MediaEval 2015 Workshop.Google Scholar
- Leendert D. Couprie. 1983. Iconclass: An iconographic classification system. Art Libr. J. 8, 2 (1983), 32--49.Google Scholar
Cross Ref
- Elliot J. Crowley and Andrew Zisserman. 2014. In search of art. In Proceedings of the Workshop at the European Conference on Computer Vision. Springer, 54--70.Google Scholar
- Claire-Hélène Demarty, Cédric Penet, Mohammad Soleymani, and Guillaume Gravier. 2015. VSD, a public dataset for the detection of violent scenes in movies: Design, annotation, analysis and evaluation. Multimedia Tools Appl. 74, 17 (2015), 7379--7404. Google Scholar
Digital Library
- Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on, Computer Vision and Pattern Recognition (CVPR’09). IEEE, 248--255.Google Scholar
Cross Ref
- A. Elgammal, Y. Kang, and M. Den Leeuw. 2017. Picasso, matisse, or a fake? Automated analysis of drawings at the stroke level for attribution and authentication. ArXiv e-prints (Nov. 2017). arxiv:1711.03536Google Scholar
- A. Elgammal, M. Mazzone, B. Liu, D. Kim, and M. Elhoseiny. 2018. The shape of art history in the eyes of the machine. ArXiv e-prints (Jan. 2018). arxiv:cs.AI/1801.07729Google Scholar
- Ahmed Elgammal and Babak Saleh. 2015. Quantifying creativity in art networks. arXiv preprint arXiv:1506.00711 (2015).Google Scholar
- Hugo Jair Escalante, Víctor Ponce-López, Jun Wan, Michael A Riegler, Baiyu Chen, Albert Clapés, Sergio Escalera, Isabelle Guyon, Xavier Baró, Pål Halvorsen, et al. 2016. Chalearn joint contest on multimedia challenges beyond visual analysis: An overview. In Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR’16). IEEE, 67--73.Google Scholar
Cross Ref
- Mark Everingham, S. M. Ali Eslami, Luc Van Gool, Christopher K.I. Williams, John Winn, and Andrew Zisserman. 2015. The pascal visual object classes challenge: A retrospective. Int. J. Comput. Vis. 111, 1 (2015), 98--136. Google Scholar
Digital Library
- Li Fei-Fei, Rob Fergus, and Pietro Perona. 2006. One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28, 4 (2006), 594--611. Google Scholar
Digital Library
- Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2414--2423.Google Scholar
Cross Ref
- Shiry Ginosar, Daniel Haas, Timothy Brown, and Jitendra Malik. 2014. Detecting people in cubist art. In Proceedings of the Workshop at the European Conference on Computer Vision. Springer, 101--116.Google Scholar
- Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems. 2672--2680. Google Scholar
Digital Library
- Gregory Griffin, Alex Holub, and Pietro Perona. 2007. Caltech-256 object category dataset. (2007).Google Scholar
- Hui Mao, Ming Cheung, and James She. 2017. DeepArt: Learning joint representations of visual arts. In Proceedings of the 2017 ACM on Multimedia Conference. ACM, 1183–1191. Google Scholar
Digital Library
- Dmitry I. Ignatov and Sergei O. Kuznetsov. 2009. Frequent itemset mining for clustering near duplicate web documents. In Proceedings of the International Conference on Conceptual Structures. Springer, 185--200. Google Scholar
Digital Library
- Bogdan Ionescu, Alexandru Lucian Gînscă, Bogdan Boteanu, Mihai Lupu, Adrian Popescu, and Henning Müller. 2016. Div150Multi: A social image retrieval result diversification dataset with multi-topic queries. In Proceedings of the 7th International Conference on Multimedia Systems. ACM, 46. Google Scholar
Digital Library
- C. Richard Johnson, Ella Hendriks, Igor J. Berezhnoy, Eugene Brevdo, Shannon M. Hughes, Ingrid Daubechies, Jia Li, Eric Postma, and James Z. Wang. 2008. Image processing for artist identification. IEEE Sign. Process. Mag. 25, 4 (2008).Google Scholar
Cross Ref
- Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, C. Lawrence Zitnick, and Ross Girshick. 2017. CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, 1988--1997.Google Scholar
- Sergey Karayev, Matthew Trentacoste, Helen Han, Aseem Agarwala, Trevor Darrell, Aaron Hertzmann, and Holger Winnemoeller. 2013. Recognizing image style. arXiv preprint arXiv:1311.3715 (2013).Google Scholar
- Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Alex Krizhevsky and Geoffrey E. Hinton. 2009. Learning Multiple Layers of Features from Tiny Images. Technical Report 4. Citeseer.Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097--1105. Google Scholar
Digital Library
- Martha Larson, Mohammad Soleymani, Guillaume Gravier, Bogdan Ionescu, and Gareth J. F. Jones. 2017. The benchmarking initiative for multimedia evaluation: MediaEval 2016. IEEE MultiMedia 24, 1 (2017), 93--96.Google Scholar
Cross Ref
- Adrian Lecoutre, Benjamin Negrevergne, and Florian Yger. 2017. Recognizing art style automatically in painting with deep learning. In Proceedings of the 9th Asian Conference on Machine Learning (Proceedings of Machine Learning Research), Min-Ling Zhang and Yung-Kyun Noh (Eds.), Vol. 77. PMLR, 327--342.Google Scholar
- Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.Google Scholar
Cross Ref
- Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Proceedings of the European Conference on Computer Vision. Springer, 740--755.Google Scholar
- Thomas Mensink and Jan Van Gemert. 2014. The rijksmuseum challenge: Museum-centered visual recognition. In Proceedings of International Conference on Multimedia Retrieval. ACM, 451. Google Scholar
Digital Library
- Joseph Redmon and Ali Farhadi. 2016. YOLO9000: Better, faster, stronger. arXiv preprint arXiv:1612.08242 (2016).Google Scholar
- Jean C. Rush. 1979. Acquiring a concept of painting style. Stud. Art Educ. 20, 3 (1979), 43--51.Google Scholar
Cross Ref
- Bryan C. Russell, Antonio Torralba, Kevin P. Murphy, and William T. Freeman. 2008. LabelMe: A database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 1 (2008), 157--173. Google Scholar
Digital Library
- Babak Saleh and Ahmed Elgammal. 2015. Large-scale classification of fine-art paintings: Learning the right metric on the right feature. arXiv preprint arXiv:1505.00855 (2015).Google Scholar
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014).Google Scholar
- Alan F. Smeaton, Paul Over, Cash Costello, Arjen P. de Vries, David S. Doermann, Alexander G. Hauptmann, Mark E. Rorvig, John R. Smith, and Lide Wu. 2002. The TREC2001 video track: Information retrieval on digital video information. In ECDL (Lecture Notes in Computer Science), Vol. 2458. Springer, 266–275. Google Scholar
Digital Library
- Gjorgji Strezoski and Marcel Worring. 2017. OmniArt: Multi-task deep learning for artistic data analysis. arXiv preprint arXiv:1708.00684 (2017).Google Scholar
- Gjorgji Strezoski and Marcel Worring. 2017. Plug-and-play interactive deep network visualization. In VADL: Visual Analytics for Deep Learning. 0100--0106.Google Scholar
- Bart Thomee, David A. Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, Damian Borth, and Li-Jia Li. 2016. YFCC100M: The new data in multimedia research. Commun. ACM 59, 2 (2016), 64--73. Google Scholar
Digital Library
- Nanne van Noord, Ella Hendriks, and Eric Postma. 2015. Toward discovery of the artist’s style: Learning to recognize artists by their artworks. IEEE Sign. Process. Mag. 32, 4 (2015), 46--54.Google Scholar
Cross Ref
- Nicholas Westlake, Hongping Cai, and Peter Hall. 2016. Detecting people in artwork with CNNs. In Proceedings of the European Conference on Computer Vision. Springer, 825--841.Google Scholar
Cross Ref
- Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networkss. arXiv preprint arXiv:1703.10593 (2017).Google Scholar
Index Terms
OmniArt: A Large-scale Artistic Benchmark
Recommendations
ArtSight: An Artistic Data Exploration Engine
MM '18: Proceedings of the 26th ACM international conference on MultimediaThis technical demo presents ArtSight, a comprehensive query-by-color explorative interface built on top of the large scale artistic dataset OmniArt. Color is of paramount importance in the artistic realm and querying such large data collections by ...
A Framework on the Applications of Interactive Art
CGIV '09: Proceedings of the 2009 Sixth International Conference on Computer Graphics, Imaging and VisualizationWith the fusion in the world of art and science, technology has made a dent in the course of art, and there is now a time and place for those actively interested in both the academics of art and science like never before. Interactive art is one of the ...
Multimedia Sensor Dataset for the Analysis of Vehicle Movement
MMSys'17: Proceedings of the 8th ACM on Multimedia Systems ConferenceWith applications ranging from basic trajectory calculations to complex autonomous vehicle operations, detailed vehicle movement analysis has been getting more attention in academia and industry. So far, real-data driven analysis, e.g., utilizing ...






Comments