Building flexible, accurate multimedia analysis and retrieval systems requires massive amounts of annotated data as ground-truth input. The Multimedia COMMONS workshop will lay the groundwork for developing a research community around the Multimedia Commons Project (MMCP), an initiative initially focused on annotation of---and research using---the 99.2 million images and nearly 800,000 videos in the Yahoo Flickr Creative Commons 100 Million (YFCC100M) dataset released in 2014. This freely available dataset is already being used as the basis for several derived corpora and related tools that form the Multimedia Commons, including the YLI Corpus, a public-domain corpus of computed audio, visual, and motion features; YLIMED, a set of public-domain annotations specialized for multimedia event detection; and the YFCC100M Browser, a tag-based query tool for visualizing the distribution of YFCC100M data.
We invite all current and potential users of the YFCC100M and associated Multimedia Commons resources (read: everybody!) to attend MMCommons. Researchers currently using the MMCP datasets---or comparing other large datasets---will present their work in paper sessions and demo systems built for the ACM MM Grand Challenge. The papers presented will address the challenges of leveraging massive amounts of multimedia data from a variety of angles, from evaluating the use of tags for training to extracting social meaning from related images.
Much of the workshop will focus on maximizing the long-term value to the multimedia research community of these new resources. In plenary sessions and panel and discussion formats, workshop attendees are invited to actively participate in shaping the future of the Multimedia Commons Project. Topics will include how to leverage the dataset to stimulate new research directions through challenges and benchmarks and how to extend the impact through applications in other fields (e.g., anthropology or medicine). These explorations will be grounded in a discussion of how we get there: what are the needs and priorities of multimedia researchers with regard to annotation and feature data, and how can we build a collaborative structure for developing and distributing that data.
In laying the groundwork for focused collaborations around an open, shared dataset, we hope to inspire participation from a diverse set of multimedia researchers working on a diversity of tasks. We particularly hope to encourage research that ties together multiple research threads or approaches questions from multiple perspectives.
The publication of the Yahoo Flickr Creative Commons 100 Million dataset (YFCC100M)--to date the largest open-access collection of photos and videos--has provided a unique opportunity to stimulate new research in multimedia analysis and retrieval. To ...
The public availability of large-scale multimedia collections, such as YFCC, facilitates the evaluation of image retrieval approaches in real-life conditions. However, due to their size, the creation of exhaustive ground truth would require huge ...
This paper proposes direct learning of image classification from image tags in the wild, without filtering. Each wild tag is supplied by the user who shared the image online. Enormous numbers of these tags are freely available, and they give insight ...
Multimedia Event Detection (MED) aims to identify events-also called scenes-in videos, such as a flash mob or a wedding ceremony. Audio content information complements cues such as visual content and text. In this paper, we explore the optimization of ...
With the Yahoo Flickr Creative Commons 100 Million (YFCC100m) dataset, a novel dataset was introduced to the computer vision and multimedia research community. To maximize the benefit for the research community and utilize its potential, this dataset ...
We explore what names people use to describe visual concepts and why these names are chosen. Choosing object names has been a topic of interest in cognitive psychology, but a systematic, data-driven approach for naming at the scale of thousands of ...
In this paper, we analyze the association between a social media user's photo content and their interests. Visual content of photos is analyzed using state-of-the-art deep learning based automatic concept recognition. We compute an aggregate visual ...