skip to main content
10.1145/3290605.3300522acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article
Public Access

Crowdsourcing Multi-label Audio Annotation Tasks with Citizen Scientists

Published: 02 May 2019 Publication History

Abstract

Annotating rich audio data is an essential aspect of training and evaluating machine listening systems. We approach this task in the context of temporally-complex urban soundscapes, which require multiple labels to identify overlapping sound sources. Typically this work is crowdsourced, and previous studies have shown that workers can quickly label audio with binary annotation for single classes. However, this approach can be difficult to scale when multiple passes with different focus classes are required to annotate data with multiple labels. In citizen science, where tasks are often image-based, annotation efforts typically label multiple classes simultaneously in a single pass. This paper describes our data collection on the Zooniverse citizen science platform, comparing the efficiencies of different audio annotation strategies. We compared multiple-pass binary annotation, single-pass multi-label annotation, and a hybrid approach: hierarchical multi-pass multi-label annotation. We discuss our findings, which support using multi-label annotation, with reference to volunteer citizen scientists' motivations.

Supplementary Material

MP4 File (paper292.mp4)

References

[1]
{n. d.}. Zooniverse. https://www.zooniverse.org/. Accessed: 2018-0919.
[2]
Ãsten Axelsson, Mats E Nilsson, and Birgitta Berglund. 2012. The Swedish soundscape-quality protocol. The Journal of the Acoustical Society of America 131, 4 (2012), 3476--3476.
[3]
Östen Axelsson, Mats E Nilsson, and Birgitta Berglund. 2010. A principal components model of soundscape perception. The Journal of the Acoustical Society of America 128, 5 (2010), 2836--2846.
[4]
Mathias Basner, Wolfgang Babisch, Adrian Davis, Mark Brink, Charlotte Clark, Sabine Janssen, and Stephen Stansfeld. 2014. Auditory and non-auditory efects of noise on health. The Lancet 383, 9925 (2014), 1325--1332.
[5]
Juan Pablo Bello, Claudio Silva, Oded Nov, R Luke DuBois, Anish Arora, Justin Salamon, Charles Mydlarz, and Harish Doraiswamy. 2019. SONYC: A System for the Monitoring, Analysis and Mitigation of Urban Noise Pollution. Commun. ACM 62, 2 (2019).
[6]
Anne Bowser, Derek Hansen, Yurong He, Carol Boston, Matthew Reid, Logan Gunnell, and Jennifer Preece. 2013. Using gamifcation to inspire new citizen science volunteers. In Proceedings of the International Conference on Gameful Design, Research, and Applications. ACM, 18-- 25.
[7]
Jonathan Bragg and Daniel S Weld. 2013. Crowdsourcing multi-label classifcation for taxonomy creation. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing.
[8]
Emre Cakir, Toni Heittola, Heikki Huttunen, and Tuomas Virtanen. 2015. Multi-label vs. combined single-label sound event detection with deep neural networks. In Proceedings of the European Signal Processing Conference. IEEE, 2551--2555.
[9]
John M Carroll. 1989. Evaluation, description and invention: Paradigms for human-computer interaction. In Advances in Computers. Vol. 29. Elsevier, 47--77.
[10]
Mark Cartwright, Ayanna Seals, Justin Salamon, Alex Williams, Stefanie Mikloska, Duncan MacConnell, Edith Law, Juan P. Bello, and Oded Nov. 2017. Seeing sound: Investigating the efects of visualizations and complexity on crowdsourced audio annotations. Proceedings of the ACM on Human-Computer Interaction 1, 1 (2017).
[11]
Dana Chandler and Adam Kapelner. 2013. Breaking monotony with meaning: Motivation in crowdsourcing markets. Journal of Economic Behavior & Organization 90 (2013), 123--133.
[12]
Xiangyu Chen, Yadong Mu, Shuicheng Yan, and Tat-Seng Chua. 2010. Efcient large-scale image annotation by probabilistic collaborative multi-label propagation. In Proceedings of the ACM International Conference on Multimedia. ACM, 35--44.
[13]
Lydia B Chilton, Greg Little, Darren Edge, Daniel S Weld, and James A Landay. 2013. Cascade: Crowdsourcing taxonomy creation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1999--2008.
[14]
Caren B Cooper, Jennifer Shirk, and Benjamin Zuckerberg. 2014. The invisible prevalence of citizen science in global research: migratory birds and climate change. PLOS ONE 9, 9 (2014), e106508.
[15]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248--255.
[16]
Jia Deng, Olga Russakovsky, Jonathan Krause, Michael S Bernstein, Alex Berg, and Li Fei-Fei. 2014. Scalable multi-label annotation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 3099--3102.
[17]
Martin Dittus, Giovanni Quattrone, and Licia Capra. 2016. Analysing volunteer engagement in humanitarian mapping: building contributor communities at large scale. In Proceedings of the ACM Conference on Computer-Supported Cooperative Work and Social Computing. ACM, 108--118.
[18]
Margret C Domroese and Elizabeth A Johnson. 2017. Why watch bees? Motivations of citizen science volunteers in the Great Pollinator Project. Biological Conservation 208 (2017), 40--47.
[19]
Alexandra Eveleigh, Charlene Jennett, Ann Blandford, Philip Brohan, and Anna L Cox. 2014. Designing for dabblers and deterring drop-outs in citizen science. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2985--2994.
[20]
Jort F Gemmeke, Daniel PW Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R Channing Moore, Manoj Plakal, and Marvin Ritter. 2017. Audio Set: An ontology and human-labeled dartaset for audio events. In Proceedigns of the IEEE International Conference on Acoustics, Speech, and Signal Processing.
[21]
Monica S Hammer, Tracy K Swinburn, and Richard L Neitzel. 2013. Environmental noise pollution in the United States: developing an efective public health response. Environmental Health Perspectives 122, 2 (2013), 115--119.
[22]
Eric Humphrey, Simon Durand, and Brian McFee. 2018. OpenMIC-2018: an open dataset for multiple instrument recognition. In Proceedings of the International Society for Music Information Retrieval Conference.
[23]
Edwin L Hutchins, James D Hollan, and Donald A Norman. 1985. Direct manipulation interfaces. Human-computer Interaction 1, 4 (1985), 311-- 338.
[24]
Ioanna Iacovides, Charlene Jennett, Cassandra Cornish-Trestrail, and Anna L Cox. 2013. Do games attract or sustain engagement in citizen science?: a study of volunteer motivations. In CHI'13 Extended Abstracts on Human Factors in Computing Systems. ACM, 1101--1106.
[25]
Corey Brian Jackson, Kevin Crowston, Gabriel Mugar, and Carsten Østerlund. 2016. Guess what! You're the First to See this Event: Increasing Contribution to Online Production Communities. In Proceedings of the International Conference on Supporting Group Work. ACM, 171--179.
[26]
Corey Brian Jackson, Carsten Østerlund, Gabriel Mugar, Katie DeVries Hassman, and Kevin Crowston. 2015. Motivations for sustained participation in crowdsourcing: case studies of citizen science on the role of talk. In Proceedings of the Hawaii International Conference on System Sciences. IEEE, 1624--1634.
[27]
Aren Jansen, Jort F Gemmeke, Daniel PW Ellis, Xiaofeng Liu, Wade Lawrence, and Dylan Freedman. 2017. Large-scale audio event discovery in one million youtube videos. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 786--790.
[28]
Charlene Jennett, Laure Kloetzer, Daniel Schneider, Ioanna Iacovides, Anna Cox, Margaret Gold, Brian Fuchs, Alexandra Eveleigh, Kathleen Methieu, Zoya Ajani, et al. 2016. Motivations, learning and creativity in online citizen science. Journal of Science Communication 15, 3 (2016).
[29]
Dick Kasperowski and Thomas Hillman. 2018. The epistemic culture in an online citizen science project: Programs, antiprograms and epistemic subjects. Social Studies of Science (2018).
[30]
Nicolas Kaufmann, Thimo Schulze, and Daniel Veit. 2011. More than fun and money. Worker Motivation in Crowdsourcing-A Study on Mechanical Turk. In Proceedings of Americas Conference on Information Systems. 1--11.
[31]
Aniket Kittur, Boris Smus, Susheel Khamkar, and Robert E Kraut. 2011. Crowdforge: Crowdsourcing complex work. In Proceedings of the ACM Symposium on User Interface Software and Technology. ACM, 43--52.
[32]
Ranjay A Krishna, Kenji Hata, Stephanie Chen, Joshua Kravitz, David A Shamma, Li Fei-Fei, and Michael S Bernstein. 2016. Embracing error to enable rapid crowdsourcing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 3167--3179.
[33]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740--755.
[34]
Chris J Lintott, Kevin Schawinski, Ane Slosar, Kate Land, Steven Bamford, Daniel Thomas, M Jordan Raddick, Robert C Nichol, Alex Szalay, and Dan Andreescu. 2008. Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey. Monthly Notices of the Royal Astronomical Society 389, 3 (2008), 1179-- 1189.
[35]
Oisin Mac Aodha, Rory Gibb, Kate E Barlow, Ella Browning, Michael Firman, Robin Freeman, Briana Harder, Libby Kinsey, Gary R Mead, and Stuart E Newson. 2018. Bat detective-Deep learning tools for bat acoustic signal detection. PLOS Computational Biology 14, 3 (2018), e1005995.
[36]
Andrew Mao, Ece Kamar, Yiling Chen, Eric Horvitz, Megan E Schwamb, Chris J Lintott, and Arfon M Smith. 2013. Volunteering versus work for pay: Incentives and tradeofs in crowdsourcing. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing.
[37]
Winter Mason and Duncan J Watts. 2009. Financial incentives and the performance of crowds. In Proceedings of the ACM SIGKDD Workshop on Human Computation. ACM, 77--85.
[38]
Greg Newman, Andrea Wiggins, Alycia Crall, Eric Graham, Sarah Newman, and Kevin Crowston. 2012. The future of citizen science: emerging technologies and shifting paradigms. Frontiers in Ecology and the Environment 10, 6 (2012), 298--304.
[39]
Oded Nov, Ofer Arazy, and David Anderson. 2011. Dusting for science: motivation and participation of digital citizen science volunteers. In Proceedings of the iConference. ACM, 68--74.
[40]
Oded Nov, Ofer Arazy, and David Anderson. 2014. Scientists@ Home: what drives the quantity and quality of online citizen science participation? PLOS ONE 9, 4 (2014), e90375.
[41]
Nathan R Prestopnik and Kevin Crowston. 2011. Gaming for (citizen) science: Exploring motivation and data quality in the context of crowdsourced science through the design and evaluation of a socialcomputational system. In IEEE International Conference on e-Science Workshops. IEEE, 28--33.
[42]
M Jordan Raddick, Georgia Bracey, Pamela L Gay, Chris J Lintott, Carie Cardamone, Phil Murray, Kevin Schawinski, Alexander S Szalay, and Jan Vandenberg. 2013. Galaxy Zoo: Motivations of citizen scientists. (2013). arXiv:physics.ed-ph/1303.6886
[43]
M Jordan Raddick, Georgia Bracey, Pamela L Gay, Chris J Lintott, Phil Murray, Kevin Schawinski, Alexander S Szalay, and Jan Vandenberg. 2010. Galaxy Zoo: Exploring the motivations of citizen science volunteers. Astronomy Education Review 9, 1 (2010), n1.
[44]
Jason Reed, M Jordan Raddick, Andrea Lardner, and Karen Carney. 2013. An exploratory factor analysis of motivations for participating in Zooniverse, a collection of virtual citizen science projects. In Proceedings of the Hawaii International Conference on System Sciences. IEEE, 610--619.
[45]
Hauke Riesch and Clive Potter. 2014. Citizen science as seen by scientists: Methodological, epistemological and ethical dimensions. Public understanding of science 23, 1 (2014), 107--120.
[46]
Jakob Rogstadius, Vassilis Kostakos, Aniket Kittur, Boris Smus, Jim Laredo, and Maja Vukovic. 2011. An assessment of intrinsic and extrinsic motivation on task performance in crowdsourcing markets. Proceedigns of the International AAAI Conference on Web and Social Media, 17--21.
[47]
Dana Rotman, Jenny Preece, Jen Hammock, Kezee Procita, Derek Hansen, Cynthia Parr, Darcy Lewis, and David Jacobs. 2012. Dynamic changes in motivation in collaborative citizen-science projects. In Proceedings of the ACM Conference on Computer Supported Cooperative Work. ACM, 217--226.
[48]
Lior Shamir, Carol Yerby, Robert Simpson, Alexander M von BendaBeckmann, Peter Tyack, Filipa Samarra, Patrick Miller, and John Wallin. 2014. Classifcation of large acoustic datasets using machine learning and crowdsourcing: Application to whale calls. The Journal of the Acoustical Society of America 135, 2 (2014), 953--962.
[49]
Ben Shneiderman. 1982. The future of interactive systems and the emergence of direct manipulation. Behaviour & Information Technology 1, 3 (1982), 237--256.
[50]
Gunnar A Sigurdsson, Olga Russakovsky, Ali Farhadi, Ivan Laptev, and Abhinav Gupta. 2016. Much ado about time: Exhaustive annotation of temporal data. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing.
[51]
Alexandra Swanson, Margaret Kosmala, Chris Lintott, Robert Simpson, Arfon Smith, and Craig Packer. 2015. Snapshot Serengeti, highfrequency annotated camera trap images of 40 mammalian species in an African savanna. Scientifc data 2 (2015), 150026.
[52]
Ramine Tinati, Max Van Kleek, Elena Simperl, Markus Luczak-Rösch, Robert Simpson, and Nigel Shadbolt. 2015. Designing for citizen data analysis: a cross-sectional case study of a multi-domain citizen science platform. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 4069--4078.
[53]
Anthony Truskinger, Mark Cottman-Fields, Daniel Johnson, and Paul Roe. 2013. Rapid scanning of spectrograms for efcient identifcation of bioacoustic events in big data. In Proceedings of the IEEE International Conference on eScience. IEEE, 270--277.
[54]
Carl Vondrick, Donald Patterson, and Deva Ramanan. 2013. Efciently scaling up crowdsourced video annotation. International Journal of Computer Vision 101, 1 (2013), 184--204.
[55]
Chris Wood, Brian Sullivan, Marshall Ilif, Daniel Fink, and Steve Kelling. 2011. eBird: engaging birders in science and conservation. PLOS Biology 9, 12 (2011), e1001220.
[56]
Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2018. Places: A 10 million image database for scene recognition. IEEE transactions on Pattern Analysis and Machine Intelligence 40, 6 (2018), 1452--1464.

Cited By

View all
  • (2024)Nature Networks: Designing for nature data collection and sharing from local to globalProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661520(1439-1452)Online publication date: 1-Jul-2024
  • (2024)Closing the Knowledge Gap in Designing Data Annotation Interfaces for AI-powered Disaster Management Analytic SystemsProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645214(405-418)Online publication date: 18-Mar-2024
  • (2024)Citizen science in European research infrastructuresThe European Physical Journal Plus10.1140/epjp/s13360-024-05223-x139:5Online publication date: 17-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems
May 2019
9077 pages
ISBN:9781450359702
DOI:10.1145/3290605
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 May 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. audio annotation
  2. citizen science
  3. crowdsourcing

Qualifiers

  • Research-article

Funding Sources

Conference

CHI '19
Sponsor:

Acceptance Rates

CHI '19 Paper Acceptance Rate 703 of 2,958 submissions, 24%;
Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI '25
CHI Conference on Human Factors in Computing Systems
April 26 - May 1, 2025
Yokohama , Japan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)351
  • Downloads (Last 6 weeks)49
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Nature Networks: Designing for nature data collection and sharing from local to globalProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661520(1439-1452)Online publication date: 1-Jul-2024
  • (2024)Closing the Knowledge Gap in Designing Data Annotation Interfaces for AI-powered Disaster Management Analytic SystemsProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645214(405-418)Online publication date: 18-Mar-2024
  • (2024)Citizen science in European research infrastructuresThe European Physical Journal Plus10.1140/epjp/s13360-024-05223-x139:5Online publication date: 17-May-2024
  • (2024)MindReaD: Enhancing Pedestrian-Vehicle Interaction with Micro-Level Reasoning Data AnnotationInternational Journal of Human–Computer Interaction10.1080/10447318.2024.2406053(1-16)Online publication date: 4-Oct-2024
  • (2023)NoisenseDB: An Urban Sound Event Database to Develop Neural Classification Systems for Noise-Monitoring ApplicationsApplied Sciences10.3390/app1316935813:16(9358)Online publication date: 17-Aug-2023
  • (2023)Evaluating Descriptive Quality of AI-Generated Audio Using Image-SchemasProceedings of the 28th International Conference on Intelligent User Interfaces10.1145/3581641.3584083(621-632)Online publication date: 27-Mar-2023
  • (2023)Unlocking the Tacit Knowledge of Data Work in Machine LearningExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544549.3585616(1-7)Online publication date: 19-Apr-2023
  • (2023)Interface Design for Crowdsourcing Hierarchical Multi-Label Text AnnotationsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581431(1-17)Online publication date: 19-Apr-2023
  • (2023)Strong Labeling of Sound Events Using Crowdsourced Weak Labels and Annotator Competence EstimationIEEE/ACM Transactions on Audio, Speech, and Language Processing10.1109/TASLP.2022.323346831(902-914)Online publication date: 2023
  • (2023)Better quantifying inter-annotator variability: A step towards citizen science in underwater passive acousticsOCEANS 2023 - Limerick10.1109/OCEANSLimerick52467.2023.10244502(1-8)Online publication date: 5-Jun-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media