skip to main content
10.1145/3580305.3599825acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open access

FedMultimodal: A Benchmark for Multimodal Federated Learning

Published: 04 August 2023 Publication History

Abstract

Over the past few years, Federated Learning (FL) has become an emerging machine learning technique to tackle data privacy challenges through collaborative training. In the Federated Learning algorithm, the clients submit a locally trained model, and the server aggregates these parameters until convergence. Despite significant efforts that have been made to FL in fields like computer vision, audio, and natural language processing, the FL applications utilizing multimodal data streams remain largely unexplored. It is known that multimodal learning has broad real-world applications in emotion recognition, healthcare, multimedia, and social media, while user privacy persists as a critical concern. Specifically, there are no existing FL benchmarks targeting multimodal applications or related tasks. In order to facilitate the research in multimodal FL, we introduce FedMultimodal, the first FL benchmark for multimodal learning covering five representative multimodal applications from ten commonly used datasets with a total of eight unique modalities. FedMultimodal offers a systematic FL pipeline, enabling end-to-end modeling framework ranging from data partition and feature extraction to FL benchmark algorithms and model evaluation. Unlike existing FL benchmarks, FedMultimodal provides a standardized approach to assess the robustness of FL against three common data corruptions in real-life multimodal applications: missing modalities, missing labels, and erroneous labels. We hope that FedMultimodal can accelerate numerous future research directions, including designing multimodal FL algorithms toward extreme data heterogeneity, robustness multimodal FL, and efficient multimodal FL. The datasets and benchmark results can be accessed at: https://github.com/usc-sail/fed-multimodal.

Supplementary Material

MP4 File (adfp472-2min-promo.mp4)
Federated Learning is a popular privacy-enhancing learning algorithm that enables machine learning in distributed systems without sharing user data. In this work, we present FedMultimodal, one of the first works benchmarking Multimodal Federated Learning. FedMultimodal provides an end-to-end learning recipe from data partitioning, feature extraction, and federated learning algorithms, to multimodal learning, covering a wide range of applications. Specifically, FedMultimodal provides opportunities to compare unimodal and multimodal federated learning while also supporting many FL algorithms to be compared. Furthermore, FedMultimodal provides tools to study multimodal Federated learning with low-quality data, including missing modalities, missing labels, and erroneous labels. We believe that FedMultimodal also opens many opportunities in future multimodal federated learning studies to include broader multidisciplinary applications, novel algorithms, and connections to foundation models.

References

[1]
Firoj Alam, Ferda Ofli, and Muhammad Imran. 2018. Crisismmd: Multimodal twitter datasets from natural disasters. In Twelfth international AAAI conference on web and social media.
[2]
Erick A Perez Alday, Annie Gu, Amit J Shah, Chad Robichaux, An-Kwok Ian Wong, Chengyu Liu, Feifei Liu, Ali Bahrami Rad, Andoni Elola, Salman Seyedi, et al. 2020. Classification of 12-lead ecgs: the physionet/computing in cardiology challenge 2020. Physiological measurement, Vol. 41, 12 (2020), 124003.
[3]
Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra Perez, and Jorge Luis Reyes Ortiz. 2013. A public domain dataset for human activity recognition using smartphones. In Proceedings of the 21th international European symposium on artificial neural networks, computational intelligence and machine learning. 437--442.
[4]
Burçin Becerik-Gerber, Gale M. Lucas, Ashrant Aryal, Mohamad Awada, Mario Bergés, Sarah Billington, Olga Boric-Lubecke, Ali Ghahramani, Arsalan Heydarian, Christoph Höelscher, Farrokh Jazizadeh, Azam Khan, Jared Langevin, Ruying Liu, Frederick Marks, Matthew Louis Mauriello, Elizabeth L. Murnane, Haeyoung Noh, Marco Pritoni, Shawn C Roll, Davide Schaumann, Mir Hasan Seyedrezaei, John Ellor Taylor, Jie Zhao, and Runhe Zhu. 2022. The field of human building interaction for convergent research and innovation for intelligent built environments. Scientific Reports, Vol. 12 (2022).
[5]
Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical secure aggregation for privacy-preserving machine learning. In proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 1175--1191.
[6]
Brandon M Booth, Tiantian Feng, Abhishek Jangalwa, and Shrikanth S Narayanan. 2019a. Toward robust interpretable human movement pattern analysis in a workplace setting. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 7630--7634.
[7]
Brandon M Booth, Karel Mundnich, Tiantian Feng, Amrutha Nadarajan, Tiago H Falk, Jennifer L Villatte, Emilio Ferrara, and Shrikanth Narayanan. 2019b. Multimodal human and environmental sensing for longitudinal behavioral studies in naturalistic settings: Framework for sensor selection, deployment, and management. Journal of medical Internet research, Vol. 21, 8 (2019), e12832.
[8]
Sebastian Caldas, Sai Meher Karthik Duddu, Peter Wu, Tian Li, Jakub KonečnỴ, H Brendan McMahan, Virginia Smith, and Ameet Talwalkar. 2018. Leaf: A benchmark for federated settings. arXiv preprint arXiv:1812.01097 (2018).
[9]
Houwei Cao, David G Cooper, Michael K Keutmann, Ruben C Gur, Ani Nenkova, and Ragini Verma. 2014. Crema-d: Crowd-sourced emotional multimodal actors dataset. IEEE transactions on affective computing, Vol. 5, 4 (2014), 377--390.
[10]
Jiayi Chen and Aidong Zhang. 2022. FedMSplit: Correlation-Adaptive Federated Multi-Task Learning across Multimodal Split Networks. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 87--96.
[11]
Li-Wei Chen and Alexander Rudnicky. 2021. Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition. arXiv preprint arXiv:2110.06309 (2021).
[12]
Yae Jee Cho, Andre Manoel, Gauri Joshi, Robert Sim, and Dimitrios Dimitriadis. 2022. Heterogeneous ensemble knowledge transfer for training large models in federated learning. arXiv preprint arXiv:2204.12703 (2022).
[13]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv, Vol. abs/1810.04805 (2019).
[14]
Dimitrios Dimitriadis, Mirian Hipolito Garcia, Daniel Madrigal Diaz, Andre Manoel, and Robert Sim. 2022. Flute: A scalable, extensible framework for high-performance federated learning simulations. arXiv preprint arXiv:2203.13789 (2022).
[15]
Nanqing Dong and Irina Voiculescu. 2021. Federated contrastive learning for decentralized unlabeled medical images. In Medical Image Computing and Computer Assisted Intervention--MICCAI 2021: 24th International Conference, Strasbourg, France, September 27-October 1, 2021, Proceedings, Part III 24. Springer, 378--387.
[16]
Cynthia Dwork. 2006. Differential privacy. In Automata, Languages and Programming: 33rd International Colloquium, ICALP 2006, Venice, Italy, July 10-14, 2006, Proceedings, Part II 33. Springer, 1--12.
[17]
Tiantian Feng, Brandon M Booth, Brooke Baldwin-Rodr'iguez, Felipe Osorno, and Shrikanth Narayanan. 2021a. A multimodal analysis of physical activity, sleep, and work shift in nurses with wearable sensor data. Scientific reports, Vol. 11, 1 (2021), 8693.
[18]
Tiantian Feng, Hanieh Hashemi, Rajat Hebbar, Murali Annavaram, and Shrikanth S Narayanan. 2021b. Attribute inference attack of speech emotion recognition in federated learning settings. arXiv preprint arXiv:2112.13416 (2021).
[19]
Tiantian Feng, Rajat Hebbar, Nicholas Mehlman, Xuan Shi, Aditya Kommineni, and Shrikanth Narayanan. 2023. A Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and Fairness. APSIPA Transactions on Signal and Information Processing, Vol. 12, 3 (2023). https://doi.org/10.1561/116.00000084
[20]
Tiantian Feng and Shrikanth Narayanan. 2019a. Imputing missing data in large-scale multivariate biomedical wearable recordings using bidirectional recurrent neural networks with temporal activation regularization. In 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, 2529--2534.
[21]
Tiantian Feng and Shrikanth Narayanan. 2022. Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling. arXiv preprint arXiv:2203.08810 (2022).
[22]
Tiantian Feng and Shrikanth S Narayanan. 2019b. Discovering optimal variable-length time series motifs in large-scale wearable recordings of human bio-behavioral signals. In ICASSP 2019--2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 7615--7619.
[23]
Tiantian Feng, Raghuveer Peri, and Shrikanth Narayanan. 2022. User-Level Differential Privacy against Attribute Inference Attack of Speech Emotion Recognition on Federated Learning. In Proc. Interspeech 2022. 5055--5059. https://doi.org/10.21437/Interspeech.2022--10060
[24]
Chong Fu, Xuhong Zhang, Shouling Ji, Jinyin Chen, Jingzheng Wu, Shanqing Guo, Jun Zhou, Alex X Liu, and Ting Wang. 2022. Label inference attacks against vertical federated learning. In 31st USENIX Security Symposium (USENIX Security 22). 1397--1414.
[25]
Jiahui Geng, Yongli Mou, Feifei Li, Qing Li, Oya Beyan, Stefan Decker, and Chunming Rong. 2021. Towards General Deep Leakage in Federated Learning. arXiv preprint arXiv:2110.09074 (2021).
[26]
Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, et al. 2022. Ego4d: Around the world in 3,000 hours of egocentric video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18995--19012.
[27]
Chaoyang He, Keshav Balasubramanian, Emir Ceyani, Yu Rong, Peilin Zhao, Junzhou Huang, Murali Annavaram, and Salman Avestimehr. 2021a. FedGraphNN: A Federated Learning System and Benchmark for Graph Neural Networks. ArXiv, Vol. abs/2104.07145 (2021).
[28]
Chaoyang He, Songze Li, Jinhyun So, Mi Zhang, Hongyi Wang, Xiaoyang Wang, Praneeth Vepakomma, Abhishek Singh, Hang Qiu, Li Shen, Peilin Zhao, Yan Kang, Yang Liu, Ramesh Raskar, Qiang Yang, Murali Annavaram, and Salman Avestimehr. 2020. FedML: A Research Library and Benchmark for Federated Machine Learning. arXiv preprint arXiv:2007.13518 (2020).
[29]
Chaoyang He, Alay Dilipbhai Shah, Zhenheng Tang, Di Fan1Adarshan Naiynar Sivashunmugam, Keerti Bhogaraju, Mita Shimpi, Li Shen, Xiaowen Chu, Mahdi Soltanolkotabi, and Salman Avestimehr. 2021b. Fedcv: a federated learning framework for diverse computer vision tasks. arXiv preprint arXiv:2111.11066 (2021).
[30]
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).
[31]
Sohei Itahara, Takayuki Nishio, Yusuke Koda, Masahiro Morikura, and Koji Yamamoto. 2021. Distillation-based semi-supervised federated learning for communication-efficient collaborative training with non-iid private data. IEEE Transactions on Mobile Computing, Vol. 22, 1 (2021), 191--205.
[32]
Andrew Jaegle, Felix Gimeno, Andy Brock, Oriol Vinyals, Andrew Zisserman, and Joao Carreira. 2021. Perceiver: General perception with iterative attention. In International conference on machine learning. PMLR, 4651--4664.
[33]
Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et al. 2021. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, Vol. 14, 1--2 (2021), 1-210.
[34]
Yan Kang, Yang Liu, and Xinle Liang. 2022. Fedcvt: Semi-supervised vertical federated learning with cross-view training. ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 13, 4 (2022), 1--16.
[35]
Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian Stich, and Ananda Theertha Suresh. 2020. Scaffold: Stochastic controlled averaging for federated learning. In International Conference on Machine Learning. PMLR, 5132--5143.
[36]
Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, and Davide Testuggine. 2020. The hateful memes challenge: Detecting hate speech in multimodal memes. Advances in Neural Information Processing Systems, Vol. 33 (2020), 2611--2624.
[37]
Jakub Konevc nỳ, H Brendan McMahan, Felix X Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2016. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016).
[38]
Fan Lai, Yinwei Dai, Xiangfeng Zhu, Harsha V Madhyastha, and Mosharaf Chowdhury. 2021. FedScale: Benchmarking model and system performance of federated learning. In Proceedings of the First Workshop on Systems Challenges in Reliable and Secure Federated Learning. 1--3.
[39]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature, Vol. 521, 7553 (2015), 436--444.
[40]
Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2020. Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems, Vol. 2 (2020), 429--450.
[41]
Xin-Chun Li and De-Chuan Zhan. 2021. Fedrs: Federated learning with restricted softmax for label distribution non-iid data. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 995--1005.
[42]
Paul Pu Liang, Yiwei Lyu, Xiang Fan, Zetian Wu, Yun Cheng, Jason Wu, Leslie Chen, Peter Wu, Michelle A Lee, Yuke Zhu, et al. 2021. Multibench: Multiscale benchmarks for multimodal representation learning. arXiv preprint arXiv:2107.07502 (2021).
[43]
Bill Yuchen Lin, Chaoyang He, Zihang Zeng, Hulin Wang, Yufen Huang, Mahdi Soltanolkotabi, Xiang Ren, and Salman Avestimehr. 2021. Fednlp: Benchmarking federated learning methods for natural language processing tasks. arXiv preprint arXiv:2104.08815 (2021).
[44]
Tao Lin, Lingjing Kong, Sebastian U Stich, and Martin Jaggi. 2020. Ensemble distillation for robust model fusion in federated learning. Advances in Neural Information Processing Systems, Vol. 33 (2020), 2351--2363.
[45]
Jiasen Lu, Jianwei Yang, Dhruv Batra, and Devi Parikh. 2016. Hierarchical question-image co-attention for visual question answering. Advances in neural information processing systems, Vol. 29 (2016).
[46]
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics. PMLR, 1273--1282.
[47]
Sachin Mehta and Mohammad Rastegari. 2021. Mobilevit: light-weight, general-purpose, and mobile-friendly vision transformer. arXiv preprint arXiv:2110.02178 (2021).
[48]
Luca Melis, Congzheng Song, Emiliano De Cristofaro, and Vitaly Shmatikov. 2019. Exploiting unintended feature leakage in collaborative learning. In 2019 IEEE Symposium on Security and Privacy (SP). IEEE, 691--706.
[49]
Fatemehsadat Mireshghallah, Mohammadkazem Taram, Praneeth Vepakomma, Abhishek Singh, Ramesh Raskar, and Hadi Esmaeilzadeh. 2020. Privacy in deep learning: A survey. arXiv preprint arXiv:2004.12254 (2020).
[50]
Mathew Monfort, Alex Andonian, Bolei Zhou, Kandan Ramakrishnan, Sarah Adel Bargal, Tom Yan, Lisa Brown, Quanfu Fan, Dan Gutfreund, Carl Vondrick, et al. 2019. Moments in time dataset: one million videos for event understanding. IEEE transactions on pattern analysis and machine intelligence, Vol. 42, 2 (2019), 502--508.
[51]
Curtis Northcutt, Lu Jiang, and Isaac Chuang. 2021. Confident learning: Estimating uncertainty in dataset labels. Journal of Artificial Intelligence Research, Vol. 70 (2021), 1373--1411.
[52]
Alexandros Pantelopoulos and Nikolaos G Bourbakis. 2009. A survey on wearable sensor-based systems for health monitoring and prognosis. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), Vol. 40, 1 (2009), 1--12.
[53]
Srinivas Parthasarathy and Shiva Sundaram. 2020. Training strategies to handle missing modalities for audio-visual expression recognition. In Companion Publication of the 2020 International Conference on Multimodal Interaction. 400--404.
[54]
Shyamal Patel, Hyung Park, Paolo Bonato, Leighton Chan, and Mary Rodgers. 2012. A review of wearable sensors and systems with application in rehabilitation. Journal of neuroengineering and rehabilitation, Vol. 9, 1 (2012), 1--17.
[55]
Soujanya Poria, Devamanyu Hazarika, Navonil Majumder, Gautam Naik, Erik Cambria, and Rada Mihalcea. 2018. Meld: A multimodal multi-party dataset for emotion recognition in conversations. arXiv preprint arXiv:1810.02508 (2018).
[56]
Andrew Raij, Animikh Ghosh, Santosh Kumar, and Mani Srivastava. 2011. Privacy risks emerging from the adoption of innocuous wearable sensors in the mobile environment. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 11--20.
[57]
Sashank Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konevc nỳ, Sanjiv Kumar, and H Brendan McMahan. 2020. Adaptive federated optimization. arXiv preprint arXiv:2003.00295 (2020).
[58]
Michael S Ryoo, AJ Piergiovanni, Mingxing Tan, and Anelia Angelova. 2019. Assemblenet: Searching for multi-stream neural connectivity in video architectures. arXiv preprint arXiv:1905.13209 (2019).
[59]
Aaqib Saeed, Flora D Salim, Tanir Ozcelebi, and Johan Lukkien. 2020. Federated self-supervised learning of multisensor representations for embedded intelligence. IEEE Internet of Things Journal, Vol. 8, 2 (2020), 1030--1040.
[60]
Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).
[61]
EK Sannara, Francois Portet, Philippe Lalanda, and VEGA German. 2021. A federated learning aggregation algorithm for pervasive computing: Evaluation and comparison. In 2021 IEEE International Conference on Pervasive Computing and Communications (PerCom). IEEE, 1--10.
[62]
Niloy Sikder and Abdullah-Al Nahid. 2021. KU-HAR: An open dataset for heterogeneous human activity recognition. Pattern Recognition Letters, Vol. 146 (2021), 46--54.
[63]
Khurram Soomro, Amir Roshan Zamir, and Mubarak Shah. 2012. UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012).
[64]
Nils Strodthoff, Patrick Wagner, Tobias Schaeffter, and Wojciech Samek. 2020. Deep learning for ECG analysis: Benchmarks and insights from PTB-XL. IEEE Journal of Biomedical and Health Informatics, Vol. 25, 5 (2020), 1519--1528.
[65]
Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, and Denny Zhou. 2020. Mobilebert: a compact task-agnostic bert for resource-limited devices. arXiv preprint arXiv:2004.02984 (2020).
[66]
Jean Ogier du Terrail, Samy-Safwan Ayed, Edwige Cyffers, Felix Grimberg, Chaoyang He, Regis Loeb, Paul Mangold, Tanguy Marchand, Othmane Marfoq, Erum Mushtaq, et al. 2022. FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings. arXiv preprint arXiv:2210.04620 (2022).
[67]
Vasileios Tsouvalas, Tanir Ozcelebi, and Nirvana Meratnia. 2022. Privacy-preserving Speech Emotion Recognition through Semi-Supervised Federated Learning. In 2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops). IEEE, 359--364.
[68]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
[69]
Patrick Wagner, Nils Strodthoff, Ralf-Dieter Bousseljot, Dieter Kreiseler, Fatima I Lunze, Wojciech Samek, and Tobias Schaeffter. 2020. PTB-XL, a large publicly available electrocardiography dataset. Scientific data, Vol. 7, 1 (2020), 1--15.
[70]
Meng Wang, Weijie Fu, Xiangnan He, Shijie Hao, and Xindong Wu. 2020. A survey on large-scale machine learning. IEEE Transactions on Knowledge and Data Engineering (2020).
[71]
Zhen Wang, Weirui Kuang, Yuexiang Xie, Liuyi Yao, Yaliang Li, Bolin Ding, and Jingren Zhou. 2022. FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning. Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2022).
[72]
Kang Wei, Jun Li, Ming Ding, Chuan Ma, Howard H Yang, Farhad Farokhi, Shi Jin, Tony QS Quek, and H Vincent Poor. 2020. Federated learning with differential privacy: Algorithms and performance analysis. IEEE Transactions on Information Forensics and Security, Vol. 15 (2020), 3454--3469.
[73]
Yuexiang Xie, Zhen Wang, Daoyuan Chen, Dawei Gao, Liuyi Yao, Weirui Kuang, Yaliang Li, Bolin Ding, and Jingren Zhou. 2022. FederatedScope: A Comprehensive and Flexible Federated Learning Platform via Message Passing. ArXiv, Vol. abs/2204.05011 (2022).
[74]
Baochen Xiong, Xiaoshan Yang, Fan Qi, and Changsheng Xu. 2022. A unified framework for multi-modal federated learning. Neurocomputing, Vol. 480 (2022), 110--118.
[75]
Duygu Yaldiz, Tuo Zhang, and Salman Avestimehr. 2023. Secure Federated Learning against Model Poisoning Attacks via Client Filtering. ArXiv, Vol. abs/2304.00160 (2023).
[76]
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies. 1480--1489.
[77]
Qiying Yu, Yimu Wang, Ke Xu, Yang Liu, and Jingjing Liu. 2023. Multimodal Federated Learning via Contrastive Representation Ensemble. In International Conference on Learning Representations. https://openreview.net/forum"id=Hnk1WRMAYqg
[78]
Fengda Zhang, Kun Kuang, Zhaoyang You, Tao Shen, Jun Xiao, Yin Zhang, Chao Wu, Yueting Zhuang, and Xiaolin Li. 2020. Federated unsupervised representation learning. arXiv preprint arXiv:2010.08982 (2020).
[79]
Tuo Zhang, Tiantian Feng, Samiul Alam, Sunwoo Lee, Mi Zhang, Shrikanth S Narayanan, and Salman Avestimehr. 2022. FedAudio: A Federated Learning Benchmark for Audio Tasks. arXiv preprint arXiv:2210.15707 (2022).
[80]
Tuo Zhang, Lei Gao, Chaoyang He, Mi Zhang, Bhaskar Krishnamachari, and Salman Avestimehr. 2021a. Federated Learning for the Internet of Things: Applications, Challenges, and Opportunities. IEEE Internet of Things Magazine, Vol. 5 (2021), 24--29.
[81]
Zhengming Zhang, Yaoqing Yang, Zhewei Yao, Yujun Yan, Joseph E Gonzalez, Kannan Ramchandran, and Michael W Mahoney. 2021b. Improving semi-supervised federated learning by reducing the gradient diversity of models. In 2021 IEEE International Conference on Big Data (Big Data). IEEE, 1214--1225.
[82]
Yuchen Zhao, Hanyang Liu, Honglin Li, Payam Barnaghi, and Hamed Haddadi. 2020. Semi-supervised federated learning for activity recognition. arXiv preprint arXiv:2011.00851 (2020).
[83]
Ligeng Zhu and Song Han. 2020. Deep leakage from gradients. In Federated learning. Springer, 17--31.
[84]
Weiming Zhuang, Xin Gan, Yonggang Wen, Shuai Zhang, and Shuai Yi. 2021. Collaborative unsupervised visual representation learning from decentralized data. In Proceedings of the IEEE/CVF international conference on computer vision. 4912--4921.

Cited By

View all
  • (2024)Artificial Intelligence of Things: A SurveyACM Transactions on Sensor Networks10.1145/3690639Online publication date: 30-Aug-2024
  • (2024)FedSea: Federated Learning via Selective Feature Alignment for Non-IID Multimodal DataIEEE Transactions on Multimedia10.1109/TMM.2023.334010926(5807-5822)Online publication date: 2024
  • (2024)FedMFS: Federated Multimodal Fusion Learning with Selective Modality CommunicationICC 2024 - IEEE International Conference on Communications10.1109/ICC51166.2024.10622194(287-292)Online publication date: 9-Jun-2024
  • Show More Cited By

Index Terms

  1. FedMultimodal: A Benchmark for Multimodal Federated Learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    August 2023
    5996 pages
    ISBN:9798400701030
    DOI:10.1145/3580305
    This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 August 2023

    Check for updates

    Author Tags

    1. federated learning
    2. multimodal benchmark
    3. multimodal learning

    Qualifiers

    • Research-article

    Funding Sources

    • Meta
    • Intel
    • Konica Minolta
    • USC-Amazon Center

    Conference

    KDD '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1,847
    • Downloads (Last 6 weeks)269
    Reflects downloads up to 24 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Artificial Intelligence of Things: A SurveyACM Transactions on Sensor Networks10.1145/3690639Online publication date: 30-Aug-2024
    • (2024)FedSea: Federated Learning via Selective Feature Alignment for Non-IID Multimodal DataIEEE Transactions on Multimedia10.1109/TMM.2023.334010926(5807-5822)Online publication date: 2024
    • (2024)FedMFS: Federated Multimodal Fusion Learning with Selective Modality CommunicationICC 2024 - IEEE International Conference on Communications10.1109/ICC51166.2024.10622194(287-292)Online publication date: 9-Jun-2024
    • (2024)Decentralized and Distributed Learning for AIoT: A Comprehensive Review, Emerging Challenges, and OpportunitiesIEEE Access10.1109/ACCESS.2024.342221112(101016-101052)Online publication date: 2024
    • (2024)A survey of multimodal federated learning: background, applications, and perspectivesMultimedia Systems10.1007/s00530-024-01422-930:4Online publication date: 29-Jul-2024
    • (2024)Personalized Multimodal Federated Learning for Fingerprint and Finger Vein RecognitionAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5594-3_31(365-376)Online publication date: 5-Aug-2024
    • (2023)Multimodal Federated Learning: A SurveySensors10.3390/s2315698623:15(6986)Online publication date: 6-Aug-2023
    • (2023)PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models2023 11th International Conference on Affective Computing and Intelligent Interaction (ACII)10.1109/ACII59096.2023.10388152(1-8)Online publication date: 10-Sep-2023
    • (2023)Federated Learning in Computer VisionIEEE Access10.1109/ACCESS.2023.331040011(94863-94884)Online publication date: 2023

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media