skip to main content
research-article
Open Access

Federated Bandit: A Gossiping Approach

Authors Info & Claims
Published:22 February 2021Publication History
Skip Abstract Section

Abstract

In this paper, we study Federated Bandit, a decentralized Multi-Armed Bandit problem with a set of N agents, who can only communicate their local data with neighbors described by a connected graph G. Each agent makes a sequence of decisions on selecting an arm from M candidates, yet they only have access to local and potentially biased feedback/evaluation of the true reward for each action taken. Learning only locally will lead agents to sub-optimal actions while converging to a no-regret strategy requires a collection of distributed data. Motivated by the proposal of federated learning, we aim for a solution with which agents will never share their local observations with a central entity, and will be allowed to only share a private copy of his/her own information with their neighbors. We first propose a decentralized bandit algorithm \textttGossip\_UCB, which is a coupling of variants of both the classical gossiping algorithm and the celebrated Upper Confidence Bound (UCB) bandit algorithm. We show that \textttGossip\_UCB successfully adapts local bandit learning into a global gossiping process for sharing information among connected agents, and achieves guaranteed regret at the order of O(\max\ \textttpoly (N,M) łog T, \textttpoly (N,M)łog_łambda_2^-1 N\ ) for all N agents, where łambda_2\in(0,1) is the second largest eigenvalue of the expected gossip matrix, which is a function of G. We then propose \textttFed\_UCB, a differentially private version of \textttGossip\_UCB, in which the agents preserve ε-differential privacy of their local data while achieving O(\max \\frac\textttpoly (N,M) ε łog^2.5 T, \textttpoly (N,M) (łog_łambda_2^-1 N + łog T) \ ) regret.

References

  1. Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine Learning , Vol. 47, 2--3 (2002), 235--256.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ilai Bistritz and Amir Leshem. 2018. Distributed Multi-Player Bandits - a Game of Thrones Approach. In Advances in Neural Information Processing Systems 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). Curran Associates, Inc., 7222--7232.Google ScholarGoogle Scholar
  3. Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2016. Practical secure aggregation for federated learning on user-held data. arXiv preprint arXiv:1611.04482 (2016).Google ScholarGoogle Scholar
  4. Stephen Boyd, Arpita Ghosh, Balaji Prabhakar, and Devavrat Shah. 2006. Randomized Gossip Algorithms. IEEE Transactions on Information Theory , Vol. 52, 6 (2006), 2508--2530.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Semih Cayci, Atilla Eryilmaz, and Rayadurgam Srikant. 2019. Learning to control renewal processes with bandit feedback. Proceedings of the ACM on Measurement and Analysis of Computing Systems , Vol. 3, 2 (2019), 1--32.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Mithun Chakraborty, Kai Yee Phoebe Chua, Sanmay Das, and Brendan Juba. 2017. Coordinated Versus Decentralized Exploration In Multi-Agent Multi-Armed Bandits.. In IJCAI . 164--170.Google ScholarGoogle Scholar
  7. T-H Hubert Chan, Elaine Shi, and Dawn Song. 2011. Private and continual release of statistics. ACM Transactions on Information and System Security (TISSEC) , Vol. 14, 3 (2011), 26.Google ScholarGoogle Scholar
  8. Igor Colin, Aurélien Bellet, Joseph Salmon, and Stéphan Clémencc on. 2015. Extending Gossip Algorithms to Distributed Estimation of U-Statistics. In Advances in Neural Information Processing Systems. 271--279.Google ScholarGoogle Scholar
  9. Richard Combes, Alexandre Proutière, and Alexandre Fauquette. 2020. Unimodal Bandits with Continuous Arms: Order-optimal Regret without Smoothness. Proceedings of the ACM on Measurement and Analysis of Computing Systems , Vol. 4, 1 (2020), 1--28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Alexandros G. Dimakis, Soummya Kar, José M. F. Moura, Michael G. Rabbat, and Anna Scaglione. 2010. Gossip algorithms for distributed signal processing. Proc. IEEE , Vol. 98, 11 (2010), 1847--1864.Google ScholarGoogle ScholarCross RefCross Ref
  11. Bolin Ding, Janardhan Kulkarni, and Sergey Yekhanin. 2017. Collecting telemetry data privately. In Advances in Neural Information Processing Systems. 3571--3580.Google ScholarGoogle Scholar
  12. Abhimanyu Dubey and Alex Pentland. 2020 a. Cooperative Multi-Agent Bandits with Heavy Tails. arXiv preprint arXiv:2008.06244 (2020).Google ScholarGoogle Scholar
  13. Abhimanyu Dubey and Alex Pentland. 2020 b. Differentially-Private Federated Linear Bandits. arXiv preprint arXiv:2010.11425 (2020).Google ScholarGoogle Scholar
  14. Abhimanyu Dubey and Alex Pentland. 2020 c. Kernel Methods for Cooperative Multi-Agent Contextual Bandits. arXiv preprint arXiv:2008.06220 (2020).Google ScholarGoogle Scholar
  15. Abhimanyu Dubey and Alex Pentland. 2020 d. Private and Byzantine-Proof Cooperative Decision-Making. In Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems . 357--365.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference . Springer, 265--284.Google ScholarGoogle Scholar
  17. Cynthia Dwork, Aaron Roth, et almbox. 2014. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science , Vol. 9, 3--4 (2014), 211--407.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Úlfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. 2014. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security . 1054--1067.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. István HegedHu s, Gábor Danner, and Márk Jelasity. 2019. Gossip learning as a decentralized alternative to federated learning. In IFIP International Conference on Distributed Applications and Interoperable Systems. Springer, 74--90.Google ScholarGoogle Scholar
  20. Ali Jadbabaie, Jie Lin, and A. Stephen Morse. 2003. Coordination of Groups of Mobile Autonomous Agents Using Nearest Neighbor Rules. IEEE Trans. Automat. Control , Vol. 48, 6 (2003), 988--1001.Google ScholarGoogle ScholarCross RefCross Ref
  21. Pooria Joulani, Andras Gyorgy, and Csaba Szepesvári. 2013. Online learning under delayed feedback. In International Conference on Machine Learning . 1453--1461.Google ScholarGoogle Scholar
  22. Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Keith Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et almbox. 2019. Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977 (2019).Google ScholarGoogle Scholar
  23. Dileep Kalathil, Naumaan Nayyar, and Rahul Jain. 2014. Decentralized learning for multiplayer multiarmed bandits. IEEE Transactions on Information Theory , Vol. 60, 4 (2014), 2331--2345.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Akshay Kashyap, Tamer Bacsar, and R. Srikant. 2007. Quantized consensus. Automatica , Vol. 43, 7 (2007), 1192--1203.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. David Kempe, Alin Dobra, and Johannes Gehrke. 2003. Gossip-based computation of aggregate information. In 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings. IEEE, 482--491.Google ScholarGoogle ScholarCross RefCross Ref
  26. Jakub Kone?ný, H. Brendan McMahan, Felix X. Yu, Peter Richtarik, Ananda Theertha Suresh, and Dave Bacon. 2016. Federated Learning: Strategies for Improving Communication Efficiency. In NIPS Workshop on Private Multi-Party Machine Learning . https://arxiv.org/abs/1610.05492Google ScholarGoogle Scholar
  27. Satish Babu Korada, Andrea Montanari, and Sewoong Oh. 2011. Gossip PCA. In Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems . ACM, 209--220.Google ScholarGoogle Scholar
  28. Nathan Korda, Balazs Szorenyi, and Shuai Li. 2016. Distributed clustering of linear bandits in peer to peer networks. In International Conference on Machine Learning. 1301--1309.Google ScholarGoogle Scholar
  29. Peter Landgren, Vaibhav Srivastava, and Naomi Ehrich Leonard. 2016. Distributed Cooperative Decision-making in Multiarmed Bandits: Frequentist and Bayesian Algorithms. In Proceedings of the 55th IEEE Conference on Decision and Control. 167--172.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Lihong Li, Wei Chu, John Langford, and Robert E Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web. 661--670.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Qinbin Li, Zeyi Wen, and Bingsheng He. 2019. Federated learning systems: Vision, hype and reality for data privacy and protection. arXiv preprint arXiv:1907.09693 (2019).Google ScholarGoogle Scholar
  32. Tan Li, Linqi Song, and Christina Fragouli. 2020 b. Federated Recommendation System via Differential Privacy. arXiv preprint arXiv:2005.06670 (2020).Google ScholarGoogle Scholar
  33. Xiang Li, Kaixuan Huang, Wenhao Yang, Shusen Wang, and Zhihua Zhang. 2020 a. On the Convergence of FedAvg on Non-IID Data. In International Conference on Learning Representations . https://openreview.net/forum?id=HJxNAnVtDSGoogle ScholarGoogle Scholar
  34. Ji Liu, Shaoshuai Mou, A. Stephen Morse, Brian D. O. Anderson, and Changbin Yu. 2011. Deterministic gossiping. Proc. IEEE , Vol. 99, 9 (2011), 1505--1524.Google ScholarGoogle ScholarCross RefCross Ref
  35. Keqin Liu and Qing Zhao. 2010. Distributed learning in multi-armed bandit with multiple players. IEEE Transactions on Signal Processing , Vol. 58, 11 (2010), 5667--5681.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Yang Liu, Ji Liu, and Tamer Bacsar. 2018. Differentially private gossip gradient descent. In 2018 IEEE Conference on Decision and Control (CDC). IEEE, 2777--2782.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Yang Liu and Mingyan Liu. 2015. An online learning approach to improving the quality of crowd-sourcing. ACM SIGMETRICS Performance Evaluation Review , Vol. 43, 1 (2015), 217--230.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Palma London, Shai Vardi, and Adam Wierman. 2019. Logarithmic Communication for Distributed Optimization in Multi-Agent Systems. Proceedings of the ACM on Measurement and Analysis of Computing Systems , Vol. 3, 3 (2019), 1--29.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Udari Madhushani and Naomi Ehrich Leonard. 2021. Heterogeneous Explore-Exploit Strategies on Multi-Star Networks. IEEE Control Systems Letters , Vol. 5, 5 (2021), 1603--1608.Google ScholarGoogle ScholarCross RefCross Ref
  40. Mohammad Malekzadeh, Dimitrios Athanasakis, Hamed Haddadi, and Ben Livshits. 2020. Privacy-Preserving Bandits. In Proceedings of Machine Learning and Systems 2020. 350--362.Google ScholarGoogle Scholar
  41. David Mart'inez-Rubio, Varun Kanade, and Patrick Rebeschini. 2019. Decentralized Cooperative Stochastic Bandits. In Advances in Neural Information Processing Systems. 4531--4542.Google ScholarGoogle Scholar
  42. Nikita Mishra and Abhradeep Thakurta. 2014. Private Stochastic Multi-arm Bandits: From Theory to Practice.Google ScholarGoogle Scholar
  43. Naumaan Nayyar, Dileep Kalathil, and Rahul Jain. 2016. On Regret-optimal Learning in Decentralized Multi-player Multi-armed Bandits. IEEE Transactions on Control of Network Systems , Vol. 5, 1 (2016), 597--606.Google ScholarGoogle ScholarCross RefCross Ref
  44. Reza Olfati-Saber, J. Alex Fax, and Richard M. Murray. 2007. Consensus and Cooperation in Networked Multi-Agent Systems. Proc. IEEE , Vol. 95, 1 (2007), 215--233.Google ScholarGoogle ScholarCross RefCross Ref
  45. Alex Olshevsky and John N. Tsitsiklis. 2009. Convergence speed in distributed consensus and averaging. SIAM Journal on Control and Optimization , Vol. 48, 1 (2009), 33--55.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Kristiaan Pelckmans and Johan AK Suykens. 2009. Gossip algorithms for computing U-statistics. IFAC Proceedings Volumes , Vol. 42, 20 (2009), 48--53.Google ScholarGoogle ScholarCross RefCross Ref
  47. Joshua Romoff, Nicolas Ballas, Joelle Pineau, Mike Rabbat, et almbox. 2019. Gossip-based Actor-Learner Architectures for Deep Reinforcement Learning. In Advances in Neural Information Processing Systems. 13299--13309.Google ScholarGoogle Scholar
  48. Ronitt Rubinfeld, Gil Tamir, Shai Vardi, and Ning Xie. 2011. Fast Local Computation Algorithms. arxiv: 1104.1377 [cs.DS]Google ScholarGoogle Scholar
  49. Abishek Sankararaman, Ayalvadi Ganesh, and Sanjay Shakkottai. 2019. Social learning in multi agent multi armed bandits. Proceedings of the ACM on Measurement and Analysis of Computing Systems , Vol. 3, 3 (2019), 1--35.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Chengshuai Shi and Cong Shen. 2021. Federated Multi-Armed Bandits. In 35th AAAI Conference on Artificial Intelligence .Google ScholarGoogle Scholar
  51. Benjamin Sirb and Xiaojing Ye. 2018. Decentralized consensus algorithm with delayed and stochastic gradients. SIAM Journal on Optimization , Vol. 28, 2 (2018), 1232--1254.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Beata Strack, Jonathan P DeShazo, Chris Gennings, Juan L Olmo, Sebastian Ventura, Krzysztof J Cios, and John N Clore. 2014. Impact of HbA1c measurement on hospital readmission rates: analysis of 70,000 clinical database patient records. BioMed research international , Vol. 2014 (2014).Google ScholarGoogle Scholar
  53. Lili Su and Jiaming Xu. 2019. Securing distributed gradient descent in high dimensional statistical learning. Proceedings of the ACM on Measurement and Analysis of Computing Systems , Vol. 3, 1 (2019), 1--41.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Latanya Sweeney. 2000. Simple demographics often identify people uniquely. Health (San Francisco) , Vol. 671, 2000 (2000), 1--34.Google ScholarGoogle Scholar
  55. Balazs Szorenyi, Róbert Busa-Fekete, István Hegedus, Róbert Ormándi, Márk Jelasity, and Balázs Kégl. 2013. Gossip-based distributed stochastic bandit algorithms. In International Conference on Machine Learning. 19--27.Google ScholarGoogle Scholar
  56. Aristide CY Tossou and Christos Dimitrakakis. 2015. Differentially private, multi-agent multi-armed bandits. In European Workshop on Reinforcement Learning (EWRL) .Google ScholarGoogle Scholar
  57. Aristide CY Tossou and Christos Dimitrakakis. 2016. Algorithms for differentially private multi-armed bandits. In Thirtieth AAAI Conference on Artificial Intelligence .Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Jiayi Wang, Shiqiang Wang, Rong-Rong Chen, and Mingyue Ji. 2020 b. Local Averaging Helps: Hierarchical Federated Learning and Convergence Analysis. arXiv preprint arXiv:2010.12998 (2020).Google ScholarGoogle Scholar
  59. Yuanhao Wang, Jiachen Hu, Xiaoyu Chen, and Liwei Wang. 2020 a. Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication. In International Conference on Learning Representations . https://openreview.net/forum?id=SJxZnR4YvBGoogle ScholarGoogle Scholar
  60. Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) , Vol. 10, 2 (2019), 1--19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Jingxuan Zhu, Romeil Sandhu, and Ji Liu. 2020. A Distributed Algorithm for Sequential Decision Making in Multi-Armed Bandit with Homogeneous Rewards. In Proceedings of the 59th IEEE Conference on Decision and Control. 3078--3083.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Federated Bandit: A Gossiping Approach

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!