skip to main content
research-article

Multiagent Reinforcement Social Learning toward Coordination in Cooperative Multiagent Systems

Authors Info & Claims
Published:19 December 2014Publication History
Skip Abstract Section

Abstract

Most previous works on coordination in cooperative multiagent systems study the problem of how two (or more) players can coordinate on Pareto-optimal Nash equilibrium(s) through fixed and repeated interactions in the context of cooperative games. However, in practical complex environments, the interactions between agents can be sparse, and each agent's interacting partners may change frequently and randomly. To this end, we investigate the multiagent coordination problems in cooperative environments under a social learning framework. We consider a large population of agents where each agent interacts with another agent randomly chosen from the population in each round. Each agent learns its policy through repeated interactions with the rest of the agents via social learning. It is not clear a priori if all agents can learn a consistent optimal coordination policy in such a situation. We distinguish two different types of learners depending on the amount of information each agent can perceive: individual action learner and joint action learner. The learning performance of both types of learners is evaluated under a number of challenging deterministic and stochastic cooperative games, and the influence of the information sharing degree on the learning performance also is investigated—a key difference from the learning framework involving repeated interactions among fixed agents.

References

  1. Albertus Agung and Ford Lumban Gaol. 2012. Game artificial intelligence based using reinforcement learning. Procedia Engineering 50 (2012), 555--565.Google ScholarGoogle Scholar
  2. R. I. Brafman and M. Tennenholtz. 2004. Efficient learning equilibrium. Artificial Intelligence 159 (2004), 27--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Georgios C. Chasparis, Jeff S. Shamma, and Anders Rantzer. 2011. Perturbed learning automata in potential games. In Proceedings of 50th IEEE Conference on Decision and Control and European Control Conference (CDC-ECC). 2453--2458.Google ScholarGoogle ScholarCross RefCross Ref
  4. C. Claus and C. Boutilier. 1998. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of AAAI'98. 746--752. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. C. Eberhart, Y. H. Shi, and J. Kennedy. 2001. Swarm Intelligence. Elsevier.Google ScholarGoogle Scholar
  6. I. Foster and C. Kesselman. 2003. The Grid 2: Blueprint for a New Computing Infrastructure. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. N. Fulda and D. Ventura. 2007. Predicting and preventing coordination problems in cooperative learning systems. In Proceedings of IJCAI'07. 780--785. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ali Jadbabaie, Pooya Molavi, Alvaro Sandroni, and Alireza Tahbaz-Salehi. 2012. Non-Bayesian social learning. Games and Economic Behavior 76, 1 (2012), 210--225.Google ScholarGoogle ScholarCross RefCross Ref
  9. Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. 1996. Reinforcement learning: A survey. Artificial Intelligence Research (1996), 237--285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Kapetanakis and D. Kudenko. 2002. Reinforcement learning of coordination in cooperative multiagent systems. In Proceedings of AAAI'02. 326--331. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ratul Lahkar and Robert M. Seymour. 2013. Reinforcement learning in population games. Games and Economic Behavior 80 (2013), 10--38.Google ScholarGoogle ScholarCross RefCross Ref
  12. M. Lauer and M. Riedmiller. 2000. An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In Proceedings of ICML'00. 535--542. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. E. Leonard, T. Shen, B. Nabet, L. Scardovi, I. D. Couzin, and S. A. Levin. 2012. Decision versus compromise for animal groups in motion. In Proceedings of the National Academy of Sciences. 227--232.Google ScholarGoogle Scholar
  14. L. Matignon, G. J. Laurent, and N. Le For-Piat. 2008. A study of FMQ heuristic in cooperative multi-agent games. In AAMAS'08 Workshop: MSDM. 77--91.Google ScholarGoogle Scholar
  15. L. Matignon, G. J. Laurent, and N. Le For-Piat. 2012. Independent reinforcement learners in cooperative Markov games: A survey regarding coordination problems. Knowledge Engineering Review 27 (2012), 1--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Matignon, G. J. Laurent, and N. Le Fort-Piat. 2007. Hysteretic q-learning: An algorithm for dynamic reinforcement learning in cooperative multiagent teams. In Proceedings of IROS'07. 64--69.Google ScholarGoogle Scholar
  17. M. Mauve, J. Widmer, and H. Hartenstein. 2001. A survey on position-based routing in mobile ad hoc networks. Network, IEEE 15, 6 (2001), 30--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. Panait and S. Luke. 2005. Cooperative multi-agent learning: The state of the art. JAAMAS 11(3) (2005), 387--434. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. Panait, K. Sullivan, and S. Luke. 2006. Lenient learners in cooperative multiagent systems. In Proceedings of AAMAS'06. 801--803. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. Rendell, R. Boyd, D. Cownden, M. Enquist, K. Eriksson, M. W. Feldman, L. Fogarty, S. Ghirlanda, T. Lillicrap, and K. N. Laland. 2010. Why copy others? insights from the social learning strategies tournament. Science 328(5975) (2010), 208 213.Google ScholarGoogle Scholar
  21. S. Sen and S. Airiau. 2007. Emergence of norms through social learning. In Proceedings of IJCAI'07. 1507--1512. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. Sigmund, H. D. Silva, A. Traulsen, and C. Hauert. 2010. Social learning promotes institutions for governing the commons. Nature 466 (2010), 7308.Google ScholarGoogle ScholarCross RefCross Ref
  23. D. Villatoro, J. Sabater-Mir, and S. Sen. 2011. Social instruments for robust convention emergence. In Proceedings of IJCAI'11. 420--425. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. Villatoro, J. Sabater-Mir, and S. Sen. 2013. Robust convention emergence in social networks through self-reinforcing structures dissolution. ACM Transactions on Autonomous and Adaptive Systems (TAAS) 8, 1 (2013), 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. X. Wang and T. Sandholm. 2002. Reinforcement learning to play an optimal Nash equilibrium in team Markov games. In Proceedings of NIPS'02. 1571--1578.Google ScholarGoogle Scholar
  26. C. J. C. H. Watkins and P. D. Dayan. 1992. Q-learning. Machine Learning (1992), 279--292. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multiagent Reinforcement Social Learning toward Coordination in Cooperative Multiagent Systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!