Abstract
This work presents a decentralized multi-agent navigation approach that allows agents to coordinate their motion through local communication. Our approach allows agents to develop their own emergent language of communication through an optimization process that simultaneously determines what agents say in response to their spatial observations and how agents interpret communication from others to update their motion. We apply our communication approach together with the TTC-Forces crowd simulation algorithm (a recent, high performing, anticipatory collision technique) and show a significant decrease in congestion and bottle-necking of agents, especially in scenarios where agents benefit from close coordination. In addition to reaching their goals faster, agents using our approach show coordinated behaviors including greeting, flocking, following, and grouping. Furthermore, we observe that communication strategies optimized for one scenario often continue to provide time-efficient, coordinated motion between agents when applied to different scenarios. This suggests that the agents are learning to generalize strategies for coordination through their communication "language".
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Coordinating Multi-Agent Navigation by Learning Communication
- Javier Alonso-Mora, Andreas Breitenmoser, Martin Rufli, Roland Siegwart, and Paul Beardsley. 2012. Image and animation display with multiple mobile robots. The International Journal of Robotics Research 31, 6 (2012), 753--773. Google Scholar
Digital Library
- Tucker Balch and Ronald C Arkin. 1994. Communication in reactive multiagent robotic systems. Autonomous robots 1, 1 (1994), 27--52. Google Scholar
Digital Library
- Ralph Beckers, OE Holland, and Jean-Louis Deneubourg. 1994. From local actions to global tasks: Stigmergy and collective robotics. In Artificial life IV, Vol. 181. 189.Google Scholar
- Glen Berseth, Mubbasir Kapadia, Brandon Haworth, and Petros Faloutsos. 2014. SteerFit: Automated parameter fitting for steering algorithms. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation. Eurographics Association, 113--122. Google Scholar
Digital Library
- Graeme Best, Michael Forrai, Ramgopal R Mettu, and Robert Fitch. 2018. Planning-aware communication for decentralised multi-robot coordination. In 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 1050--1057.Google Scholar
Cross Ref
- PC Buzing, AE Eiben, and Martijn C Schut. 2005. Emerging communication and cooperation in evolving agent societies. Journal of Artificial Societies and Social Simulation 8, 1 (2005).Google Scholar
- Jakob Foerster, Yannis M Assael, Nando de Freitas, and Shimon Whiteson. 2016. Learning to communicate with deep multi-agent reinforcement learning. In Advances in Neural Information Processing Systems. 2137--2145. Google Scholar
Digital Library
- Mohammad Ghavamzadeh and Sridhar Mahadevan. 2004. Learning to communicate and act using hierarchical reinforcement learning. In Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems-Volume 3. IEEE Computer Society, 1114--1121. Google Scholar
Digital Library
- Julio Godoy, Tiannan Chen, Stephen J Guy, Ioannis Karamouzas, and Maria Gini. 2018. ALAN: adaptive learning for multi-agent navigation. Autonomous Robots (2018), 1--20. Google Scholar
Digital Library
- Julio Erasmo Godoy, Ioannis Karamouzas, Stephen J Guy, and Maria Gini. 2016. Implicit coordination in crowded multi-agent navigation. In Thirtieth AAAI Conference on Artificial Intelligence. Google Scholar
Digital Library
- Carlos Guestrin, Michail Lagoudakis, and Ronald Parr. 2002. Coordinated reinforcement learning. In ICML, Vol. 2. Citeseer, 227--234. Google Scholar
Digital Library
- Stephen J Guy and Ioannis Karamouzas. 2015. Guide to Anticipatory Collision Avoidance. In Game AI Pro 2, Steve Rabin (Ed.). CRC Press, Chapter 19, 195--208.Google Scholar
- Dirk Helbing, Illés Farkas, and Tamas Vicsek. 2000. Simulating dynamical features of escape panic. Nature 407, 6803 (2000), 487.Google Scholar
Cross Ref
- Dirk Helbing and Peter Molnar. 1995. Social force model for pedestrian dynamics. Physical review E 51, 5 (1995), 4282.Google Scholar
- Suranga Hettiarachchi. 2010. An evolutionary approach to swarm adaptation in dense environments. In ICCAS 2010. IEEE, 962--966.Google Scholar
- Mubbasir Kapadia, Alejandro Beacco, Francisco Garcia, Vivek Reddy, Nuria Pelechano, and Norman I Badler. 2013. Multi-domain real-time planning in dynamic environments. In Proceedings of the 12th ACM SIGGRAPH/Eurographics symposium on computer animation. ACM, 115--124. Google Scholar
Digital Library
- Ioannis Karamouzas, Peter Heil, Pascal Van Beek, and Mark H Overmars. 2009. A predictive collision avoidance model for pedestrian simulation. In International Workshop on Motion in Games. Springer, 41--52. Google Scholar
Digital Library
- Ioannis Karamouzas, Brian Skinner, and Stephen J Guy. 2014. Universal power law governing pedestrian interactions. Physical review letters 113, 23 (2014), 238701.Google Scholar
- Andrew Kimmel, Andrew Dobson, and Kostas Bekris. 2012. Maintaining team coherence under the velocity obstacle framework. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 1. International Foundation for Autonomous Agents and Multiagent Systems, 247--256. Google Scholar
Digital Library
- Celine Loscos, David Marchal, and Alexandre Meyer. 2003. Intuitive crowd behavior in dense urban environments using local laws. In Proceedings of Theory and Practice of Computer Graphics, 2003. IEEE, 122--129. Google Scholar
Digital Library
- Francisco Martinez-Gil, Miguel Lozano, and Fernando Fernández. 2014. MARL-Ped: A multi-agent reinforcement learning based framework to simulate pedestrian groups. Simulation Modelling Practice and Theory 47 (2014), 259--275.Google Scholar
Cross Ref
- Soraia Raupp Musse and Daniel Thalmann. 2001. Hierarchical model for real time simulation of virtual human crowds. IEEE Transactions on Visualization and Computer Graphics 7, 2 (2001), 152--164. Google Scholar
Digital Library
- Sébastien Paris, Julien Pettré, and Stéphane Donikian. 2007. Pedestrian Reactive Navigation for Crowd Simulation: a Predictive Approach. Comput. Graph. Forum 26 (2007), 665--674.Google Scholar
Cross Ref
- Nuria Pelechano and Norman I Badler. 2006. Modeling crowd and trained leader behavior during building evacuation. IEEE computer graphics and applications 26, 6 (2006), 80--86. Google Scholar
Digital Library
- Fasheng Qiu and Xiaolin Hu. 2010. Modeling group structures in pedestrian crowd simulation. Simulation Modelling Practice and Theory 18, 2 (2010), 190--205.Google Scholar
Cross Ref
- Matt Quinn. 2001. Evolving communication without dedicated communication channels. In European Conference on Artificial Life. Springer, 357--366. Google Scholar
Digital Library
- Zhiguo Ren, Panayiotis Charalambous, Julien Bruneau, Qunsheng Peng, and Julien Pettré. 2017. Group Modeling: A Unified Velocity-Based Approach. In Computer Graphics Forum, Vol. 36. Wiley Online Library, 45--56.Google Scholar
- Craig W Reynolds. 1987. Flocks, herds and schools: A distributed behavioral model. ACM SIGGRAPH computer graphics 21, 4 (1987), 25--34. Google Scholar
Digital Library
- Matthew Schuerman, Shawn Singh, Mubbasir Kapadia, and Petros Faloutsos. 2010. Situation agents: agent-based externalized steering logic. Computer Animation and Virtual Worlds 21, 3-4 (2010), 267--276. Google Scholar
Digital Library
- Sainbayar Sukhbaatar, Rob Fergus, et al. 2016. Learning multiagent communication with backpropagation. In Advances in Neural Information Processing Systems. 2244--2252. Google Scholar
Digital Library
- Ioan Cristian Trelea. 2003. The particle swarm optimization algorithm: convergence analysis and parameter selection. Information processing letters 85, 6 (2003), 317--325. Google Scholar
Digital Library
- Jur van den Berg, Stephen Guy, Ming Lin, and Dinesh Manocha. 2011. Reciprocal n-body collision avoidance. Robotics research (2011), 3--19.Google Scholar
- Jur van den Berg, Ming Lin, and Dinesh Manocha. 2008. Reciprocal velocity obstacles for real-time multi-agent navigation. In 2008 IEEE International Conference on Robotics and Automation. IEEE, 1928--1935.Google Scholar
Cross Ref
- Sai-Keung Wong, Pao-Kun Tang, Fu-Shun Li, Zong-Min Wang, and Shih-Ting Yu. 2015. Guidance path scheduling using particle swarm optimization in crowd simulation. Computer Animation and Virtual Worlds 26, 3-4 (2015), 387--395. Google Scholar
Digital Library
- Ping Xuan, Victor Lesser, and Shlomo Zilberstein. 2001. Communication decisions in multi-agent cooperation: Model and experiments. In Proceedings of the fifth international conference on Autonomous agents. ACM, 616--623. Google Scholar
Digital Library
- Hengchin Yeh, Sean Curtis, Sachin Patil, Jur van den Berg, Dinesh Manocha, and Ming Lin. 2008. Composite agents. In Proceedings of the 2008 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. Eurographics Association, 39--47. Google Scholar
Digital Library
- Mauricio Zambrano-Bigiarini, Maurice Clerc, and Rodrigo Rojas. 2013. Standard particle swarm optimisation 2011 at cec-2013: A baseline for future pso improvements. In 2013 IEEE Congress on Evolutionary Computation. IEEE, 2337--2344.Google Scholar
Cross Ref
Index Terms
Coordinating Multi-Agent Navigation by Learning Communication
Recommendations
Pheromone-inspired Communication Framework for Large-scale Multi-agent Reinforcement Learning
Artificial Neural Networks and Machine Learning – ICANN 2022AbstractBeing difficult to scale poses great problems in multi-agent coordination. Multi-agent Reinforcement Learning (MARL) algorithms applied in small-scale multi-agent systems are hard to extend to large-scale ones because the latter is far more ...
Adaptive Learning for Multi-Agent Navigation
AAMAS '15: Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent SystemsWhen agents in a multi-robot system move, they need to adapt their paths to account for potential collisions with other agents and with static obstacles. Existing distributed navigation methods compute motions that are optimal locally but do not account ...
Learning Group-Level Information Integration in Multi-Agent Communication
AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent SystemsIn multi-agent systems, it's hard to make proper decisions for agents due to the partial observability of the environment. Among categories of multi-agent reinforcement learning (MARL) algorithms, communication learning is a common approach to solving ...






Comments