skip to main content
research-article

Using Imitation to Build Collaborative Agents

Published:20 April 2016Publication History
Skip Abstract Section

Abstract

The article presents an approach to learn collaborative strategies among multiple agents via imitation. Imitation-based learning involves learning from an expert by observing the demonstration of a task and then replicating it. This mechanism makes it convenient for a knowledge engineer to transfer knowledge to a software agent. This article applies imitation to learn not only the strategy of an individual agent, but also the collaborative strategy of a team of agents to achieve a common goal. The article presents an imitation-based solution that learns a weighted naïve Bayes structure, whereas the weights of the model are optimized using Artificial Immune Systems. The learned model is then used by agents to act autonomously. The applicability of the presented approach is assessed in the RoboCup Soccer 3D Simulation environment, which is a promising platform to address many complex real-world problems. The performance of the trained agents is benchmarked against other RoboCup Soccer 3D Simulation teams. In addition to performance characteristics, the research also analyzes the behavioral traits of the imitating team to assess how closely they are imitating the demonstrating team.

References

  1. P. Abbeel and A. Y. Ng. 2004. Apprenticeship learning via inverse reinforcement learning. In Proceedings of the 21st International Conference on Machine Learning. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. H. Akiyama and I. Noda. 2008. Multi-agent positioning mechanism in the dynamic environment. In RoboCup 2007: Robot Soccer World Cup XI, U. Visser, F. Ribeiro, T. Ohashi, and F. Dellaert (Eds.). Springer, Berlin, 377--384. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Aler, O. Garcia, and J. M. Valls. 2005. Correcting and improving imitation models of humans for Robosoccer agents,. In Proceedings of the 2005 IEEE Congress on Evolutionary Computation. IEEE, 3, 2402--2409. DOI:http://doi.org/10.1109/CEC.2005.1554994Google ScholarGoogle Scholar
  4. R. Aler, J. M. Valls, D. Camacho, and A. Lopez. 2009. Programming robosoccer agents by modeling human behavior. Expert Systems with Applications 36, 2, 1850--1859. DOI:http://doi.org/10.1016/j.eswa.2007.12.033 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Alissandrakis, C. L. Nehaniv, K. Dautenhahn, and J. Saunders. 2006. Evaluation of robot imitation attempts: Comparison of the system's and the human's perspectives. In Proceedings of the 1st ACM/IEEE International Conference on Human-Robot Interactions (HRI’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. Almeida, N. Lau, and L. Reis. 2010. A survey on coordination methodologies for simulated robotic soccer teams. In Proceedings of the RoboCup Symposium.Google ScholarGoogle Scholar
  7. B. D. Argall, S. Chernova, M. Veloso, and B. Browning. 2009. A survey of robot learning from demonstration. Robotics and Autonomous Systems 57, 5, 469--483. DOI:http://doi.org/10.1016/j.robot.2008.10.024 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. S. S. Barto and A. G. 1998. Reinforcement Learning: An Introduction. Retrieved from http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid==7548.Google ScholarGoogle Scholar
  9. L. Bull. 1998. Evolutionary computing in multi-agent environments: Operators. In Evolutionary Programming VII, V. W. Porto, N. Saravanan, D. Waagen, and A. E. Eiben (Eds.). Springer, Berlin, pp. 43--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Calinon and A. Billard. 2007. Incremental learning of gestures by imitation in a humanoid robot. In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction. ACM, New York, NY, 255--262. DOI:http://doi.org/10.1145/1228716.1228751 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Calinon, F. D’halluin, E. L. Sauser, D. G. Caldwell, and A. G. Billard. 2010. Learning and reproduction of gestures by imitation. IEEE Robotics Automation Magazine 17, 2, 44--54. DOI:http://doi.org/10.1109/MRA.2010.936947Google ScholarGoogle ScholarCross RefCross Ref
  12. S. Chernova and M. Veloso. 2007. Confidence-based policy learning from demonstration using Gaussian mixture models. In Proceedings of the Joint Conference on Autonomous Agents and Multi-Agent Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Chernova and M. Veloso. 2008. Teaching collaborative multi-robot tasks through demonstration. In Proceedings of the 8th IEEE-RAS International Conference on Humanoids Robots. 385--390. DOI:http://doi.org/10.1109/ICHR.2008.4755982Google ScholarGoogle Scholar
  14. S. Chernova and M. Veloso. 2010. Confidence-based multi-robot learning from demonstration. International Journal of Social Robotics 2, 2, 195--215. DOI:http://doi.org/10.1007/s12369-010-0060-0Google ScholarGoogle ScholarCross RefCross Ref
  15. C. Claus and C. Boutilier. 1998. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the 15th National/10th Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence. American Association for Artificial Intelligence, Menlo Park, CA. 746--752. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. V. Cutello, G. Narzisi, G. Nicosia, and M. Pavone. 2005. Clonal selection algorithms: A comparative case study using effective mutation potentials, optIA versus CLONALG. In Proceedings of the 4th International Conference on Artificial Immune Systems (ICARIS’05). Banff, Canada. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Sammut, S. Hurst, D. Kedzier, and D. Michie. 1992. Learning to fly. In Proceedings of the 9th International Workshop on Machine Learning. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. T. Dashti, N. Aghaeepour, S. Asadi, M. Bastani, Z. Delafkar, F. M. Disfani, and A. F. Siahpirani. 2006. Dynamic positioning based on voronoi cells (DPVC). In RoboCup 2005: Robot Soccer World Cup IX, A. Bredenfeld, A. Jacoff, I. Noda, and Y. Takahashi (Eds.). Springer, Berlin. 219--229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. B. Dias, R. Zlot, N. Kalra, and A. Stentz. 2006. Market-based multirobot coordination: A survey and analysis. Proceedings of the IEEE 94, 7, 1257--1270. DOI:http://doi.org/10.1109/JPROC.2006.876939Google ScholarGoogle ScholarCross RefCross Ref
  20. F. Duvallet and A. Stentz. 2010. Imitation learning for task allocation. In Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’10). 3568--3573. DOI:http://doi.org/10.1109/IROS.2010.5650006Google ScholarGoogle ScholarCross RefCross Ref
  21. A. P. Engelbrecht. 2003. Computational Intelligence: An Introduction (1st ed.). Wiley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. D. Erbas, A. F. Winfield, and L. Bull. 2011. Towards imitation-enhanced reinforcement learning in multi-agent systems. In Proceedings of the 2011 IEEE Symposium on Artificial Life (ALIFE). IEEE, 6--13. DOI:http://doi.org/10.1109/ALIFE.2011.5954652Google ScholarGoogle ScholarCross RefCross Ref
  23. F. Fernández, D. Borrajo, and L. E. Parker. 2005. A reinforcement learning algorithm in cooperative multi-robot domains. Journal of Intelligent and Robotics Systems 43, 2--4, 161--174. DOI:http://doi.org/10.1007/s10846-005-5137-x Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. W. Floyd, B. Esfandiari, and K. Lam. 2008. A case-based reasoning approach to imitating RoboCup players. In Proceedings of the 21st International FLAIRS Conference (2008).Google ScholarGoogle Scholar
  25. M. Hall. 2007. A decision tree-based attribute weighting filter for naive Bayes. Knowledge-Based Systems 20, 2 (March 2007), 120--126. DOI:http://dx.doi.org/10.1016/j.knosys.2006.11.008 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. Jiang, H. Zhang, Z. Cai, and D. Wang. 2011. Weighted average of one-dependence estimators. Journal of Experimental and Theoretical Artificial Intelligence 24(2), 219--230. DOI:http://doi.org/10.1080/0952813X.2011.639092Google ScholarGoogle ScholarCross RefCross Ref
  27. W. U. Jia and C. A. I. Zhihua. 2011. Attribute weighting via differential evolution algorithm for attribute weighted naive Bayes (WNB). Journal of Computational Information Systems 7, 1672--1679.Google ScholarGoogle Scholar
  28. S. Kapetanakis and D. Kudenko. 2002. Reinforcement learning of coordination in cooperative multi-agent systems. In Proceedings of the 18th National Conference on Artificial Intelligence. American Association for Artificial Intelligence, Menlo Park, CA, 326--331. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. I. V. Karpov, V. K. Valsalam, and R. Miikkulainen. 2011. Human-assisted neuroevolution through shaping, advice, and examples. In Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation . ACM, New York, NY, 371--378. DOI:http://doi.org/10.1145/2001576.2001628 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. Konar. 2005. Computational Intelligence: Principles, Techniques and Applications (2005th ed.). Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. H. Köse, K. Kaplan, U. Tatlidede, C. Mericli, and H. L. Akin. 2005a. Market-driven multi-agent collaboration in robot soccer domain. In Cutting Edge Robotics, V. Kordic, A. Lazinica, and M. Merdan (Eds.). Pro Literatur Verlag.Google ScholarGoogle Scholar
  32. H. Köse, U. Tatlidede, Ç. Meriçli, K. Kaplan, and H. L. Akin. 2004. Q-learning based market-driven multi-agent collaboration in robot soccer. In Proceedings of the Turkish Symposium on Artificial Intelligence and Neural Networks. 219--228.Google ScholarGoogle Scholar
  33. M. Lauer and M. Riedmiller. 2000. An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In Proceedings of the 17th International Conference on Machine Learning, Morgan Kaufmann, 535--542. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. T. Leandro De Castro. 2002. Artificial Immune Systems: A New Computational Intelligence Approach. 2002nd Ed. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. P. MacAlpine, F. Barrera, and P. Stone. 2013. Positioning to Win: A dynamic role assignment and formation positioning system. In RoboCup-2012: Robot Soccer World Cup XVI, X. Chen, P. Stone, L. E. Sucar, and T. V. der Zant (Eds.). Springer-Verlag, Berlin.Google ScholarGoogle Scholar
  36. R. Nakanishi, K. Murakami, and T. Naruse. 2008. Dynamic positioning method based on dominant region diagram to realize successful cooperative play. In RoboCup 2007: Robot Soccer World Cup XI. U. Visser, F. Ribeiro, T. Ohashi, and F. Dellaert (Eds.). Springer, Berlin, 488--495. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. L. Panait, S. Luke, and R. P. Wiegand. 2006. Biasing coevolutionary search for optimal multiagent behaviors. IEEE Transactions on Evolutionary Computation 10, 6, 629--645. DOI:http://doi.org/10.1109/TEVC.2006.880330 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. L. Panait, R. P. Wiegand, and S. Luke. 2003. Improving coevolutionary search for optimal multiagent behaviors. In Proceedings of the 18th International Joint Conference on Artificial Intelligence. Morgan Kaufmann Publishers Inc., San Francisco, CA, 653--658. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. B. Price and C. Boutilier. 2001. Imitation and reinforcement learning in agents with heterogeneous actions. In Proceedings of the 14th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence. Springer--Verlag, London, UK, 111--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. B. Price and C. Boutilier. 2003. A Bayesian approach to imitation in reinforcement learning. In Proceedings of the 18th International Joint Conference on Artificial Intelligence. Morgan Kaufmann Publishers Inc., San Francisco, CA, 712--717. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. S. Sariel and T. Balch. 2005a. A framework for multi-robot coordination. In Proceedings of the International Conference on Automated Planning & Scheduling (ICAPS’’05).Google ScholarGoogle Scholar
  42. S. Sariel and T. Balch. 2005b. A framework for multi-robot coordination. In Proceedings of the International Conference on Automated Planning & Scheduling (ICAPS), Doctoral Consortium.Google ScholarGoogle Scholar
  43. S. Natarajan, S. Joshi, P. Tadepalli, K. Kersting, and J. Shavlik. 2011. Imitation learning in relational domains: A functional-gradient boosting approach. In Proceedings of the International Joint Conference in AI (IJCAI’11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. S. Raza, S. Haider, and M.-A. Williams. 2012. Teaching coordinated strategies to soccer robots via imitation. In Proceedings of the 2012 IEEE International Conference on Robotics and Biomimetics (ROBIO). 1434--1439.Google ScholarGoogle ScholarCross RefCross Ref
  45. P. Stone. 2000. Layered Learning in Multiagent Systems: A Winning Approach to Robotic Soccer. A Bradford Book. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. P. Stone, P. Riley, and M. Veloso. 2000. Defining and using ideal teammate and opponent models. In Proceedings of 12th Innovative Applications of Artificial Intelligence Conference. 1040--1045. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. P. Stone and D. McAllester. 2001. An architecture for action selection in robotic soccer. In Proceedings of the 5th International Conference on Autonomous Agents. ACM, New York, NY, 316--323. DOI://doi.org/10.1145/375735.376320 Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. P. Stone and M. M. Veloso. 1999. Task decomposition and dynamic role assignment for real-time strategic teamwork. In Proceedings of the 5th International Workshop on Intelligent Agents V, Agent Theories, Architectures, and Languages. Springer-Verlag, London, UK, 293--308. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. A. Wai. 2011. Learning by Imitation using Inductive Logic Programming. Carleton University, Ottawa, Ontario, Canada.Google ScholarGoogle Scholar
  50. J. Wu, Z. Cai, S. Zeng, and X. Zhu. 2013. Artificial immune system for attribute weighted Naive Bayes classification. In Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN’13). 1--8. DOI:http://doi.org/10.1109/IJCNN.2013.6706818Google ScholarGoogle ScholarCross RefCross Ref
  51. N. A. Zaidi, J. Cerquides, M. J. Carman, and G. I. Webb. 2013. Alleviating naive Bayes attribute independence assumption by attribute weighting. Journal of Machine Learning Research 14, 1947--1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. H. Zhang and S. Sheng. 2004. Learning weighted naive Bayes with accurate ranking. In Proceedings of the 4th IEEE International Conference on Data Mining (ICDM’04). 567--570. DOI:http://doi.org/10.1109/ICDM.2004.10030 Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. B. D. Ziebart, A. Maas, J. A. Bagnell, and A. K. Dey. 2008. Maximum entropy inverse reinforcement learning. In Proceedings of the 23rd National Conference on Artificial Intelligence - Volume 3. AAAI Press. 1433--1438. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Using Imitation to Build Collaborative Agents

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!