skip to main content
research-article

A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control

Published:27 September 2021Publication History
Skip Editorial Notes Section

Editorial Notes

The authors have requested minor, non-substantive changes to the VoR and, in accordance with ACM policies, a Corrected VoR was published on December 27, 2021. For reference purposes the VoR may still be accessed via the Supplemental Material section on this page.

Skip Abstract Section

Abstract

We present a simple and intuitive approach for interactive control of physically simulated characters. Our work builds upon generative adversarial networks (GAN) and reinforcement learning, and introduces an imitation learning framework where an ensemble of classifiers and an imitation policy are trained in tandem given pre-processed reference clips. The classifiers are trained to discriminate the reference motion from the motion generated by the imitation policy, while the policy is rewarded for fooling the discriminators. Using our GAN-like approach, multiple motor control policies can be trained separately to imitate different behaviors. In runtime, our system can respond to external control signal provided by the user and interactively switch between different policies. Compared to existing method, our proposed approach has the following attractive properties: 1) achieves state-of-the-art imitation performance without manually designing and fine tuning a reward function; 2) directly controls the character without having to track any target reference pose explicitly or implicitly through a phase state; and 3) supports interactive policy switching without requiring any motion generation or motion matching mechanism. We highlight the applicability of our approach in a range of imitation and interactive control tasks, while also demonstrating its ability to withstand external perturbations as well as to recover balance. Overall, our approach has low runtime cost and can be easily integrated into interactive applications and games.

Skip Supplemental Material Section

Supplemental Material

References

  1. Kevin Bergamin, Simon Clavet, Daniel Holden, and James Richard Forbes. 2019. DReCon: data-driven responsive control of physics-based characters. ACM Transactions On Graphics 38, 6 (2019), 1--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Andrew Brock, Jeff Donahue, and Karen Simonyan. 2018. Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018).Google ScholarGoogle Scholar
  3. David F Brown, Adriano Macchietto, KangKang Yin, and Victor Zordan. 2013. Control of rotational dynamics for ground behaviors. In Proceedings of the 12th ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 55--61.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Nuttapong Chentanez, Matthias Müller, Miles Macklin, Viktor Makoviychuk, and Stefan Jeschke. 2018. Physics-Based motion capture imitation with deep reinforcement learning. In Proceedings of the 11th Annual International Conference on Motion, Interaction, and Games. Association for Computing Machinery, Article 1, 10 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Simon Clavet. 2016. Motion matching and the road to next-gen animation. In Proc. of GDC.Google ScholarGoogle Scholar
  7. Stelian Coros, Philippe Beaudoin, and Michiel van de Panne. 2010. Generalized biped walking control. ACM Transactions on Graphics 29, 4 (2010), 130.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Erwin Coumans and Yunfei Bai. 2016--2021. PyBullet, a Python module for physics simulation for games, robotics and machine learning. http://pybullet.org.Google ScholarGoogle Scholar
  9. Danilo Borges da Silva, Rubens Fernandes Nunes, Creto Augusto Vidal, Joaquim B Cavalcante-Neto, Paul G Kry, and Victor B Zordan. 2017. Tunable robustness: An artificial contact strategy with virtual actuator control for balance. Computer Graphics Forum 36, 8 (2017), 499--510.Google ScholarGoogle ScholarCross RefCross Ref
  10. Marco Da Silva, Yeuhi Abe, and Jovan Popović. 2008. Simulation of human motion data using short-horizon model-predictive control. Computer Graphics Forum 27, 2 (2008), 371--380.Google ScholarGoogle ScholarCross RefCross Ref
  11. Thomas G Dietterich. 2000. Ensemble methods in machine learning. In International workshop on multiple classifier systems. Springer, 1--15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ishan Durugkar, Ian Gemp, and Sridhar Mahadevan. 2016. Generative multi-adversarial networks. arXiv preprint arXiv:1611.01673 (2016).Google ScholarGoogle Scholar
  13. Thomas Geijtenbeek, Nicolas Pronost, and A Frank van der Stappen. 2012. Simple data-driven control for simulated bipeds. In Proceedings of the 11th ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 211--219.Google ScholarGoogle Scholar
  14. Thomas Geijtenbeek, Michiel van de Panne, and A Frank Van Der Stappen. 2013. Flexible muscle-based locomotion for bipedal creatures. ACM Transactions on Graphics 32, 6 (2013), 1--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial networks. arXiv preprint arXiv:1406.2661 (2014).Google ScholarGoogle Scholar
  16. Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron Courville. 2017. Improved training of Wasserstein GANs. arXiv preprint arXiv:1704.00028 (2017).Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Perttu Hämäläinen, Joose Rajamäki, and C Karen Liu. 2015. Online control of simulated humanoids using particle belief propagation. ACM Transactions on Graphics 34, 4 (2015), 1--13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Félix G. Harvey, Mike Yurick, Derek Nowrouzezahrai, and Christopher Pal. 2020. Robust motion in-betweening. ACM Transactions on Graphics 39, 4 (2020).Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Nicolas Heess, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, SM Eslami, Martin Riedmiller, et al. 2017. Emergence of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286 (2017).Google ScholarGoogle Scholar
  20. Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. arXiv preprint arXiv:1606.03476 (2016).Google ScholarGoogle Scholar
  21. Daniel Holden, Oussama Kanoun, Maksym Perepichka, and Tiberiu Popa. 2020. Learned motion matching. ACM Transactions on Graphics 39, 4 (2020), 53--1.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Daniel Holden, Taku Komura, and Jun Saito. 2017. Phase-functioned neural networks for character control. ACM Transactions on Graphics 36, 4 (2017), 1--13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy, Dario Bellicoso, Vassilios Tsounis, Vladlen Koltun, and Marco Hutter. 2019. Learning agile and dynamic motor skills for legged robots. Science Robotics 4, 26 (2019).Google ScholarGoogle Scholar
  24. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  25. Taesoo Kwon and Jessica K Hodgins. 2010. Control systems for human running using an inverted pendulum model and a reference motion capture sequence. In Proceedings of the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 129--138.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Taesoo Kwon and Jessica K Hodgins. 2017. Momentum-mapped inverted pendulum models for controlling dynamic human motions. ACM Transactions on Graphics 36, 1 (2017), 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Seunghwan Lee, Moonseok Park, Kyoungmin Lee, and Jehee Lee. 2019. Scalable muscle-actuated human simulation and control. ACM Transactions on Graphics 38, 4 (2019), 1--13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Yoonsang Lee, Sungeun Kim, and Jehee Lee. 2010. Data-driven biped control. ACM Transactions on Graphics 29, 4 (2010), 129.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yoonsang Lee, Moon Seok Park, Taesoo Kwon, and Jehee Lee. 2014. Locomotion control for many-muscle humanoids. ACM Transactions on Graphics 33, 6 (2014), 1--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).Google ScholarGoogle Scholar
  31. Jae Hyun Lim and Jong Chul Ye. 2017. Geometric GAN. arXiv preprint arXiv:1705.02894 (2017).Google ScholarGoogle Scholar
  32. Hung Yu Ling, Fabio Zinno, George Cheng, and Michiel Van De Panne. 2020. Character controllers using motion VAEs. ACM Transactions on Graphics 39, 4 (2020), 40--1.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Libin Liu, Michiel van de Panne, and KangKang Yin. 2016. Guided learning of control graphs for physics-based characters. ACM Transactions on Graphics 35, 3 (2016), 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Libin Liu, KangKang Yin, and Baining Guo. 2015. Improving sampling-based motion control. Computer Graphics Forum 34, 2 (2015), 415--423.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Libin Liu, KangKang Yin, Michiel van de Panne, Tianjia Shao, and Weiwei Xu. 2010. Sampling-based contact-rich motion control. In ACM SIGGRAPH 2010 papers. 1--10.Google ScholarGoogle Scholar
  36. Josh Merel, Yuval Tassa, Dhruva TB, Sriram Srinivasan, Jay Lemmon, Ziyu Wang, Greg Wayne, and Nicolas Heess. 2017. Learning human behaviors from motion capture by adversarial imitation. arXiv preprint arXiv:1707.02201 (2017).Google ScholarGoogle Scholar
  37. Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. 2018. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018).Google ScholarGoogle Scholar
  38. Igor Mordatch and Emo Todorov. 2014. Combining the benefits of function approximation and trajectory optimization. In Robotics: Science and Systems, Vol. 4.Google ScholarGoogle Scholar
  39. Igor Mordatch, Emanuel Todorov, and Zoran Popović. 2012. Discovery of complex behaviors through contact-invariant optimization. ACM Transactions on Graphics 31, 4 (2012), 1--8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Masaki Nakada, Tao Zhou, Honglin Chen, Tomer Weiss, and Demetri Terzopoulos. 2018. Deep learning of biomimetic sensorimotor control for biomechanical human animation. ACM Transactions on Graphics 37, 4 (2018), 1--15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Andrew Y Ng, Stuart J Russell, et al. 2000. Algorithms for inverse reinforcement learning. In International Conference on Machine Learning, Vol. 1. 2.Google ScholarGoogle Scholar
  42. Soohwan Park, Hoseok Ryu, Seyoung Lee, Sunmin Lee, and Jehee Lee. 2019. Learning predict-and-simulate policies from unorganized human motion data. ACM Transactions on Graphics 38, 6 (2019), 1--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel van de Panne. 2018. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Transactions on Graphics 37, 4 (2018), 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Xue Bin Peng, Glen Berseth, KangKang Yin, and Michiel van de Panne. 2017. Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Transactions on Graphics 36, 4 (2017), 1--13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Xue Bin Peng, Erwin Coumans, Tingnan Zhang, Tsang-Wei Edward Lee, Jie Tan, and Sergey Levine. 2020. Learning agile robotic locomotion skills by imitating animals. In Robotics: Science and Systems.Google ScholarGoogle Scholar
  46. Xue Bin Peng, Ze Ma, Pieter Abbeel, Sergey Levine, and Angjoo Kanazawa. 2021. AMP: Adversarial motion priors for stylized physics-based character control. ACM Transactions on Graphics 40, 4, Article 1 (July 2021), 15 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Stuart Russell. 1998. Learning agents for uncertain environments. In Proceedings of the eleventh annual conference on Computational learning theory. 101--103.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Andrew M Saxe, James L McClelland, and Surya Ganguli. 2013. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv preprint arXiv:1312.6120 (2013).Google ScholarGoogle Scholar
  49. John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015a. Trust region policy optimization. In International Conference on Machine Learning. 1889--1897.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, and Pieter Abbeel. 2015b. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438 (2015).Google ScholarGoogle Scholar
  51. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).Google ScholarGoogle Scholar
  52. Jie Tan, Karen Liu, and Greg Turk. 2011. Stable proportional-derivative controllers. IEEE Computer Graphics and Applications 31, 4 (2011), 34--44.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Yuval Tassa, Tom Erez, and Emanuel Todorov. 2012. Synthesis and stabilization of complex behaviors through online trajectory optimization. In IEEE/RSJ International Conference on Intelligent Robots and Systems. 4906--4913.Google ScholarGoogle ScholarCross RefCross Ref
  54. Yuval Tassa, Nicolas Mansard, and Emo Todorov. 2014. Control-limited differential dynamic programming. In IEEE International Conference on Robotics and Automation. 1168--1175.Google ScholarGoogle ScholarCross RefCross Ref
  55. Jungdam Won, Deepak Gopinath, and Jessica Hodgins. 2020. A scalable approach to control diverse behaviors for physically simulated characters. ACM Transactions on Graphics 39, 4 (2020), 33--1.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Jungdam Won, Jungnam Park, and Jehee Lee. 2018. Aerobatics control of flying creatures via self-regulated learning. ACM Transactions on Graphics 37, 6 (2018), 1--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Chun-Chih Wu and Victor Zordan. 2010. Goal-directed stepping with momentum control. In Proceedings of the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 113--118.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Jia-chi Wu and Zoran Popović. 2010. Terrain-adaptive bipedal locomotion control. ACM Transactions on Graphics 29, 4 (07 2010), 72:1--72:10.Google ScholarGoogle Scholar
  59. Zhaoming Xie, Hung Yu Ling, Nam Hee Kim, and Michiel van de Panne. 2020. ALLSTEPS: Curriculum-Driven Learning of Stepping Stone Skills. Computer Graphics Forum 39 (2020), 213--224.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. KangKang Yin, Kevin Loken, and Michiel van de Panne. 2007. Simbicon: Simple biped locomotion control. ACM Transactions on Graphics 26, 3 (2007), 105-es.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Wenhao Yu, Greg Turk, and C Karen Liu. 2018. Learning symmetric and low-energy locomotion. ACM Transactions on Graphics 37, 4 (2018), 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. 2019. Self-attention generative adversarial networks. In International Conference on Machine Learning. 7354--7363.Google ScholarGoogle Scholar

Index Terms

  1. A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the ACM on Computer Graphics and Interactive Techniques
        Proceedings of the ACM on Computer Graphics and Interactive Techniques  Volume 4, Issue 3
        September 2021
        268 pages
        EISSN:2577-6193
        DOI:10.1145/3488568
        Issue’s Table of Contents

        Copyright © 2021 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 September 2021
        Published in pacmcgit Volume 4, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!