Editorial Notes
The authors have requested minor, non-substantive changes to the VoR and, in accordance with ACM policies, a Corrected VoR was published on December 27, 2021. For reference purposes the VoR may still be accessed via the Supplemental Material section on this page.
Abstract
We present a simple and intuitive approach for interactive control of physically simulated characters. Our work builds upon generative adversarial networks (GAN) and reinforcement learning, and introduces an imitation learning framework where an ensemble of classifiers and an imitation policy are trained in tandem given pre-processed reference clips. The classifiers are trained to discriminate the reference motion from the motion generated by the imitation policy, while the policy is rewarded for fooling the discriminators. Using our GAN-like approach, multiple motor control policies can be trained separately to imitate different behaviors. In runtime, our system can respond to external control signal provided by the user and interactively switch between different policies. Compared to existing method, our proposed approach has the following attractive properties: 1) achieves state-of-the-art imitation performance without manually designing and fine tuning a reward function; 2) directly controls the character without having to track any target reference pose explicitly or implicitly through a phase state; and 3) supports interactive policy switching without requiring any motion generation or motion matching mechanism. We highlight the applicability of our approach in a range of imitation and interactive control tasks, while also demonstrating its ability to withstand external perturbations as well as to recover balance. Overall, our approach has low runtime cost and can be easily integrated into interactive applications and games.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control
Version of Record for "A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control" by Xu et al., Proceedings of the ACM on Computer Graphics and Interactive Techniques, Volume 4, Issue 3 (PACMCGIT 4:3).
- Kevin Bergamin, Simon Clavet, Daniel Holden, and James Richard Forbes. 2019. DReCon: data-driven responsive control of physics-based characters. ACM Transactions On Graphics 38, 6 (2019), 1--11.Google Scholar
Digital Library
- Andrew Brock, Jeff Donahue, and Karen Simonyan. 2018. Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018).Google Scholar
- David F Brown, Adriano Macchietto, KangKang Yin, and Victor Zordan. 2013. Control of rotational dynamics for ground behaviors. In Proceedings of the 12th ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 55--61.Google Scholar
Digital Library
- Nuttapong Chentanez, Matthias Müller, Miles Macklin, Viktor Makoviychuk, and Stefan Jeschke. 2018. Physics-Based motion capture imitation with deep reinforcement learning. In Proceedings of the 11th Annual International Conference on Motion, Interaction, and Games. Association for Computing Machinery, Article 1, 10 pages.Google Scholar
Digital Library
- Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).Google Scholar
Digital Library
- Simon Clavet. 2016. Motion matching and the road to next-gen animation. In Proc. of GDC.Google Scholar
- Stelian Coros, Philippe Beaudoin, and Michiel van de Panne. 2010. Generalized biped walking control. ACM Transactions on Graphics 29, 4 (2010), 130.Google Scholar
Digital Library
- Erwin Coumans and Yunfei Bai. 2016--2021. PyBullet, a Python module for physics simulation for games, robotics and machine learning. http://pybullet.org.Google Scholar
- Danilo Borges da Silva, Rubens Fernandes Nunes, Creto Augusto Vidal, Joaquim B Cavalcante-Neto, Paul G Kry, and Victor B Zordan. 2017. Tunable robustness: An artificial contact strategy with virtual actuator control for balance. Computer Graphics Forum 36, 8 (2017), 499--510.Google Scholar
Cross Ref
- Marco Da Silva, Yeuhi Abe, and Jovan Popović. 2008. Simulation of human motion data using short-horizon model-predictive control. Computer Graphics Forum 27, 2 (2008), 371--380.Google Scholar
Cross Ref
- Thomas G Dietterich. 2000. Ensemble methods in machine learning. In International workshop on multiple classifier systems. Springer, 1--15.Google Scholar
Digital Library
- Ishan Durugkar, Ian Gemp, and Sridhar Mahadevan. 2016. Generative multi-adversarial networks. arXiv preprint arXiv:1611.01673 (2016).Google Scholar
- Thomas Geijtenbeek, Nicolas Pronost, and A Frank van der Stappen. 2012. Simple data-driven control for simulated bipeds. In Proceedings of the 11th ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 211--219.Google Scholar
- Thomas Geijtenbeek, Michiel van de Panne, and A Frank Van Der Stappen. 2013. Flexible muscle-based locomotion for bipedal creatures. ACM Transactions on Graphics 32, 6 (2013), 1--11.Google Scholar
Digital Library
- Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial networks. arXiv preprint arXiv:1406.2661 (2014).Google Scholar
- Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron Courville. 2017. Improved training of Wasserstein GANs. arXiv preprint arXiv:1704.00028 (2017).Google Scholar
Digital Library
- Perttu Hämäläinen, Joose Rajamäki, and C Karen Liu. 2015. Online control of simulated humanoids using particle belief propagation. ACM Transactions on Graphics 34, 4 (2015), 1--13.Google Scholar
Digital Library
- Félix G. Harvey, Mike Yurick, Derek Nowrouzezahrai, and Christopher Pal. 2020. Robust motion in-betweening. ACM Transactions on Graphics 39, 4 (2020).Google Scholar
Digital Library
- Nicolas Heess, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, SM Eslami, Martin Riedmiller, et al. 2017. Emergence of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286 (2017).Google Scholar
- Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. arXiv preprint arXiv:1606.03476 (2016).Google Scholar
- Daniel Holden, Oussama Kanoun, Maksym Perepichka, and Tiberiu Popa. 2020. Learned motion matching. ACM Transactions on Graphics 39, 4 (2020), 53--1.Google Scholar
Digital Library
- Daniel Holden, Taku Komura, and Jun Saito. 2017. Phase-functioned neural networks for character control. ACM Transactions on Graphics 36, 4 (2017), 1--13.Google Scholar
Digital Library
- Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy, Dario Bellicoso, Vassilios Tsounis, Vladlen Koltun, and Marco Hutter. 2019. Learning agile and dynamic motor skills for legged robots. Science Robotics 4, 26 (2019).Google Scholar
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Taesoo Kwon and Jessica K Hodgins. 2010. Control systems for human running using an inverted pendulum model and a reference motion capture sequence. In Proceedings of the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 129--138.Google Scholar
Digital Library
- Taesoo Kwon and Jessica K Hodgins. 2017. Momentum-mapped inverted pendulum models for controlling dynamic human motions. ACM Transactions on Graphics 36, 1 (2017), 1--14.Google Scholar
Digital Library
- Seunghwan Lee, Moonseok Park, Kyoungmin Lee, and Jehee Lee. 2019. Scalable muscle-actuated human simulation and control. ACM Transactions on Graphics 38, 4 (2019), 1--13.Google Scholar
Digital Library
- Yoonsang Lee, Sungeun Kim, and Jehee Lee. 2010. Data-driven biped control. ACM Transactions on Graphics 29, 4 (2010), 129.Google Scholar
Digital Library
- Yoonsang Lee, Moon Seok Park, Taesoo Kwon, and Jehee Lee. 2014. Locomotion control for many-muscle humanoids. ACM Transactions on Graphics 33, 6 (2014), 1--11.Google Scholar
Digital Library
- Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).Google Scholar
- Jae Hyun Lim and Jong Chul Ye. 2017. Geometric GAN. arXiv preprint arXiv:1705.02894 (2017).Google Scholar
- Hung Yu Ling, Fabio Zinno, George Cheng, and Michiel Van De Panne. 2020. Character controllers using motion VAEs. ACM Transactions on Graphics 39, 4 (2020), 40--1.Google Scholar
Digital Library
- Libin Liu, Michiel van de Panne, and KangKang Yin. 2016. Guided learning of control graphs for physics-based characters. ACM Transactions on Graphics 35, 3 (2016), 1--14.Google Scholar
Digital Library
- Libin Liu, KangKang Yin, and Baining Guo. 2015. Improving sampling-based motion control. Computer Graphics Forum 34, 2 (2015), 415--423.Google Scholar
Digital Library
- Libin Liu, KangKang Yin, Michiel van de Panne, Tianjia Shao, and Weiwei Xu. 2010. Sampling-based contact-rich motion control. In ACM SIGGRAPH 2010 papers. 1--10.Google Scholar
- Josh Merel, Yuval Tassa, Dhruva TB, Sriram Srinivasan, Jay Lemmon, Ziyu Wang, Greg Wayne, and Nicolas Heess. 2017. Learning human behaviors from motion capture by adversarial imitation. arXiv preprint arXiv:1707.02201 (2017).Google Scholar
- Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. 2018. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018).Google Scholar
- Igor Mordatch and Emo Todorov. 2014. Combining the benefits of function approximation and trajectory optimization. In Robotics: Science and Systems, Vol. 4.Google Scholar
- Igor Mordatch, Emanuel Todorov, and Zoran Popović. 2012. Discovery of complex behaviors through contact-invariant optimization. ACM Transactions on Graphics 31, 4 (2012), 1--8.Google Scholar
Digital Library
- Masaki Nakada, Tao Zhou, Honglin Chen, Tomer Weiss, and Demetri Terzopoulos. 2018. Deep learning of biomimetic sensorimotor control for biomechanical human animation. ACM Transactions on Graphics 37, 4 (2018), 1--15.Google Scholar
Digital Library
- Andrew Y Ng, Stuart J Russell, et al. 2000. Algorithms for inverse reinforcement learning. In International Conference on Machine Learning, Vol. 1. 2.Google Scholar
- Soohwan Park, Hoseok Ryu, Seyoung Lee, Sunmin Lee, and Jehee Lee. 2019. Learning predict-and-simulate policies from unorganized human motion data. ACM Transactions on Graphics 38, 6 (2019), 1--11.Google Scholar
Digital Library
- Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel van de Panne. 2018. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Transactions on Graphics 37, 4 (2018), 1--14.Google Scholar
Digital Library
- Xue Bin Peng, Glen Berseth, KangKang Yin, and Michiel van de Panne. 2017. Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Transactions on Graphics 36, 4 (2017), 1--13.Google Scholar
Digital Library
- Xue Bin Peng, Erwin Coumans, Tingnan Zhang, Tsang-Wei Edward Lee, Jie Tan, and Sergey Levine. 2020. Learning agile robotic locomotion skills by imitating animals. In Robotics: Science and Systems.Google Scholar
- Xue Bin Peng, Ze Ma, Pieter Abbeel, Sergey Levine, and Angjoo Kanazawa. 2021. AMP: Adversarial motion priors for stylized physics-based character control. ACM Transactions on Graphics 40, 4, Article 1 (July 2021), 15 pages.Google Scholar
Digital Library
- Stuart Russell. 1998. Learning agents for uncertain environments. In Proceedings of the eleventh annual conference on Computational learning theory. 101--103.Google Scholar
Digital Library
- Andrew M Saxe, James L McClelland, and Surya Ganguli. 2013. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv preprint arXiv:1312.6120 (2013).Google Scholar
- John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015a. Trust region policy optimization. In International Conference on Machine Learning. 1889--1897.Google Scholar
Digital Library
- John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, and Pieter Abbeel. 2015b. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438 (2015).Google Scholar
- John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).Google Scholar
- Jie Tan, Karen Liu, and Greg Turk. 2011. Stable proportional-derivative controllers. IEEE Computer Graphics and Applications 31, 4 (2011), 34--44.Google Scholar
Digital Library
- Yuval Tassa, Tom Erez, and Emanuel Todorov. 2012. Synthesis and stabilization of complex behaviors through online trajectory optimization. In IEEE/RSJ International Conference on Intelligent Robots and Systems. 4906--4913.Google Scholar
Cross Ref
- Yuval Tassa, Nicolas Mansard, and Emo Todorov. 2014. Control-limited differential dynamic programming. In IEEE International Conference on Robotics and Automation. 1168--1175.Google Scholar
Cross Ref
- Jungdam Won, Deepak Gopinath, and Jessica Hodgins. 2020. A scalable approach to control diverse behaviors for physically simulated characters. ACM Transactions on Graphics 39, 4 (2020), 33--1.Google Scholar
Digital Library
- Jungdam Won, Jungnam Park, and Jehee Lee. 2018. Aerobatics control of flying creatures via self-regulated learning. ACM Transactions on Graphics 37, 6 (2018), 1--10.Google Scholar
Digital Library
- Chun-Chih Wu and Victor Zordan. 2010. Goal-directed stepping with momentum control. In Proceedings of the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 113--118.Google Scholar
Digital Library
- Jia-chi Wu and Zoran Popović. 2010. Terrain-adaptive bipedal locomotion control. ACM Transactions on Graphics 29, 4 (07 2010), 72:1--72:10.Google Scholar
- Zhaoming Xie, Hung Yu Ling, Nam Hee Kim, and Michiel van de Panne. 2020. ALLSTEPS: Curriculum-Driven Learning of Stepping Stone Skills. Computer Graphics Forum 39 (2020), 213--224.Google Scholar
Digital Library
- KangKang Yin, Kevin Loken, and Michiel van de Panne. 2007. Simbicon: Simple biped locomotion control. ACM Transactions on Graphics 26, 3 (2007), 105-es.Google Scholar
Digital Library
- Wenhao Yu, Greg Turk, and C Karen Liu. 2018. Learning symmetric and low-energy locomotion. ACM Transactions on Graphics 37, 4 (2018), 1--12.Google Scholar
Digital Library
- Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. 2019. Self-attention generative adversarial networks. In International Conference on Machine Learning. 7354--7363.Google Scholar
Index Terms
A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control
Recommendations
Composite Motion Learning with Task Control
We present a deep learning method for composite and task-driven motion control for physically simulated characters. In contrast to existing data-driven approaches using reinforcement learning that imitate full-body motions, we learn decoupled motions ...
Aerobatics control of flying creatures via self-regulated learning
Flying creatures in animated films often perform highly dynamic aerobatic maneuvers, which require their extreme of exercise capacity and skillful control. Designing physics-based controllers (a.k.a., control policies) for aerobatic maneuvers is very ...
How to train your dragon: example-guided control of flapping flight
Imaginary winged creatures in computer animation applications are expected to perform a variety of motor skills in a physically realistic and controllable manner. Designing physics-based controllers for a flying creature is still very challenging ...






Comments