skip to main content
research-article

Learning basketball dribbling skills using trajectory optimization and deep reinforcement learning

Published:30 July 2018Publication History
Skip Abstract Section

Abstract

Basketball is one of the world's most popular sports because of the agility and speed demonstrated by the players. This agility and speed makes designing controllers to realize robust control of basketball skills a challenge for physics-based character animation. The highly dynamic behaviors and precise manipulation of the ball that occur in the game are difficult to reproduce for simulated players. In this paper, we present an approach for learning robust basketball dribbling controllers from motion capture data. Our system decouples a basketball controller into locomotion control and arm control components and learns each component separately. To achieve robust control of the ball, we develop an efficient pipeline based on trajectory optimization and deep reinforcement learning and learn non-linear arm control policies. We also present a technique for learning skills and the transition between skills simultaneously. Our system is capable of learning robust controllers for various basketball dribbling skills, such as dribbling between the legs and crossover moves. The resulting control graphs enable a simulated player to perform transitions between these skills and respond to user interaction.

Skip Supplemental Material Section

Supplemental Material

142-250.mp4

References

  1. Mazen Al Borno, Martin de Lasa, and Aaron Hertzmann. 2013. Trajectory Optimization for Full-Body Movements with Complex Contacts. IEEE Trans. Visual. Comput. Graph. 19, 8 (Aug 2013), 1405--1414. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Sheldon Andrews and Paul G. Kry. 2013. Goal directed multi-finger manipulation: Control policies and analysis. Comput. Graph. 37, 7 (2013), 830 -- 839. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Yunfei Bai, Kristin Siu, and C. Karen Liu. 2012. Synthesis of Concurrent Object Manipulation Tasks. ACM Trans. Graph. 31, 6, Article 156 (Nov. 2012), 9 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Georg Bätz, Kwang-Kyu Lee, Dirk Wollherr, and Martin Buss. 2009. Robot basketball: A comparison of ball dribbling with visual and force/torque feedback. In 2009 IEEE International Conference on Robotics and Automation. 514--519. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Georg Bätz, Uwe Mettin, Alexander Schmidts, Michael Scheint, Dirk Wollherr, and Anton S. Shiriaev. 2010. Ball dribbling with an underactuated continuous-time control phase: Theory & experiments. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems. 2890--2895.Google ScholarGoogle Scholar
  6. Stelian Coros, Philippe Beaudoin, and Michiel van de Panne. 2009. Robust Task-based Control Policies for Physics-based Characters. ACM Trans. Graph. 28, 5, Article 170 (Dec. 2009), 9 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Stelian Coros, Philippe Beaudoin, and Michiel van de Panne. 2010. Generalized Biped Walking Control. ACM Trans. Graph. 29, 4, Article 130 (July 2010), 9 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Kai Ding, Libin Liu, Michiel van de Panne, and KangKang Yin. 2015. Learning Reduced-order Feedback Policies for Motion Skills. In Proceedings of the 14th ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '15). ACM, 83--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gaël Guennebaud, Benoït Jacob, and others. 2010. Eigen v3. http://eigen.tuxfamily.org. (2010).Google ScholarGoogle Scholar
  10. Sehoon Ha, Yuting Ye, and C. Karen Liu. 2012. Falling and Landing Motion Control for Character Animation. ACM Trans. Graph. 31, 6, Article 155 (Nov. 2012), 9 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sami Haddadin, Kai Krieger, and Alin Albu-Schäffer. 2011. Exploiting elastic energy storage for cyclic manipulation: Modeling, stability, and observations for dribbling. In 2011 50th IEEE Conference on Decision and Control and European Control Conference. 690--697.Google ScholarGoogle ScholarCross RefCross Ref
  12. Nikolaus Hansen. 2006. The CMA Evolution Strategy: A Comparing Review. In Towards a New Evolutionary Computation. Studies in Fuzziness and Soft Computing, Vol. 192. Springer Berlin Heidelberg, 75--102.Google ScholarGoogle Scholar
  13. Matthew J. Hausknecht and Peter Stone. 2015. Deep Reinforcement Learning in Parameterized Action Space. CoRR abs/1511.04143 (2015). http://arxiv.org/abs/1511.04143Google ScholarGoogle Scholar
  14. Jessica K. Hodgins, Wayne L. Wooten, David C. Brogan, and James F. O'Brien. 1995. Animating Human Athletics. In Proceedings of SIGGRAPH. 71--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sumit Jain and C. Karen Liu. 2009. Interactive Synthesis of Human-object Interaction. In Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '09). 47--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980Google ScholarGoogle Scholar
  17. Paul G. Kry and Dinesh K. Pai. 2006. Interaction Capture and Synthesis. ACM Trans. Graph. 25, 3 (July 2006), 872--880. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Taesoo Kwon and Jessica K. Hodgins. 2017. Momentum-Mapped Inverted Pendulum Models for Controlling Dynamic Human Motions. ACM Trans. Graph. 36, 1, Article 10 (Jan. 2017), 14 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Yoonsang Lee, Sungeun Kim, and Jehee Lee. 2010a. Data-driven Biped Control. ACM Trans. Graph. 29, 4, Article 129 (July 2010), 8 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yongjoon Lee, Kevin Wampler, Gilbert Bernstein, Jovan Popović, and Zoran Popović. 2010b. Motion Fields for Interactive Character Locomotion. ACM Trans. Graph. 29, 6, Article 138 (Dec. 2010), 8 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Sergey Levine and Vladlen Koltun. 2013. Guided Policy Search. In Proceedings of the 30th International Conference on Machine Learning, Vol. 28(3). 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Sergey Levine and Vladlen Koltun. 2014. Learning Complex Neural Network Policies with Trajectory Optimization. In Proceedings of the 31st International Conference on Machine Learning, Vol. 32(2). 829--837. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. CoRR abs/1509.02971 (2015). http://arxiv.org/abs/1509.02971Google ScholarGoogle Scholar
  24. C. Karen Liu. 2009. Dextrous Manipulation from a Grasping Pose. ACM Trans. Graph. 28, 3, Article 59 (July 2009), 6 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. Karen Liu, Aaron Hertzmann, and Zoran Popović. 2006. Composition of Complex Optimal Multi-character Motions. In Proceedings of the 2006 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '06). 215--222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Libin Liu and Jessica Hodgins. 2017. Learning to Schedule Control Fragments for Physics-Based Characters Using Deep Q-Learning. ACM Trans. Graph. 36, 3, Article 29 (June 2017), 14 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Libin Liu, Michiel van de Panne, and KangKang Yin. 2016. Guided Learning of Control Graphs for Physics-Based Characters. ACM Trans. Graph. 35, 3, Article 29 (May 2016), 14 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Libin Liu, KangKang Yin, Bin Wang, and Baining Guo. 2013. Simulation and Control of Skeleton-driven Soft Body Characters. ACM Trans. Graph. 32, 6 (2013), Article 215. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Adriano Macchietto, Victor Zordan, and Christian R. Shelton. 2009. Momentum control for balance. ACM Trans. Graph. 28, 3 (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. James McCann and Nancy Pollard. 2007. Responsive Characters from Motion Fragments. ACM Trans. Graph. 26, 3, Article 6 (July 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015b. Human-level control through deep reinforcement learning. Nature 518, 7540 (26 Feb 2015), 529--533. Letter.Google ScholarGoogle Scholar
  32. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, and et al. 2015a. Human-level control through deep reinforcement learning. Nature 518, 7540 (26 Feb 2015), 529--533. Letter.Google ScholarGoogle Scholar
  33. Igor Mordatch, Martin de Lasa, and Aaron Hertzmann. 2010. Robust Physics-based Locomotion Using Low-dimensional Planning. ACM Trans. Graph. 29, 4, Article 71 (July 2010), 8 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Igor Mordatch, Kendall Lowrey, Galen Andrew, Zoran Popovic, and Emanuel V. Todorov. 2015. Interactive Control of Diverse Complex Characters with Neural Networks. In Advances in Neural Information Processing Systems 28. 3114--3122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Igor Mordatch, Zoran Popović, and Emanuel Todorov. 2012a. Contact-invariant Optimization for Hand Manipulation. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '12). 137--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Igor Mordatch, Emanuel Todorov, and Zoran Popović. 2012b. Discovery of Complex Behaviors Through Contact-invariant Optimization. ACM Trans. Graph. 31, 4, Article 43 (July 2012), 8 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Uldarico Muico, Yongjoon Lee, Jovan Popović, and Zoran Popović. 2009. Contact-aware nonlinear control of dynamic characters. ACM Trans. Graph. 28, 3 (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Xue Bin Peng, Glen Berseth, Kangkang Yin, and Michiel Van De Panne. 2017. DeepLoco: Dynamic Locomotion Skills Using Hierarchical Deep Reinforcement Learning. ACM Trans. Graph. 36, 4, Article 41 (July 2017), 13 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Xue Bin Peng and Michiel van de Panne. 2017. Learning Locomotion Skills Using DeepRL: Does the Choice of Action Space Matter?. In Proceedings of the ACM SIGGRAPH / Eurographics Symposium on Computer Animation (SCA '17). Article 12, 13 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Nancy S. Pollard and Victor Brian Zordan. 2005. Physically Based Grasping Control from Example. In Proceedings of the 2005 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '05). 311--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Philipp Reist and Raffaello D'Andrea. 2012. Design and Analysis of a Blind Juggling Robot. IEEE Trans. Robot. 28, 6 (Dec 2012), 1228--1243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, and Ilya Sutskever. 2017. Evolution Strategies as a Scalable Alternative to Reinforcement Learning. ArXiv e-prints (March 2017). arXiv:stat.ML/1703.03864Google ScholarGoogle Scholar
  43. John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015a. Trust Region Policy Optimization. In The 32nd International Conference on Machine Learning. 1889--1897. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. John Schulman, Philipp Moritz, Sergey Levine, Michael I. Jordan, and Pieter Abbeel. 2015b. High-Dimensional Continuous Control Using Generalized Advantage Estimation. CoRR abs/1506.02438 (2015). http://arxiv.org/abs/1506.02438Google ScholarGoogle Scholar
  45. David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, and Martin Riedmiller. 2014. Deterministic Policy Gradient Algorithms. In The 31st International Conference on Machine Learning. 387--395. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Jie Tan, Yuting Gu, C. Karen Liu, and Greg Turk. 2014. Learning Bicycle Stunts. ACM Trans. Graph. 33, 4, Article 50 (July 2014), 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Jie Tan, Yuting Gu, Greg Turk, and C. Karen Liu. 2011. Articulated Swimming Creatures. ACM Trans. Graph. 30, 4, Article 58 (July 2011), 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688 (May 2016). http://arxiv.org/abs/1605.02688Google ScholarGoogle Scholar
  49. Adrien Treuille, Yongjoon Lee, and Zoran Popović. 2007. Near-optimal Character Animation with Continuous Control. ACM Trans. Graph. 26, 3, Article 7 (July 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Hado van Hasselt. 2012. Reinforcement Learning in Continuous State and Action Spaces. Springer, Berlin, Heidelberg, 207--251.Google ScholarGoogle Scholar
  51. Kevin Wampler and Zoran Popović. 2009. Optimal gait and form for animal locomotion. ACM Trans. Graph. 28, 3 (2009), Article 60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Jack M. Wang, David J. Fleet, and Aaron Hertzmann. 2010. Optimizing Walking Controllers for Uncertain Inputs and Environments. ACM Trans. Graph. 29, 4, Article 73 (July 2010), 8 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Nkenge Wheatland, Yingying Wang, Huaguang Song, Michael Neff, Victor Zordan, and Sophie Jörg. 2015. State of the Art in Hand and Finger Modeling and Animation. Comput. Graph. Forum 34, 2 (May 2015), 735--760. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Yuting Ye and C. Karen Liu. 2012. Synthesis of detailed hand manipulations using contact sampling. ACM Trans. Graph. 31, 4 (2012), Article 41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. KangKang Yin, Kevin Loken, and Michiel van de Panne. 2007. SIMBICON: Simple Biped Locomotion Control. ACM Trans. Graph. 26, 3 (2007), Article 105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Wenping Zhao, Jianjie Zhang, Jianyuan Min, and Jinxiang Chai. 2013. Robust Realtime Physics-based Motion Control for Human Grasping. ACM Trans. Graph. 32, 6, Article 207 (Nov. 2013), 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Victor Zordan, David Brown, Adriano Macchietto, and KangKang Yin. 2014. Control of Rotational Dynamics for Ground and Aerial Behavior. IEEE Trans. Visual. Comput. Graph. 20, 10 (Oct 2014), 1356--1366.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Learning basketball dribbling skills using trajectory optimization and deep reinforcement learning

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Graphics
          ACM Transactions on Graphics  Volume 37, Issue 4
          August 2018
          1670 pages
          ISSN:0730-0301
          EISSN:1557-7368
          DOI:10.1145/3197517
          Issue’s Table of Contents

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 30 July 2018
          Published in tog Volume 37, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader