skip to main content
research-article

MoCap-solver: a neural solver for optical motion capture data

Published:19 July 2021Publication History
Skip Abstract Section

Abstract

In a conventional optical motion capture (MoCap) workflow, two processes are needed to turn captured raw marker sequences into correct skeletal animation sequences. Firstly, various tracking errors present in the markers must be fixed (cleaning or refining). Secondly, an agent skeletal mesh must be prepared for the actor/actress, and used to determine skeleton information from the markers (re-targeting or solving). The whole process, normally referred to as solving MoCap data, is extremely time-consuming, labor-intensive, and usually the most costly part of animation production. Hence, there is a great demand for automated tools in industry. In this work, we present MoCap-Solver, a production-ready neural solver for optical MoCap data. It can directly produce skeleton sequences and clean marker sequences from raw MoCap markers, without any tedious manual operations. To achieve this goal, our key idea is to make use of neural encoders concerning three key intrinsic components: the template skeleton, marker configuration and motion, and to learn to predict these latent vectors from imperfect marker sequences containing noise and errors. By decoding these components from latent vectors, sequences of clean markers and skeletons can be directly recovered. Moreover, we also provide a novel normalization strategy based on learning a pose-dependent marker reliability function, which greatly improves system robustness. Experimental results demonstrate that our algorithm consistently outperforms the state-of-the-art on both synthetic and real-world datasets.

Skip Supplemental Material Section

Supplemental Material

3450626.3459681.mp4
a84-chen.mp4

References

  1. Kfir Aberman, Peizhuo Li, Dani Lischinski, Olga Sorkine-Hornung, Daniel Cohen-Or, and Baoquan Chen. 2020. Skeleton-aware networks for deep motion retargeting. ACM Trans. Graph. 39, 4 (2020), 62.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ijaz Akhter, Tomas Simon, Sohaib Khan, Iain A. Matthews, and Yaser Sheikh. 2012. Bilinear spatiotemporal basis models. ACM Trans. Graph. 31, 2 (2012), 17:1--17:12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Andreas Aristidou, Daniel Cohen-Or, Jessica K. Hodgins, and Ariel Shamir. 2018. Self-similarity analysis for motion capture cleaning. CGF 37, 2 (2018), 297--309.Google ScholarGoogle ScholarCross RefCross Ref
  4. Andreas Aristidou and Joan Lasenby. 2013. Real-time marker prediction and CoR estimation in optical motion capture. Vis. Comput. 29, 1 (2013), 7--26.Google ScholarGoogle ScholarCross RefCross Ref
  5. Jan Baumann, Björn Krüger, Arno Zinke, and Andreas Weber. 2011. Data-driven completion of motion capture data. In Proc. of VRIPHYS. 111--118.Google ScholarGoogle Scholar
  6. Michael Burke and Joan Lasenby. 2016. Estimating missing marker positions using low dimensional Kalman smoothing. J. Biomechanics 49, 9 (2016), 1854--1858.Google ScholarGoogle ScholarCross RefCross Ref
  7. Jinxiang Chai and Jessica K. Hodgins. 2005. Performance animation from low-dimensional control signals. ACM Trans. Graph. 24, 3 (2005), 686--696.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. CMU. 2000. CMU graphics lab motion capture database. http://mocap.cs.cmu.edu (2000).Google ScholarGoogle Scholar
  9. Klaus Dorfmüller-Ulhaas. 2007. Robust optical user motion tracking using a kalman filter. (2007).Google ScholarGoogle Scholar
  10. Yinfu Feng, Mingming Ji, Jun Xiao, Xiaosong Yang, Jian J. Zhang, Yueting Zhuang, and Xuelong Li. 2015. Mining spatial-temporal patterns and structural sparsity for human motion data denoising. IEEE Trans. Cyber. 45, 12 (2015), 2693--2706.Google ScholarGoogle ScholarCross RefCross Ref
  11. Yinfu Feng, Jun Xiao, Yueting Zhuang, Xiaosong Yang, Jian J. Zhang, and Rong Song. 2014. Exploiting temporal stability and low-rank structure for motion capture data refinement. Inf. Sci. 277 (2014), 777--793.Google ScholarGoogle ScholarCross RefCross Ref
  12. Katerina Fragkiadaki, Sergey Levine, Panna Felsen, and Jitendra Malik. 2015. Recurrent network models for human dynamics. In Proc. of ICCV. 4346--4354.Google ScholarGoogle ScholarCross RefCross Ref
  13. Félix G. Harvey, Mike Yurick, Derek Nowrouzezahrai, and Christopher J. Pal. 2020. Robust motion in-betweening. ACM Trans. Graph. 39, 4 (2020), 60.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Gustav Eje Henter, Simon Alexanderson, and Jonas Beskow. 2020. MoGlow: probabilistic and controllable motion synthesis using normalising flows. ACM Trans. Graph. 39, 6 (2020), 236:1--236:14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Lorna Herda, Pascal Fua, Ralf Plänkers, Ronan Boulic, and Daniel Thalmann. 2000. Skeleton-based motion capture for robust reconstruction of human motion. In Proc. of CA. IEEE, 77.Google ScholarGoogle ScholarCross RefCross Ref
  16. Daniel Holden. 2018. Robust solving of optical motion capture data by denoising. ACM Trans. Graph. 37, 4 (2018), 165:1--165:12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Daniel Holden, Jun Saito, and Taku Komura. 2016. A deep learning framework for character motion synthesis and editing. ACM Trans. Graph. 35, 4 (2016), 138:1--138:11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Daniel Holden, Jun Saito, Taku Komura, and Thomas Joyce. 2015. Learning motion manifolds with convolutional autoencoders. In SIGGRAPH Asia Technical Briefs. 18:1--18:4.Google ScholarGoogle Scholar
  19. Alexander Hornung, Sandip Sar-Dessai, and Leif Kobbelt. 2005. Self-calibrating optical motion tracking for articulated bodies. In Proc. of VR. 75--82.Google ScholarGoogle Scholar
  20. Manuel Kaufmann, Emre Aksan, Jie Song, Fabrizio Pece, Remo Ziegler, and Otmar Hilliges. 2020. Convolutional autoencoders for human motion infilling. CoRR (2020).Google ScholarGoogle Scholar
  21. Adam G. Kirk, James F. O'Brien, and David A. Forsyth. 2005. Skeletal parameter estimation from optical motion capture data. In Proc. of CVPR. 782--788.Google ScholarGoogle Scholar
  22. Ranch Y. Q. Lai, Pong C. Yuen, and Kelvin K. W. Lee. 2011. Motion capture data completion and denoising by singular value thresholding. In EG Short Papers. 45--48.Google ScholarGoogle Scholar
  23. Kyungho Lee, Seyoung Lee, and Jehee Lee. 2018. Interactive character animation by learning multi-objective control. ACM Trans. Graph. 37, 6 (2018), 180:1--180:10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Lei Li, James McCann, Nancy S. Pollard, and Christos Faloutsos. 2010. BoLeRO: A principled technique for including bone length constraints in motion capture occlusion filling. In Proc. of SCA. 179--188.Google ScholarGoogle Scholar
  25. Shujie Li, Yang Zhou, Haisheng Zhu, Wenjun Xie, Yang Zhao, and Xiaoping Liu. 2019. Bidirectional recurrent autoencoder for 3D skeleton motion data refinement. Comput. Graph. 81 (2019), 92--103.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Shu-Jie Li, Hai-Sheng Zhu, Liping Zheng, and Lin Li. 2020. A perceptual-based noise-agnostic 3D skeleton motion data refinement network. IEEE Access 8 (2020), 52927--52940.Google ScholarGoogle ScholarCross RefCross Ref
  27. Guodong Liu and Leonard McMillan. 2006. Estimation of missing markers in human motion capture. Vis. Comput. 22, 9-11 (2006), 721--728.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Xin Liu, Yiu-ming Cheung, Shu-Juan Peng, Zhen Cui, Bineng Zhong, and Ji-Xiang Du. 2014. Automatic motion capture data denoising via filtered subspace clustering and low rank matrix approximation. Signal Process. 105 (2014), 350--362.Google ScholarGoogle ScholarCross RefCross Ref
  29. Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J Black. 2015. SMPL: A skinned multi-person linear model. ACM Trans. Graph. 34, 6 (2015), 248.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Naureen Mahmood, Nima Ghorbani, Nikolaus F. Troje, Gerard Pons-Moll, and Michael J. Black. 2019. AMASS: Archive of motion capture as surface shapes. In Proc. of ICCV. 5441--5450.Google ScholarGoogle Scholar
  31. Utkarsh Mall, G. Roshan Lal, Siddhartha Chaudhuri, and Parag Chaudhuri. 2017. A deep recurrent framework for cleaning motion capture data. CoRR (2017).Google ScholarGoogle Scholar
  32. Sang Il Park and Jessica K. Hodgins. 2006. Capturing and animating skin deformation in human motion. ACM Trans. Graph. 25, 3 (2006), 881--889.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed AA Osman, Dimitrios Tzionas, and Michael J Black. 2019. Expressive body capture: 3d hands, face, and body from a single image. In Proc. of CVPR. 10975--10985.Google ScholarGoogle ScholarCross RefCross Ref
  34. Dario Pavllo, Mathias Delahaye, Thibault Porssut, Bruno Herbelin, and Ronan Boulic. 2019. Real-time neural network prediction for handling two-hands mutual occlusions. Comput. Graph. X 2 (2019).Google ScholarGoogle Scholar
  35. Dario Pavllo, Christoph Feichtenhofer, Michael Auli, and David Grangier. 2020. Modeling human motion with Quaternion-based neural networks. Int. J. Comput. Vis. 128, 4 (2020), 855--872.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Maksym Perepichka, Daniel Holden, Sudhir Mudur, and Tiberiu Popa. 2019. Robust marker trajectory repair for MOCAP using kinematic reference. In Proc. of MIG. 1--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Kathleen M Robinette, Sherri Blackwell, Hein Daanen, Mark Boehmer, and Scott Fleming. 2002. Civilian American and European surface anthropometry resource (CAESAR). Technical Report.Google ScholarGoogle Scholar
  38. Javier Romero, Dimitrios Tzionas, and Michael J Black. 2017. Embodied hands: Modeling and capturing hands and bodies together. ACM Trans. Graph. 36, 6 (2017), 245.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Jochen Tautges, Arno Zinke, Björn Krüger, Jan Baumann, Andreas Weber, Thomas Helten, Meinard Müller, Hans-Peter Seidel, and Bernd Eberhardt. 2011. Motion reconstruction using sparse accelerometer data. ACM Trans. Graph. 30, 3 (2011), 18:1--18:12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Graham W. Taylor, Geoffrey E. Hinton, and Sam T. Roweis. 2006. Modeling human motion using binary latent variables. In Proc. of NIPS. 1345--1352.Google ScholarGoogle Scholar
  41. Zhao Wang, Shuang Liu, Rongqiang Qian, Tao Jiang, Xiaosong Yang, and Jian J Zhang. 2016. Human motion data refinement unitizing structural sparsity and spatial-temporal information. In Proc. of ICSP. 975--982.Google ScholarGoogle ScholarCross RefCross Ref
  42. Jun Xiao, Yinfu Feng, Mingming Ji, Xiaosong Yang, Jian J. Zhang, and Yueting Zhuang. 2015. Sparse motion bases selection for human motion denoising. Signal Process. 110 (2015), 108--122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Sijie Yan, Yuanjun Xiong, and Dahua Lin. 2018. Spatial temporal graph convolutional networks for skeleton-based action recognition. In Proc. of AAAI. 7444--7452.Google ScholarGoogle ScholarCross RefCross Ref
  44. Victor B. Zordan and Nicholas C. Van Der Horst. 2003. Mapping optical motion capture data to skeletal motion using a physical model. In Proc. of SCA. 245--250.Google ScholarGoogle Scholar

Index Terms

  1. MoCap-solver: a neural solver for optical motion capture data

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Graphics
        ACM Transactions on Graphics  Volume 40, Issue 4
        August 2021
        2170 pages
        ISSN:0730-0301
        EISSN:1557-7368
        DOI:10.1145/3450626
        Issue’s Table of Contents

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 19 July 2021
        Published in tog Volume 40, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader