Abstract
3D human motion capture is a form of multimedia data that is widely used in entertainment as well as medical fields (such as orthopedics, physical medicine, and rehabilitation where gait analysis is needed). These applications typically create large repositories of motion capture data and need efficient and accurate content-based retrieval techniques. 3D motion capture data is in the form of multidimensional time-series data. To reduce the dimensions of human motion data while maintaining semantically important features, we quantize human motion data by extracting spatio-temporal features through SVD and translate them onto a symbolic sequential representation through our proposed sGMMEM (semantic Gaussian Mixture Modeling with EM). In order to handle variations in motion capture data due to human body characteristics and speed of motion, we transform the semantically quantized values into a histogram representation. This representation is used as a signature for classification and similarity-based retrieval. We achieved good classification accuracies for “coarse” human motion categories (such as walking 92.85%, run 91.42%, and jump 94.11%) and even for subtle categories (such as dance 89.47%, laugh 83.33%, basketball signal 85.71%, golf putting 80.00%). Experiments also demonstrated that the proposed approach outperforms earlier techniques such as the wMSV (weighted Motion Singular Vector) approach and LB_Keogh method.
- Adjeroh, D. A., King, I., and Lee, M. C. 1998a. Video sequence similarity matching. In Proceedings of the IAPR International Workshop on Multimedia Information Analysis and Retrieval. 80--95. Google Scholar
Digital Library
- Adjeroh, D. A., King, I., and Lee, M. C. 1998b. A distance measure for video sequence similarity matching. In Proceedings of the International Workshop on Multi-Media Database Management Systems (IW-MMDBS).Google Scholar
- Alexa, M. and Muller, W. 2000. Representing animations by principal components. Comput. Graph. Forum 19, 3, 411--418.Google Scholar
Cross Ref
- Bishop, C. M. 2006. Pattern Recognition and Machine Learning. Springer. CMU Motion Capture Library. CMU motion capture library homepage. http://mocap.cs.cmu.edu Google Scholar
Digital Library
- Cardle, M., Vlachos, M., Brooks, S., Keogh, E., and Gunopulos, D. 2003. Fast motion capture matching with replicated motion editing. In Proceedings of the 29th ACM SIGGRAPH Sketches and Applications.Google Scholar
- Dempster, A. P., Laird, N. M., and Rubin, D. B. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statist. Soc. B39, 1--38.Google Scholar
- Duda, R. O., Hart, P. E., and Stork, D. G. 2001. Pattern Classification. John Wiley and Sons, New York. Google Scholar
Digital Library
- Golub, G. H. and van Loan, C. F. 1996. Matrix Computation. The Johns Hopkins University Press, Baltimore, MD.Google Scholar
- Guerra-Filho, G. and Aloimonos, Y. 2007. A language for human action. IEEE Comput. Mag. 40, 5. Google Scholar
Digital Library
- Guerra-Filho, G. and Aloimonos, Y. 2006. A sensory-motor language for human activity understanding. In Proceedings of the 6th IEEE-RAS International Conference on Humanoid Robotics (HUMANOIDS'06). 69--75.Google Scholar
- Jin, Y. and Prabhakaran, B. 2008. Semantic quantization of 3D human motion capture data through spatial-temporal feature extraction. In Proceedings of the International Multimedia Modeling Conference (MMM). Google Scholar
Digital Library
- Ketterer, J., Puzicha, J., Held, M., Fisher, M., Buhmann, J. M., and Fellner, D. 2000. On spatial quantization of color images. IEEE Trans. Image Process. 9, 666--682. Google Scholar
Digital Library
- Kohonen, T., Kangas, J., Laaksonen, J., and Torkkola, K. 1992. A program package for the correct application of learning vector quantization algorithms. In Proceedings of the International Joint Conference on Neural Networks. 725--730.Google Scholar
- Kovar, L. and Gleicher, M. 2004. Automated extraction and parameterization of motions in large data sets. ACM Trans. Graph. 23, 3, 559--568. Google Scholar
Digital Library
- Keogh, E., Palpanas, T., Zordan, V. B., Gunopulos, D., and Cardle, M. 2004. Indexing large human-motion databases. In Proceedings of the 30th International Conference on Very Large Databases (VLDB'04). 780--791. Google Scholar
Digital Library
- Li, C., Pradhan, G. N., Zheng, S. Q., and Prabhakaran, B. 2004. Indexing of variable length multi-attribute motion data. In Proceedings of the 2nd ACM International Workshop on Multimedia. 75--84. Google Scholar
Digital Library
- Li, C., Kulkani, P. R., and Prabhakaran, B. 2007a. Motion stream segmentation and recognition by classification. Int. J. Multimedia Tools Appl. 35, 1.Google Scholar
- Li, C., Zheng, S. Q., and Prabhakaran, B. 2007b. Segmentation and recognition of motion streams by similarity search. ACM Trans. Multimedia Comput., Comm. Appl. 3, 3, Article 16. Google Scholar
Digital Library
- Liu, F., Zhuang, Y., Wu, F., and Pan, Y. 2003. 3D motion retrieval with motion index tree. Comput. Vis. Image Understand. 92, 265--284. Google Scholar
Digital Library
- Liu, G., Zhang, J., Wang, W., and McMillan, L. 2005. A system for analyzing and indexing human-motion databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Google Scholar
Digital Library
- Meredith, M. and Maddock, S. Motion capture file format explained. Tech. rep., Department of Computer Science, University of Sheffield.Google Scholar
- Muller, M., Roder, T., and Clausen, M. 2005. Efficient content based retrieval of motion capture data. ACM Tran. Graph. 24, 677--685. Google Scholar
Digital Library
- NCBI. National center for biotechnology information. http://www.ncbi.nlm.nih.gov/Google Scholar
- Pradhan, G. N., Li, C., and Prabhakaran, B. 2007. Hierarchical indexing structure for 3D human motion. In Proceedings of the International Multimedia Modeling Conference (MMM). Google Scholar
Digital Library
- Redner, R. A. and Walker, H. F. 1984. Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev. 26, 195--239.Google Scholar
Digital Library
- Wu, C. F. 1983. On the convergence properties of the EM algorithm. Ann. Statist. 11, 95--103.Google Scholar
Cross Ref
- Xu, L. and Jordan, M. I. 1996. On convergence properties of the EM algorithm for Gaussian mixtures. Neur. Comput. 8, 1, 129--151. Google Scholar
Digital Library
- Yang, K. and Shahabi, C. 2005. Multilevel distance-based index structure for multivariate time series. In Proceedings of the TIME Conference. IEEE Computer Society. Google Scholar
Digital Library
- Yang, C. F., Ye, M., and Zhao, J. 2003. Document clustering based on nonnegative sparse matrix factorization. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 267--273. Google Scholar
Digital Library
- Yu, C., Ooi, B. C., Tan, K.-L., and Jagadish, H. V. 2001. Indexing the distance: An efficient method to knn processing. In Proceedings of the International Conference on Very Large Databases (VLDB'01). 421--430. Google Scholar
Digital Library
Index Terms
Knowledge discovery from 3D human motion streams through semantic dimensional reduction
Recommendations
SportsCap: Monocular 3D Human Motion Capture and Fine-Grained Understanding in Challenging Sports Videos
AbstractMarkerless motion capture and understanding of professional non-daily human movements is an important yet unsolved task, which suffers from complex motion patterns and severe self-occlusion, especially for the monocular setting. In this paper, we ...
Measuring motion significance and motion complexity
In this paper, we propose two novel measures to specify motion significance and motion complexity from human motion trajectories. Motion significance indicates the relative meaningfulness of every motion frame which is defined as a set of data points ...
Reconstruct 3D human motion from monocular video using motion library
MMM'08: Proceedings of the 14th international conference on Advances in multimedia modelingIn this paper, we present a new approach to reconstruct 3D human motion from video clips with the assistance of a precaputred motion library. Given a monocular video clip recording of one person performing some kind of locomotion and a motion library ...






Comments