Abstract
In mainstream approaches for 3D human action recognition, depth and skeleton features are combined to improve recognition accuracy. However, this strategy results in high feature dimensions and low discrimination due to redundant feature vectors. To solve this drawback, a multi-feature selection approach for 3D human action recognition is proposed in this paper. First, three novel single-modal features are proposed to describe depth appearance, depth motion, and skeleton motion. Second, a classification entropy of random forest is used to evaluate the discrimination of the depth appearance based features. Finally, one of the three features is selected to recognize the sample according to the discrimination evaluation. Experimental results show that the proposed multi-feature selection approach significantly outperforms other approaches based on single-modal feature and feature fusion.
- J. Wang, Z. Liu, Y. Wu, and J. Yuan. 2012. Mining actionlet ensemble for action recognition with depth cameras. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12), 1290--1297. Google Scholar
Digital Library
- J. Wang, Z. Liu, J. Chorowski, Z. Chen, and Y. Wu. 2012. Robust 3d action recognition with random occupancy patterns. In European Conference on Computer Vision (ECCV’12). Springer, 872--885.Google Scholar
- A. W. Vieira, E. R. Nascimento, G. L. Oliveira, Z. Liu, and M. F. Campos. 2012. Stop: Space-time occupancy patterns for 3d action recognition from depth map sequences. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. Springer, 252--259.Google Scholar
- H. Rahmani, A. Mahmood, D. Q. Huynh, and A. Mian. 2014. HOPC: Histogram of oriented principal components of 3D pointclouds for action recognition. In European Conference on Computer Vision (ECCV’14). Springer, 742--757.Google Scholar
- L. Xia, C.-C. Chen, and J. Aggarwal. 2012. View invariant human action recognition using histograms of 3d joints. In Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’12). IEEE, 20--27.Google Scholar
- X. Yang and Y. Tian. 2012. Eigenjoints-based action recognition using naive-bayes-nearest-neighbor. In Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’12). IEEE, 14--19.Google Scholar
- R. Vemulapalli, F. Arrate, and R. Chellappa. 2014. Human action recognition by representing 3D skeletons as points in a lie group. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). Google Scholar
Digital Library
- O. Oreifej and Z. Liu. 2013. Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13). IEEE, 716--723. Google Scholar
Digital Library
- X. Yang and Y. Tian. 2014. Super normal vector for activity recognition using depth sequences. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). Google Scholar
Digital Library
- X. Yang, C. Zhang, and Y. Tian. 2012. Recognizing actions using depth motion maps-based histograms of oriented gradients. In Proceedings of the 20th ACM International Conference on Multimedia. ACM, 1057--1060. Google Scholar
Digital Library
- A. Krizhevsky, I. Sutskever, and G. E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, 1097--1105. Google Scholar
Digital Library
- C. Chen, R. Jafari, and N. Kehtarnavaz. 2015. Action recognition from depth sequences using depth motion maps-based local binary patterns. In Proceedings of the 2015 IEEE Winter Conference on Applications of Computer Vision, 1092--1099. Google Scholar
Digital Library
- P. Wang, W. Li, Z. Gao, C. Tang, J. Zhang, and P. Ogunbona. 2015. Convnets-based action recognition from depth maps through virtual cameras and pseudocoloring. In Proceedings of the 23rd ACM International Conference on Multimedia. ACM, 1119--1122. Google Scholar
Digital Library
- P. Wang, W. Li, Z. Gao, J. Zhang, C. Tang, and P. O. Ogunbona. 2016. Action recognition from depth maps using deep convolutional neural networks. IEEE Transactions on Human-Machine Systems 46, 498--509.Google Scholar
Cross Ref
- A. Chaaraoui, J. Padilla-Lopez, and F. Flórez-Revuelta. 2013. Fusion of skeletal and silhouette-based features for human action recognition with rgb-d devices. In Proceedings of the IEEE International Conference on Computer Vision Workshops, 91--97. Google Scholar
Digital Library
- Y. Liu, L. Qin, Z. Cheng, Y. Zhang, W. Zhang, and Q. Huang. 2014. Da-ccd: A novel action representation by deep architecture of local depth feature. In Proceedings of the 2014 IEEE International Conference on Image Processing (ICIP’14). IEEE, 833--837.Google Scholar
- Y. Kong and Y. Fu. 2015. Bilinear heterogeneous information machine for RGB-D action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1054--1062.Google Scholar
- J.-F. Hu, W.-S. Zheng, J. Lai, and J. Zhang. 2015. Jointly learning heterogeneous features for RGB-D activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5344--5352.Google Scholar
- I. Guyon and A. Elisseeff. 2003. An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157--1182. Google Scholar
Digital Library
- K. Kira and L. A. Rendell. 1992. The feature selection problem: Traditional methods and a new algorithm. In Proceedings of the National Conference on Artificial Intelligence, 129--134. Google Scholar
Digital Library
- R. Kohavi and G. H. John. 1997. Wrappers for feature subset selection. Artificial Intelligence 97, 273--324. Google Scholar
Digital Library
- Y. Yang, Z. Ma, A. G. Hauptmann, and N. Sebe. 2013. Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Transactions on Multimedia 15, 661--669. Google Scholar
Digital Library
- J. Weston, A. Elisseeff, B. Scholkopf, and M. E. Tipping. 2003. Use of the zero norm with linear models and kernel methods. Journal of Machine Learning Research 3, 1439--1461. Google Scholar
Digital Library
- M. Huang, S. Z. Su, G. R. Cai, H. B. Zhang, D. Cao, and S. Z. Li. 2017. Meta-action descriptor for action recognition in RGBD video. IET Computer Vision 11, 301--308.Google Scholar
Cross Ref
- M. Huang, G.-R. Cai, H.-B. Zhang, S. Yu, D.-Y. Gong, D.-L. Cao, S. Li, and S.-Z. Su. 2018. Discriminative parts learning for 3d human action recognition. Neurocomputing 291 (2018), 84--96.Google Scholar
- J. Wu, Y. Zhang, and W. Lin. 2016. Good practices for learning to recognize actions using FV and VLAD. IEEE Transactions on Systems, Man, and Cybernetics 46, 2978--2990.Google Scholar
- X. Peng, L. Wang, X. Wang, and Y. Qiao. 2016. Bag of visual words and fusion methods for action recognition. Computer Vision and Image Understanding. 109--125. Google Scholar
Digital Library
- A. Liu, Y. Su, W. Nie, and M. S. Kankanhalli. 2017. Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 102--114. Google Scholar
Digital Library
- W. Li, Z. Zhang, and Z. Liu. 2010. Action recognition based on a bag of 3d points. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’10). IEEE, 9--14.Google Scholar
- A. Jalal, M. Z. Uddin, J. T. Kim, and T.-S. Kim. 2012. Recognition of human home activities via depth silhouettes and r transformation for smart homes. Indoor and Built Environment 21, 184--190.Google Scholar
Cross Ref
- C. Lu, J. Jia, and C.-K. Tang. 2014. Range-sample depth feature for action recognition. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Google Scholar
Digital Library
- R. Yang and R. Yang. 2015. DMM-pyramid based deep architectures for action recognition with depth cameras. In Asian Conference on Computer Vision (ACCV’14). Springer, 37--49.Google Scholar
- B. B. Amor, J. Su, and A. Srivastava. 2016. Action recognition using rate-invariant analysis of skeletal shape trajectories. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 1--13.Google Scholar
Digital Library
- A. Shahroudy, T. T. Ng, Q. Yang, and G. Wang. 2016. Multimodal multipart learning for action recognition in depth videos. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 2123--2129. Google Scholar
Digital Library
- P. Wang, Z. Li, Y. Hou, and W. Li. 2016. Action recognition based on joint trajectory maps using convolutional neural networks. In Proceedings of the 2016 ACM on Multimedia Conference. ACM, 102--106. Google Scholar
Digital Library
- D. Wu and L. Shao. 2014. Leveraging hierarchical parametric networks for skeletal joints based action segmentation and recognition. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). Google Scholar
Digital Library
- Y. Du, W. Wang, and L. Wang. 2015. Hierarchical recurrent neural network for skeleton based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1110--1118.Google Scholar
- L. Liu and L. Shao. 2013. Learning discriminative representations from RGB-D video data. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence, AAAI Press, 1493--1500. Google Scholar
Digital Library
- L. Liu, L. Shao, X. Li, and K. Lu. 2016. Learning spatio-temporal representations for action recognition: A genetic programming approach. IEEE Transactions on Systems, Man, and Cybernetics 46, 158--170.Google Scholar
- W. Chen and G. Guo. 2015. TriViews: A general framework to use 3D depth data effectively for action recognition. Journal of Visual Communication and Image Representation 26, 182--191. Google Scholar
Digital Library
- R. N. Bracewell. 1986. The Fourier Transform and Its Applications. McGraw-Hill New York.Google Scholar
- M. Li, H. Leung, and H. P. Shum. 2016. Human action recognition via skeletal and depth based feature fusion. In Proceedings of the 9th International Conference on Motion in Games. ACM, 123--132. Google Scholar
Digital Library
Index Terms
Multifeature Selection for 3D Human Action Recognition
Recommendations
Fusing Multiple Features for Depth-Based Action Recognition
Special Section on Visual Understanding with RGB-D SensorsHuman action recognition is a very active research topic in computer vision and pattern recognition. Recently, it has shown a great potential for human action recognition using the three-dimensional (3D) depth data captured by the emerging RGB-D ...
Automatic 3D face recognition from depth and intensity Gabor features
As is well known, traditional 2D face recognition based on optical (intensity or color) images faces many challenges, such as illumination, expression, and pose variation. In fact, the human face generates not only 2D texture information but also 3D ...
Human action recognition via skeletal and depth based feature fusion
MIG '16: Proceedings of the 9th International Conference on Motion in GamesThis paper addresses the problem of recognizing human actions captured with depth cameras. Human action recognition is a challenging task as the articulated action data is high dimensional in both spatial and temporal domains. An effective approach to ...






Comments