Abstract
Recently, the deep computation model, as a tensor deep learning model, has achieved super performance for multimedia feature learning. However, the conventional deep computation model involves a large number of parameters. Typically, training a deep computation model with millions of parameters needs high-performance servers with large-scale memory and powerful computing units, limiting the growth of the model size for multimedia feature learning on common devices such as portable CPUs and conventional desktops. To tackle this problem, this article proposes a Tucker deep computation model by using the Tucker decomposition to compress the weight tensors in the full-connected layers for multimedia feature learning. Furthermore, a learning algorithm based on the back-propagation strategy is devised to train the parameters of the Tucker deep computation model. Finally, the performance of the Tucker deep computation model is evaluated by comparing with the conventional deep computation model on two representative multimedia datasets, that is, CUAVE and SNAE2, in terms of accuracy drop, parameter reduction, and speedup in the experiments. Results imply that the Tucker deep computation model can achieve a large-parameter reduction and speedup with a small accuracy drop for multimedia feature learning.
- Oluwakemi A. Ademoye, Niall Murray, Gabriel-Miro Muntean, and Gheorghita Ghinea. 2016. Audio masking effect on inter-component skews in olfaction-enhanced multimedia presentations. ACM Transactions on Multimedia Computing, Communications, and Applications 12, 4, 51.Google Scholar
Digital Library
- Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 8, 1798--1828. Google Scholar
Digital Library
- Wenlin Chen, James T. Wilson, Stephen Tyree, Kilian Q. Weinberger, and Yixin Chen. 2015. Compressing neural networks with the hashing trick. In International Conference on Machine Learning. ACM, 2285--2294.Google Scholar
- Xue-Wen Chen and Xiaotong Lin. 2014. Big data deep learning: Challenges and perspectives. IEEE Access 2, 514--525. Google Scholar
Cross Ref
- Andrzej Cichocki. 2014. Era of big data processing: A new approach via tensor networks and tensor decompositions. arXiv preprint arXiv. arXiv:1403.2048.Google Scholar
- Li Deng, Dong Yu, and John Platt. 2012. Scalable stacking and learning for building deep architectures. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2133--2136. Google Scholar
Cross Ref
- Misha Denil, Babak Shakibi, Laurent Dinh, Marc’Aurelio Ranzato, and Nando de Freitas. 2013. Predicting parameters in deep learning. In Advances in Neural Information Processing Systems. MIT, 2148--2156.Google Scholar
- Mianxiong Dong, Kaoru Ota, and Anfeng Liu. 2016. RMER: Reliable and energy-efficient data collection for large-scale wireless sensor networks. IEEE Internet of Things Journal 3, 4, 511--519. Google Scholar
Cross Ref
- Zhen Guo, Zhongfei Zhang, Eric P. Xing, and Christos Faloutsos. 2016. Multimodal data mining in a multimedia database based on structured max margin learning. ACM Transactions on Knowledge Discovery from Data 10, 3, 23.Google Scholar
- Daniel Holden, Jun Saito, and Taku Komura. 2016. A deep learning framework for character motion synthesis and editing. ACM Transactions on Graphics 35, 14, 138.Google Scholar
Digital Library
- Sergio Ilarri, Florence Sdes, Francesco De Natale, and Alan Hanjalic. 2014. Multimedia data management in mobile computing. IEEE Multimedia 21, 1, 10--13. Google Scholar
Cross Ref
- Vadim Lebedev, Yaroslav Ganin, Maksim Rakhuba, Ivan Oseledets, and Victor Lempitsky. 2014. Speeding-up convolutional neural networks using fine-tuned CP-decomposition. arXiv preprint arXiv. arXiv:1412.6553.Google Scholar
- Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2014. Deep learning. Nature 521, 7553, 436--444. Google Scholar
Cross Ref
- Xiulong Liu, Heng Qi, Keqiu Li, Ivan Stojmenovic, Alex X. Liu, Yanming Shen, and Wenyu Qu. 2015. Sampling Bloom filter-based detection of unknown RFID tags. IEEE Transactions on Communications 63, 4, 1432--1442. Google Scholar
Cross Ref
- Zhi Liu, Mianxiong Dong, Bo Gu, Cheng Zhang, Yusheng Ji, and Yoshiaki Tanaka. 2017. Fast-start video delivery in future Internet architectures with intra-domain caching. MONET 22, 1, 98--112. Google Scholar
Digital Library
- Lei Meng, Ah-Hwee Tan, and Dong Xu. 2014. Semi-supervised heterogeneous fusion for multimedia data co-clustering. IEEE Transactions on Knowledge and Data Engineering 26, 9, 2293--2306. Google Scholar
Cross Ref
- Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew N. Ng. 2011. Multimodal deep learning. In Proceedings of the 28th International Conference on Machine Learning. ACM, 689--696.Google Scholar
Digital Library
- Alexander Novikov, Dmitrii Podoprikhin, Anton Osokin, and Dmitry P. Vetrov. 2015. Tensorizing neural networks. In Advances in Neural Information Processing Systems. MIT, 442--450.Google Scholar
Digital Library
- Wanli Ouyang, Xiao Chu, and Xiaogang Wang. 2014. Multi-source deep learning for human pose estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2337--2344. Google Scholar
Digital Library
- Eric Patterson, Ozgur Gurbuz, Zeynep Tufekci, and John N. Gowdy. 2002. CUAVE: A new audio-visual database for multimodal human-computer interface research. In Proceedings of 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing. IEEE, 2017--2020. Google Scholar
Cross Ref
- Rajat Raina, Anand Madhavan, and Andrew Y. Ng. 2009. Large-scale deep unsupervised learning using graphics processors. In Proceedings of the 26th International Conference on Machine learning. ACM, 873--880. Google Scholar
Digital Library
- Ao Ren, Zhe Li, Yanzhi Wang, Qinru Qiu, and Bo Yuan. 2016. Designing reconfigurable large-scale deep learning systems using stochastic computing. In IEEE International Conference on Rebooting Computing. IEEE, 1--7. Google Scholar
Cross Ref
- Tara N. Sainath, Brian Kingsbury, Vikas Sindhwani, Ebru Arisoy, and Bhuvana Ramabhadran. 2013. Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In International Conference of Acoustics, Speech, and Signal Processing. IEEE, 6655--6659. Google Scholar
Cross Ref
- Jurgen Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural Networks 61, 85--117. Google Scholar
Digital Library
- Kijung Shin, Lee Sael, and U. Kang. 2017. Fully scalable methods for distributed tensor factorization. IEEE Transactions on Knowledge and Data Engineering 29, 1, 100--113. Google Scholar
Digital Library
- Nitish Srivastava and Ruslan R. Salakhutdinov. 2012. Multimodal learning with deep Boltzmann machines. In Advances in Neural Information Processing Systems. MIT, 2222--2230.Google Scholar
Digital Library
- Xiaokang Wang, Laurence T. Yang, Jun Feng, Xingyu Chen, and Mohamed J. Deen. A tensor-based big service framework for enhanced living environments. IEEE Cloud Computing 3, 6, 36--43.Google Scholar
- Simon Wiesler, Alexander Richard, Ralf Schlter, and Hermann Ney. 2014. Mean-normalized stochastic gradient for large-scale deep learning. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 180--184. Google Scholar
Cross Ref
- Wai K. Wong, Zhihui Lai, Yong Xu, Jiajun Wen, and Chu Po Ho. 2015. Joint tensor feature analysis for visual object recognition. IEEE Transactions on Cybernetics 45, 11, 2425--2436. Google Scholar
Cross Ref
- Jun Wu, Kaoru Ota, Mianxiong Dong, and Chunxiao Li. 2016. A hierarchical security framework for defending against sophisticated attacks on wireless sensor networks in smart cities. IEEE Access 4, 416--424. Google Scholar
Cross Ref
- Qingbo Wu, Hua Zhang, Si Liu, and Xiaochun Cao. 2015. Multimedia analysis with deep learning. In Proceedings of IEEE International Conference on Multimedia Big Data. IEEE, 20--23. Google Scholar
Digital Library
- Xiaoshan Yang, Tianzhu Zhang, and Changsheng Xu. 2015. Cross-domain feature learning in multimedia. IEEE Transactions on Multimedia 17, 1, 64--78. Google Scholar
Cross Ref
- Qingchen Zhang, Laurence T. Yang, and Zhikui Chen. 2016. Privacy preserving deep computation model on cloud for big data feature learning. IEEE Transactions on Computers 65, 5, 1351--1362. Google Scholar
Digital Library
- Qingchen Zhang, Laurence T. Yang, and Zhikui Chen. 2016. Deep computation model for unsupervised feature learning on big data. IEEE Transactions on Services Computing 9, 1, 161--171.Google Scholar
- Yanbo Zhang, Xuanqin Mou, Ge Wang, and Hengyong Yu. 2017. Tensor-based dictionary learning for spectral CT reconstruction. IEEE Transactions on Medical Imaging 36, 1, 142--154. Google Scholar
Cross Ref
- Xueyi Zhao, Xi Li, and Zhongfei Zhang. 2015. Multimedia retrieval via deep learning to rank. IEEE Signal Processing Letters 22, 9, 1481--1491. Google Scholar
Cross Ref
- Yucan Zhou, Qinghua Hu, Jie Liu, and Yuan Jia. 2015. Combining multi-modal deep neural networks with conditional random fields for Chinese dialogue act recognition. Neurocomputing 168, 408--417. Google Scholar
Digital Library
- Chunsheng Zhu, Lei Shu, Takahiro Hara, Lei Wang, Shojiro Nishio, and Laurence T. Yang. 2014. A survey on communication and data management issues in mobile sensor networks. Wireless Communications and Mobile Computing 14, 1, 19--36. Google Scholar
Cross Ref
Index Terms
A Tucker Deep Computation Model for Mobile Multimedia Feature Learning
Recommendations
Deep neural network compression by Tucker decomposition with nonlinear response
AbstractDeep neural networks have shown impressive performance in many areas, including computer vision and natural language processing. Millions of parameters in deep neural network limit its deployment in low-end devices due to intensive ...
Speeding Up Deep Convolutional Neural Networks Based on Tucker-CP Decomposition
ICMLT '20: Proceedings of the 2020 5th International Conference on Machine Learning TechnologiesConvolutional neural networks (CNNs) have made great success in computer vision tasks. But the computational complexity of CNNs is huge, which makes CNNs run slowly especially when computational resources are limited. In this paper, we propose a scheme ...
Differentially Private Tensor Train Deep Computation for Internet of Multimedia Things
Special Issue on Privacy and Security in Evolving Internet of Multimedia Things and Regular PapersThe significant growth of the Internet of Things (IoT) takes a key and active role in healthcare, smart homes, smart manufacturing, and wearable gadgets. Due to complexness and difficulty in processing multimedia data, the IoT based scheme, namely ...






Comments