skip to main content
research-article

Deep Learning for Mobile Multimedia: A Survey

Published:28 June 2017Publication History
Skip Abstract Section

Abstract

Deep Learning (DL) has become a crucial technology for multimedia computing. It offers a powerful instrument to automatically produce high-level abstractions of complex multimedia data, which can be exploited in a number of applications, including object detection and recognition, speech-to- text, media retrieval, multimodal data analysis, and so on. The availability of affordable large-scale parallel processing architectures, and the sharing of effective open-source codes implementing the basic learning algorithms, caused a rapid diffusion of DL methodologies, bringing a number of new technologies and applications that outperform, in most cases, traditional machine learning technologies. In recent years, the possibility of implementing DL technologies on mobile devices has attracted significant attention. Thanks to this technology, portable devices may become smart objects capable of learning and acting. The path toward these exciting future scenarios, however, entangles a number of important research challenges. DL architectures and algorithms are hardly adapted to the storage and computation resources of a mobile device. Therefore, there is a need for new generations of mobile processors and chipsets, small footprint learning and inference algorithms, new models of collaborative and distributed processing, and a number of other fundamental building blocks. This survey reports the state of the art in this exciting research area, looking back to the evolution of neural networks, and arriving to the most recent results in terms of methodologies, technologies, and applications for mobile environments.

References

  1. Caffe for Android. Retrieved from https://github.com/sh1r0/caffe-android-lib. Accessed: 2017-02-20.Google ScholarGoogle Scholar
  2. CaffeOnSpark. Retrieved from https://github.com/yahoo/CaffeOnSpark. Accessed: 2017-02-20.Google ScholarGoogle Scholar
  3. Chainer. Retrieved from http://chainer.org. Accessed: 2017-02-20.Google ScholarGoogle Scholar
  4. Keras. Retrieved from https://keras.io/. Accessed: 2017-02-20.Google ScholarGoogle Scholar
  5. Lasagne. Retrieved from https://lasagne.readthedocs.io. Accessed: 2017-02-20.Google ScholarGoogle Scholar
  6. MXNet. Retrieved from http://mxnet.io/. Accessed: 2017-02-20.Google ScholarGoogle Scholar
  7. Neon. Retrieved from http://neon.nervanasys.com/index.html/index.html. Accessed: 2017-02-20.Google ScholarGoogle Scholar
  8. PyTorch. Retrieved from http://pytorch.org/. Accessed: 2017-02-20.Google ScholarGoogle Scholar
  9. TensorFlowOnSpark. Retrieved from https://github.com/yahoo/TensorFlowOnSpark. Accessed: 2017-02-20.Google ScholarGoogle Scholar
  10. Torch. Retrieved from http://torch.ch/. Accessed: 2017-02-20.Google ScholarGoogle Scholar
  11. Torch for Android. Retrieved from https://github.com/soumith/torch-android. Accessed: 2017-02-20.Google ScholarGoogle Scholar
  12. Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, et al. 2016. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. CoRR abs/1603.04467 (2016). Retrieved from http://arxiv.org/abs/1603.04467.Google ScholarGoogle Scholar
  13. Richard Adhikari. 2016. Google, Movidius to Bring Deep Learning to Mobile Devices. Retrieved from http://www.technewsworld.com/story/83052.html (Jan 2016).Google ScholarGoogle Scholar
  14. Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermüller, Dzmitry Bahdanau, Nicolas Ballas, et al. 2016. Theano: A python framework for fast computation of mathematical expressions. CoRR abs/1605.02688 (2016). Retrieved from http://arxiv.org/abs/1605.02688.Google ScholarGoogle Scholar
  15. Jose M. Alvarez and Lars Petersson. 2016. DecomposeMe: Simplifying ConvNets for end-to-end learning. CoRR abs/1606.05426 (2016). Retrieved from http://arxiv.org/abs/1606.05426.Google ScholarGoogle Scholar
  16. Torbjrn Morland Amund Tveit and Thomas Brox Rst. DeepLearningKit—An Open Source Deep Learning Framework for Apple’s iOS, OS X, and tvOS developed in Metal and Swift. Retrieved from https://arxiv.org/abs/1605.04614"⟩https://arxiv.org/abs/1605.04614. Accessed: 2017-02-20.Google ScholarGoogle Scholar
  17. Sajid Anwar, Kyuyeon Hwang, and Wonyong Sung. 2017. Structured pruning of deep convolutional neural networks. J. Emerg. Technol. Comput. Syst. 13, 3 (2017), 32:1--32:18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. David Bacon, Rodric Rabbah, and Sunil Shukla. 2013. FPGA programming for the masses. Queue 11, 2, Article 40 (Feb. 2013), 13 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Dana H. Ballard. Modular learning in neural networks. In Proceedings of the 6th National Conference on Artificial Intelligence, K. Forbus and H. Shrobe (Eds.). Morgan Kaufmann, San Francisco, CA. 279--284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yoshua Bengio, Pascal Lamblin, Dan Popovici, and Hugo Larochelle. 2007. Greedy layer-wise training of deep networks. In Advances in Neural Information Processing Systems 19, P. B. Schölkopf, J. C. Platt, and T. Hoffman (Eds.). MIT Press, 153--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y. Bengio, P. Simard, and P. Frasconi. 1994. Learning long-term dependencies with gradient descent is difficult. Trans. Neur. Netw. 5, 2 (1994), 157--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. B. Biggio, G. Fumera, and F. Roli. 2014. Security evaluation of pattern classifiers under attack. IEEE Trans. Knowl. Data Eng. 26(4) (2014), 984--996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. Castellano, A. M. Fanelli, and M. Pelillo. 1997. An iterative pruning algorithm for feedforward neural networks. IEEE Trans. Neural Netw. 8, 3 (1997), 519--531. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kumar Chellapilla, Sidd Puri, and Patrice Simard. 2006. High performance convolutional neural networks for document processing. In Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition. Suvisoft.Google ScholarGoogle Scholar
  25. Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, and Olivier Temam. 2014. DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). ACM, New York, 269--284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Wenlin Chen, James Wilson, Stephen Tyree, Kilian Q. Weinberger, and Yixin Chen. 2015. Compressing neural networks with the hashing trick. In Proceedings of the 32nd International Conference on Machine Learning (ICML’15). 2285--2294. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Y. Chen, T. Luo, S. Liu, S. Zhang, L. He, J. Wang, L. Li, T. Chen, Z. Xu, N. Sun, and O. Temam. 2014. DaDianNao: A machine-learning supercomputer. In Proceedings of the 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture. 609--622. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. H. Chen, T. Krishna, J. Emer, and V. Sze. 2016. 14.5 Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. In Proceedings of the 2016 IEEE International Solid-State Circuits Conference (ISSCC’16). 262--263.Google ScholarGoogle Scholar
  29. Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. CUDNN: Efficient primitives for deep learning. arXiv:1410.0759 (2014).Google ScholarGoogle Scholar
  30. Trishul Chilimbi, Yutaka Suzue, Johnson Apacible, and Karthik Kalyanaraman. 2014. Project Adam: Building an efficient and scalable deep learning training system. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). USENIX Association, Broomfield, CO, 571--582. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Wonje Choi, Karthi Duraisamy, Ryan Gary Kim, Janardhan Rao Doppa, Partha Pratim Pande, Radu Marculescu, and Diana Marculescu. 2016. Hybrid network-on-chip architectures for accelerating deep learning kernels on heterogeneous manycore platforms. In Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES’16). ACM, New York, Article 13, 10 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Dan C. Cireşan, Ueli Meier, Jonathan Masci, Luca M. Gambardella, and Jürgen Schmidhuber. 2011. Flexible, high performance convolutional neural networks for image classification. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence—Volume Two (IJCAI’11). 1237--1242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Adam Coates, Brody Huval, Tao Wang, David Wu, Bryan Catanzaro, and Ng Andrew. 2013. Deep learning with COTS HPC systems. In Proceedings of the 30th International Conference on Machine Learning (ICML’13), Sanjoy Dasgupta and David Mcallester (Eds.), Vol. 28. JMLR Workshop and Conference Proceedings, 1337--1345. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Maxwell D. Collins and Pushmeet Kohli. 2014. Memory bounded deep convolutional networks. CoRR abs/1412.1442 (2014).Google ScholarGoogle Scholar
  35. Henggang Cui, Hao Zhang, Gregory R. Ganger, Phillip B. Gibbons, and Eric P. Xing. 2016. GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server. In Proceedings of the 11th European Conference on Computer Systems (EuroSys’16). ACM, New York, Article 4, 16 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Yann Le Cun, John S. Denker, and Sara A. Solla. 1990. Optimal brain damage. In Advances in Neural Information Processing Systems. Morgan Kaufmann, 598--605. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. O. E. David and N. S. Netanyahu. 2015. DeepSign: Deep learning for automatic malware signature generation and classification. In Proceedings of the 2015 International Joint Conference on Neural Networks (IJCNN’15). 1--8.Google ScholarGoogle Scholar
  38. S. Dreyfus. 1973. The computational solution of optimal control problems with time lag. IEEE Trans. Automat. Control 18, 4 (1973), 383--385.Google ScholarGoogle ScholarCross RefCross Ref
  39. Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, and Samy Bengio. 2010. Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11 (2010), 625--660. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Y. Gao, N. Zhang, H. Wang, X. Ding, X. Ye, G. Chen, and Y. Cao. 2016. iHear food: Eating detection using commodity bluetooth headsets. In Proceedings of the 2016 IEEE First International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE’16). 163--172.Google ScholarGoogle Scholar
  41. V. Gokhale, J. Jin, A. Dundar, B. Martini, and E. Culurciello. 2014. A 240 G-ops/s mobile coprocessor for deep neural networks. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops. 696--701. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Yunchao Gong, Liu Liu, Ming Yang, and Lubomir D. Bourdev. 2014. Compressing deep convolutional networks using vector quantization. CoRR abs/1412.6115 (2014). Retrieved from http://arxiv.org/abs/1412.6115.Google ScholarGoogle Scholar
  43. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2672--2680. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Song Han, Huizi Mao, and William J. Dally. 2015. Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. CoRR abs/1510.00149 (2015). http://arxiv.org/abs/1510.00149Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16).Google ScholarGoogle ScholarCross RefCross Ref
  46. Donald O. Hebb. 1949. The Organization of Behavior: A Neuropsychological Theory. Wiley.Google ScholarGoogle Scholar
  47. Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh. 2006. A fast learning algorithm for deep belief nets. Neural Comput. 18, 7 (2006), 1527--1554. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. S. Hochreiter. 1991. Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis, Institut für Informatik, Lehrstuhl Prof. Brauer, Technische Universität München (1991).Google ScholarGoogle Scholar
  49. Sepp Hochreiter, Yoshua Bengio, Paolo Frasconi, and Jrgen Schmidhuber. 2001. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies (2001).Google ScholarGoogle Scholar
  50. S. Hou, A. Saas, L. Chen, and Y. Ye. 2016. Deep4MalDroid: A deep learning framework for android malware detection based on linux kernel system call graphs. In Proceedings of the 2016 IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW’16). 104--111.Google ScholarGoogle Scholar
  51. Loc Nguyen Huynh, Rajesh Krishna Balan, and Youngki Lee. 2016. DeepSense: A GPU-based deep convolutional neural network framework on commodity mobile devices. In Proceedings of the 2016 Workshop on Wearable Systems and Applications (WearSys’16). ACM, New York, 25--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. IBM Corporation. 2016. Introducing A Brain-inspired Computer and an End-to-End Ecosystem that Could Revolutionize Computing. Retrieved from http://www.research.ibm.com/articles/Brain-chip.shtml (2016).Google ScholarGoogle Scholar
  53. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22Nd ACM International Conference on Multimedia (MM’14). ACM, New York, 675--678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. V. Jindal. 2016. Integrating mobile and cloud for PPG signal selection to monitor heart rate during intensive physical exercise. In Proceedings of the 2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft’16). 36--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Edward Kim, Miguel Corte-Real, and Zubair Baloch. 2016. A deep semantic mobile application for thyroid cytopathology (2016).Google ScholarGoogle Scholar
  56. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, P. Bartlett, F.c.n. Pereira, C.j.c. Burges, L. Bottou, and K.q. Weinberger (Eds.). 1106--1114. Retrieved from http://books.nips.cc/papers/files/nips25/NIPS2012_0534.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. P. Kuhad, A. Yassine, and S. Shimohammadi. 2015. Using distance estimation and deep learning to simplify calibration in food calorie measurement. In Proceedings of the 2015 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA’15). 1--6.Google ScholarGoogle Scholar
  58. G. Lacey, G. W. Taylor, and S. Areibi. 2016. Deep learning on FPGAs: Past, present, and future. ArXiv e-prints (Feb. 2016). arXiv:cs.DC/1602.04283.Google ScholarGoogle Scholar
  59. N. D. Lane, S. Bhattacharya, P. Georgiev, C. Forlivesi, L. Jiao, L. Qendro, and F. Kawsar. 2016. DeepX: A software accelerator for low-power deep learning inference on mobile devices. In Proceedings of the 2016 15th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN’16). 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Nicholas D. Lane, Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, and Fahim Kawsar. 2015. An early resource characterization of deep learning on wearables, smartphones and internet-of-things devices. In Proceedings of the 2015 International Workshop on Internet of Things Towards Applications (IoT-App’15). ACM, New York, 7--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Nicholas D. Lane and Petko Georgiev. 2015a. Can deep learning revolutionize mobile sensing? In Proceedings of the 16th International Workshop on Mobile Computing Systems and Applications (HotMobile’15). 117--122. Retrieved from Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Nicholas D. Lane and Petko Georgiev. 2015b. Can deep learning revolutionize mobile sensing? In Proceedings of the 16th International Workshop on Mobile Computing Systems and Applications (HotMobile’15). ACM, New York, 117--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Nicholas D. Lane, Petko Georgiev, and Lorena Qendro. 2015. DeepEar: Robust smartphone audio sensing in unconstrained acoustic environments using deep learning. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp’15). ACM, New York, 283--294. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Seyyed Salar Latifi Oskouei, Hossein Golestani, Matin Hashemi, and Soheil Ghiasi. 2016. CNNdroid: GPU-accelerated execution of trained deep convolutional neural networks on android. In Proceedings of the 2016 ACM on Multimedia Conference (MM’16). ACM, New York, 1201--1205. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. 1989. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1, 4 (1989), 541--551. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Jemin Lee, Jinse Kwon, and Hyungshin Kim. 2016. Reducing distraction of smartwatch users with deep learning. In Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services Adjunct (MobileHCI’16). ACM, New York, 948--953. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Sicong Liu and Junzhao Du. 2016. Poster: MobiEar-building an environment-independent acoustic sensing platform for the deaf using deep learning. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services Companion (MobiSys’16 Companion). ACM, New York, 50--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Warren S. McCulloch and Walter Pitts. 1943. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5, 4 (1943), 115--133.Google ScholarGoogle ScholarCross RefCross Ref
  69. Ian McGraw, Rohit Prabhavalkar, Raziel Alvarez, Montse Gonzalez Arenas, Kanishka Rao, David Rybach, Ouais Alsharif, Hasim Sak, Alexander Gruenstein, Franoise Beaufays, and Carolina Parada. 2016. Personalized speech recognition on mobile devices. In Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP’16).Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Michele Merler, Hui Wu, Rosario Uceda-Sosa, Quoc-Bao Nguyen, and John R. Smith. 2016. Snap, eat, RepEat: A food recognition engine for dietary logging. In Proceedings of the 2Nd International Workshop on Multimedia Assisted Dietary Management (MADiMa’16). ACM, New York, 31--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Gaurav Mittal, Kaushal B. Yagnik, Mohit Garg, and Narayanan C. Krishnan. 2016. SpotGarbage: Smartphone app to detect garbage using deep learning. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp’16). ACM, New York, 940--945. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard. 2016. Deepfool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2574--2582.Google ScholarGoogle Scholar
  73. Matthew W. Moskewicz, Forrest N. Iandola, and Kurt Keutzer. 2016. Boda-RTC: Productive generation of portable, efficient code for convolutional neural networks on mobile computing platforms. CoRR abs/1606.00094 (2016). Retrieved from http://arxiv.org/abs/1606.00094.Google ScholarGoogle Scholar
  74. Movidius. 2017. Embedded Neural Network Compute Framework: Fathom. Retrieved from https://www.movidius.com/solutions/machine-vision-algorithms/machine-learning (2017).Google ScholarGoogle Scholar
  75. Kumpati S. Narendra, Senior Member, and M. A. L. Thathachar. 1974. Learning automata—A survey. IEEE Trans. Syst. Man. Cybernet. (1974), 323--334.Google ScholarGoogle ScholarCross RefCross Ref
  76. NVIDIA. 2017. Next-Gen Smartphones, Tablets, Devices. Retrieved from http://www.nvidia.com/object/tegra-phones-tablet s.html (2017).Google ScholarGoogle Scholar
  77. NVIDIA Corporation. 2017. Embedded Systems Developer Kits 8 Modules. Retrieved from http://www.nvidia.com/object/embedded-systems-dev-kits-modules.html (2017).Google ScholarGoogle Scholar
  78. Nvidia, CUDA. 2010. Programming guide (2010).Google ScholarGoogle Scholar
  79. Sri Vijay Bharat Peddi, Pallavi Kuhad, Abdulsalam Yassine, Parisa Pouladzadeh, Shervin Shirmohammadi, and Ali Asghar Nazari Shirehjini. 2017. An intelligent cloud-based data processing broker for mobile e-health multimedia applications. Future Gen. Comput. Syst. 66 (2017), 71--86.Google ScholarGoogle ScholarCross RefCross Ref
  80. Pai Peng, Hongxiang Chen, Lidan Shou, Ke Chen, Gang Chen, and Chang Xu. 2015. DeepCamera: A unified framework for recognizing places-of-interest based on deep ConvNets. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM’15). ACM, New York, 1891--1894. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. Qualcomm Technologies, Inc. 2016. Qualcomm Helps Make Your Mobile Devices Smarter With New Snapdragon Machine Learning Software Development Kit. Retrieved from https://www.qualcomm.com/news/releases/2016/05/02/qualcomm-helps-make-your-mobile-devices-smarter-new-snapdragon-machine (2016).Google ScholarGoogle Scholar
  82. Rajat Raina, Anand Madhavan, and Andrew Y. Ng. 2009. Large-scale deep unsupervised learning using graphics processors. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML’09). 873--880. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. R. Reed. Pruning algorithms-A survey. Trans. Neur. Netw. 4, 5, 740--747. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. F. Rosenblatt. 1958. The perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. (1958), 65--386.Google ScholarGoogle Scholar
  85. F. Rosenblatt. 1962. Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books.Google ScholarGoogle Scholar
  86. David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. 1986. Learning internal representations by error propagation. In Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1: Foundations, David E. Rumelhart and James L. Mcclelland (Eds.). MIT Press, Cambridge, MA, 318--362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Jason Sanders and Edward Kandrot. 2010. CUDA by Example: An Introduction to General-Purpose GPU Programming, Portable Documents. Addison-Wesley Professional. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. M. Sharif, S. Bhagavatula, L. Bauer, and M. K. Reiter. 2016. Accessorize to a crime: Real and stealthy attacks on state-ofthe-art face recognition. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 1528--1540. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. Vikas Sindhwani, Tara Sainath, and Sanjiv Kumar. 2015. Structured transforms for small-footprint deep learning. In Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.). Curran Associates, Inc., 3088--3096. Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Guillaume Soulié, Vincent Gripon, and Maëlys Robert. 2016. Compression of Deep Neural Networks on the Fly. Springer International Publishing, 153--160.Google ScholarGoogle Scholar
  91. Slawomir W. Stepniewski and Andy J. Keane. 1997. Pruning backpropagation neural networks using modern stochastic optimisation techniques. Neural Comput. Applic. 5, 2 (1997), 76--98.Google ScholarGoogle ScholarCross RefCross Ref
  92. Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS’14). MIT Press, Cambridge, MA, 3104--3112. Retrieved from http://dl.acm.org/citation.cfm?id=2969033.2969173 Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’15).Google ScholarGoogle ScholarCross RefCross Ref
  94. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. 2014. Intriguing properties of neural networks. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  95. Ryosuke Tanno, Koichi Okamoto, and Keiji Yanai. 2016. DeepFoodCam: A DCNN-based real-time mobile food recognition system. In Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management (MADiMa’16). ACM, New York, 89--89. Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. Vladimir N. Vapnik. 1995. The Nature of Statistical Learning Theory. Springer-Verlag New York, Inc., New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. 2008. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning (ICML’08). 1096--1103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. Joel Emer Vivienne Sze. 2016. Chip could bring deep learning to mobile devices. Retrieved from http://www.eurekalert.org/pub_releases/2016-02/m iot-ccb020316.php (2016).Google ScholarGoogle Scholar
  99. Saiwen Wang, Jie Song, Jaime Lien, Ivan Poupyrev, and Otmar Hilliges. 2016. Interacting with soli: Exploring fine-grained dynamic gesture recognition in the radio-frequency spectrum. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (UIST’16). ACM, New York, 851--860. Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. Wei Wen, Chunpeng Wu, Yandan Wang, Yiran Chen, and Hai Li. 2016. Learning structured sparsity in deep neural networks. CoRR abs/1608.03665 (2016).Google ScholarGoogle Scholar
  101. J. Weng, N. Ahuja, and T. S. Huang. 1992. Cresceptron: A self-organizing neural network which grows adaptively. In Proceedings of the 1992 IJCNN International Joint Conference on Neural Networks, Vol. 1. 576--581.Google ScholarGoogle Scholar
  102. Paul J. Werbos. 1982. Applications of Advances in Nonlinear Sensitivity Analysis. Springer, Berlin . 762--770.Google ScholarGoogle Scholar
  103. Bernard Widrow and Marcian E. Hoff. 1962. Associative Storage and Retrieval of Digital Information in Networks of Adaptive “Neurons.” Springer, Boston, MA, 160--160.Google ScholarGoogle Scholar
  104. Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, and Jian Cheng. 2016a. Quantized convolutional neural networks for mobile devices. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 4820--4828.Google ScholarGoogle ScholarCross RefCross Ref
  105. Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, and et al. 2016b. Google’s neural machine translation system: Bridging the gap between human and machine translation. CoRR abs/1609.08144 (2016). Retrieved from http://arxiv.org/abs/1609.08144.Google ScholarGoogle Scholar
  106. Zhenlong Yuan, Yongqiang Lu, Zhaoguo Wang, and Yibo Xue. 2014. Droid-sec: Deep learning in android malware detection. SIGCOMM Comput. Commun. Rev. 44, 4 (Aug. 2014), 371--372. Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. Z. Yuan, Y. Lu, and Y. Xue. 2016. Droiddetector: Android malware characterization and detection using deep learning. Tsinghua Sci. Technol. 21, 1 (Feb. 2016), 114--123.Google ScholarGoogle Scholar
  108. Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and Understanding Convolutional Networks. Springer International Publishing, 818--833.Google ScholarGoogle Scholar
  109. Q. Zhang, H. Li, Z. Sun, Z. He, and T. Tan. 2016. Exploring complementary features for iris recognition on mobile devices. In Proceedings of the 2016 International Conference on Biometrics (ICB’16). 1--8.Google ScholarGoogle Scholar
  110. Sixin Zhang, Anna E. Choromanska, and Yann LeCun. 2015. Deep learning with elastic averaging SGD. In Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.). Curran Associates, Inc., 685--693. Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. J. Zhu, A. Pande, P. Mohapatra, and J. J. Han. 2015. Using deep learning for energy expenditure estimation with wearable sensors. In Proceedings of the 2015 17th International Conference on E-health Networking, Application Services (HealthCom’15). 501--506.Google ScholarGoogle Scholar

Index Terms

  1. Deep Learning for Mobile Multimedia: A Survey

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Multimedia Computing, Communications, and Applications
            ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 13, Issue 3s
            Special Section on Deep Learning for Mobile Multimedia and Special Section on Best Papers from ACM MMSys/NOSSDAV 2016
            August 2017
            258 pages
            ISSN:1551-6857
            EISSN:1551-6865
            DOI:10.1145/3119899
            Issue’s Table of Contents

            Copyright © 2017 ACM

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 28 June 2017
            • Revised: 1 April 2017
            • Accepted: 1 April 2017
            • Received: 1 March 2017
            Published in tomm Volume 13, Issue 3s

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!