Abstract
Content-based image retrieval (CBIR) is one of the most important applications of computer vision. In recent years, there have been many important advances in the development of CBIR systems, especially Convolutional Neural Networks (CNNs) and other deep-learning techniques. On the other hand, current CNN-based CBIR systems suffer from high computational complexity of CNNs. This problem becomes more severe as mobile applications become more and more popular. The current practice is to deploy the entire CBIR systems on the server side while the client side only serves as an image provider. This architecture can increase the computational burden on the server side, which needs to process thousands of requests per second. Moreover, sending images have the potential of personal information leakage. As the need of mobile search expands, concerns about privacy are growing. In this article, we propose a fast image search framework, named DeepSearch, which makes complex image search based on CNNs feasible on mobile phones. To implement the huge computation of CNN models, we present a tensor Block Term Decomposition (BTD) approach as well as a nonlinear response reconstruction method to accelerate the CNNs involving in object detection and feature extraction. The extensive experiments on the ImageNet dataset and Alibaba Large-scale Image Search Challenge dataset show that the proposed accelerating approach BTD can significantly speed up the CNN models and further makes CNN-based image search practical on common smart phones.
- Artem Babenko, Anton Slesarev, Alexandr Chigorin, and Victor Lempitsky. 2014. Neural codes for image retrieval. In European Conference on Computer Vision. Springer, 584--599.Google Scholar
Cross Ref
- Stefano Berretti, Alberto Del Bimbo, and Pietro Pala. 2000. Retrieval by shape similarity with perceptual distance and effective indexing. IEEE Trans. Multimedia 2, 4 (2000), 225--239. Google Scholar
Digital Library
- Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the devil in the details: Delving deep into convolutional nets. In British Machine Vision Conference.Google Scholar
Cross Ref
- Jian Cheng, Cong Leng, Peng Li, Meng Wang, and Hanqing Lu. 2014. Semi-supervised multi-graph hashing for scalable similarity search. Comput. Vis. Image Understand. 124 (2014), 12--21.Google Scholar
Cross Ref
- Jian Cheng, Cong Leng, Jiaxiang Wu, Hainan Cui, and Hanqing Lu. 2014. Fast and accurate image matching with cascade hashing for 3d reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--8. Google Scholar
Digital Library
- Jian Cheng and Kongqiao Wang. 2007. Active learning for image retrieval with Co-SVM. Pattern Recogn. 40, 1 (2007), 330--334. Google Scholar
Digital Library
- Zhiyong Cheng, Daniel Soudry, Zexi Mao, and Zhenzhong Lan. 2015. Training binary multilayer neural networks for image classification using expectation backpropagation. Arxiv:1503.03562 (2015).Google Scholar
- Matthieu Courbariaux and Yoshua Bengio. 2016. Binarynet: Training deep neural networks with weights and activations constrained to+ 1 or-1. Arxiv:1602.02830 (2016).Google Scholar
- Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2015. Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in Neural Information Processing Systems. 3123--3131. Google Scholar
Digital Library
- Lieven De Lathauwer. 2008. Decompositions of a higher-order tensor in block terms-Part I: Lemmas for partitioned matrices. SIAM J. Matrix Anal. Appl. 30, 3 (2008), 1022--1032. Google Scholar
Digital Library
- Lieven De Lathauwer. 2008. Decompositions of a higher-order tensor in block terms-part II: definitions and uniqueness. SIAM J. Matrix Anal. Appl. 30, 3 (2008), 1033--1066. Google Scholar
Digital Library
- Lieven De Lathauwer, Bart De Moor, and Joos Vandewalle. 2000. On the best rank-1 and rank-(r 1, r 2,..., rn) approximation of higher-order tensors. SIAM J. Matrix Anal. Appl. 21, 4 (2000), 1324--1342. Google Scholar
Digital Library
- Lieven De Lathauwer and Dimitri Nion. 2008. Decompositions of a higher-order tensor in block terms-part III: Alternating least squares algorithms. SIAM J. Matrix Anal. Appl. 30, 3 (2008), 1067--1083. Google Scholar
Digital Library
- Misha Denil, Babak Shakibi, Laurent Dinh, Nando de Freitas, and others. 2013. Predicting parameters in deep learning. In Advances in Neural Information Processing Systems. 2148--2156. Google Scholar
Digital Library
- Emily L. Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun, and Rob Fergus. 2014. Exploiting linear structure within convolutional networks for efficient evaluation. In Advances in Neural Information Processing Systems. 1269--1277. Google Scholar
Digital Library
- Zhiwei Fang, Jing Liu, Yuhang Wang, Yong Li, Song Hang, Jinhui Tang, and Hanqing Lu. 2016. Object-aware deep network for commodity image retrieval. In Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval. ACM, 405--408. Google Scholar
Digital Library
- Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35, 12 (2013), 2916--2929. Google Scholar
Digital Library
- John C. Gower and Garmt B. Dijksterhuis. 2004. Procrustes Problems. Number 30. Oxford University Press on Demand.Google Scholar
- Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep learning with limited numerical precision. In Proceedings of the 32nd International Conference on Machine Learning (ICML–15). 1737–1746. Google Scholar
Digital Library
- Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. In Advances in Neural Information Processing Systems. 1135--1143. Google Scholar
Digital Library
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16).Google Scholar
Cross Ref
- Eva Hörster and Rainer Lienhart. 2008. Deep networks for image retrieval on large-scale databases. In Proceedings of the 16th ACM International Conference on Multimedia. ACM, 643--646. Google Scholar
Digital Library
- Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. Arxiv:1704.04861 (2017).Google Scholar
- Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks. In Advances in Neural Information Processing Systems. 4107–4115. Google Scholar
Digital Library
- Kyuyeon Hwang and Wonyong Sung. 2014. Fixed-point feedforward deep neural network design using weights+ 1, 0, and- 1. In Proceedings of the 2014 IEEE Workshop on Signal Processing Systems (SiPS). IEEE, 1--6.Google Scholar
Cross Ref
- Piotr Indyk and Rajeev Motwani. 1998. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of the T30th Annual ACM Symposium on Theory of Computing. ACM, 604--613. Google Scholar
Digital Library
- Max Jaderberg, Andrea Vedaldi, and Andrew Zisserman. 2014. Speeding up convolutional neural networks with low rank expansions. In British Machine Vision Conference.Google Scholar
Cross Ref
- Anil K. Jain and Aditya Vailaya. 1996. Image retrieval using color and shape. Pattern Recogn. 29, 8 (1996), 1233--1244.Google Scholar
Cross Ref
- Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the ACM International Conference on Multimedia. ACM, 675--678. Google Scholar
Digital Library
- Minje Kim and Paris Smaragdis. 2016. Bitwise neural networks. Arxiv:1601.06071 (2016).Google Scholar
- Yong-Deok Kim, Eunhyeok Park, Sungjoo Yoo, Taelim Choi, Lu Yang, and Dongjun Shin. 2015. Compression of deep convolutional neural networks for fast and low power mobile applications. Arxiv:1511.06530 (2015).Google Scholar
- Tamara G. Kolda and Brett W. Bader. 2009. Tensor decompositions and applications. SIAM Rev. 51, 3 (2009), 455--500. Google Scholar
Digital Library
- Alex Krizhevsky and Geoffrey E. Hinton. 2011. Using very deep autoencoders for content-based image retrieval. In Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN’11).Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097--1105. Google Scholar
Digital Library
- Hanjiang Lai, Yan Pan, Ye Liu, and Shuicheng Yan. 2015. Simultaneous feature learning and hash coding with deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3270--3278.Google Scholar
Cross Ref
- Vadim Lebedev, Yaroslav Ganin, Maksim Rakhuba, Ivan Oseledets, and Victor Lempitsky. 2014. Speeding-up convolutional neural networks using fine-tuned cp-decomposition. Arxiv:1412.6553 (2014).Google Scholar
- Vadim Lebedev and Victor Lempitsky. 2016. Fast convnets using group-wise brain damage. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2554–2564.Google Scholar
Cross Ref
- Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.Google Scholar
- Darryl D. Lin, Sachin S. Talathi, and V. Sreekanth Annapureddy. 2015. Fixed point quantization of deep convolutional networks. Arxiv:1511.06393 (2015). Google Scholar
Digital Library
- Baoyuan Liu, Min Wang, Hassan Foroosh, Marshall Tappen, and Marianna Pensky. 2015. Sparse convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 806--814.Google Scholar
- Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3431--3440.Google Scholar
Cross Ref
- Bangalore S. Manjunath and Wei-Ying Ma. 1996. Texture features for browsing and retrieval of image data. IEEE Trans. Pattern Anal. Mach. Intell. 18, 8 (1996), 837--842. Google Scholar
Digital Library
- Michael Mathieu, Mikael Henaff, and Yann LeCun. 2013. Fast training of convolutional networks through FFTs. Arxiv:1312.5851 (2013).Google Scholar
- Alexander Novikov, Dmitrii Podoprikhin, Anton Osokin, and Dmitry P. Vetrov. 2015. Tensorizing neural networks. In Advances in Neural Information Processing Systems. 442--450. Google Scholar
Digital Library
- Jiantao Qiu, Jie Wang, Song Yao, Kaiyuan Guo, Boxun Li, Erjin Zhou, Jincheng Yu, Tianqi Tang, Ningyi Xu, Sen Song, and others. 2016. Going deeper with embedded fpga platform for convolutional neural network. In Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 26--35. Google Scholar
Digital Library
- Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. XNOR-Net: ImageNet classification using binary convolutional neural networks. In ECCV (4), Vol. 9908. Springer, 525--542.Google Scholar
- Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. 91--99. Google Scholar
Digital Library
- Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2014. Fitnets: Hints for thin deep nets. Arxiv:1412.6550 (2014).Google Scholar
- Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, and others. 2015. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 3 (2015), 211--252. Google Scholar
Digital Library
- Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations.Google Scholar
- Cheng Tai, Tong Xiao, Xiaogang Wang, and others. 2015. Convolutional neural networks with low-rank regularization. Arxiv:1511.06067 (2015).Google Scholar
- Yoshio Takane and Sunho Jung. 2006. Generalized constrained redundancy analysis. Behaviormetrika 33, 2 (2006), 179--192.Google Scholar
Cross Ref
- Ji Wan, Dayong Wang, Steven Chu Hong Hoi, Pengcheng Wu, Jianke Zhu, Yongdong Zhang, and Jintao Li. 2014. Deep learning for content-based image retrieval: A comprehensive study. In Proceedings of the 22nd ACM International Conference on Multimedia. ACM, 157--166. Google Scholar
Digital Library
- Peisong Wang and Jian Cheng. 2016. Accelerating convolutional neural networks for mobile applications. In Proceedings of the 2016 ACM on Multimedia Conference. ACM, 541--545. Google Scholar
Digital Library
- Peisong Wang and Jian Cheng. 2017. Fixed-point factorized networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).Google Scholar
Cross Ref
- Yair Weiss, Antonio Torralba, and Rob Fergus. 2009. Spectral hashing. In Advances in Neural Information Processing Systems. 1753--1760. Google Scholar
Digital Library
- Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, and Jian Cheng. 2016. Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16).Google Scholar
Cross Ref
- Joe Yue-Hei Ng, Fan Yang, and Larry S. Davis. 2015. Exploiting local features from deep networks for image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 53--61.Google Scholar
- Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. 2017. ShuffleNet: An extremely efficient convolutional neural network for mobile devices. Arxiv:1707.01083 (2017).Google Scholar
- Xiangyu Zhang, Jianhua Zou, Kaiming He, and Jian Sun. 2015. Accelerating very deep convolutional networks for classification and detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI’15). Google Scholar
Digital Library
- Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, and Yurong Chen. 2017. Incremental network quantization: Towards lossless cnns with low-precision weights. Arxiv:1702.03044 (2017).Google Scholar
Index Terms
DeepSearch: A Fast Image Search Framework for Mobile Devices
Recommendations
Accelerating Convolutional Neural Networks for Mobile Applications
MM '16: Proceedings of the 24th ACM international conference on MultimediaConvolutional neural networks (CNNs) have achieved remarkable performance in a wide range of computer vision tasks, typically at the cost of massive computational complexity. The low speed of these networks may hinder real-time applications especially ...
Local bit-plane decoded convolutional neural network features for biomedical image retrieval
AbstractBiomedical image retrieval is a challenging problem due to the varying contrast and size of structures in the images. The approaches for biomedical image retrieval generally rely on the feature descriptors to characterize the images. The feature ...
Image retrieval based on aggregated deep features weighted by regional significance and channel sensitivity
AbstractDeep convolutional neural networks (CNN) have demonstrated a very powerful approach for extracting discriminative local descriptors for image description. Many related works suggest that an effective aggregation representation for deep ...






Comments