Abstract
Hardware implementations of deep neural networks (DNNs) have been adopted in many systems because of their higher classification speed. However, while they may be characterized by better accuracy, larger DNNs require significant energy and area, thereby limiting their wide adoption. The energy consumption of DNNs is driven by both memory accesses and computation. Binarized neural networks (BNNs), as a tradeoff between accuracy and energy consumption, can achieve great energy reduction and have good accuracy for large DNNs due to their regularization effect. However, BNNs show poor accuracy when a smaller DNN configuration is adopted. In this article, we propose a new DNN architecture, LightNN, which replaces the multiplications to one shift or a constrained number of shifts and adds. Our theoretical analysis for LightNNs shows that their accuracy is maintained while dramatically reducing storage and energy requirements. For a fixed DNN configuration, LightNNs have better accuracy at a slight energy increase than BNNs, yet are more energy efficient with only slightly less accuracy than conventional DNNs. Therefore, LightNNs provide more options for hardware designers to trade off accuracy and energy. Moreover, for large DNN configurations, LightNNs have a regularization effect, making them better in accuracy than conventional DNNs. These conclusions are verified by experiment using the MNIST and CIFAR-10 datasets for different DNN configurations. Our FPGA implementation for conventional DNNs and LightNNs confirms all theoretical and simulation results and shows that LightNNs reduce latency and use fewer FPGA resources compared to conventional DNN architectures.
- Renzo Andri, Lukas Cavigelli, Davide Rossi, and Luca Benini. 2016. YodaNN: An ultra-low power convolutional neural network accelerator based on binary weights. In 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI’16). IEEE, 236--241.Google Scholar
Cross Ref
- D. Dua and E. Karra Taniskidou. 2017. UCI Machine Learning Repository. University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml.Google Scholar
- Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2015. Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in Neural Information Processing Systems. 3123--3131. Google Scholar
Digital Library
- Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition, 2009 (CVPR’09). IEEE, 248--255.Google Scholar
Cross Ref
- Li Deng, Geoffrey Hinton, and Brian Kingsbury. 2013. New types of deep neural network learning for speech recognition and related applications: An overview. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’13). IEEE, 8599--8603.Google Scholar
Cross Ref
- Ruizhou Ding, Zeye Liu, Rongye Shi, Diana Marculescu, and R. D. Blanton. 2017. LightNN: Filling the gap between conventional deep neural networks and binarized networks. In Proceedings of the Great Lakes Symposium on VLSI 2017. ACM, 35--40. Google Scholar
Digital Library
- Zidong Du, Avinash Lingamneni, Yunji Chen, Krishna Palem, Olivier Temam, and Chengyong Wu. 2014. Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators. In 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC’14). IEEE, 201--206.Google Scholar
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. Retrieved from http://www.deeplearningbook.org. Google Scholar
Digital Library
- Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep learning with limited numerical precision. In Proceedings of the 32nd International Conference on Machine Learning (ICML’15). 1737--1746. Google Scholar
Digital Library
- Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. In Advances in Neural Information Processing Systems. 1135--1143. Google Scholar
Digital Library
- Johann Hauswald, Yiping Kang, Michael A. Laurenzano, Quan Chen, Cheng Li, Trevor Mudge, Ronald G. Dreslinski, Jason Mars, and Lingjia Tang. 2015. Djinn and tonic: DNN as a service and its implications for future warehouse scale computers. In ACM SIGARCH Computer Architecture News, Vol. 43. ACM, 27--40. Google Scholar
Digital Library
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google Scholar
Cross Ref
- Itay Hubara. 2017. BinaryNet PyTorch code. Retrieved from https://github.com/itayhubara/BinaryNet.pytorch/blob/master/models/vgg_cifar10.py.Google Scholar
- Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. BinaryNet Theano code. Retrieved from https://github.com/MatthieuCourbariaux/BinaryNet.Google Scholar
- Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks. In Advances in Neural Information Processing Systems. 4107--4115. Google Scholar
Digital Library
- Synopsys Inc. 2016. Synopsys Design Compiler. Retrieved from https://www.synopsys.com/implementation-and-signoff/signoff/primetime.html.Google Scholar
- Xilinx Inc. 2017a. Xilinx FPGA. Retrieved from https://www.xilinx.com/products/silicon-devices/fpga/virtex-7.html.Google Scholar
- Xilinx Inc. 2017b. Xilinx Vivado. Retrieved from https://www.xilinx.com/products/design-tools/vivado/integration/esl-design.html.Google Scholar
- Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning. 448--456. Google Scholar
Digital Library
- Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe model examples. Retrieved from https://github.com/BVLC/caffe/tree/master/examples.Google Scholar
- Yongtae Kim, Yong Zhang, and Peng Li. 2013. An energy efficient approximate adder with carry skip for error resilient neuromorphic VLSI systems. In Proceedings of the International Conference on Computer-Aided Design. IEEE Press, 130--137. Google Scholar
Digital Library
- Diederik P. Kingma and Jimmy Lei Ba. 2015. A method for stochastic optimization. In International Conference on Learning Representations (ICLR’15).Google Scholar
- Alex Krizhevsky and Geoffrey Hinton. 2012. Learning multiple layers of features from tiny images. University of Toronto.Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097--1105. Google Scholar
Digital Library
- Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.Google Scholar
- Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324.Google Scholar
Cross Ref
- Boxun Li, Yi Shan, Miao Hu, Yu Wang, Yiran Chen, and Huazhong Yang. 2013. Memristor-based approximated computation. In Proceedings of the 2013 International Symposium on Low Power Electronics and Design. IEEE Press, 242--247. Google Scholar
Digital Library
- Hao Li, Soham De, Zheng Xu, Christoph Studer, Hanan Samet, and Tom Goldstein. 2017. Training quantized nets: A deeper understanding. arXiv Preprint arXiv:1706.02379 (2017). Google Scholar
Digital Library
- Andrew L. Maas, Awni Y. Hannun, and Andrew Y. Ng. 2013. Rectifier nonlinearities improve neural network acoustic models. In Proceedings of International Conference of Machine Learning (ICML'13), Vol. 30. 3.Google Scholar
- Synopsys. 2010. Synopsys MEDICI User’s Manual. Synopsys, Mountain View, CA.Google Scholar
- Michele Marchesi, Gianni Orlandi, Francesco Piazza, and Aurelio Uncini. 1993. Fast neural networks without multipliers. IEEE Transactions on Neural Networks 4, 1 (1993), 53--62. Google Scholar
Digital Library
- Naveen Mellempudi, Abhisek Kundu, Dheevatsa Mudigere, Dipankar Das, Bharat Kaul, and Pradeep Dubey. 2017. Ternary neural networks with fine-grained quantization. arXiv Preprint arXiv:1705.01462 (2017).Google Scholar
- Naveen Muralimanohar, Rajeev Balasubramonian, and Norman P. Jouppi. 2009. CACTI 6.0: A tool to model large caches. HP Laboratories, 22--31.Google Scholar
- Jiantao Qiu, Jie Wang, Song Yao, Kaiyuan Guo, Boxun Li, Erjin Zhou, Jincheng Yu, Tianqi Tang, Ningyi Xu, Sen Song, and Yu Wang. 2016. Going deeper with embedded FPGA platform for convolutional neural network. In Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 26--35. Google Scholar
Digital Library
- Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. Xnor-net: Imagenet classification using binary convolutional neural networks. In European Conference on Computer Vision. Springer, 525--542.Google Scholar
Cross Ref
- Syed Shakib Sarwar, Swagath Venkataramani, Anand Raghunathan, and Kaushik Roy. 2016. Multiplier-less artificial neurons exploiting error resiliency for energy-efficient neural computing. In 2016 Design, Automation 8 Test in Europe Conference 8 Exhibition (DATE’16). IEEE, 145--150. Google Scholar
Digital Library
- David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, and Demis Hassabis. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016), 484--489.Google Scholar
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv Preprint arXiv:1409.1556 (2014).Google Scholar
- Douglas R. Smith. 1999. Design ware: Software Development by Refinement. Electronic Notes in Theoretical Computer Science 29 (1999), 275--287.Google Scholar
Cross Ref
- Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 1929--1958. Google Scholar
Digital Library
- Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688 (May 2016). Retrieved from http://arxiv.org/abs/1605.02688.Google Scholar
- Swagath Venkataramani, Ashish Ranjan, Kaushik Roy, and Anand Raghunathan. 2014. AxNN: Energy-efficient neuromorphic systems using approximate computing. In Proceedings of the 2014 International Symposium on Low Power Electronics and Design. ACM, 27--32. Google Scholar
Digital Library
- Amir Yazdanbakhsh, Jongse Park, Hardik Sharma, Pejman Lotfi-Kamran, and Hadi Esmaeilzadeh. 2015. Neural acceleration for gpu throughput processors. In 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’15). IEEE, 482--493. Google Scholar
Digital Library
- Chen Zhang, Peng Li, Guanyu Sun, Yijin Guan, Bingjun Xiao, and Jason Cong. 2015. Optimizing FPGA-based accelerator design for deep convolutional neural networks. In Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 161--170. Google Scholar
Digital Library
- Guoqiang Peter Zhang. 2000. Neural networks for classification: A survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 30, 4 (2000), 451--462. Google Scholar
Digital Library
- Jialiang Zhang and Jing Li. 2017. Improving the performance of OpenCL-based FPGA accelerator for convolutional neural network. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 25--34. Google Scholar
Digital Library
- Qian Zhang, Ting Wang, Ye Tian, Feng Yuan, and Qiang Xu. 2015. ApproxANN: An approximate computing framework for artificial neural network. In Proceedings of the 2015 Design, Automation 8 Test in Europe Conference 8 Exhibition. EDA Consortium, 701--706. Google Scholar
Digital Library
- Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, and Yuheng Zou. 2016. DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv Preprint arXiv:1606.06160 (2016).Google Scholar
Index Terms
Lightening the Load with Highly Accurate Storage- and Energy-Efficient LightNNs
Recommendations
LightNN: Filling the Gap between Conventional Deep Neural Networks and Binarized Networks
GLSVLSI '17: Proceedings of the on Great Lakes Symposium on VLSI 2017Application-specific integrated circuit (ASIC) implementations for Deep Neural Networks (DNNs) have been adopted in many systems because of their higher classification speed. However, although they may be characterized by better accuracy, larger DNNs ...
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks
Computer Vision – ECCV 2018AbstractAlthough weight and activation quantization is an effective approach for Deep Neural Network (DNN) compression and has a lot of potentials to increase inference speed leveraging bit-operations, there is still a noticeable gap in terms of ...
Efficient detection of adversarial, out-of-distribution and other misclassified samples
AbstractDeep Neural Networks (DNNs) are increasingly being considered for safety–critical approaches in which it is crucial to detect misclassified samples. Typically, detection methods are geared towards either the detection of out-of-...






Comments