Abstract
Bayesian Neural Networks (BNNs) have been proposed to address the problem of model uncertainty in training and inference. By introducing weights associated with conditioned probability distributions, BNNs are capable of resolving the overfitting issue commonly seen in conventional neural networks and allow for small-data training, through the variational inference process. Frequent usage of Gaussian random variables in this process requires a properly optimized Gaussian Random Number Generator (GRNG). The high hardware cost of conventional GRNG makes the hardware implementation of BNNs challenging. In this paper, we propose VIBNN, an FPGA-based hardware accelerator design for variational inference on BNNs. We explore the design space for massive amount of Gaussian variable sampling tasks in BNNs. Specifically, we introduce two high performance Gaussian (pseudo) random number generators: 1) the RAM-based Linear Feedback Gaussian Random Number Generator (RLF-GRNG), which is inspired by the properties of binomial distribution and linear feedback logics; and 2) the Bayesian Neural Network-oriented Wallace Gaussian Random Number Generator. To achieve high scalability and efficient memory access, we propose a deep pipelined accelerator architecture with fast execution and good hardware utilization. Experimental results demonstrate that the proposed VIBNN implementations on an FPGA can achieve throughput of 321,543.4 Images/s and energy efficiency upto 52,694.8 Images/J while maintaining similar accuracy as its software counterpart.
- Pulkit Agrawal, Ross Girshick, and Jitendra Malik. 2014. Analyzing the performance of multilayer neural networks for object recognition. In European Conference on Computer Vision. Springer, 329-344.Google Scholar
Cross Ref
- Filipp Akopyan, Jun Sawada, Andrew Cassidy, Rodrigo Alvarez-Icaza, John Arthur, Paul Merolla, Nabil Imam, Yutaka Nakamura, Pallab Datta, and Gi-Joon Nam. 2015. Truenorth: Design and tool flow of a 65 mw 1 million neuron programmable neurosynaptic chip. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 34, 10 (2015), 1537-1557.Google Scholar
Digital Library
- R Andraka and R Phelps. 1998. An FPGA based processor yields a real time high fidelity radar environment simulator. In Military and Aerospace Applications of Programmable Devices and Technologies Conference. 220-224.Google Scholar
- Christophe Andrieu, Nando De Freitas, Arnaud Doucet, and Michael I Jordan. 2003. An introduction to MCMC for machine learning. Machine learning 50, 1-2 (2003), 5-43.Google Scholar
- Bálint Antal and András Hajdu. 2014. An Ensemble-based System for Automatic Screening of Diabetic Retinopathy. Know.-Based Syst. 60 (April 2014), 20-27. Google Scholar
Digital Library
- Matias S Attene-Ramos, Nicole Miller, Ruili Huang, Sam Michael, Misha Itkin, Robert J Kavlock, Christopher P Austin, Paul Shinn, Anton Simeonov, and Raymond R Tice. 2013. The Tox21 robotic platform for the assessment of environmental chemicals?from vision to reality. Drug discovery today 18, 15 (2013), 716-723.Google Scholar
- JD Beasley and SG Springer. 1985. The percentage points of the normal distribution. Algorithm AS 111 (1985).Google Scholar
- David M Blei, Thomas L Griffiths, and Michael I Jordan. 2010. The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies. Journal of the ACM (JACM) 57, 2 (2010), 7. Google Scholar
Digital Library
- Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, and Daan Wierstra. 2015. Weight uncertainty in neural networks. arXiv preprint arXiv:1505.05424 (2015). Google Scholar
Digital Library
- George EP Box, William Gordon Hunter, and J Stuart Hunter. 1978. Statistics for experimenters: an introduction to design, data analysis, and model building. Vol. 1. JSTOR.Google Scholar
- Michael Braun and Jon McAuliffe. 2010. Variational inference for large-scale models of discrete choice. J. Amer. Statist. Assoc. 105, 489 (2010), 324-335.Google Scholar
Cross Ref
- Yunji Chen, Tao Luo, Shaoli Liu, Shijin Zhang, Liqiang He, Jia Wang, Ling Li, Tianshi Chen, Zhiwei Xu, and Ninghui Sun. 2014. Dadiannao: A machine-learning supercomputer. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 609-622. Google Scholar
Digital Library
- Yu-Hsin Chen, Tushar Krishna, Joel S Emer, and Vivienne Sze. 2017. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE Journal of Solid-State Circuits 52, 1 (2017), 127-138.Google Scholar
Cross Ref
- Dan C Cireşan, Alessandro Giusti, Luca M Gambardella, and Jürgen Schmidhuber. 2013. Mitosis detection in breast cancer histology images with deep neural networks. In International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, 411-418.Google Scholar
- R. David. 1980. Testing by Feedback Shift Register. IEEE Trans. Comput. 29, 7 (July 1980), 668-673. Google Scholar
Digital Library
- Misha Denil, Babak Shakibi, Laurent Dinh, and Nando de Freitas. 2013. Predicting parameters in deep learning. In Advances in Neural Information Processing Systems. 2148-2156. Google Scholar
Digital Library
- Thomas G Dietterich. 2000. Ensemble methods in machine learning. In International workshop on multiple classifier systems. Springer, 1-15. Google Scholar
Digital Library
- Clément Farabet, Berin Martini, Benoit Corda, Polina Akselrod, Eugenio Culurciello, and Yann LeCun. 2011. Neuflow: A runtime reconfigurable dataflow processor for vision. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on. IEEE, 109-116.Google Scholar
Cross Ref
- Meire Fortunato, Charles Blundell, and Oriol Vinyals. 2017. Bayesian Recurrent Neural Networks. arXiv preprint arXiv:1704.02798 (2017).Google Scholar
- Yarin Gal and Zoubin Ghahramani. 2015. Bayesian convolutional neural networks with Bernoulli approximate variational inference. arXiv preprint arXiv:1506.02158 (2015).Google Scholar
- Zoubin Ghahramani and Matthew J Beal. 2001. Propagation algorithms for variational Bayesian learning. In Advances in neural information processing systems. 507-513. Google Scholar
Digital Library
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org. Google Scholar
Digital Library
- Philipp Gysel, Mohammad Motamedi, and Soheil Ghiasi. 2016. Hardware-oriented approximation of convolutional neural networks. arXiv preprint arXiv:1604.03168 (2016).Google Scholar
- Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A Horowitz, and William J Dally. 2016. EIE: efficient inference engine on compressed deep neural network. In Proceedings of the 43rd International Symposium on Computer Architecture. IEEE Press, 243-254. Google Scholar
Digital Library
- Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).Google Scholar
- Rein Houthooft, Xi Chen, Yan Duan, John Schulman, Filip De Turck, and Pieter Abbeel. 2016. VIME: Variational Information Maximizing Exploration. In Advances In Neural Information Processing Systems. 1109-1117. Google Scholar
Digital Library
- Max Jaderberg, Andrea Vedaldi, and Andrew Zisserman. 2014. Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866 (2014).Google Scholar
- Michael I Jordan, Zoubin Ghahramani, Tommi S Jaakkola, and Lawrence K Saul. 1999. An introduction to variational methods for graphical models. Machine learning 37, 2 (1999), 183-233. Google Scholar
Digital Library
- Norman P Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, and Al Borchers. 2017. In-datacenter performance analysis of a tensor processing unit. arXiv preprint arXiv:1704.04760 (2017). Google Scholar
Digital Library
- Günter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. 2017. Self-Normalizing Neural Networks. arXiv preprint arXiv:1706.02515 (2017).Google Scholar
- Yann LeCun, Corinna Cortes, and Christopher JC Burges. 2010. MNIST hand-written digit database. AT&T Labs {Online}. Available: http://yann.lecun.com/exdb/mnist2 (2010).Google Scholar
- Yong Liu and Xin Yao. 1999. Ensemble learning via negative correlation. Neural networks 12, 10 (1999), 1399-1404. Google Scholar
Digital Library
- David JC MacKay. 1992. Bayesian methods for adaptive models. Ph.D. Dissertation. California Institute of Technology.Google Scholar
- Jamshaid Sarwar Malik and Ahmed Hemani. 2016. Gaussian Random Number Generation: A Survey on Hardware Architectures. ACM Computing Surveys (CSUR) 49, 3 (2016), 53. Google Scholar
Digital Library
- George Marsaglia and Wai Wan Tsang. 1984. A fast, easily implemented method for sampling from decreasing or symmetric unimodal density functions. SIAM Journal on scientific and statistical computing 5, 2 (1984), 349-359.Google Scholar
- Michael Mathieu, Mikael Henaff, and Yann LeCun. 2013. Fast training of convolutional networks through ffts. arXiv preprint arXiv:1312.5851 (2013).Google Scholar
- Mervin E Muller. 1958. An inverse method for the generation of random normal deviates on large-scale computers. Mathematical tables and other aids to computation 12, 63 (1958), 167-174.Google Scholar
- Mervin E Muller. 1959. A comparison of methods for generating normal deviates on digital computers. Journal of the ACM (JACM) 6, 3 (1959), 376-383. Google Scholar
Digital Library
- Radford M Neal. 2012. Bayesian learning for neural networks. Vol. 118. Springer Science&Business Media.Google Scholar
- Eriko Nurvitadhi, Ganesh Venkatesh, Jaewoong Sim, Debbie Marr, Randy Huang, Jason Ong Gee Hock, Yeong Tat Liew, Krishnan Srivatsan, Duncan Moss, and Suchit Subhaschandra. 2017. Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks?. In FPGA. 5-14. Google Scholar
Digital Library
- Kalin Ovtcharov, Olatunji Ruwase, Joo-Young Kim, Jeremy Fowers, Karin Strauss, and Eric S Chung. 2015. Accelerating deep convolutional neural networks using specialized hardware. Microsoft Research Whitepaper 2, 11 (2015).Google Scholar
- Ao Ren, Zhe Li, Caiwen Ding, Qinru Qiu, Yanzhi Wang, Ji Li, Xuehai Qian, and Bo Yuan. 2017. Sc-dcnn: Highly-scalable deep convolutional neural network using stochastic computing. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 405-418. Google Scholar
Digital Library
- B. E. Sakar, M. E. Isenkul, C. O. Sakar, A. Sertbas, F. Gurgen, S. Delil, H. Apaydin, and O. Kursun. 2013. Collection and Analysis of a Parkinson Speech Dataset With Multiple Types of Sound Recordings. IEEE Journal of Biomedical and Health Informatics 17, 4 (July 2013), 828-834.Google Scholar
Cross Ref
- Jürgen Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural networks 61 (2015), 85-117. Google Scholar
Digital Library
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
- Naveen Suda, Vikas Chandra, Ganesh Dasika, Abinash Mohanty, Yufei Ma, Sarma Vrudhula, Jae-sun Seo, and Yu Cao. 2016. Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. In Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 16-25. Google Scholar
Digital Library
- Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems. 3104-3112. Google Scholar
Digital Library
- Yee Whye Teh and Michael I Jordan. 2010. Hierarchical Bayesian nonparametric models with applications. Bayesian nonparametrics 1 (2010).Google Scholar
- Daniel Teichroew. 1953. Distribution sampling with high speed computers. Ph.D. Dissertation. North Carolina State College.Google Scholar
- Jonathan L Ticknor. 2013. A Bayesian regularized artificial neural network for stock market forecasting. Expert Systems with Applications 40, 14 (2013), 5501-5506.Google Scholar
Cross Ref
- Ganesh Venkatesh, Eriko Nurvitadhi, and Debbie Marr. 2017. Accelerating Deep Convolutional Networks using low-precision and sparsity. In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on. IEEE, 2861-2865.Google Scholar
Cross Ref
- Martin J Wainwright and Michael I Jordan. 2008. Graphical models, exponential families, and variational inference. Foundations and Trends® in Machine Learning 1, 1-2 (2008), 1-305. Google Scholar
Digital Library
- Helen M Walker. 1985. De Moivre on the law of normal probability. Smith, David Eugene. A Source Book in Mathematics. Dover. ISBN 0-486-64690-4 (1985).Google Scholar
- Chris S Wallace. 1996. Fast pseudorandom generators for normal and exponential variates. ACM Transactions on Mathematical Software (TOMS) 22, 1 (1996), 119-127. Google Scholar
Digital Library
- Roy Ward and Tim Molteno. 2007. Table of linear feedback shift registers. Datasheet, Department of Physics, University of Otago (2007).Google Scholar
- Wei Wen, Chunpeng Wu, Yandan Wang, Kent Nixon, Qing Wu, Mark Barnell, Hai Li, and Yiran Chen. 2016. A new learning method for inference accuracy, core occupation, and performance co-optimization on TrueNorth chip. In Design Automation Conference (DAC), 2016 53nd ACM/EDAC/IEEE. IEEE, 1-6. Google Scholar
Digital Library
- Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, and Jason Cong. 2015. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks. In Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 161-170. Google Scholar
Digital Library
- Cha Zhang and Yunqian Ma. 2012. Ensemble machine learning: methods and applications. Springer.Google Scholar
- Maciej Zikeba, Jakub M Tomczak, Marek Lubicz, and Jerzy 'Swikatek. 2013. Boosted SVM for extracting rules from imbalanced data in application to prediction of the post-operative life expectancy in the lung cancer patients. Applied Soft Computing (2013). https://doi.org/{WebLink}Google Scholar
Index Terms
VIBNN: Hardware Acceleration of Bayesian Neural Networks
Recommendations
VIBNN: Hardware Acceleration of Bayesian Neural Networks
ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating SystemsBayesian Neural Networks (BNNs) have been proposed to address the problem of model uncertainty in training and inference. By introducing weights associated with conditioned probability distributions, BNNs are capable of resolving the overfitting issue ...
A Runtime Programmable Accelerator for Convolutional and Multilayer Perceptron Neural Networks on FPGA
Applied Reconfigurable Computing. Architectures, Tools, and ApplicationsAbstractDeep neural networks (DNNs) are prevalent for many applications related to classification, prediction and regression. To perform different applications with better performance and accuracy, an optimized network architecture is required, which can ...
Dynamic MAC-based architecture of artificial neural networks suitable for hardware implementation on FPGAs
Artificial neural networks (ANNs) is a well known bio-inspired model that simulates human brain capabilities such as learning and generalization. ANNs consist of a number of interconnected processing units, wherein each unit performs a weighted sum ...







Comments