skip to main content
research-article
Public Access

CxDNN: Hardware-software Compensation Methods for Deep Neural Networks on Resistive Crossbar Systems

Authors Info & Claims
Published:15 November 2019Publication History
Skip Abstract Section

Abstract

Resistive crossbars have shown strong potential as the building blocks of future neural fabrics, due to their ability to natively execute vector-matrix multiplication (the dominant computational kernel in DNNs). However, a key challenge that arises in resistive crossbars is that non-idealities in the synaptic devices, interconnects, and peripheral circuits of resistive crossbars lead to errors in the computations performed. When large-scale DNNs are executed on resistive crossbar systems, these errors compound and result in unacceptable degradation in application-level accuracy. We propose CxDNN, a hardware-software methodology that enables the realization of large-scale DNNs on crossbar systems by compensating for errors due to non-idealities, greatly mitigating the degradation in accuracy. CxDNN is composed of (i) an optimized mapping technique to convert floating-point weights and activations to crossbar conductances and input voltages, (ii) a fast one-time re-training method to recover accuracy loss due to this conversion, and (iii) low-overhead compensation hardware to mitigate dynamic and hardware-instance-specific errors. Unlike previous efforts that are limited to small networks and require the training and deployment of hardware-instance-specific models, CxDNN presents a scalable compensation methodology that can address large DNNs (e.g., ResNet-50 on ImageNet) and maintains the train-once-deploy-anywhere tenet of current DNN application. We evaluated CxDNN on six top DNNs on the ImageNet dataset with 0.5--13.8 million neurons and 0.5--15.5 billion connections. CxDNN achieves 16.9%--49% improvement in the top-1 classification accuracy, effectively mitigating a key challenge to the use of resistive crossbar--based neural fabrics.

References

  1. A. Avizienis. 1971. Arithmetic error codes: Cost and effectiveness studies for application in digital system design. IEEE Trans. Comput. C-20, 11 (Nov. 1971), 1322--1331. DOI:https://doi.org/10.1109/T-C.1971.223134Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. I. Chakraborty, D. Roy, and K. Roy. 2018. Technology aware training in memristive neuromorphic systems for nonideal synaptic crossbars. IEEE Trans. Emerg. Topics. Comput. Intell. 2, 5 (Oct. 2018), 335--344. DOI:https://doi.org/10.1109/tetci.2018.2829919Google ScholarGoogle Scholar
  3. L. Chen, J. Li, Y. Chen, Q. Deng, J. Shen, X. Liang, and L. Jiang. 2017. Accelerator-friendly neural-network training: Learning variations and defects in RRAM crossbar. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE’17). 19--24. DOI:https://doi.org/10.23919/DATE.2017.7926952Google ScholarGoogle Scholar
  4. P. Chen, X. Peng, and S. Yu. 2018. NeuroSim: A circuit-level macro model for benchmarking neuro-inspired architectures in online learning. IEEE Trans. Comput.-Aided Des. Integ. Circ. Syst. (2018), 1--1. DOI:https://doi.org/10.1109/TCAD.2018.2789723Google ScholarGoogle Scholar
  5. Pai-Yu Chen, Binbin Lin, I-Ting Wang, Tuo-Hung Hou, Jieping Ye, Sarma Vrudhula, Jae-sun Seo, Yu Cao, and Shimeng Yu. 2015. Mitigating effects of non-ideal synaptic device characteristics for on-chip learning. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’15). IEEE Press, Piscataway, NJ, 194--199. Retrieved from: http://dl.acm.org/citation.cfm?id=2840819.2840848.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ming Cheng, Lixue Xia, Zhenhua Zhu, Yi Cai, Yuan Xie, Yu Wang, and Huazhong Yang. 2017. TIME: A training-in-memory architecture for memristor-based deep neural networks. In Proceedings of the 54th Design Automation Conference (DAC’17). ACM, New York, NY, Article 26, 6 pages. DOI:https://doi.org/10.1145/3061639.3062326Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Chi, S. Li, C. Xu, T. Zhang, J. Zhao, Y. Liu, Y. Wang, and Y. Xie. 2016. PRIME: A novel processing-in-memory architecture for neural network computation in ReRAM-based main memory. In Proceedings of the ACM/IEEE 43rd International Symposium on Computer Architecture (ISCA’16). 27--39. DOI:https://doi.org/10.1109/ISCA.2016.13Google ScholarGoogle Scholar
  8. B. Feinberg, S. Wang, and E. Ipek. 2018. Making memristive neural network accelerators reliable. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’18). 52--65. DOI:https://doi.org/10.1109/HPCA.2018.00015Google ScholarGoogle Scholar
  9. T. Gokmen, O. Murat Onen, and W. Haensch. 2017. Training deep convolutional neural networks with resistive cross-point devices. Front. Neurosci. 11 (2017), 538. DOI:https://doi.org/10.3389/fnins.2017.00538Google ScholarGoogle ScholarCross RefCross Ref
  10. T. Gokmen and Y. Vlasov. 2016. Acceleration of deep neural network training with resistive cross-point devices: Design considerations. Front. Neurosci. 10 (2016), 333. DOI:https://doi.org/10.3389/fnins.2016.00333Google ScholarGoogle ScholarCross RefCross Ref
  11. Song Han, Huizi Mao, and William J. Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization. and Huffman coding. Retrieved from: Arxiv Preprint Arxiv:1510.00149 (2015).Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Yihui He and Song Han. 2018. ADC: Automated deep compression and acceleration with reinforcement learning. CoRR abs/1802.03494 (2018). Retrieved from: http://arxiv.org/abs/1802.03494.Google ScholarGoogle Scholar
  13. Miao Hu, Catherine Graves, Can Li, Yunning Li, Ning Ge, Eric Montgomery, Noraica Davila, Hao Jiang, R. Stanley Williams, J. Joshua Yang, Qiangfei Xia, and John Paul Strachan. 2018. Memristor-based analog computation and neural network classification with a dot product engine. Advanced Materials 30, 9 (2018), 1705914.Google ScholarGoogle ScholarCross RefCross Ref
  14. M. Hu, J. P. Strachan, Z. Li, E. M. Grafals, N. Davila, C. Graves, S. Lam, N. Ge, J. J. Yang, and R. S. Williams. 2016. Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication. In Proceedings of the 53rd ACM/EDAC/IEEE Design Automation Conference (DAC’16). 1--6. DOI:https://doi.org/10.1145/2897937.2898010Google ScholarGoogle Scholar
  15. Shubham Jain, Abhronil Sengupta, Kaushik Roy, and Anand Raghunathan. 2018. Rx-Caffe: Framework for evaluating and training deep neural networks on resistive crossbars. CoRR abs/1809.00072 (2018). Retrieved from: http://arxiv.org/abs/1809.00072.Google ScholarGoogle Scholar
  16. Shubham Jain, Swagath Venkataramani, Vijayalakshmi Srinivasan, Jungwook Choi, Pierce Chuang, and Leland Chang. 2018. Compensated-DNN: Energy efficient low-precision deep neural networks by compensating quantization errors. In Proceedings of the 55th Design Automation Conference (DAC’18). ACM, New York, NY, Article 38, 6 pages. DOI:https://doi.org/10.1145/3195970.3196012Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Jain, S. Venkataramani, V. Srinivasan, J. Choi, K. Gopalakrishnan, and L. Chang. 2019. BiScaled-DNN: Quantizing long-tailed datastructures with two scale factors for deep neural networks. In Proceedings of the 56th ACM/IEEE Design Automation Conference (DAC’19). 1--6.Google ScholarGoogle Scholar
  18. Sung Hyun Jo, Ting Chang, Idongesit Ebong, Bhavitavya B. Bhadviya, Pinaki Mazumder, and Wei Lu. 2010. Nanoscale memristor device as synapse in neuromorphic systems. Nano Lett. 10, 4 (2010), 1297--1301. DOI:https://doi.org/10.1021/nl904092h PMID: 20192230.Google ScholarGoogle ScholarCross RefCross Ref
  19. Irina Kataeva, Farnood Merrikh-Bayat, Elham Zamanidoost, and Dmitri Strukov. 2015. Efficient training algorithms for neural networks based on memristive crossbar circuits. In Proceedings of the International Joint Conference on Neural Networks (IJCNN’15). IEEE, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  20. L. Kull, T. Toifl, M. Schmatz, P. A. Francese, C. Menolfi, M. Braendli, M. Kossel, T. Morf, T. M. Andersen, and Y. Leblebici. 2013. A 3.1mW 8b 1.2GS/s single-channel asynchronous SAR ADC with alternate comparators for enhanced speed in 32nm digital SOI CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference Digest of Technical Papers. 468--469. DOI:https://doi.org/10.1109/ISSCC.2013.6487818Google ScholarGoogle Scholar
  21. Beiye Liu, M. Hu, Hai Li, Zhi-Hong Mao, Yiran Chen, Tingwen Huang, and Wei Zhang. 2013. Digital-assisted noise-eliminating training for memristor crossbar-based analog neuromorphic computing engine. In Proceedings of the 50th ACM/EDAC/IEEE Design Automation Conference (DAC’13). 1--6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. B. Liu, H. Li, Y. Chen, X. Li, T. Huang, Q. Wu, and M. Barnell. 2014. Reduction and IR-drop compensations techniques for reliable neuromorphic computing systems. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’14). 63--70. DOI:https://doi.org/10.1109/ICCAD.2014.7001330Google ScholarGoogle Scholar
  23. C. Liu, M. Hu, J. P. Strachan, and H. Li. 2017. Rescuing memristor-based neuromorphic design with high defects. In Proceedings of the 54th ACM/EDAC/IEEE Design Automation Conference (DAC’17). 1--6. DOI:https://doi.org/10.1145/3061639.3062310Google ScholarGoogle Scholar
  24. Peter Moon, Vinay Chikarmane, Kevin Fischer, Rohit Grover, Tarek A. Ibrahim, Doug Ingerly, Kevin J. Lee, Chris Litteken, Tony Mule, and Sarah Williams. 2008. Process and electrical results for the on-die interconnect stack for Intel’s 45nm process generation. Intel Technol. J. 12, 2 (2008).Google ScholarGoogle Scholar
  25. E. Park, D. Kim, and S. Yoo. 2018. Energy-efficient neural network accelerator based on outlier-aware low-precision computation. In Proceedings of the ACM/IEEE 45th International Symposium on Computer Architecture (ISCA’18). 688--698. DOI:https://doi.org/10.1109/ISCA.2018.00063Google ScholarGoogle Scholar
  26. R. Parloff. 2016. Why deep learning is suddenly changing your life. Fortune.com. 9/28/16. Retrieved from: http://fortune.com/ai-artificial-intelligence-deep-machine-learning/Google ScholarGoogle Scholar
  27. Design Rules. [n.d.]. MOSIS scalable CMOS (SCMOS). Retrieved from: https://www.mosis.com/files/scmos/scmos.pdf.Google ScholarGoogle Scholar
  28. Abhronil Sengupta, Yong Shim, and Kaushik Roy. 2016. Proposal for an all-spin artificial neural network: Emulating neural and synaptic functionalities through domain wall motion in ferromagnets. IEEE Trans. Biomed. Circ. Syst. 10, 6 (2016), 1152--1160.Google ScholarGoogle ScholarCross RefCross Ref
  29. A. Shafiee, A. Nag, N. Muralimanohar, R. Balasubramonian, J. P. Strachan, M. Hu, R. S. Williams, and V. Srikumar. 2016. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. In Proceedings of the ACM/IEEE 43rd International Symposium on Computer Architecture (ISCA’16). 14--26. DOI:https://doi.org/10.1109/ISCA.2016.12Google ScholarGoogle Scholar
  30. Hokchhay Tann, Soheil Hashemi, R. Iris Bahar, and Sherief Reda. 2017. Hardware-software codesign of accurate, multiplier-free deep neural networks. In Proceedings of the 54th Design Automation Conference (DAC’17). ACM, New York, NY, Article 28, 6 pages. DOI:https://doi.org/10.1145/3061639.3062259Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. F. Vincent, J. Larroque, N. Locatelli, N. Ben Romdhane, O. Bichler, C. Gamrat, W. S. Zhao, J. O. Klein, S. Galdin-Retailleau, and D. Querlioz. 2015. Spin-transfer torque magnetic memory as a stochastic memristive synapse for neuromorphic systems. IEEE Trans. Biomed. Circ. Syst. 9, 2 (Apr. 2015), 166--174. DOI:https://doi.org/10.1109/TBCAS.2015.2414423Google ScholarGoogle ScholarCross RefCross Ref
  32. Yandan Wang, Wei Wen, Beiye Liu, Donald M. Chiarulli, and Hai Helen Li. 2017. Group scissor: Scaling neuromorphic computing design to big neural networks. CoRR abs/1702.03443 (2017). Retrieved from: http://arxiv.org/abs/1702.03443.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. H. S Philip Wong, Heng Yuan Lee, Shimeng Yu, Yu Sheng Chen, Yi Wu, Pang Shiu Chen, Byoungil Lee, Frederick T. Chen, and Ming Jinn Tsai. 2012. Metal-oxide RRAM. Proc. IEEE 100, 6 (6 2012), 1951--1970. DOI:https://doi.org/10.1109/JPROC.2012.2190369Google ScholarGoogle Scholar
  34. L. Xia, B. Li, T. Tang, P. Gu, X. Yin, W. Huangfu, P. Y. Chen, S. Yu, Y. Cao, Y. Wang, Y. Xie, and H. Yang. 2016. MNSIM: Simulation platform for memristor-based neuromorphic computing system. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE’16). 469--474.Google ScholarGoogle Scholar
  35. B. Yan, J. Yang, Q. Wu, Y. Chen, and H. Li. 2017. A closed-loop design to enhance weight stability of memristor based neural network chips. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’17). 541--548. DOI:https://doi.org/10.1109/ICCAD.2017.8203824Google ScholarGoogle Scholar
  36. Jintao Zhang, Zhuo Wang, and Naveen Verma. June 2016. A machine-learning classifier implemented in a standard 6T SRAM array. In Proceedings of the IEEE Symposium on VLSI Circuits (VLSI-Circuits’16). IEEE, 1--2.Google ScholarGoogle Scholar
  37. Shuchang Zhou, Zekun Ni, Xinyu Zhou, He Wen, Yuxin Wu, and Yuheng Zou. 2016. DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients. CoRR abs/1606.06160 (2016). Retrieved from: http://arxiv.org/abs/1606.06160.Google ScholarGoogle Scholar

Index Terms

  1. CxDNN: Hardware-software Compensation Methods for Deep Neural Networks on Resistive Crossbar Systems

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!