skip to main content
research-article

DL-RSIM: A Reliability and Deployment Strategy Simulation Framework for ReRAM-based CNN Accelerators

Published:28 May 2022Publication History
Skip Abstract Section

Abstract

Memristor-based deep learning accelerators provide a promising solution to improve the energy efficiency of neuromorphic computing systems. However, the electrical properties and crossbar structure of memristors make these accelerators error-prone. In addition, due to the hardware constraints, the way to deploy neural network models on memristor crossbar arrays affects the computation parallelism and communication overheads. To enable reliable and energy-efficient memristor-based accelerators, a simulation platform is needed to precisely analyze the impact of non-ideal circuit/device properties on the inference accuracy and the influence of different deployment strategies on performance and energy consumption. In this paper, we propose a flexible simulation framework, DL-RSIM, to tackle this challenge. A rich set of reliability impact factors and deployment strategies are explored by DL-RSIM, and it can be incorporated with any deep learning neural networks implemented by TensorFlow. Using several representative convolutional neural networks as case studies, we show that DL-RSIM can guide chip designers to choose a reliability-friendly design option and energy-efficient deployment strategies and develop optimization techniques accordingly.

REFERENCES

  1. [1] Alex K.. 2012. Learning multiple layers of features from tiny images. University of Toronto (2012).Google ScholarGoogle Scholar
  2. [2] Balasubramonian R., Kahng A. B., Muralimanohar N., Shafiee A., and Srinivas V.. 2017. CACTI 7: New tools for interconnect exploration in innovative off-chip memories. ACM Trans. Archit. Code Optim. 14, 2, Article 14 (2017), 25 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Bojnordi M. N. and Ipek E.. 2016. Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning. In IEEE International Symposium on High Performance Computer Architecture (HPCA). 113.Google ScholarGoogle Scholar
  4. [4] Chakraborty I., Ali M. Fayez, Kim D. Eun, Ankit A., and Roy K.. 2020. GENIEx: A Generalized Approach to Emulating Non-Ideality in Memristive Xbars using Neural Networks. 6 pages.Google ScholarGoogle Scholar
  5. [5] Chen P.-Y., Peng X., and Yu S.. 2018. NeuroSim: A circuit-level macro model for benchmarking neuro-inspired architectures in online learning. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) 37, 12 (2018), 30673080.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Chen W.-H., Li K.-X., Lin W.-Y., Hsu K.-H., Li P.-Y., Yang C.-H., Xue C.-X., Yang E.-Y., Chen Y.-K., Chang Y.-S., Hsu T.-H., King Y.-C., Lin C.-J., Liu R.-S., Hsieh C.-C., Tang K.-T., and Chang M.-F.. 2018. A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors. In IEEE International Solid - State Circuits Conference (ISSCC). 494496.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Chi P., Li S., Xu C., Zhang T., Zhao J., Liu Y., Wang Y., and Xie Y.. 2016. PRIME: A novel processing-in-memory architecture for neural network computation in ReRAM-based main memory. In ACM/IEEE International Symposium on Computer Architecture (ISCA). 2739.Google ScholarGoogle Scholar
  8. [8] Deng J., Dong W., Socher R., Li L.-J., Li Kai, and Fei-Fei. Li2009. ImageNet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 248255.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Esmaeilzadeh H., Sampson A., Ceze L., and Burger D.. 2012. Neural acceleration for general-purpose approximate programs. In IEEE/ACM International Symposium on Microarchitecture (MICRO). 449460.Google ScholarGoogle Scholar
  10. [10] Feinberg B., Wang S., and Ipek E.. 2018. Making memristive neural network accelerators reliable. In IEEE International Symposium on High Performance Computer Architecture (HPCA). 5265.Google ScholarGoogle Scholar
  11. [11] Hsu K. C., Lee F. M., Lin Y. Y., Lai E. K., Wu J. Y., Lee D. Y., Lee M. H., Lung H. L., Hsieh K. Y., and Lu C. Y.. 2015. A study of array resistance distribution and a novel operation algorithm for WOx ReRAM memory. In International Conference on Solid State Devices and Materials (SSDM). 11681169.Google ScholarGoogle Scholar
  12. [12] Hu M., Strachan J. P., Li Z., Grafals E. M., Davila N., Graves C., Lam S., Ge N., Yang J. J., and Williams R. S.. 2016. Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication. In ACM/IEEE Design Automation Conference (DAC). 16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Ji Y., Zhang Y., Xie X., Li S., Wang P., Hu X., Zhang Y., and Xie Y.. 2019. FPSA: A full system stack solution for reconfigurable ReRAM-Based NN accelerator architecture. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 733747.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Keckler S. W., Dally W. J., Khailany B., Garland M., and Glasco D.. 2011. GPUs and the future of parallel computing. IEEE Micro 31, 5 (2011), 717.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Krizhevsky A., Sutskever I., and Hinton G. E.. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (NIPS).Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Lammie C. and Azghadi M. R.. 2020. MemTorch: An open-source simulation framework for memristive deep learning systems. In IEEE International Symposium on Circuits and Systems (ISCAS). 15.Google ScholarGoogle Scholar
  17. [17] Lecun Y., Bottou L., Bengio Y., and Haffner P.. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 22782324.Google ScholarGoogle Scholar
  18. [18] Lee M. K. F., Cui Y., Somu T., Luo T., Zhou J., Tang W. T., Wong W.-F., and Goh R. S. M.. 2019. A system-level simulator for RRAM-based neuromorphic computing chips. ACM Trans. Archit. Code Optim. 15, 4, Article 64 (2019), 24 pages.Google ScholarGoogle Scholar
  19. [19] Li B., Wang Y., and Chen Y.. 2020. HitM: High-throughput ReRAM-based PIM for multi-modal neural networks. In IEEE/ACM International Conference On Computer Aided Design (ICCAD). 17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Saberi M., Lotfi R., Mafinezhad K., and Serdijn W. A.. 2011. Analysis of power consumption and linearity in capacitive digital-to-analog converters used in successive approximation ADCs. IEEE Transactions on Circuits and Systems I: Regular Papers 58, 8 (2011), 17361748.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Sermanet P., Eigen D., Zhang X., Mathieu M., Fergus R., and LeCun Y.. 2014. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks. arxiv:cs.CV/1312.6229.Google ScholarGoogle Scholar
  22. [22] Shafiee A., Nag A., Muralimanohar N., Balasubramonian R., Strachan J. P., Hu M., Williams R. S., and Srikumar V.. 2016. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. In ACM/IEEE International Symposium on Computer Architecture (ISCA). 1426.Google ScholarGoogle Scholar
  23. [23] Su F., Chen W.-H., Xia L., Lo C.-P., Tang T., Wang Z., Hsu K.-H., Cheng M., Li J.-Y., Xie Y., Wang Y., Chang M.-F., Yang H., and Liu Y.. 2017. A 462GOPs/J RRAM-based nonvolatile intelligent processor for energy harvesting IoE system featuring nonvolatile logics and processing-in-memory. In Symposium on VLSI Technology (VLSIT). T260–T261.Google ScholarGoogle Scholar
  24. [24] Sun Y., Wang X., and Tang X.. 2014. Deep learning face representation from predicting 10,000 classes. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 18911898.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Wong H.-S. P., Lee H.-Y., Yu S., Chen Y.-S., Wu Y., Chen P.-S., Lee B., Chen F. T., and Tsai M.-J.. 2012. Metal-oxide RRAM. Proc. IEEE 100, 6 (2012), 19511970.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Wu Y. N., Sze V., and Emer J. S.. 2020. An architecture-level energy and area estimator for processing-in-memory accelerator designs. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 116118.Google ScholarGoogle Scholar
  27. [27] Xia L., Li B., Tang T., Gu P., Chen P.-Y., Yu S., Cao Y., Wang Y., Xie Y., and Yang H.. 2018. MNSIM: Simulation platform for memristor-based neuromorphic computing system. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) 37, 5 (2018), 10091022.Google ScholarGoogle Scholar

Index Terms

  1. DL-RSIM: A Reliability and Deployment Strategy Simulation Framework for ReRAM-based CNN Accelerators

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Embedded Computing Systems
        ACM Transactions on Embedded Computing Systems  Volume 21, Issue 3
        May 2022
        365 pages
        ISSN:1539-9087
        EISSN:1558-3465
        DOI:10.1145/3530307
        • Editor:
        • Tulika Mitra
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 28 May 2022
        • Online AM: 31 January 2022
        • Accepted: 1 December 2021
        • Revised: 1 October 2021
        • Received: 1 February 2021
        Published in tecs Volume 21, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed
      • Article Metrics

        • Downloads (Last 12 months)491
        • Downloads (Last 6 weeks)34

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!