Abstract
Spiking Neural Networks (SNNs) are an emerging computation model that uses event-driven activation and bio-inspired learning algorithms. SNN-based machine learning programs are typically executed on tile-based neuromorphic hardware platforms, where each tile consists of a computation unit called a crossbar, which maps neurons and synapses of the program. However, synthesizing such programs on an off-the-shelf neuromorphic hardware is challenging. This is because of the inherent resource and latency limitations of the hardware, which impact both model performance, e.g., accuracy, and hardware performance, e.g., throughput. We propose DFSynthesizer, an end-to-end framework for synthesizing SNN-based machine learning programs to neuromorphic hardware. The proposed framework works in four steps. First, it analyzes a machine learning program and generates SNN workload using representative data. Second, it partitions the SNN workload and generates clusters that fit on crossbars of the target neuromorphic hardware. Third, it exploits the rich semantics of the Synchronous Dataflow Graph (SDFG) to represent a clustered SNN program, allowing for performance analysis in terms of key hardware constraints such as number of crossbars, dimension of each crossbar, buffer space on tiles, and tile communication bandwidth. Finally, it uses a novel scheduling algorithm to execute clusters on crossbars of the hardware, guaranteeing hardware performance. We evaluate DFSynthesizer with 10 commonly used machine learning programs. Our results demonstrate that DFSynthesizer provides a much tighter performance guarantee compared to current mapping approaches.
- [1] . 2016. Tensorflow: A system for large-scale machine learning. In USENIX Symposium on Operating Systems Design and Implementation (OSDI).Google Scholar
- [2] . 2013. Cognitive computing programming paradigm: A corelet language for composing networks of neurosynaptic cores. In International Joint Conference on Neural Networks (IJCNN).Google Scholar
- [3] . 2017. TraNNsformer: Neural network transformation for memristive crossbar based neuromorphic system design. In International Conference on Computer-Aided Design (ICCAD).Google Scholar
- [4] . 2018. Neuromorphic computing across the stack: Devices, circuits and architectures. In International Workshop on Signal Processing Systems (SIPS).Google Scholar
- [5] . 2010. The Internet of Things: A survey. Computer Networks. 54, 15 (2010), 2787–2805.Google Scholar
Digital Library
- [6] . 2017. A pipelined and scalable dataflow implementation of convolutional neural networks on FPGA. In International Parallel and Distributed Processing Symposium (IPDPS) Workshops.Google Scholar
- [7] . 2020. PyCARL: A PyNN interface for hardware-software co-simulation of spiking neural network. In International Joint Conference on Neural Networks (IJCNN).Google Scholar
- [8] . 2018. Power-accuracy trade-offs for heartbeat classification on neural networks hardware. Journal of Low Power Electronics (JOLPE) 14, 4 (2018), 508–519.Google Scholar
- [9] . 2019. A framework for the analysis of throughput-constraints of SNNs on neuromorphic hardware. In IEEE Annual Symposium on VLSI (ISVLSI).Google Scholar
- [10] . 2020. Compiling spiking neural networks to mitigate neuromorphic hardware constraints. In International Green and Sustainable Computing Conference (IGSC) Workshops.Google Scholar
- [11] . 2020. Mapping spiking neural networks to neuromorphic hardware. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 28, 1 (2020), 76–86.Google Scholar
- [12] . 2020. Run-time mapping of spiking neural networks to neuromorphic hardware. Journal of Signal Processing Systems 92, 11 (2020), 1293–1302.Google Scholar
- [13] . 2019. A framework to explore workload-specific performance and lifetime trade-offs in neuromorphic computing. Computer Architecture Letters 18, 2 (2019), 149–152.Google Scholar
- [14] . 2020. Enabling resource-aware mapping of spiking neural networks via spatial decomposition. Embedded Systems Letters 13, 3 (2020), 142–145.Google Scholar
- [15] . 2021. NeuroXplorer 1.0: An extensible framework for architectural exploration with spiking neural networks. In International Conference on Neuromorphic Systems (ICONS).Google Scholar
- [16] . 2019. Design methodology for embedded approximate artificial neural networks. In Great Lakes Symposium on VLSI (GLSVLSI).Google Scholar
- [17] . 2019. Exploration of segmented bus as scalable global interconnect for neuromorphic computing. In Great Lakes Symposium on VLSI (GLSVLSI).Google Scholar
- [18] . 1996. Loose interdependence algorithms. In Software Synthesis from Dataflow Graphs.Google Scholar
Digital Library
- [19] . 2014. Nengo: A python tool for building large-scale functional brain models. Frontiers in Neuroinformatics.Google Scholar
Cross Ref
- [20] . 2002. Networks on chip: A new paradigm for systems on chip design. In Design, Automation & Test in Europe Conference & Exhibition (DATE).Google Scholar
- [21] . 2014. Neurogrid: A mixed-analog-digital multichip system for large-scale neural simulations. Proc. IEEE 102, 5 (2014), 699–716.Google Scholar
- [22] . 2017. N2D2: Neural network design & deployment. https://github.com/CEA-LIST/N2D2.Google Scholar
- [23] . 2013. Maximum-throughput mapping of SDFGs on multi-core SoC platforms. J. Parallel and Distrib. Comput. 73, 10 (2013), 1337–1350.Google Scholar
- [24] . 2017. Neuromorphic computing using non-volatile memory. Advances in Physics: X 2, 1 (2017), 89–124.Google Scholar
- [25] . 2018. Very large-scale neuromorphic systems for biological signal processing. In CMOS Circuits for Biological Sensing and Processing.Google Scholar
- [26] . 2016. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE Journal of Solid-State Circuits 52, 1 (2016), 127–138.Google Scholar
- [27] . 2017. Using dataflow to optimize energy efficiency of deep neural network accelerators. IEEE Micro 37, 3 (2017), 12–21.Google Scholar
- [28] . 2018. CARLsim 4: An open source library for large scale, biologically detailed spiking neural network simulation using heterogeneous clusters. In International Joint Conference on Neural Networks (IJCNN).Google Scholar
- [29] . 2006. An efficient and versatile scheduling algorithm based on SDC formulation. In Design Automation Conference (DAC).Google Scholar
- [30] . 2012. Modeling static-order schedules in synchronous dataflow graphs. In Design, Automation & Test in Europe Conference & Exhibition (DATE).Google Scholar
- [31] . 2016. Adaptive and hierarchical runtime manager for energy-aware thermal management of embedded systems. ACM Transactions on Embedded Computing Systems 15, 2 (2016), 1–25.Google Scholar
- [32] . 2018. Heartbeat classification in wearables using multi-layer perceptron and time-frequency joint distribution of ECG. In International conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE).Google Scholar
- [33] . 2012. Fault-aware task re-mapping for throughput constrained multimedia applications on NoC-based MPSoCs. In International Workshop on Rapid System Prototyping (RSP).Google Scholar
- [34] . 2018. Dataflow-based mapping of spiking neural networks on neuromorphic hardware. In Great Lakes Symposium on VLSI (GLSVLSI).Google Scholar
- [35] . 2012. Energy-aware communication and remapping of tasks for reliable multimedia multiprocessor systems. In International Conference on Parallel and Distributed Systems (ICPADS).Google Scholar
- [36] . 2013. Aging-aware hardware-software task partitioning for reliable reconfigurable multiprocessor systems. In International Conference on Compilers, Architectures, and Synthesis for Embedded Systems (CASES).Google Scholar
- [37] . 2013. Communication and migration energy aware design space exploration for multicore systems with intermittent faults. In Design, Automation & Test in Europe Conference & Exhibition (DATE).Google Scholar
- [38] . 2014. Communication and migration energy aware task mapping for reliable multiprocessor systems. Future Generation Computer Systems 30 (2014), 216–228.Google Scholar
- [39] . 2014. Energy-aware task mapping and scheduling for reliable embedded computing systems. ACM Transactions on Embedded Computing Systems 13, 2s (2014), 1–27.Google Scholar
- [40] . 2015. Reliability and energy-aware mapping and scheduling of multimedia applications on multiprocessor systems. IEEE Transactions on Parallel and Distributed Systems 27, 3 (2015), 869–884.Google Scholar
- [41] . 2018. Unsupervised heart-rate estimation in wearables with Liquid states and a probabilistic readout. Neural Networks 99 (2018), 134–147.Google Scholar
- [42] . 2013. Energy-aware dynamic reconfiguration of communication-centric applications for reliable MPSoCs. In Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC).Google Scholar
- [43] . 2015. Hardware-software interaction for run-time power optimization: A case study of embedded Linux on multicore smartphones. In International Symposium on Low Power Electronics and Design (ISLPED).Google Scholar
- [44] . 2018. Mapping of local and global synapses on spiking neuromorphic hardware. In Design, Automation & Test in Europe Conference & Exhibition (DATE).Google Scholar
- [45] . 2018. Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38, 1 (2018), 82–99.Google Scholar
- [46] . 2009. PyNN: A common interface for neuronal network simulators. Frontiers in Neuroinformatics. 2 (2009), 11.Google Scholar
- [47] . 2019. TrueNorth: Accelerating from zero to 64 million neurons in 10 years. Computer 52, 5 (2019), 20–29.Google Scholar
- [48] . 2009. Imagenet: A large-scale hierarchical image database. In Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
- [49] . 2012. The MNIST database of handwritten digit images for machine learning research [best of the web]. Signal Processing Magazine. 29, 6 (2012), 141–142.Google Scholar
- [50] . 2015. Unsupervised learning of digit recognition using spike-timing-dependent plasticity. Frontiers in Computational Neuroscience 9 (2015).Google Scholar
Cross Ref
- [51] . 2009. PyNEST: A convenient interface to the NEST simulator. Frontiers in Neuroinformatics 2 (2009), 12.Google Scholar
- [52] . 2014. The SpiNNaker project. Proc. IEEE 102, 5 (2014), 652–665.Google Scholar
- [53] . 2015. A framework for plasticity implementation on the SpiNNaker neural architecture. Frontiers in Neuroscience 8 (2015), 429.Google Scholar
Cross Ref
- [54] . 2006. Throughput analysis of synchronous data flow graphs. In International Conference on Application of Concurrency to System Design (ACSD).Google Scholar
- [55] . 2009. The brian simulator. Frontiers in Neuroscience 3 (2009), 26.Google Scholar
- [56] . 2020. HFNet: A CNN architecture co-designed for neuromorphic hardware with a crossbar array of synapses. Frontiers in Neuroscience 14 (2020).Google Scholar
Cross Ref
- [57] . 2017. Deep Learning with Keras.Google Scholar
Digital Library
- [58] . 2014. Max Plus at Work: Modeling and Analysis of Synchronized Systems: A Course on Max-Plus Algebra and Its Applications. Princeton University Press.Google Scholar
- [59] . 1997. The NEURON simulation environment. Neural Computation. 9, 6 (1997), 1179–1209.Google Scholar
Digital Library
- [60] . 2017. Hierarchical dataflow modeling of iterative applications. In Design Automation Conference (DAC).Google Scholar
- [61] . 2016. Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication. In Design Automation Conference (DAC).Google Scholar
- [62] . 2003. A low-power adaptive integrate-and-fire neuron circuit. In IEEE International Symposium on Circuits and Systems (ISCAS).Google Scholar
- [63] . 2016. NEUTRAMS: Neural network transformation and co-design under neuromorphic hardware constraints. In International Symposium on Microarchitecture (MICRO).Google Scholar
- [64] . 2018. A recurrent neural network based model of predictive smooth pursuit eye movement in primates. In International Joint Conference on Neural Networks (IJCNN).Google Scholar
- [65] . 1970. An efficient heuristic procedure for partitioning graphs. Bell System Technical Journal 49, 2 (1970), 291–307.Google Scholar
Cross Ref
- [66] . 2012. Imagenet classification with deep convolutional neural networks. Neural Information Processing Systems 25 (2012), 1097–1105.Google Scholar
- [67] . 2021. Special session: Reliability analysis for ML/AI hardware. In IEEE VLSI Test Symposium (VTS).Google Scholar
- [68] . 2015. LeNet-5, convolutional neural networks. http://yann.lecun.com/exdb/lenet.Google Scholar
- [69] . 1987. Synchronous data flow. Proc. IEEE 75, 9 (1987), 1235–1245.Google Scholar
Cross Ref
- [70] . 2019. A system-level simulator for RRAM-based neuromorphic computing chips. ACM Transactions on Architecture and Code Optimization (TACO) 15, 4 (2019), 64.Google Scholar
- [71] . 1997. Networks of spiking neurons: The third generation of neural network models. Neural Networks 10, 9 (1997), 1659–1671.Google Scholar
- [72] . 2017. Design-technology co-optimization for OxRRAM-based synaptic processing unit. In Symposium on VLSI Technology.Google Scholar
- [73] . 2017. A scalable multicore architecture with heterogeneous memory structures for dynamic neuromorphic asynchronous processors (DYNAPs). IEEE Transactions on Biomedical Circuits and Systems 12, 1 (2017), 106–122.Google Scholar
- [74] . 2007. Self-timed scheduling analysis for real-time applications. EURASIP Journal on Advances in Signal Processing 2007 (2007), 1–14.Google Scholar
- [75] . 2020. Machine learning applications to DNA subsequence and restriction site analysis. In IEEE Signal Processing in Medicine and Biology Symposium.Google Scholar
- [76] . 2019. PyTorch: An imperative style, high-performance deep learning library. Neural Information Processing Systems 32 (2019).Google Scholar
- [77] . 2014. SPINDLE: SPINtronic deep learning engine for large-scale neuromorphic computing. In International Symposium on Low Power Electronics and Design (ISLPED).Google Scholar
- [78] . 2020. Mlperf inference benchmark. In International Symposium on Computer Architecture (ISCA).Google Scholar
- [79] . 2016. Theory and tools for the conversion of analog to spiking convolutional neural networks. arXiv.Google Scholar
- [80] . 2012. Live demonstration: A scaled-down version of the brainscales wafer-scale neuromorphic system. In IEEE International Symposium on Circuits and Systems (ISCAS).Google Scholar
- [81] . 2015. Adaptive energy minimization of openMP parallel applications on many-core systems. In Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures (PARMA)/Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (DITAM).Google Scholar
- [82] . 2014. Very deep convolutional networks for large-scale image recognition. arXiv.Google Scholar
- [83] . 2020. Compiling spiking neural networks to neuromorphic hardware. In International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES).Google Scholar
- [84] . 2020. A case for lifetime reliability-aware neuromorphic computing. In IEEE International Midwest Symposium on Circuits and Systems (MWSCAS).Google Scholar
- [85] . 2020. Design methodologies for reliable and energy-efficient PCM systems. In International Green and Sustainable Computing Conference (IGSC) Workshops.Google Scholar
- [86] . 2020. Exploiting inter- and intra-memory asymmetries for data mapping in hybrid tiered-memories. In International Symposium on Memory Management (ISMM).Google Scholar
- [87] . 2020. Improving dependability of neuromorphic computing with non-volatile memory. In European Dependable Computing Conference.Google Scholar
- [88] . 2019. Enabling and exploiting partition-level parallelism (PALP) in phase change memories. ACM Transactions on Embedded Computing Systems 18, 5s (2019), 1–25.Google Scholar
- [89] . 2020. Improving phase change memory performance with data content aware access. In International Symposium on Memory Management (ISMM).Google Scholar
- [90] . 2021. Aging-aware request scheduling for non-volatile main memory. In Asia and South Pacific Design Automation Conference (ASPDAC).Google Scholar
- [91] . 2021. Dynamic reliability management in neuromorphic computing. ACM Journal on Emerging Technologies in Computing Systems (JETC) 17, 4 (2021), 1–27.Google Scholar
- [92] . 2021. A design flow for mapping spiking neural networks to many-core neuromorphic hardware. In International Conference on Computer-Aided Design (ICCAD).Google Scholar
- [93] . 2021. Improving inference lifetime of neuromorphic systems via intelligent synapse mapping. In International Conference on Application-specific Systems, Architectures, and Processors (ASAP).Google Scholar
- [94] . 2000. Embedded Multiprocessors; Scheduling and Synchronization. Marcel Dekker.Google Scholar
- [95] . 2020. Towards probabilistic timing analysis for SDFGs on tile based heterogeneous MPSoCs. In Euromicro Conference on Real-Time Systems (ECRTS).Google Scholar
- [96] . 2007. Multiprocessor resource allocation for throughput-constrained synchronous dataflow graphs. In Design Automation Conference (DAC).Google Scholar
- [97] . 2006. Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataflow graphs. In Design Automation Conference (DAC).Google Scholar
- [98] . 2006. Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataflow graphs. In Design Automation Conference (DAC).Google Scholar
- [99] . 2020. Reliability-performance trade-offs in neuromorphic computing. In International Green and Sustainable Computing Conference (IGSC) Workshops.Google Scholar
- [100] . 2020. Thermal-aware compilation of spiking neural networks to neuromorphic hardware. In Languages and Compilers for Parallel Computing (LCPC) Workshop.Google Scholar
- [101] . 2021. On the role of system software in energy management of neuromorphic computing. In ACM International Conference on Computing Frontiers.Google Scholar
- [102] . 2021. Endurance-aware mapping of spiking neural networks to neuromorphic hardware. IEEE Transactions on Parallel and Distributed Systems 33, 2 (2021), 288–301.Google Scholar
- [103] . 2015. An EDA framework for large scale hybrid neuromorphic computing systems. In Design Automation Conference (DAC).Google Scholar
- [104] . 2018. An all-memristor deep spiking neural computing system: A step toward realizing the low-power stochastic brain. IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) 2, 5 (2018), 345–358.Google Scholar
- [105] . 2019. Memristive crossbar arrays for brain-inspired computing. Nature Materials 18, 4 (2019), 309.Google Scholar
- [106] . 2018. Neuromorphic computing with memristor crossbar. Physica Status Solidi (a) 215, 13 (2018), 1700875.Google Scholar
- [107] . 2013. SDC-based modulo scheduling for pipeline synthesis. In International Conference on Computer-Aided Design (ICCAD).Google Scholar
- [108] . 2012. Static rate-optimal scheduling of multirate DSP algorithms via retiming and unfolding. In IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS).Google Scholar
Index Terms
DFSynthesizer: Dataflow-based Synthesis of Spiking Neural Networks to Neuromorphic Hardware
Recommendations
Compiling Spiking Neural Networks to Neuromorphic Hardware
LCTES '20: The 21st ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded SystemsMachine learning applications that are implemented with spike-based computation model, e.g., Spiking Neural Network (SNN), have a great potential to lower the energy consumption when executed on a neuromorphic hardware. How- ever, compiling and mapping ...
Run-time Mapping of Spiking Neural Networks to Neuromorphic Hardware
AbstractNeuromorphic architectures implement biological neurons and synapses to execute machine learning algorithms with spiking neurons and bio-inspired learning algorithms. These architectures are energy efficient and therefore, suitable for cognitive ...
Dataflow-Based Mapping of Spiking Neural Networks on Neuromorphic Hardware
GLSVLSI '18: Proceedings of the 2018 on Great Lakes Symposium on VLSISpiking Neural Networks (SNNs) are powerful computation engines for pattern recognition and image classification applications. Apart from application performance such as recognition and classification accuracy, system performance such as throughput ...






Comments