ABSTRACT
A recent hot research topic in deep learning concerns the reduction of the model size of a neural network by pruning, in order to minimize its training and inference cost and thus, being capable of running on devices with memory constraints. In this paper, we employ a pruning technique to sparsify a Multi-Layer Perceptron (MLP) during training, in which the number of topology connections, being pruned and restored, is not stable, but it adopts either one of the following rules: Linear Decreasing Variation (LDV) rule or Oscillating Variation (OSV) rule or Exponential Decay (EXD) rule. We conducted experiments on three MLP Network topologies, implemented with Keras, using the Fashion-MNIST dataset and results showed that the EXD method is a clear winner since, in that case our proposed sparse network has a faster convergence than the dense version of the same one, while it achieves approximately the same high accuracy (around 90%). Furthermore, it is shown that the memory footprint of the aforementioned sparse techniques is at least 95% less instead of the dense version of the network, due to the weights removed. Finally, we present an improved version of the SET implementation in Keras, using Callbacks API, making the SET implementation more efficient.
- Q. Abbas, F. Ahmad, and M. Imran. 2016. Variable learning rate based modification in backpropagation algorithm (MBPA) of artificial neural network for data classification. Science International 28, 3 (2016), 2369–2380.Google Scholar
- X. Dai, H. Yin, and N. K. Jha. 2018. NeST: A neural network synthesis tool based on a grow-and-prune paradigm. Available at: https://arxiv.org/abs/1711.02017.Google Scholar
- J. Frankle and M. Carbin. 2019. The Lottery Ticket hypothesis: Finding sparse, trainable neural networks. In Proceedings of the International Conference on Learning Representations (ICLR).Google Scholar
- S. Han, H. Mao, and W. J. Dally. 2016. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. In Proceedings of the International Conference on Learning Representations (ICLR).Google Scholar
- T. Hoefler, D. Alistarh, T. Ben-Nun, N. Dryden, and A. Peste. 2021. Sparsity in deep learning: Pruning and growh for efficient inference and training in neural networks. Available at https://arxiv.org/abs/2102.00554.Google Scholar
- Robert A. Jacobs. 1988. Increased rates of convergence through learning rate adaptation. Neural Networks 1, 4 (1988), 295–307.Google Scholar
Cross Ref
- J. Mei, Y. Li, X. Lian, X. Jin, L. Yang, A. Yuille, and J. Yang. 2020. AtomNAS: Fine-grained end-to-end neural architecture search. In Proceedings of the International Conference on Learning Representations (ICLR).Google Scholar
- D. Mocanu, E. Mocanu, P. Stone, P. Nguyen, M. Gibescu, and A. Liotta. 2018. Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science. Nature Communications 9(2018).Google Scholar
- S. J. Reddy, S. Kale, and S. Kumar. 2019. On the convergence of Adam and beyond. In Proceedings of the International Conference on Learning Representations (ICLR).Google Scholar
- P. Ren, Y. Xiao, X. Chang, P.-Y. Huang, Z. Li, X. Chen, and W. Wang. 2021. A comprehensive survey of neural architecture search: Challenges and solutions. Comput. Surveys 54, 76 (2021), 1–34.Google Scholar
- N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 56 (2014), 1929–1958.Google Scholar
Digital Library
- X. Sun, X. Ren, S. Ma, and H. Wang. 2020. Training simplification and model simplification for deep learning: A minimal effort back propagation method. IEEE Transactions on Knowledge and Data Engineering 32, 2(2020), 374–387.Google Scholar
Digital Library
Index Terms
(auto-classified)Feed Forward Neural Network Sparsificationwith Dynamic Pruning
Recommendations
Learning in the feed-forward random neural network: A critical review
The Random Neural Network (RNN) has received, since its inception in 1989, considerable attention and has been successfully used in a number of applications. In this critical review paper we focus on the feed-forward RNN model and its ability to solve ...
Structural simplification of a feed-forward, multilayer perceptron artificial neural network
ICASSP '91: Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International ConferenceSeveral methods to reduce the excessive number of neurons and synaptic weights in a feedforward, multilayer perceptron artificial neural network (ANN) are presented. To reduce the synaptic weights, the authors replace the original weight matrix by a ...






Comments