Abstract
We propose a framework for Class-aware Personalized Neural Network Inference (CAP’NN), which prunes an already-trained neural network model based on the preferences of individual users. Specifically, by adapting to the subset of output classes that each user is expected to encounter, CAP’NN is able to prune not only ineffectual neurons but also miseffectual neurons that confuse classification, without the need to retrain the network. CAP’NN also exploits the similarities among pruning requests from different users to minimize the timing overheads of pruning the network. To achieve this, we propose a clustering algorithm that groups similar classes in the network based on the firing rates of neurons for each class and then implement a lightweight cache architecture to store and reuse information from previously pruned networks. In our experiments with VGG-16, AlexNet, and ResNet-152 networks, CAP’NN achieves, on average, up to 47% model size reduction while actually improving the top-1(5) classification accuracy by up to 3.9%(3.4%) when the user only encounters a subset of the trained classes in these networks.
- [1] . 2012. SLICC: Self-Assembly of instruction cache collectives for OLTP workloads. In Proceedings of the International Symposium on Microarchitecture. 188–198.Google Scholar
Digital Library
- [2] . 2014. A clustering technique based on Elbow method and k-means in WSN. Int. J. Comput. Appl. 105, 9 (2014), 17–24.Google Scholar
- [3] . 2018. Understanding the limitations of existing energy-efficient design approaches for deep neural networks. Energy 2, L1 (2018), L3.Google Scholar
- [4] . 2019. Context-aware convolutional neural network over distributed system in collaborative computing. In Proceedings of the Design Automation Conference. 1–6.Google Scholar
Digital Library
- [5] . 1990. Optimal brain damage. In Proceedings of the Conference on Neural Information Processing Systems. 598–605.Google Scholar
- [6] . 2014. Exploiting linear structure within convolutional networks for efficient evaluation. In Proceedings of the Conference on Neural Information Processing Systems. 1269–1277.Google Scholar
- [7] . 2012. NVSIM: A circuit-level performance, energy, and area model for emerging nonvolatile memory. IEEE Trans. Comput.-Aided Design Integr. Circ. Syst. 31, 7 (2012), 994–1007.Google Scholar
Digital Library
- [8] . 2017. Pruning ConvNets online for efficient specialist model pruning. In Proceedings of the Computer Vision and Pattern Recognition Conference. 113–120.Google Scholar
- [9] . 2015. Learning both weights and connections for efficient neural network. In Proceedings of the Conference on Neural Information Processing Systems. 1135–1143.Google Scholar
- [10] . 1993. Optimal brain surgeon and general network pruning. In Proceedings of the International Conference on Neural Networks. 293–299.Google Scholar
Cross Ref
- [11] . 2017. Channel pruning for accelerating very deep neural networks. In Proceedings of the International Conference on Computer Vision. 1389–1397.Google Scholar
Cross Ref
- [12] . 2020. CAP’NN: Class-aware personalized neural network inference. In Proceedings of the Design Automation Conference. 1–6.Google Scholar
Cross Ref
- [13] . Retrieved from https://www.leadergpu.com/catalog/tensorflow.Google Scholar
- [14] . 2016. Network trimming: A data-driven neuron pruning approach towards efficient deep architectures. Retrieved from https://arXiv:1607.03250.Google Scholar
- [15] . 1991. Structural simplification of a feed-forward, multi-layer perceptron artificial neural network. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. 1061–1064.Google Scholar
- [16] . 2017. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the International Symposium on Computer Architecture. 1–12.Google Scholar
Digital Library
- [17] . 2015. Compression of deep convolutional neural networks for fast and low power mobile applications. Retrieved from https://arXiv:1511.06530.Google Scholar
- [18] . 1991. A Frobenius approximation reduction method (FARM) for determining optimal number of hidden units. In Proceedings of the IEEE International Joint Conference on Neural Networks. 163–168.Google Scholar
Cross Ref
- [19] . 2016. Pruning filters for efficient convnets. Retrieved from https://arXiv:1608.08710.Google Scholar
- [20] . 2017. Thinet: A filter level pruning method for deep neural network compression. In Proceedings of the International Conference on Computer Vision. 5058–5066.Google Scholar
Cross Ref
- [21] . 2017. Pruning convolutional neural networks for resource efficient inference. Retrieved from https://arXiv:1611.06440.Google Scholar
- [22] . 2019. Energy-efficient, low-latency realization of neural networks through boolean logic minimization. In Proceedings of the Asia and South Pacific Design Automation Conference. 274–279.Google Scholar
Digital Library
- [23] . 2019. CAPTOR: A class adaptive filter pruning framework for convolutional neural networks in Mobile applications. In Proceedings of the Asia and South Pacific Design Automation Conference. 444–449.Google Scholar
Digital Library
- [24] . 2017. LookNN: Neural network with no multiplication. In Proceedings of the Design Automation and Test in Europe. 1779–1784.Google Scholar
Cross Ref
- [25] . 2016. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. In Proceedings of the International Symposium on Computer Architecture. 14–26.Google Scholar
Digital Library
- [26] . 2019. SeFAct: Selective feature activation and early classification for CNNs. In Proceedings of the Asia and South Pacific Design Automation Conference. 487–492.Google Scholar
Digital Library
- [27] . 2016. Learning structured sparsity in deep neural networks. In Proceedings of the Conference on Neural Information Processing Systems. 2074–2082.Google Scholar
- [28] . 2018. Federated learning based proactive content caching in edge computing. In Proceedings of the Global Communications Conference. 1–6.Google Scholar
Digital Library
- [29] . 2018. Exploring energy and accuracy tradeoff in structure simplification of trained deep neural networks. IEEE J. Emerg. Select. Top. Circ. Syst. 8, 4 (2018), 836–848.Google Scholar
Cross Ref
- [30] . 2015. Optimizing FPGA-based accelerator design for deep convolutional neural networks. In Proceedings of the International Symposium on Field-Programmable Gate Arrays. 161–170.Google Scholar
Digital Library
Index Terms
CAP’NN: A Class-aware Framework for Personalized Neural Network Inference
Recommendations
CAP'NN: class-aware personalized neural network inference
DAC '20: Proceedings of the 57th ACM/EDAC/IEEE Design Automation ConferenceWe propose CAP'NN, a framework for Class-Aware Personalized Neural Network Inference. CAP'NN prunes an already-trained neural network model based on the preferences of individual users. Specifically, by adapting to the subset of output classes that each ...
Multiplicative neuron model artificial neural network based on Gaussian activation function
Multiplicative neuron model-based artificial neural networks are one of the artificial neural network types which have been proposed recently and have produced successful forecasting results. Sigmoid activation function was used in multiplicative neuron ...
Personalized context-aware collaborative filtering based on neural network and slope one
CDVE'09: Proceedings of the 6th international conference on Cooperative design, visualization, and engineeringCurrently, context has been identified as an important factor in recommender systems. Lots of researches have been done for context-aware collaborative filtering (CF) recommendation, but the contextual parameters in current approaches have same weights ...






Comments