Abstract
ACE-GCN is a fast and resource/energy-efficient FPGA accelerator for graph convolutional embedding under data-driven and in-place processing conditions. Our accelerator exploits the inherent power law distribution and high sparsity commonly exhibited by real-world graphs datasets. Contrary to other hardware implementations of GCN, on which traditional optimization techniques are employed to bypass the problem of dataset sparsity, our architecture is designed to take advantage of this very same situation. We propose and implement an innovative acceleration approach supported by our “implicit-processing-by-association” concept, in conjunction with a dataset-customized convolutional operator. The computational relief and consequential acceleration effect arise from the possibility of replacing rather complex convolutional operations for a faster embedding result estimation. Based on a computationally inexpensive and super-expedited similarity calculation, our accelerator is able to decide from the automatic embedding estimation or the unavoidable direct convolution operation. Evaluations demonstrate that our approach presents excellent applicability and competitive acceleration value. Depending on the dataset and efficiency level at the target, between 23× and 4,930× PyG baseline, coming close to AWB-GCN by 46% to 81% on smaller datasets and noticeable surpassing AWB-GCN for larger datasets and with controllable accuracy loss levels. We further demonstrate the unique hardware optimization characteristics of our approach and discuss its multi-processing potentiality.
- Yunsheng Bai, Hao Ding, Song Bian, Ting Chen, Yizhou Sun, and W. Wang. 2019. SimGNN: A neural network approach to fast graph similarity computation. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining. Google Scholar
Digital Library
- Hadi Banaee and Amy Loutfi. 2015. Data-driven rule mining and representation of temporal patterns in physiological sensor data. IEEE Journal of Biomedical and Health Informatics 19 (2015), 1557–1566.Google Scholar
Cross Ref
- Diana Cai, Trevor Campbell, and T. Broderick. 2016. Edge-exchangeable graphs and sparsity (NIPS 2016). arXiv: Machine Learning (2016). Google Scholar
Digital Library
- L. Chen, J. Hoey, C. Nugent, D. Cook, and Z. Yu. 2012. Sensor-based activity recognition. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 42 (2012), 790–808. Google Scholar
Digital Library
- Gianmarco Dinelli, Gabriele Meoni, Emilio Rapuano, Gionata Benelli, and L. Fanucci. 2019. An FPGA-based hardware accelerator for CNNs using on-chip memories only: Design and benchmarking with Intel Movidius neural compute stick. International Journal of Reconfigurable Computing 2019 (2019), 7218758:1–7218758:13.Google Scholar
Digital Library
- H. Gao, Zhengyang Wang, and S. Ji. 2018. Large-scale learnable graph convolutional networks. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Google Scholar
Digital Library
- Tong Geng, Ang Li, Runbin Shi, Chunshu Wu, T. Wang, Yanfei Li, Pouya Haghi, Antonino Tumeo, Shuai Che, Steve Reinhardt, and M. Herbordt. 2020. AWB-GCN: A graph convolutional network accelerator with runtime workload rebalancing. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’20), 922–936.Google Scholar
- Tong Geng, A. Li, T. Wang, Chunshu Wu, Yanfei Li, Antonino Tumeo, and M. Herbordt. 2019. UWB-GCN: Hardware acceleration of graph-convolution-network through runtime workload rebalancing. ArXiv abs/1908.10834 (2019).Google Scholar
- Xu Geng, Yaguang Li, Leye Wang, Lingyu Zhang, Qiang Yang, Jieping Ye, and Yan Liu. 2019. Spatiotemporal multi-graph convolution network for ride-hailing demand forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 3656–3663.Google Scholar
Digital Library
- R. Gera, Lázaro Alonso, B. Crawford, J. House, J. A. Méndez-Bermúdez, T. Knuth, and R. Miller. 2018. Identifying network structure similarity using spectral graph theory. Applied Network Science 3 (2018).Google Scholar
- Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In AISTATS.Google Scholar
- Lei He. 2019. EnGN: A high-throughput and energy-efficient accelerator for large graph neural networks. ArXiv abs/1909.00155 (2019).Google Scholar
- D. Hill and B. Minsker. 2010. Anomaly detection in streaming environmental sensor data: A data-driven modeling approach. Environmental Modelling and Software 25 (2010), 1014–1022. Google Scholar
Digital Library
- Nachiket Kapre. 2015. Custom FPGA-based soft-processors for sparse graph acceleration. In 2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP’15), 9–16.Google Scholar
Cross Ref
- Anees Kazi, Shayan Shekarforoush, S. Krishna, Hendrik Burwinkel, G. Vivar, K. Kortuem, Seyed-Ahmad Ahmadi, Shadi Albarqouni, and N. Navab. 2019. InceptionGCN: Receptive field aware graph convolutional network for disease prediction. In IPMI.Google Scholar
- Thomas Kipf and M. Welling. 2017. Semi-supervised classification with graph convolutional networks. ArXiv abs/1609.02907 (2017).Google Scholar
- Danai Koutra, U. Kang, Jilles Vreeken, and C. Faloutsos. 2014. VOG: Summarizing and understanding large graphs. ArXiv abs/1406.3411 (2014). Google Scholar
Digital Library
- G. Li, M. Müller, Ali K. Thabet, and Bernard Ghanem. 2019. DeepGCNs: Can GCNs go as deep as CNNs? In2019 IEEE/CVF International Conference on Computer Vision (ICCV’19), 9266–9275.Google Scholar
Cross Ref
- Qimai Li, Zhichao Han, and Xiao-Ming Wu. 2018. Deeper insights into graph convolutional networks for semi-supervised learning. ArXiv abs/1801.07606 (2018).Google Scholar
- S. Li, Junwei Huang, Z. Zhang, Jianhang Liu, Tingpei Huang, and Haihua Chen. 2018. Similarity-based future common neighbors model for link prediction in complex networks. Scientific Reports 8 (2018).Google Scholar
- L. Lu, Y. Liang, Qingcheng Xiao, and Shengen Yan. 2017. Evaluating fast algorithms for convolutional neural networks on FPGAs. In 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’17), 101–108.Google Scholar
Cross Ref
- Milena Mihail and Christos H. Papadimitriou. 2002. On the eigenvalue power law. In Proceedings of the 6th International Workshop on Randomization and Approximation Techniques (RANDOM’02). Springer-Verlag, Berlin, 254–262. Google Scholar
Digital Library
- Anurag Mukkara, Nathan Beckmann, Maleen Abeydeera, Xiaosong Ma, and D. Sánchez. 2018. Exploiting locality in graph analytics through hardware-accelerated traversal scheduling. In 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’18), 1–14. Google Scholar
Digital Library
- J. Nesetril and P. D. Mendez. 2008. Structural properties of sparse graphs. Electronic Notes in Discrete Mathematics 31 (2008), 247–251.Google Scholar
Cross Ref
- M. Newman. 2005. Power laws, Pareto distributions and Zipf’s law. Contemporary Physics 46 (2005), 323–351.Google Scholar
Cross Ref
- Eriko Nurvitadhi, Ganesh Venkatesh, Jaewoong Sim, Debbie Marr, Randy Huang, Jason Hock, Yeong Tat Liew, Krishnan Srivatsan, Duncan J. M. Moss, Suchit Subhaschandra, and Guy Boudoukh. 2017. Can FPGAs beat GPUs in accelerating next-generation deep neural networks? In FPGA’17. Google Scholar
Digital Library
- H. Reittu, Lasse Leskelä, T. Räty, and M. Fiorucci. 2018. Analysis of large sparse graphs using regular decomposition of graph distance matrices. In 2018 IEEE International Conference on Big Data (Big Data’18), 3784–3792.Google Scholar
- Athanasios I. Salamanis, Dionisis D. Kehagias, C. K. Filelis-Papadopoulos, D. Tzovaras, and G. Gravvanis. 2016. Managing spatial graph dependencies in large volumes of traffic data for travel-time prediction. IEEE Transactions on Intelligent Transportation Systems 17 (2016), 1678–1687.Google Scholar
Digital Library
- N. Shervashidze, P. Schweitzer, E. V. Leeuwen, K. Mehlhorn, and K. Borgwardt. 2011. Weisfeiler-Lehman graph kernels. Journal of Machine Learning Research 12 (2011), 2539–2561. Google Scholar
Digital Library
- L. Shi, Yifan Zhang, Jian Cheng, and H. Lu. 2019. Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19), 12018–12027.Google Scholar
- Adam Silberstein, Gregory Filpus, K. Munagala, and Jun Yang. 2007. Data-driven processing in sensor networks. In CIDR.Google Scholar
- Petar Velickovic, Guillem Cucurull, A. Casanova, A. Romero, P. Liò, and Yoshua Bengio. 2018. Graph attention networks. ArXiv abs/1710.10903 (2018).Google Scholar
- T. Wang, Tong Geng, Ang Li, Xi Jin, and M. Herbordt. 2020. FPDeep: Scalable acceleration of CNN training on deeply-pipelined FPGA clusters. IEEE Transactions on Computers 69 (2020), 1143–1158.Google Scholar
- Long Wen, X. Li, Liang Gao, and Y. Zhang. 2018. A new convolutional neural network-based data-driven fault diagnosis method. IEEE Transactions on Industrial Electronics 65 (2018), 5990–5998.Google Scholar
Cross Ref
- Yuxuan Xie, B. Liu, Lei Feng, Xi-Peng Li, and Danyin Zou. 2020. A FPGA-oriented quantization scheme for MobileNet-SSD. (2020).Google Scholar
- Keyulu Xu, Weihua Hu, J. Leskovec, and S. Jegelka. 2019. How powerful are graph neural networks?ArXiv abs/1810.00826 (2019).Google Scholar
- Mingyu Yan, L. Deng, X. Hu, Ling Liang, Yujing Feng, Xiaochun Ye, Z. Zhang, D. Fan, and Yuan Xie. 2020. HyGCN: A GCN accelerator with hybrid architecture. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA’20), 15–29.Google Scholar
Cross Ref
- Rex Ying, Zhaoyu Lou, Jiaxuan You, Chengtao Wen, A. Canedo, and J. Leskovec. 2020. Neural subgraph matching. ArXiv abs/2007.03092 (2020).Google Scholar
- Ting Yu, Haoteng Yin, and Zhanxing Zhu. 2018. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. In IJCAI. Google Scholar
Digital Library
- Jialiang Zhang, Soroosh Khoram, and J. Li. 2017. Boosting the performance of FPGA-based graph processor using hybrid memory cube: A case for breadth first search. In FPGA’17. Google Scholar
Digital Library
- Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neural networks. ArXiv abs/1802.09691 (2018). Google Scholar
Digital Library
- Qikui Zhu, B. Du, and P. Yan. 2019. Multi-hop convolutions on weighted graphs. ArXiv abs/1911.04978 (2019).Google Scholar
Index Terms
ACE-GCN: A Fast Data-driven FPGA Accelerator for GCN Embedding
Recommendations
A hardware/software codesign of a co-processor for real-time hyperelliptic curve cryptography on a Spartan3 FPGA
ARCS'08: Proceedings of the 21st international conference on Architecture of computing systemsThis paper describes the acceleration of calculations for public-key cryptography on hyperelliptic curves on very small FPGAs. This is achieved by using a Hardware/Software Codesign Approach starting with an all-software implementation on an embedded ...
A hardware-efficient computing engine for FPGA-based deep convolutional neural network accelerator
AbstractDeep convolutional neural networks (DCNNs) have recently emerged as a promising approach for computer vision tasks with many new DCNN architectures proposed to further improve their performance. However, the significant computation ...
Performance-driven event-based synchronization for multi-FPGA simulation accelerator with event time-multiplexing bus
Simulation is the most viable solution for the functional verification of system-on-chip (SoC). The acceleration of simulation with multi-field programmable gate array (multi-FPGA) emulator is a promising method to comply with the increasing complexity ...






Comments