Abstract
The Finite Element Method (FEM) is a common numerical technique used for solving Partial Differential Equations on large and unstructured domain geometries. Numerical methods for FEM typically use algorithms and data structures which exhibit an unstructured memory access pattern. This makes acceleration of FEM on Field-Programmable Gate Arrays using an efficient, deeply pipelined architecture particularly challenging. In this work, we focus on implementing and optimising a vector assembly operation which, in the context of FEM, induces the unstructured memory access. We propose a dataflow architecture, graph-based theoretical model, and design flow for optimising the assembly operation for spectral/hp finite element method on reconfigurable accelerators. We evaluate the proposed approach on two benchmark meshes and show that the graph-theoretic method of generating a static data access schedule results in a significant improvement in resource utilisation compared to prior work. This enables supporting larger FEM meshes on FPGA than previously possible.
- Pavel Burovskiy, Paul Grigoras, Spencer Sherwin, and Wayne Luk. 2015. Efficient assembly for high order unstructured FEM meshes. In Proceedings of the 2015 25th International Conference on Field Programmable Logic and Applications (FPL’15). IEEE, 1--6. Google Scholar
Cross Ref
- C. D. Cantwell, D. Moxey, A. Comerford, A. Bolis, G. Rocco, G. Mengaldo, D. De Grazia, S. Yakovlev, J.-E. Lombard, D. Ekelschot, et al. 2015. Nektar++: An open-source spectral/hp element framework. Computer Physics Communications 192 (July 2015), 205--219. Google Scholar
Cross Ref
- C. D. Cantwell, S. Yakovlev, R. M. Kirby, N. S. Peters, and S. J. Sherwin. 2014. High-order spectral/hp element discretisation for reaction-diffusion problems on surfaces: Application to cardiac electrophysiology. Journal Of Computational Physics 257 (2014), 813--829. Google Scholar
Digital Library
- Gary C. T. Chow, Paul Grigoras, Pavel Burovskiy, and Wayne Luk. 2014. An efficient sparse conjugate gradient solver using a Beneš permutation network. In Proceedings of the 2014 24th Internatinal Conference on Field Programmable Logic and Applications (FPL). IEEE, 1--7. Google Scholar
Cross Ref
- Jim S. Chow, Gregory G. Zilliac, and Peter Bradshaw. 1997. Mean and turbulence measurements in the near field of a wingtip vortex. AIAA Journal 35, 10 (1997), 1561--1567. Google Scholar
Cross Ref
- Thomas F. Coleman and Jorge J. Mor. 1983. Estimation of sparse Jacobian matrices and graph coloring problems. SIAM J. Numer. Anal. 20, 1 (1983), 187--209. DOI:http://dx.doi.org/10.1137/0720013 Google Scholar
Cross Ref
- Elizabeth Cuthill and James McKee. 1969. Reducing the bandwidth of sparse symmetric matrices. In Proceedings of the 1969 24th National Conference. ACM, 157--172. Google Scholar
Digital Library
- James W. Demmel, Michael T. Heath, and Henk A. van der Vorst. 1993. Parallel numerical linear algebra. Acta Numerica 2 (1 1993), 111--197.Google Scholar
- Yousef Elkurdi, David Fernández, Evgueni Souleimanov, Dennis Giannacopoulos, and Warren J. Gross. 2008. FPGA architecture and implementation of sparse matrix--vector multiplication for the finite element method. Computer Physics Communications 178, 8 (2008), 558--570. Google Scholar
Cross Ref
- Boris G. Galerkin. 1915. Rods and plates. Series solution of some problems in elastic equilibrium of rods and plates., Vol. 19. Vestnik Inzhenerov i Tekhnikov, 897--908.Google Scholar
- Paul Grigoras, Pavel Burovskiy, and Wayne Luk. 2016a. CASK: Open-source custom architectures for sparse kernels. In Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA’16). ACM, 179--184. Google Scholar
Digital Library
- Paul Grigoras, Pavel Burovskiy, Wayne Luk, and Spencer Sherwin. 2016b. Optimising sparse matrix vector multiplication for large scale high order FEM problems on FPGAs. In Proceedings of the 2016 26th International Conference on Field Programmable Logic and Applications (FPL’16). Google Scholar
Cross Ref
- Paul Grigoras, Pavel Burovskiy, Eddie Hung, and Wayne Luk. 2015. Accelerating SpMV on FPGAs by compressing nonzero values. In Field-Programmable Custom Computing Machines (FCCM), 2015 IEEE 23rd Annual International Symposium on. IEEE, 64--67. Google Scholar
Digital Library
- Jing Hu, Steven F. Quigley, and Andrew Chan. 2008. An element-by-element preconditioned conjugate gradient solver of 3D tetrahedral finite elements on an FPGA coprocessor. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’08). International Conference on. IEEE, 575--578.Google Scholar
- Kazuki Ikushima, Shinsuke Itoh, and Masakazu Shibahara. 2015. Development of idealized explicit FEM using GPU parallelization and its application to large-scale analysis of residual stress of multi-pass welded pipe joint. Welding in the World 59, 4 (2015), 589--595. Google Scholar
Cross Ref
- George Karniadakis and Spencer Sherwin. 2013. Spectral/hp Element Methods for Computational Fluid Dynamics. Oxford University Press.Google Scholar
- Gerhard Lienhart, Daniel Gembris, and Reinhard Männer. 2005. Perspectives for the use of field programmable gate arrays for finite element computations. In Proceedings of the COMSOL Multiphysics User’s Conference.Google Scholar
- Jean-Eloi W. Lombard, David Moxey, Spencer J. Sherwin, Julien F. A. Hoessler, Sridar Dhandapani, and Mark J. Taylor. 2015. Implicit large-eddy simulation of a wingtip vortex. AIAA Journal (2015), 1--13.Google Scholar
- G. R. Markall, A. Slemmer, D. A. Ham, P. H. J. Kelly, C. D. Cantwell, and S. J. Sherwin. 2013. Finite element assembly strategies on multi-core and many-core architectures. International Journal for Numerical Methods in Fluids 71, 1 (2013), 80--97. Google Scholar
Cross Ref
- Maciej Piechotka. 2013. Unstructured Mesh Fluid Dynamics Using the Flux Reconstruction Method on an FPGA Dataflow Engine. Master’s thesis. Imperial College London.Google Scholar
- Gabriele Rocco. 2014. Advanced Instability Methods using Spectral/hp Discretisations and their Applications to Complex Geometries. Ph.D. Dissertation. Imperial College London.Google Scholar
- Gilbert Strang and George J. Fix. 1973. An Analysis of the Finite Element Method. Vol. 212. Prentice-Hall Englewood Cliffs, NJ.Google Scholar
- Marcel van der Veen. 2007. Sparse Matrix Vector Multiplication on a Field Programmable Gate Array. Master’s thesis. University of Twente.Google Scholar
- Peter E. Vincent, Patrice Castonguay, and Antony Jameson. 2011. A new class of high-order energy stable flux reconstruction schemes. Journal of Scientific Computing 47, 1 (2011), 50--72. DOI:http://dx.doi.org/ 10.1007/s10915-010-9420-z Google Scholar
Digital Library
- Peter E. J. Vos, Spencer J. Sherwin, and Robert M. Kirby. 2010. From H to P efficiently: Implementing finite and spectral/Hp element methods to achieve optimal performance for low- and high-order discretisations. Journal of Computational Physics 229, 13 (July 2010), 5161--5181. Google Scholar
Digital Library
- Guiming Wu, Xianghui Xie, Yong Dou, and Miao Wang. 2013. High-performance architecture for the conjugate gradient solver on FPGAs. IEEE Transactions on Circuits and Systems II: Express Briefs 60, 11 (2013), 791--795. Google Scholar
Cross Ref
Index Terms
Efficient Assembly for High-Order Unstructured FEM Meshes (FPL 2015)
Recommendations
Exploiting Partial Runtime Reconfiguration for High-Performance Reconfigurable Computing
Runtime Reconfiguration (RTR) has been traditionally utilized as a means for exploiting the flexibility of High-Performance Reconfigurable Computers (HPRCs). However, the RTR feature comes with the cost of high configuration overhead which might ...
Throughput-Optimized FPGA Accelerator for Deep Convolutional Neural Networks
Deep convolutional neural networks (CNNs) have gained great success in various computer vision applications. State-of-the-art CNN models for large-scale applications are computation intensive and memory expensive and, hence, are mainly processed on high-...
Arbitrary order Trefftz-like basis functions on polygonal meshes and realization in BEM-based FEM
Polygonal meshes appear in more and more applications and the BEM-based Finite Element Method (FEM) turns out to be a forward-looking approach. The method uses Trefftz-like basis functions which are defined implicitly and are treated locally by means of ...






Comments