skip to main content
research-article

Efficient Assembly for High-Order Unstructured FEM Meshes (FPL 2015)

Published:06 April 2017Publication History
Skip Abstract Section

Abstract

The Finite Element Method (FEM) is a common numerical technique used for solving Partial Differential Equations on large and unstructured domain geometries. Numerical methods for FEM typically use algorithms and data structures which exhibit an unstructured memory access pattern. This makes acceleration of FEM on Field-Programmable Gate Arrays using an efficient, deeply pipelined architecture particularly challenging. In this work, we focus on implementing and optimising a vector assembly operation which, in the context of FEM, induces the unstructured memory access. We propose a dataflow architecture, graph-based theoretical model, and design flow for optimising the assembly operation for spectral/hp finite element method on reconfigurable accelerators. We evaluate the proposed approach on two benchmark meshes and show that the graph-theoretic method of generating a static data access schedule results in a significant improvement in resource utilisation compared to prior work. This enables supporting larger FEM meshes on FPGA than previously possible.

References

  1. Pavel Burovskiy, Paul Grigoras, Spencer Sherwin, and Wayne Luk. 2015. Efficient assembly for high order unstructured FEM meshes. In Proceedings of the 2015 25th International Conference on Field Programmable Logic and Applications (FPL’15). IEEE, 1--6. Google ScholarGoogle ScholarCross RefCross Ref
  2. C. D. Cantwell, D. Moxey, A. Comerford, A. Bolis, G. Rocco, G. Mengaldo, D. De Grazia, S. Yakovlev, J.-E. Lombard, D. Ekelschot, et al. 2015. Nektar++: An open-source spectral/hp element framework. Computer Physics Communications 192 (July 2015), 205--219. Google ScholarGoogle ScholarCross RefCross Ref
  3. C. D. Cantwell, S. Yakovlev, R. M. Kirby, N. S. Peters, and S. J. Sherwin. 2014. High-order spectral/hp element discretisation for reaction-diffusion problems on surfaces: Application to cardiac electrophysiology. Journal Of Computational Physics 257 (2014), 813--829. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Gary C. T. Chow, Paul Grigoras, Pavel Burovskiy, and Wayne Luk. 2014. An efficient sparse conjugate gradient solver using a Beneš permutation network. In Proceedings of the 2014 24th Internatinal Conference on Field Programmable Logic and Applications (FPL). IEEE, 1--7. Google ScholarGoogle ScholarCross RefCross Ref
  5. Jim S. Chow, Gregory G. Zilliac, and Peter Bradshaw. 1997. Mean and turbulence measurements in the near field of a wingtip vortex. AIAA Journal 35, 10 (1997), 1561--1567. Google ScholarGoogle ScholarCross RefCross Ref
  6. Thomas F. Coleman and Jorge J. Mor. 1983. Estimation of sparse Jacobian matrices and graph coloring problems. SIAM J. Numer. Anal. 20, 1 (1983), 187--209. DOI:http://dx.doi.org/10.1137/0720013 Google ScholarGoogle ScholarCross RefCross Ref
  7. Elizabeth Cuthill and James McKee. 1969. Reducing the bandwidth of sparse symmetric matrices. In Proceedings of the 1969 24th National Conference. ACM, 157--172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. James W. Demmel, Michael T. Heath, and Henk A. van der Vorst. 1993. Parallel numerical linear algebra. Acta Numerica 2 (1 1993), 111--197.Google ScholarGoogle Scholar
  9. Yousef Elkurdi, David Fernández, Evgueni Souleimanov, Dennis Giannacopoulos, and Warren J. Gross. 2008. FPGA architecture and implementation of sparse matrix--vector multiplication for the finite element method. Computer Physics Communications 178, 8 (2008), 558--570. Google ScholarGoogle ScholarCross RefCross Ref
  10. Boris G. Galerkin. 1915. Rods and plates. Series solution of some problems in elastic equilibrium of rods and plates., Vol. 19. Vestnik Inzhenerov i Tekhnikov, 897--908.Google ScholarGoogle Scholar
  11. Paul Grigoras, Pavel Burovskiy, and Wayne Luk. 2016a. CASK: Open-source custom architectures for sparse kernels. In Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA’16). ACM, 179--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Paul Grigoras, Pavel Burovskiy, Wayne Luk, and Spencer Sherwin. 2016b. Optimising sparse matrix vector multiplication for large scale high order FEM problems on FPGAs. In Proceedings of the 2016 26th International Conference on Field Programmable Logic and Applications (FPL’16). Google ScholarGoogle ScholarCross RefCross Ref
  13. Paul Grigoras, Pavel Burovskiy, Eddie Hung, and Wayne Luk. 2015. Accelerating SpMV on FPGAs by compressing nonzero values. In Field-Programmable Custom Computing Machines (FCCM), 2015 IEEE 23rd Annual International Symposium on. IEEE, 64--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jing Hu, Steven F. Quigley, and Andrew Chan. 2008. An element-by-element preconditioned conjugate gradient solver of 3D tetrahedral finite elements on an FPGA coprocessor. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’08). International Conference on. IEEE, 575--578.Google ScholarGoogle Scholar
  15. Kazuki Ikushima, Shinsuke Itoh, and Masakazu Shibahara. 2015. Development of idealized explicit FEM using GPU parallelization and its application to large-scale analysis of residual stress of multi-pass welded pipe joint. Welding in the World 59, 4 (2015), 589--595. Google ScholarGoogle ScholarCross RefCross Ref
  16. George Karniadakis and Spencer Sherwin. 2013. Spectral/hp Element Methods for Computational Fluid Dynamics. Oxford University Press.Google ScholarGoogle Scholar
  17. Gerhard Lienhart, Daniel Gembris, and Reinhard Männer. 2005. Perspectives for the use of field programmable gate arrays for finite element computations. In Proceedings of the COMSOL Multiphysics User’s Conference.Google ScholarGoogle Scholar
  18. Jean-Eloi W. Lombard, David Moxey, Spencer J. Sherwin, Julien F. A. Hoessler, Sridar Dhandapani, and Mark J. Taylor. 2015. Implicit large-eddy simulation of a wingtip vortex. AIAA Journal (2015), 1--13.Google ScholarGoogle Scholar
  19. G. R. Markall, A. Slemmer, D. A. Ham, P. H. J. Kelly, C. D. Cantwell, and S. J. Sherwin. 2013. Finite element assembly strategies on multi-core and many-core architectures. International Journal for Numerical Methods in Fluids 71, 1 (2013), 80--97. Google ScholarGoogle ScholarCross RefCross Ref
  20. Maciej Piechotka. 2013. Unstructured Mesh Fluid Dynamics Using the Flux Reconstruction Method on an FPGA Dataflow Engine. Master’s thesis. Imperial College London.Google ScholarGoogle Scholar
  21. Gabriele Rocco. 2014. Advanced Instability Methods using Spectral/hp Discretisations and their Applications to Complex Geometries. Ph.D. Dissertation. Imperial College London.Google ScholarGoogle Scholar
  22. Gilbert Strang and George J. Fix. 1973. An Analysis of the Finite Element Method. Vol. 212. Prentice-Hall Englewood Cliffs, NJ.Google ScholarGoogle Scholar
  23. Marcel van der Veen. 2007. Sparse Matrix Vector Multiplication on a Field Programmable Gate Array. Master’s thesis. University of Twente.Google ScholarGoogle Scholar
  24. Peter E. Vincent, Patrice Castonguay, and Antony Jameson. 2011. A new class of high-order energy stable flux reconstruction schemes. Journal of Scientific Computing 47, 1 (2011), 50--72. DOI:http://dx.doi.org/ 10.1007/s10915-010-9420-z Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Peter E. J. Vos, Spencer J. Sherwin, and Robert M. Kirby. 2010. From H to P efficiently: Implementing finite and spectral/Hp element methods to achieve optimal performance for low- and high-order discretisations. Journal of Computational Physics 229, 13 (July 2010), 5161--5181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Guiming Wu, Xianghui Xie, Yong Dou, and Miao Wang. 2013. High-performance architecture for the conjugate gradient solver on FPGAs. IEEE Transactions on Circuits and Systems II: Express Briefs 60, 11 (2013), 791--795. Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Efficient Assembly for High-Order Unstructured FEM Meshes (FPL 2015)

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Reconfigurable Technology and Systems
              ACM Transactions on Reconfigurable Technology and Systems  Volume 10, Issue 2
              Special Section on Field Programmable Logic and Applications 2015 and Regular Papers
              June 2017
              133 pages
              ISSN:1936-7406
              EISSN:1936-7414
              DOI:10.1145/3068424
              • Editor:
              • Steve Wilton
              Issue’s Table of Contents

              Copyright © 2017 ACM

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 6 April 2017
              • Accepted: 1 November 2016
              • Revised: 1 August 2016
              • Received: 1 April 2016
              Published in trets Volume 10, Issue 2

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed
            • Article Metrics

              • Downloads (Last 12 months)2
              • Downloads (Last 6 weeks)0

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!