skip to main content
10.1145/2063384.2063396acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Liszt: a domain specific language for building portable mesh-based PDE solvers

Published:12 November 2011Publication History

ABSTRACT

Heterogeneous computers with processors and accelerators are becoming widespread in scientific computing. However, it is difficult to program hybrid architectures and there is no commonly accepted programming model. Ideally, applications should be written in a way that is portable to many platforms, but providing this portability for general programs is a hard problem.

By restricting the class of programs considered, we can make this portability feasible. We present Liszt, a domain-specific language for constructing mesh-based PDE solvers. We introduce language statements for interacting with an unstructured mesh, and storing data at its elements. Program analysis of these statements enables our compiler to expose the parallelism, locality, and synchronization of Liszt programs. Using this analysis, we generate applications for multiple platforms: a cluster, an SMP, and a GPU. This approach allows Liszt applications to perform within 12% of hand-written C++, scale to large clusters, and experience order-of-magnitude speedups on GPUs.

References

  1. J. R. Allwright, R. Bordawekar, P. D. Coddington, K. Dincer, and C. L. Martin. A comparison of parallel graph coloring algorithms. Technical report, SCCS-666, Northeast Parallel Architectures Center at Syracuse University, 1995.Google ScholarGoogle Scholar
  2. C. Ancourt, F. Coelho, and R. Keryell. How to add a new phase in PIPS: the case of dead code elimination. In In Sixth International Workshop on Compilers for Parallel Computers, 1996.Google ScholarGoogle Scholar
  3. V. G. Asouti, X. S. Trompoukis, I. C. Kampolis, and K. C. Giannakoglou. Unsteady CFD computations using vertex-centered finite volumes for unstructured grids on graphics processing units. International Journal for Numerical Methods in Fluids, 2010.Google ScholarGoogle Scholar
  4. S. Balay, W. D. Gropp, L. C. McInnes, and B. F. Smith. Efficient management of parallelism in object oriented numerical software libraries. In E. Arge, A. M. Bruaset, and H. P. Langtangen, editors, Modern Software Tools in Scientific Computing, pages 163--202. Birkhäuser Press, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. J. Barker, K. Davis, A. Hoisie, D. J. Kerbyson, M. Lang, S. Pakin, and J. C. Sancho. Entering the petaflop era: the architecture and performance of Roadrunner. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing, SC '08, Piscataway, NJ, USA, 2008. IEEE Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. Brandvik and G. Pullan. SBLOCK: A framework for efficient stencil-based PDE solvers on multi-core platforms. In Computer and Information Technology (CIT), 2010 IEEE 10th International Conference on, pages 1181--1188, July 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. L. Brown, G. S. Chesshire, W. D. Henshaw, and D. J. Quinlan. OVERTURE: An object-oriented software system for solving partial differential equations in serial and parallel environments. In PPSC'97, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. H. Chafi, Z. DeVito, A. Moors, T. Rompf, A. K. Sujeeth, P. Hanrahan, M. Odersky, and K. Olukotun. Language virtualization for heterogeneous parallel computing. In Proceedings of the ACM international conference on Object oriented programming systems languages and applications, OOPSLA '10, pages 835--847, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. J. Chaitin, M. A. Auslander, A. K. Chandra, J. Cocke, M. E. Hopkins, and P. W. Markstein. Register allocation via coloring. Comput. Lang., pages 47--57, 1981.Google ScholarGoogle ScholarCross RefCross Ref
  10. A. Corrigan, F. Camelli, R. Löhner, and J. Wallin. Running unstructured grid CFD solvers on modern graphics hardware. In 19th AIAA Computational Fluid Dynamics Conference, number AIAA 2009-4001, June 2009.Google ScholarGoogle ScholarCross RefCross Ref
  11. D. P. Dobkin and M. J. Laszlo. Primitives for the manipulation of three-dimensional subdivisions. In Proceedings of the third annual symposium on Computational geometry, SCG '87, pages 86--99, New York, NY, USA, 1987. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. B. Drake, W. Putman, P. N. Swarztrauber, and D. L. Williamson. High order cartesian method for the shallow water equations on a sphere. Technical report, TM-2001, Oakridge Nation Laboratory, 1999.Google ScholarGoogle Scholar
  13. T. Dupont, J. Hoffman, C. Johnson, R. Kirby, M. Larson, A. Logg, and R. Scott. The FEniCS project. Technical report, 2003.Google ScholarGoogle Scholar
  14. M. Giles, G. Mudalige, Z. Sharif, G. Markall, and P. Kelly. Performance analysis of the OP2 framework on many-core architecture. In ACM SIGMETRICS Performance Evaluation Review (to appear), March 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. W. Gropp, S. Huss-Ledermanand, A. Lumsdaine, E. Lusk, B. Nitzberg, W. Saphir, and M. Snir. MPI - The Complete Reference: Volume 2, The MPI-2 Extensions. MIT Press, Cambridge, MA, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  16. R. Hameed, W. Qadeer, M. Wachs, O. Azizi, A. Solomatnikov, B. C. Lee, S. Richardson, C. Kozyrakis, and M. Horowitz. Understanding sources of inefficiency in general-purpose chips. SIGARCH Comput. Archit. News, 38:37--47, June 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. A. Heroux, R. A. Bartlett, V. E. Howle, R. J. Hoekstra, J. J. Hu, T. G. Kolda, R. B. Lehoucq, K. R. Long, R. P. Pawlowski, E. T. Phipps, A. G. Salinger, H. K. Thornquist, R. S. Tuminaro, J. M. Willenbring, A. Williams, and K. S. Stanley. An overview of the Trilinos project. ACM Trans. Math. Softw., 31:397--423, September 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Houston, J.-Y. Park, M. Ren, T. Knight, K. Fatahalian, A. Aiken, W. Dally, and P. Hanrahan. A portable runtime interface for multi-level memory hierarchies. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, PPoPP '08, pages 143--152, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Jameson, T. Baker, and N. Weatherill. Improvements to the aircraft Euler method. In AIAA 25th Aerospace Sciences Meeting, number 86-0103, January 1986.Google ScholarGoogle Scholar
  20. I. Kampolis, X. Trompoukis, V. Asouti, and K. Giannakoglou. CFD-based analysis and two-level aerodynamic optimization on graphics processing units. Computer Methods in Applied Mechanics and Engineering, 199(9-12):712--722, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  21. G. Karypis, V. Kumar, and V. Kumar. A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. Journal of Parallel and Distributed Computing, 48:71--95, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Khronos OpenCL Working Group. The OpenCL Specification, version 1.0.29, 8 December 2008.Google ScholarGoogle Scholar
  23. O. Lawlor, S. Chakravorty, T. Wilmarth, N. Choudhury, I. Dooley, G. Zheng, and L. Kale. ParFUM: a parallel framework for unstructured meshes for scalable dynamic physics applications. Engineering with Computers, 22:215--235, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. W. Lim and M. S. Lam. Maximizing parallelism and minimizing synchronization with affine partitions. In Parallel Computing, pages 201--214. ACM Press, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. R. Löhner. Applied Computational Fluid Dynamics: An Introduction Based on Finite Element Methods. Wiley, Fairfax, Virginia, 2nd edition, 2008.Google ScholarGoogle Scholar
  26. J. Nickolls, I. Buck, M. Garland, and K. Skadron. Scalable parallel programming with CUDA. Queue, 6:40--53, March 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. NVIDIA Corporation. NVIDIA's next generation compute architecture: Fermi, November 2009.Google ScholarGoogle Scholar
  28. NVIDIA Corporation. NVIDIA Tesla GPUs power world's fastest supercomputer, 2010.Google ScholarGoogle Scholar
  29. M. Odersky, V. Cremet, I. Dragos, G. Dubochet, B. Emir, S. Mcdirmid, S. Micheloud, N. Mihaylov, M. Schinz, E. Stenman, L. Spoon, and M. Zenger. An overview of the Scala programming language (second edition. Technical report, LAMP-REPORT-2006-001, École Polytechnique Fédérale de Lausanne, 2006.Google ScholarGoogle Scholar
  30. OpenMP Architecture Review Board. OpenMP: Application Program Interface 3.1, July 2011.Google ScholarGoogle Scholar
  31. R. Pecnik, V. E. Terrapon, F. Ham, and G. Iaccarino. Full system scramjet simulation. Annual Research Briefs of the Center for Turbulence Research, Stanford University, Stanford, CA, 2009.Google ScholarGoogle Scholar
  32. O. Pironneau, F. Hecht, A. L. Hyaric, and J. Morice. FreeFEM, 2005. Universitè Pierre et Marie Curie Laboratoire Jacques-Louis Lions, http://www.freefem.org/.Google ScholarGoogle Scholar
  33. T. Rompf and M. Odersky. Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs. In Proceedings of the ninth international conference on Generative programming and component engineering, GPCE '10, pages 127--136, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. D. E. Shaw, R. O. Dror, J. K. Salmon, J. P. Grossman, K. M. Mackenzie, J. A. Bank, C. Young, M. M. Deneroff, B. Batson, K. J. Bowers, E. Chow, M. P. Eastwood, D. J. Ierardi, J. L. Klepeis, J. S. Kuskin, R. H. Larson, K. Lindorff-Larsen, P. Maragakis, M. A. Moraes, S. Piana, Y. Shan, and B. Towles. Millisecond-scale molecular dynamics simulations on Anton. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, pages 39:1--39:11, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. R. Stewart and H. C. Edwards. A framework approach for developing parallel adaptive multiphysics applications. Finite Elem. Anal. Des., 40:1599--1617, July 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. H. G. Weller, G. Tabor, H. Jasak, and C. Fureby. A tensorial approach to computational continuum mechanics using object-oriented techniques. Comput. Phys., 12:620--631, November 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Liszt: a domain specific language for building portable mesh-based PDE solvers

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SC '11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
            November 2011
            866 pages
            ISBN:9781450307710
            DOI:10.1145/2063384

            Copyright © 2011 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 12 November 2011

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            SC '11 Paper Acceptance Rate74of352submissions,21%Overall Acceptance Rate1,516of6,373submissions,24%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader