Abstract
Designing programming environments for physical simulation is challenging because simulations rely on diverse algorithms and geometric domains. These challenges are compounded when we try to run efficiently on heterogeneous parallel architectures. We present Ebb, a Domain-Specific Language (DSL) for simulation, that runs efficiently on both CPUs and GPUs. Unlike previous DSLs, Ebb uses a three-layer architecture to separate (1) simulation code, (2) definition of data structures for geometric domains, and (3) runtimes supporting parallel architectures. Different geometric domains are implemented as libraries that use a common, unified, relational data model. By structuring the simulation framework in this way, programmers implementing simulations can focus on the physics and algorithms for each simulation without worrying about their implementation on parallel computers. Because the geometric domain libraries are all implemented using a common runtime based on relations, new geometric domains can be added as needed, without specifying the details of memory management, mapping to different parallel architectures, or having to expand the runtime’s interface.
We evaluate Ebb by comparing it to several widely used simulations, demonstrating comparable performance to handwritten GPU code where available, and surpassing existing CPU performance optimizations by up to 9 × when no GPU code exists.
- Zachary DeVito, James Hegarty, Alex Aiken, Pat Hanrahan, and Jan Vitek. 2013. Terra: A multi-stage language for high-performance computing. In ACM SIGPLAN Notices, Vol. 48. ACM, 105--116. Google Scholar
Digital Library
- Zachary DeVito, Niels Joubert, Francisco Palacios, Stephen Oakley, Montserrat Medina, Mike Barrientos, et al. 2011. Liszt: A domain specific language for building portable mesh-based PDE solvers. In Proceedings of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC’11). ACM, New York, NY, Article 9, 12 pages. DOI:http://dx.doi.org/10.1145/2063384.2063396 Google Scholar
Digital Library
- Pradeep Dubey, Pat Hanrahan, Ronald Fedkiw, Michael Lentine, and Craig Schroeder. 2011. PhysBAM: Physically based simulation. In ACM SIGGRAPH 2011 Courses. ACM, 10. Google Scholar
Digital Library
- Tim Foley and Pat Hanrahan. 2011. Spark: Modular, composable shaders for graphics hardware. In ACM SIGGRAPH 2011 Papers (SIGGRAPH’11). ACM, New York, NY, Article 107, 12 pages. Google Scholar
Digital Library
- Nolan Goodnight. 2007. CUDA/OpenGL fluid simulation. NVIDIA Corporation (2007).Google Scholar
- Pat Hanrahan and Jim Lawson. 1990. A language for shading and lighting calculations. In Proceedings of the 17th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’90). ACM, New York, NY, 289--298. Google Scholar
Digital Library
- Peter Hawkins, Alex Aiken, Kathleen Fisher, Martin Rinard, and Mooly Sagiv. 2011. Data Representation Synthesis. Vol. 46. ACM. Google Scholar
Digital Library
- Frédéric Hecht. 2012. New development in FreeFem++. J. Numer. Math. 20, 3--4 (2012), 251--265.Google Scholar
Cross Ref
- James Hegarty, John Brunhaver, Zachary DeVito, Jonathan Ragan-Kelley, Noy Cohen, Steven Bell, Artem Vasilyev, Mark Horowitz, and Pat Hanrahan. 2014. Darkroom: Compiling high-level image processing code into hardware pipelines. ACM Trans. Graph. 33, 4, Article 144 (July 2014), 11 pages. Google Scholar
Digital Library
- Roberto Ierusalimschy, Luiz Henrique De Figueiredo, and Waldemar Celes. 2011. Passing a language through the eye of a needle. Commun. ACM 54, 7 (2011), 38--43. Google Scholar
Digital Library
- Ian Karlin, Abhinav Bhatele, Bradford L. Chamberlain, Jonathan Cohen, Zachary Devito, Maya Gokhale, et al. 2012. LULESH Programming Model and Performance Ports Overview. Technical Report LLNL-TR-608824. 1--17.Google Scholar
- Ian Karlin, Abhinav Bhatele, Jeff Keasler, Bradford L. Chamberlain, Jonathan Cohen, Zachary DeVito, et al. 2013. Exploring traditional and emerging parallel programming models using a proxy application. In 27th IEEE International Parallel & Distributed Processing Symposium (IEEE IPDPS 2013). Google Scholar
Digital Library
- Andrey Kuzmin, Mathieu Luisier, and Olaf Schenk. 2013. Fast methods for computing selected elements of the Greens function in massively parallel nanoelectronic device simulations. In Euro-Par 2013 Parallel Processing, F. Wolf, B. Mohr, and D. Mey (Eds.). Lecture Notes in Computer Science, Vol. 8097. Springer, Berlin, 533--544. DOI:http://dx.doi.org/10.1007/978-3-642-40047-6_54 Google Scholar
Digital Library
- Edward A. Luke. 1999. Loci: A deductive framework for graph-based algorithms. In Computing in Object-Oriented Parallel Environments. Springer, 142--153. Google Scholar
Digital Library
- Edward A. Luke and Thomas George. 2005. Loci: A rule-based framework for parallel multi-disciplinary simulation synthesis. J. Funct. Program. 15, 3 (May 2005), 477--502. Google Scholar
Digital Library
- LUL. 2012. Hydrodynamics Challenge Problem, Lawrence Livermore National Laboratory. Technical Report LLNL-TR-490254. 1--17.Google Scholar
- Miles Macklin, Matthias Müller, Nuttapong Chentanez, and Tae-Yong Kim. 2014. Unified particle physics for real-time applications. ACM Trans. Graph. 33, 4 (2014), 104. Google Scholar
Digital Library
- Gihan R. Mudalige, Mike B. Giles, Jeyarajan Thiyagalingam, István Z. Reguly, Carlo Bertolli, Paul H. J. Kelly, and Anne E. Trefethen. 2013. Design and initial performance of a high-level unstructured mesh framework on heterogeneous parallel systems. Parallel Comput. 39, 11 (2013), 669--692. DOI:http://dx.doi.org/10.1016/j.parco.2013.09.004 Google Scholar
Digital Library
- Kekoa Proudfoot, William R. Mark, Svetoslav Tzvetkov, and Pat Hanrahan. 2001. A real-time procedural shading system for programmable graphics hardware. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’01). ACM, New York, NY, 159--170. Google Scholar
Digital Library
- Jonathan Ragan-Kelley, Andrew Adams, Sylvain Paris, Marc Levoy, Saman P. Amarasinghe, and Frédo Durand. 2012. Decoupling algorithms from schedules for easy optimization of image processing pipelines. ACM Trans. Graph. 31, 4 (2012), 32. Google Scholar
Digital Library
- Fun Shing Sin, Daniel Schroeder, and Jernej Barbič. 2013. Vega: Non-linear FEM deformable object simulator. Comput. Graph. Forum 32, 1 (2013), 36--48. DOI:http://dx.doi.org/10.1111/j.1467-8659.2012.03230.xGoogle Scholar
Cross Ref
- Jos Stam. 1999. Stable fluids. In Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques. ACM Press/Addison-Wesley Publishing Co., 121--128. Google Scholar
Digital Library
- Jos Stam. 2009. Nucleus: Towards a unified dynamics solver for computer graphics. In Proceedings of the 11th IEEE International Conference on Computer-Aided Design and Computer Graphics (CAD/Graphics’09). IEEE, 1--11.Google Scholar
Cross Ref
- Nicholas Wilt. 2013. The Cuda Handbook: A Comprehensive Guide to GPU Programming. Pearson Education.Google Scholar
Index Terms
Ebb: A DSL for Physical Simulation on CPUs and GPUs
Recommendations
GPU implementation of the multiple back-propagation algorithm
IDEAL'09: Proceedings of the 10th international conference on Intelligent data engineering and automated learningGraphics Processing Units (GPUs) can provide remarkable performance gains when compared to CPUs for computationally-intensive applications. Thus they are much attractive to be used as dedicated hardware in many fields such as in machine learning. In ...
The barracuda design pattern (in Portuguese)
SugarLoafPLoP '12: Proceedings of the 9th Latin-American Conference on Pattern Languages of ProgrammingThe High Performance Computing has achieved a high level in terms of processing capacity at reduced cost using Graphics Processing Units to perform algorithms of high computing cost. The massively parallel systems based on GPU have to deal with hundreds ...
Brook for GPUs: stream computing on graphics hardware
SIGGRAPH '04: ACM SIGGRAPH 2004 PapersIn this paper, we present Brook for GPUs, a system for general-purpose computation on programmable graphics hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming co-processor. We present a ...





Comments