skip to main content
research-article
Open Access

QuanTaichi: a compiler for quantized simulations

Published:19 July 2021Publication History
Skip Abstract Section

Abstract

High-resolution simulations can deliver great visual quality, but they are often limited by available memory, especially on GPUs. We present a compiler for physical simulation that can achieve both high performance and significantly reduced memory costs, by enabling flexible and aggressive quantization. Low-precision ("quantized") numerical data types are used and packed to represent simulation states, leading to reduced memory space and bandwidth consumption. Quantized simulation allows higher resolution simulation with less memory, which is especially attractive on GPUs. Implementing a quantized simulator that has high performance and packs the data tightly for aggressive storage reduction would be extremely labor-intensive and error-prone using a traditional programming language. To make the creation of quantized simulation practical, we have developed a new set of language abstractions and a compilation system. A suite of tailored domain-specific optimizations ensure quantized simulators often run as fast as the full-precision simulators, despite the overhead of encoding-decoding the packed quantized data types. Our programming language and compiler, based on Taichi, allow developers to effortlessly switch between different full-precision and quantized simulators, to explore the full design space of quantization schemes, and ultimately to achieve a good balance between space and precision. The creation of quantized simulation with our system has large benefits in terms of memory consumption and performance, on a variety of hardware, from mobile devices to workstations with high-end GPUs. We can simulate with levels of resolution that were previously only achievable on systems with much more memory, such as multiple GPUs. For example, on a single GPU, we can simulate a Game of Life with 20 billion cells (8× compression per pixel), an Eulerian fluid system with 421 million active voxels (1.6× compression per voxel), and a hybrid Eulerian-Lagrangian elastic object simulation with 235 million particles (1.7× compression per particle). At the same time, quantized simulations create physically plausible results. Our quantization techniques are complementary to existing acceleration approaches of physical simulation: they can be used in combination with these existing approaches, such as sparse data structures, for even higher scalability and performance.

Skip Supplemental Material Section

Supplemental Material

a182-hu.mp4
3450626.3459671.mp4

References

  1. Mridul Aanjaneya, Ming Gao, Haixiang Liu, Christopher Batty, and Eftychios Sifakis. 2017. Power diagrams and sparse paged grids for high resolution adaptive liquids. ACM Transactions on Graphics (TOG) 36, 4 (2017), 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ahmad Abdelfattah, Hartwig Anzt, Erik G Boman, Erin Carson, Terry Cojean, Jack Dongarra, Mark Gates, Thomas Grützmacher, Nicholas J Higham, Sherry Li, et al. 2020. A survey of numerical methods utilizing mixed precision arithmetic. arXiv preprint arXiv:2007.06674 (2020).Google ScholarGoogle Scholar
  3. Gilbert Louis Bernstein and Fredrik Kjolstad. 2016. Perspectives: Why New Programming Languages for Simulation? ACM Transactions on Graphics (TOG) 35, 2 (2016), 1--3.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Gilbert Louis Bernstein, Chinmayee Shah, Crystal Lemire, Zachary Devito, Matthew Fisher, Philip Levis, and Pat Hanrahan. 2016. Ebb: A DSL for physical simulation on CPUs and GPUs. ACM Trans. Graph. 35, 2 (2016), 21:1--21:12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Stephen Chou, Fredrik Kjolstad, and Saman Amarasinghe. 2018. Format abstraction for sparse tensor algebra compilers. Proceedings of the ACM on Programming Languages 2, OOPSLA (2018), 1--30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1. arXiv preprint arXiv:1602.02830 (2016).Google ScholarGoogle Scholar
  7. Zachary DeVito, Niels Joubert, Francisco Palacios, Stephen Oakley, Montserrat Medina, Mike Barrientos, Erich Elsen, Frank Ham, Alex Aiken, Karthik Duraisamy, et al. 2011. Liszt: A domain specific language for building portable mesh-based PDE solvers. In International Conference for High Performance Computing, Networking, Storage and Analysis. 9.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Christian Eisenacher, Gregory Nichols, Andrew Selle, and Brent Burley. 2013. Sorted deferred shading for production path tracing. In Computer Graphics Forum, Vol. 32. Wiley Online Library, 125--132.Google ScholarGoogle Scholar
  9. Ming Gao, Xinlei Wang, Kui Wu, Andre Pradhana-Tampubolon, Eftychios Sifakis, Yuksel Cem, and Chenfanfu Jiang. 2018. GPU Optimization of Material Point Methods. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 32, 4 (2018), 102.Google ScholarGoogle Scholar
  10. Yunhui Guo. 2018. A survey on methods and theories of quantized neural networks. arXiv preprint arXiv:1808.04752 (2018).Google ScholarGoogle Scholar
  11. Rama Karl Hoetzlein. 2016. GVDB: Raytracing sparse voxel database structures on the GPU. In Proceedings of High Performance Graphics. Eurographics Association, 109--117.Google ScholarGoogle Scholar
  12. Ben Houston, Michael B Nielsen, Christopher Batty, Ola Nilsson, and Ken Museth. 2006. Hierarchical RLE level set: A compact and versatile deformable surface representation. ACM Transactions on Graphics (TOG) 25, 1 (2006), 151--175.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Yuanming Hu. 2020. The Taichi programming language. In ACM SIGGRAPH 2020 Courses. 1--50.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Yuanming Hu, Luke Anderson, Tzu-Mao Li, Qi Sun, Nathan Carr, Jonathan Ragan-Kelley, and Frédo Durand. 2020. DiffTaichi: Differentiable Programming for Physical Simulation. ICLR (2020).Google ScholarGoogle Scholar
  15. Yuanming Hu, Yu Fang, Ziheng Ge, Ziyin Qu, Yixin Zhu, Andre Pradhana, and Chenfanfu Jiang. 2018. A moving least squares material point method with displacement discontinuity and two-way rigid body coupling. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 37, 4 (2018), 150.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Yuanming Hu, Tzu-Mao Li, Luke Anderson, Jonathan Ragan-Kelley, and Frédo Durand. 2019. Taichi: a language for high-performance computation on spatially sparse data structures. ACM Transactions on Graphics (TOG) 38, 6 (2019), 201.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Zhiao Huang, Yuanming Hu, Tao Du, Siyuan Zhou, Hao Su, Joshua B Tenenbaum, and Chuang Gan. 2021. PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable Physics. ICLR (2021).Google ScholarGoogle Scholar
  18. Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2017. Quantized neural networks: Training neural networks with low precision weights and activations. The Journal of Machine Learning Research 18, 1 (2017), 6869--6898.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. IEEE. 2008. IEEE Standard for Floating-Point Arithmetic. IEEE Std 754-2008 (2008), 1--70. Google ScholarGoogle ScholarCross RefCross Ref
  20. Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. 2018. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2704--2713.Google ScholarGoogle ScholarCross RefCross Ref
  21. Norman P Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, et al. 2017b. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th annual international symposium on computer architecture. 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jaffey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Daniel Killebrew, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Matt Ross, Amir Salek, Emad Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Souter, Dan Steinberg, Andy Swing, Mercedes Tan, Gregory Thorson, Bo Tian, Horia Toma, Erick Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, and Doe Hyun Yoon. 2017a. In-Datacenter Performance Analysis of a Tensor Processing Unit. SIGARCH Comput. Archit. News 45, 2 (June 2017), 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Minje Kim and Paris Smaragdis. 2016. Bitwise neural networks. arXiv preprint arXiv:1601.06071 (2016).Google ScholarGoogle Scholar
  24. Fredrik Kjolstad, Shoaib Kamil, Stephen Chou, David Lugato, and Saman Amarasinghe. 2017. The tensor algebra compiler. Proceedings of the ACM on Programming Languages 1, OOPSLA (2017), 1--29.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Fredrik Kjolstad, Shoaib Kamil, Jonathan Ragan-Kelley, David I. W. Levin, Shinjiro Sueda, Desai Chen, Etienne Vouga, Danny M. Kaufman, Gurtej Kanwar, Wojciech Matusik, and Saman Amarasinghe. 2016. Simit: A language for physical simulation. ACM Trans. Graph. 35, 2 (2016), 20:1--20:21.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Haixiang Liu, Yuanming Hu, Bo Zhu, Wojciech Matusik, and Eftychios Sifakis. 2018. Narrow-band Topology Optimization on a Sparsely Populated Grid. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 37, 6 (2018), 251:1--251:14.Google ScholarGoogle Scholar
  27. Haixiang Liu, Nathan Mitchell, Mridul Aanjaneya, and Eftychios Sifakis. 2016. A scalable schur-complement fluids solver for heterogeneous compute platforms. ACM Transactions on Graphics (TOG) 35, 6 (2016), 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Aleka McAdams, Eftychios Sifakis, and Joseph Teran. 2010. A parallel multigrid Poisson solver for fluids simulation on large grids. In Symposium on Computer Animation. ACM/Eurographics Association, 65--74.Google ScholarGoogle Scholar
  29. Ken Museth. 2013. VDB: High-resolution sparse volumes with dynamic topology. ACM Trans. Graph. 32, 3 (2013), 27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jonathan Ragan-Kelley, Andrew Adams, Sylvain Paris, Marc Levoy, Saman Amarasinghe, and Frédo Durand. 2012. Decoupling algorithms from schedules for easy optimization of image processing pipelines. ACM Transactions on Graphics (TOG) 31, 4 (2012), 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Andrew Selle, Ronald Fedkiw, Byungmoon Kim, Yingjie Liu, and Jarek Rossignac. 2008. An unconditionally stable MacCormack method. Journal of Scientific Computing 35, 2-3 (2008), 350--371.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Rajsekhar Setaluri, Mridul Aanjaneya, Sean Bauer, and Eftychios Sifakis. 2014. SPGrid: A sparse paged grid structure applied to adaptive smoke simulation. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 33, 6 (2014), 205.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Alexey Stomakhin, Craig Schroeder, Lawrence Chai, Joseph Teran, and Andrew Selle. 2013. A material point method for snow simulation. ACM Transactions on Graphics (TOG) 32, 4 (2013), 102.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Andre Pradhana Tampubolon, Theodore Gast, Gergely Klár, Chuyuan Fu, Joseph Teran, Chenfanfu Jiang, and Ken Museth. 2017. Multi-species simulation of porous sand and water mixtures. ACM Transactions on Graphics (TOG) 36, 4 (2017), 1--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Xinlei Wang, Yuxing Qiu, Stuart R Slattery, Yu Fang, Minchen Li, Song-Chun Zhu, Yixin Zhu, Min Tang, Dinesh Manocha, and Chenfanfu Jiang. 2020. A massively parallel and scalable multi-cpu material point method. ACM Transactions on Graphics (TOG) 39, 4 (2020), 30--1.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Gregory J Ward. 1994. The RADIANCE lighting simulation and rendering system. In Proceedings of the 21st annual conference on Computer graphics and interactive techniques. 459--472.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jun Wu, Christian Dick, and Rüdiger Westermann. 2015. A system for high-resolution topology optimization. IEEE transactions on visualization and computer graphics 22, 3 (2015), 1195--1208.Google ScholarGoogle Scholar
  38. Kui Wu, Nghia Truong, Cem Yuksel, and Rama Hoetzlein. 2018. Fast fluid simulations with sparse volumes on the GPU. In Computer Graphics Forum (Proc. Eurographics), Vol. 37. Wiley Online Library, 157--167.Google ScholarGoogle Scholar
  39. Jonas Zehnder, Rahul Narain, and Bernhard Thomaszewski. 2018. An advection-reflection solver for detail-preserving fluid simulation. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1--8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Jacob Ziv and Abraham Lempel. 1977. A universal algorithm for sequential data compression. IEEE Transactions on information theory 23, 3 (1977), 337--343.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. QuanTaichi: a compiler for quantized simulations

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Graphics
          ACM Transactions on Graphics  Volume 40, Issue 4
          August 2021
          2170 pages
          ISSN:0730-0301
          EISSN:1557-7368
          DOI:10.1145/3450626
          Issue’s Table of Contents

          Copyright © 2021 Owner/Author

          This work is licensed under a Creative Commons Attribution International 4.0 License.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 July 2021
          Published in tog Volume 40, Issue 4

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader