Abstract
In this paper, we wish to push the limit of real-time cloth and deformable body simulation to a higher level with 50K to 500K vertices, based on the development of a novel GPU-based multilevel additive Schwarz (MAS) pre-conditioner. Similar to other preconditioners under the MAS framework, our preconditioner naturally adopts multilevel and domain decomposition concepts. But contrary to previous works, we advocate the use of small, non-overlapping domains that can well explore the parallel computing power on a GPU. Based on this idea, we investigate and invent a series of algorithms for our preconditioner, including multilevel domain construction using Morton codes, low-cost matrix precomputation by one-way Gauss-Jordan elimination, and conflict-free symmetric-matrix-vector multiplication in runtime preconditioning. The experiment shows that our preconditioner is effective, fast, cheap to precompute and scalable with respect to stiffness and problem size. It is compatible with many linear and nonlinear solvers used in cloth and deformable body simulation with dynamic contacts, such as PCG, accelerated gradient descent and L-BFGS. On a GPU, our preconditioner speeds up a PCG solver by approximately a factor of four, and its CPU version outperforms a number of competitors, including ILU0 and ILUT.
Supplemental Material
- Christie Alappat, Achim Basermann, Alan R. Bishop, Holger Fehske, Georg Hager, Olaf Schenk, Jonas Thies, and Gerhard Wellein. 2020. A Recursive Algebraic Coloring Technique for Hardware-Efficient Symmetric Sparse Matrix-Vector Multiplication. ACM Trans. Parallel Comput. 7, 3, Article 19 (jun 2020), 37 pages.Google Scholar
Digital Library
- David Baraff and Andrew Witkin. 1998. Large Steps in Cloth Simulation. In Proceedings of SIGGRAPH 98 (Computer Graphics Proceedings, Annual Conference Series), Eugene Fiume (Ed.). Association for Computing Machinery, New York, NY, USA, 43--54.Google Scholar
Digital Library
- Jernej Barbič and Yili Zhao. 2011. Real-Time Large-Deformation Substructuring. ACM Trans. Graph. (SIGGRAPH) 30, 4, Article 91 (jul 2011), 8 pages.Google Scholar
Digital Library
- Nathan Bell and Michael Garland. 2008. Efficient Sparse Matrix-Vector Multiplication on CUDA. NVIDIA Technical Report NVR-2008-004. NVIDIA Corporation.Google Scholar
- Miklos Bergou, Max Wardetzky, David Harmon, Denis Zorin, and Eitan Grinspun. 2006. A Quadratic Bending Model for Inextensible Surfaces. In Proceedings of SGP (Cagliari, Sardinia, Italy). 227--230.Google Scholar
- Sofien Bouaziz, Sebastian Martin, Tiantian Liu, Ladislav Kavan, and Mark Pauly. 2014. Projective Dynamics: Fusing Constraint Projections for Fast Simulation. ACM Trans. Graph. (SIGGRAPH) 33, 4, Article 154 (July 2014), 11 pages.Google Scholar
Digital Library
- Robert Bridson, Ronald Fedkiw, and John Anderson. 2002. Robust Treatment of Collisions, Contact and Friction for Cloth Animation. ACM Trans. Graph. (SIGGRAPH) 21, 3 (July 2002), 594--603.Google Scholar
Digital Library
- Jed Brown and Peter Brune. 2013. Low-Rank Quasi-Newton Updates for Robust Jacobian Lagging in Newton Methods. International Conference on Mathematics and Computational Methods Applied to Nuclear Science and Engineering (2013).Google Scholar
- Xiao-Chuan Cai and Yousef Saad. 1996. Overlapping Domain Decomposition Algorithms for General Sparse Matrices. Numerical Linear Algebra with Applications 3, 3 (1996), 221--237.Google Scholar
Cross Ref
- Xiao-Chuan Cai and Marcus Sarkis. 1999. A Restricted Additive Schwarz Preconditioner for General Sparse Linear Systems. SIAM Journal on Scientific Computing 21, 2 (1999), 792--797.Google Scholar
Digital Library
- Ali Charara, David E. Keyes, and Hatem Ltaief. 2017. A Framework for Dense Triangular Matrix Kernels on Various Manycore Architectures. Concurrency and Computation: Practice and Experience 29, 15 (2017).Google Scholar
- Jiong Chen, Florian Schäfer, Jin Huang, and Mathieu Desbrun. 2021. Multiscale Cholesky Preconditioning for Ill-Conditioned Problems. ACM Trans. Graph. (SIGGRAPH) 40, 4, Article 81 (July 2021), 13 pages.Google Scholar
Digital Library
- Kwang-Jin Choi and Hyeong-Seok Ko. 2002. Stable but Responsive Cloth. ACM Trans. Graph. (SIGGRAPH) 21, 3 (July 2002), 604--611.Google Scholar
Digital Library
- Victorita Dolean, Pierre Jolivet, and Frédéric Nataf. 2015. An Introduction to Domain Decomposition Methods. Society for Industrial and Applied Mathematics, Philadelphia, PA.Google Scholar
- Maksymilian Dryja, Marcus V. Sarkis, and Olof B. Widlund. 1996. Multilevel Schwarz Methods for Elliptic Problems with Discontinuous Coefficients in Three Dimensions. Numer. Math. 72, 3 (01 Jan 1996), 313--348.Google Scholar
- Maksymilian Dryja and Olof B Widlund. 1989. Some Domain Decomposition Algorithms for Elliptic Problems. In Iterative methods for large linear systems. Elsevier, San Diego, CA, 273--291.Google Scholar
- Maksymilian Dryja and Olof B. Widlund. 1991. Multilevel Additive Methods for Elliptic Finite Element Problems. In Parallel Algorithms for Partial Differential Equations, Proceedings of the Sixth GAMM-Seminar. Friedr. Vieweg and Sohn, Braunschweig, Germany, 58--69.Google Scholar
- Essex Edwards and Robert Bridson. 2015. The Discretely-Discontinuous Galerkin Coarse Grid for Domain Decomposition. CoRR abs/1504.00907 (2015). arXiv:1504.00907Google Scholar
- Charbel Farhat, Michel Lesoinne, Patrick LeTallec, Kendall Pierson, and Daniel Rixen. 2001. FETI-DP: A Dual-Primal Unified FETI Method---Part I: A Faster Alternative to the Two-Level FETI Method. Internat. J. Numer. Methods Engrg. 50, 7 (2001), 1523--1544.Google Scholar
Cross Ref
- Charbel Farhat and Francois-Xavier Roux. 1991. A Method of Finite Element Tearing and Interconnecting and Its Parallel Solution Algorithm. Internat. J. Numer. Methods Engrg. 32, 6 (1991), 1205--1227.Google Scholar
Cross Ref
- Marco Fratarcangeli, Valentina Tibaldo, and Fabio Pellacini. 2016. Vivace: A Practical Gauss-Seidel Method for Stable Soft Body Dynamics. ACM Trans. Graph. (SIGGRAPH Asia) 35, 6, Article 214 (Nov. 2016), 9 pages.Google Scholar
- Andreas Frommer and Daniel B. Szyld. 1999. Weighted Max Norms, Splittings, and Overlapping Additive Schwarz Iterations. Numer. Math. 83, 2 (01 Aug 1999), 259--278.Google Scholar
- Martin J. Gander. 2008. Schwarz Methods over the Course of Time. Electronic Transactions on Numerical Analysis 31 (2008), 228--255.Google Scholar
- Theodoros Gkountouvas, Vasileios Karakasis, Kornilios Kourtis, Georgios Goumas, and Nectarios Koziris. 2013. Improving the Performance of the Symmetric Sparse Matrix-Vector Multiplication in Multicore. In Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing. 273--283.Google Scholar
Digital Library
- Gundolf Haase, Ulrich Langer, and Arnd Meyer. 1991. The Approximate Dirichlet Domain Decomposition Method. Part I: An Algebraic Approach. Computing 47, 2 (01 Jun 1991), 137--151.Google Scholar
- Liliya Kharevych, Weiwei Yang, Yiying Tong, Eva Kanso, Jerrold E. Marsden, Peter Schröder, and Mathieu Desbrun. 2006. Geometric, Variational Integrators for Computer Animation. In Proceedings of SCA (Vienna, Austria). 43--51.Google Scholar
- Theodore Kim and Doug L. James. 2011. Physics-Based Character Skinning Using Multi-Domain Subspace Deformations. In Proceedings of SCA (Vancouver, British Columbia, Canada). 63--72.Google Scholar
- Dilip Krishnan, Raanan Fattal, and Richard Szeliski. 2013. Efficient Preconditioning of Laplacian Matrices for Computer Graphics. ACM Trans. Graph. 32, 4, Article 142 (July 2013), 15 pages.Google Scholar
Digital Library
- Christian Lauterbach, Michael Garland, Shubhabrata Sengupta, David Luebke, and Dinesh Manocha. 2009. Fast BVH Construction on GPUs. Computer Graphics Forum 28, 2 (2009), 375--384.Google Scholar
Cross Ref
- Yongjoon Lee, Sung-Eui Yoon, Seungwoo Oh, Duksu Kim, and Sunghee Choi. 2010. Multi-Resolution Cloth Simulation. Comput. Graph. Forum (Pacific Graphics) 29, 7 (2010), 2225--2232.Google Scholar
Cross Ref
- Minchen Li, Zachary Ferguson, Teseo Schneider, Timothy Langlois, Denis Zorin, Daniele Panozzo, Chenfanfu Jiang, and Danny M. Kaufman. 2020. Incremental Potential Contact: Intersection-and Inversion-Free, Large-Deformation Dynamics. ACM Trans. Graph. (SIGGRAPH) 39, 4, Article 49 (jul 2020), 20 pages.Google Scholar
- Minchen Li, Ming Gao, Timothy Langlois, Chenfanfu Jiang, and Danny M. Kaufman. 2019. Decomposed Optimization Time Integrator for Large-Step Elastodynamics. ACM Trans. Graph. (SIGGRAPH) 38, 4, Article 70 (jul 2019), 10 pages.Google Scholar
Digital Library
- Ruipeng Li and Yousef Saad. 2017. Low-Rank Correction Methods for Algebraic Domain Decomposition Preconditioners. SIAM J. Matrix Anal. Appl. 38, 3 (2017), 807--828.Google Scholar
Digital Library
- Tiantian Liu, Adam W. Bargteil, James F. O'Brien, and Ladislav Kavan. 2013. Fast Simulation of Mass-Spring Systems. ACM Trans. Graph. (SIGGRAPH Asia) 32, 6, Article 214 (Nov. 2013), 7 pages.Google Scholar
- Tiantian Liu, Sofien Bouaziz, and Ladislav Kavan. 2017. Quasi-Newton Methods for Real-Time Simulation of Hyperelastic Materials. ACM Trans. Graph. 36, 3, Article 23 (may 2017), 16 pages.Google Scholar
Digital Library
- Jan Mandel. 1993. Balancing Domain Decomposition. Communications in Numerical Methods in Engineering 9, 3 (1993), 233--241.Google Scholar
Cross Ref
- Sebastian Martin, Bernhard Thomaszewski, Eitan Grinspun, and Markus Gross. 2011. Example-Based Elastic Materials. ACM Trans. Graph. (SIGGRAPH) 30, 4, Article 72 (July 2011), 8 pages.Google Scholar
Digital Library
- Matthias Müller, Bruno Heidelberger, Matthias Teschner, and Markus Gross. 2005. Meshless Deformations Based on Shape Matching. ACM Trans. Graph. (SIGGRAPH) 24, 3 (July 2005), 471--478.Google Scholar
Digital Library
- Rajib Nath, Stanimire Tomov, Tingxing "Tim" Dong, and Jack Dongarra. 2011. Optimizing Symmetric Dense Matrix-Vector Multiplication on GPUs. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (Seattle, Washington). Article 6, 10 pages.Google Scholar
Digital Library
- Maxim Naumov, Marat Arsaev, Patrice Castonguay, Jonathan Cohen, Julien Demouth, Joe Eaton, Simon Layton, Nikolay Markovskiy, Istvan Reguly, Nikolai Sakharnykh, Vijay Sellappan, and Robert Strzodka. 2015. AmgX: A Library for GPU Accelerated Algebraic Multigrid and Preconditioned Iterative Methods. SIAM Journal on Scientific Computing 37, 5 (2015), 602--626.Google Scholar
Digital Library
- Roy A. Nicolaides. 1987. Deflation of Conjugate Gradients with Applications to Boundary Value Problems. SIAM J. Numer. Anal. 24, 2 (1987), 355--365.Google Scholar
Digital Library
- NVIDIA. 2021. AmgX. https://developer.nvidia.com/amgx.Google Scholar
- Michael Ortiz and Laurent Stainier. 1999. The Variational Formulation of Viscoplastic Constitutive Updates. Computer Methods in Applied Mechanics and Engineering 171, 3 (1999), 419--444.Google Scholar
Cross Ref
- Garry Rodrigue, Kang LiShan, and Liu Yu-Hui. 1989. Convergence and Comparison Analysis of Some Numerical Schwarz Methods. Numer. Math. 56, 2 (01 Feb 1989), 123--138.Google Scholar
- Yousef Saad. 1994. ILUT: A Dual Threshold Incomplete LU Factorization. Numerical Linear Algebra with Applications 1, 4 (1994), 387--402.Google Scholar
Cross Ref
- Hermann A. Schwarz. 1870. Ueber Einen Grenz Übergang Durch Alternirendes Verfahren. Wolf J. XV. 272--286.Google Scholar
- N. Spillane and D.J. Rixen. 2013. Automatic Spectral Coarse Spaces for Robust Finite Element Tearing and Interconnecting and Balanced Domain Decomposition Algorithms. Internat. J. Numer. Methods Engrg. 95, 11 (2013), 953--990.Google Scholar
Cross Ref
- Rasmus Tamstorf, Toby Jones, and Stephen F. McCormick. 2015. Smoothed Aggregation Multigrid for Cloth Simulation. ACM Trans. Graph. (SIGGRAPH Asia) 34, 6, Article 245 (Oct. 2015), 13 pages.Google Scholar
- Min Tang, Zhongyuan Liu, Ruofeng Tong, and Dinesh Manocha. 2018. PSCC: Parallel Self-Collision Culling with Spatial Hashing on GPUs. Proc. ACM Comput. Graph. Interact. Tech. 1, 1, Article 18 (July 2018), 18 pages.Google Scholar
Digital Library
- Joseph Teran, Silvia Blemker, Victor Ng-Thow-Hing, and Ronald Fedkiw. 2003. Finite Volume Methods for the Simulation of Skeletal Muscle. In Proceedings of SCA 2003 (San Diego, California). 68--74.Google Scholar
- Joseph Teran, Eftychios Sifakis, Geoffrey Irving, and Ronald Fedkiw. 2005. Robust Quasistatic Finite Elements and Flesh Simulation. In Proceedings of SCA (Los Angeles, California). 181--190.Google Scholar
Digital Library
- Demetri Terzopoulos, John Platt, Alan Barr, and Kurt Fleischer. 1987. Elastically Deformable Models. Computer Graphics (SIGGRAPH 87) 21, 4 (Aug. 1987), 205--214.Google Scholar
- Pascal Volino, Nadia Magnenat-Thalmann, and Francois Faure. 2009. A Simple Approach to Nonlinear Tensile Stiffness for Accurate Cloth Simulation. ACM Trans. Graph. 28, 4, Article 105 (Sept. 2009), 16 pages.Google Scholar
Digital Library
- Huamin Wang. 2015. A Chebyshev Semi-Iterative Approach for Accelerating Projective and Position-Based Dynamics. ACM Trans. Graph. (SIGGRAPH Asia) 34, 6, Article 246 (Oct. 2015), 9 pages.Google Scholar
- Huamin Wang. 2021. GPU-Based Simulation of Cloth Wrinkles at Submillimeter Levels. ACM Trans. Graph. 40, 4, Article 169 (July 2021), 14 pages.Google Scholar
Digital Library
- Huamin Wang and Yin Yang. 2016. Descent Methods for Elastic Body Simulation on the GPU. ACM Trans. Graph. (SIGGRAPH Asia) 35, 6, Article 212 (Nov. 2016), 10 pages.Google Scholar
- Zhendong Wang, Longhua Wu, Marco Fratarcangeli, Min Tang, and Huamin Wang. 2018. Parallel Multigrid for Nonlinear Cloth Simulation. Comput. Graph. Forum (Pacific Graphics) 37, 7 (2018), 131--141.Google Scholar
Cross Ref
- Joerg Willems. 2013. Spectral Coarse Spaces in Robust Two-Level Schwarz Methods. In Numerical Solution of Partial Differential Equations: Theory, Algorithms, and Their Applications. Springer Proceedings in Mathematics & Statistics, Vol. 45. Springer, 303--326.Google Scholar
- Longhua Wu, Botao Wu, Yin Yang, and Huamin Wang. 2020. A Safe and Fast Repulsion Method for GPU-Based Cloth Self Collisions. ACM Trans. Graph. 40, 1, Article 5 (Dec. 2020), 18 pages.Google Scholar
Digital Library
- Xiaofeng Wu, Rajaditya Mukherjee, and Huamin Wang. 2015. A Unified Approach for Subspace Simulation of Deformable Bodies in Multiple Domains. ACM Trans. Graph. (SIGGRAPH Asia) 34, 6, Article 241 (Oct. 2015), 9 pages.Google Scholar
- Zangyueyang Xian, Xin Tong, and Tiantian Liu. 2019. A Scalable Galerkin Multigrid Method for Real-Time Simulation of Deformable Objects. ACM Trans. Graph. 38, 6, Article 162 (Nov. 2019), 13 pages.Google Scholar
Digital Library
- Yin Yang, Weiwei Xu, Xiaohu Guo, Kun Zhou, and Baining Guo. 2013. Boundary-Aware Multi-Domain Subspace Deformation. IEEE Transactions on Visualization and Computer Graphics 19, 10 (2013), 1633--1645.Google Scholar
Cross Ref
- Xuejun Zhang. 1992. Multilevel Schwarz Methods. Numer. Math. 63, 1 (01 Dec 1992), 521--539.Google Scholar
- Yongning Zhu, Eftychios Sifakis, Joseph Teran, and Achi Brandt. 2010. An Efficient Multigrid Method for the Simulation of High-resolution Elastic Solids. ACM Trans. Graph. 29, 2, Article 16 (April 2010), 18 pages.Google Scholar
Digital Library
Index Terms
A GPU-based multilevel additive schwarz preconditioner for cloth and deformable body simulation
Recommendations
A multilevel block incomplete Cholesky preconditioner for solving normal equations in linear least squares problems
An incomplete factorization method for preconditioning symmetric positive definite matrices is introduced to solve normal equations. The normal equations are form to solve linear least squares problems. The procedure is based on a block incomplete ...
A relaxed positive semi-definite and skew-Hermitian splitting preconditioner for non-Hermitian generalized saddle point problems
For non-Hermitian saddle point linear systems, Pan, Ng and Bai presented a positive semi-definite and skew-Hermitian splitting (PSS) preconditioner (Pan et al. Appl. Math. Comput. 172, 762---771 2006), to accelerate the convergence rate of the Krylov ...
A Block FSAI-ILU Parallel Preconditioner for Symmetric Positive Definite Linear Systems
A novel parallel preconditioner for symmetric positive definite matrices is developed coupling a generalized factored sparse approximate inverse (FSAI) with an incomplete LU (ILU) factorization. The generalized FSAI, called block FSAI, is derived by ...





Comments