Abstract
This poster is a case study on the application of a novel programming model, called Concurrent Collections (CnC), to the implementation of an asynchronous-parallel algorithm for computing the Cholesky factorization of dense matrices. In CnC, the programmer expresses her computation in terms of application-specific operations, partially-ordered by semantic scheduling constraints. We demonstrate the performance potential of CnC in this poster, by showing that our Cholesky implementation nearly matches or exceeds competing vendor-tuned codes and alternative programming models. We conclude that the CnC model is well-suited for expressing asynchronous-parallel algorithms on emerging multicore systems.
- Z. Budimlić, A. Chandramowlishwaran, K. Knobe, G. Lowney, V. Sarkar, and L. Treggiari. Multi-core implementations of the Concurrent Collections programming model. In Proc. Workshop on Compilers for Parallel Computing (CPC), January 2009.Google Scholar
- A. Buttari, J. Langou, J. Kurzak, and J. Dongarra. A class of parallel tiled linear algebra algorithms for multicore architectures. Technical Report UT-CS-07-600 (LAPACK Working Note 191), University of Tennessee Knoxville, September 2007.Google Scholar
- Intel® Concurrent Collections for C++. http://software.intel.com/en-us/articles/intel-concurrent-collections-for-cc/, 2009.Google Scholar
- K. Knobe. Ease of use with Concurrent Collections (CnC). In Proc. USENIX Workshop on Hot Topics in Parallelism (HotPar), March 2009. Google Scholar
Digital Library
- H. Ltaeif, J. Kurzak, and J. Dongarra. Scheduling two-sided transformations using algorithms-by-tiles on multicore architectures. Technical Report UT-CS-09-637 (LAPACK Working Note 214), University of Tennessee Knoxville, February 2009.Google Scholar
- J. M. Perez, R. M. Badia, and J. Labarta. A dependency-aware task-based programming environment for multicore architectures. In Proc. IEEE Int'l. Conf. Cluster Computing (CLUSTER), pages 142--151, September 2008.Google Scholar
- E. Chan, E. S. Quintana-Ortí, G. Quintana-Ortí, and R. van de Geijn. SuperMatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures. In Proc. ACM Symp. Parallelism in Algorithms and Architectures (SPAA), pages 116--125, June 2007. Google Scholar
Digital Library
Index Terms
Applying the concurrent collections programming model to asynchronous parallel dense linear algebra
Recommendations
Applying the concurrent collections programming model to asynchronous parallel dense linear algebra
PPoPP '10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingThis poster is a case study on the application of a novel programming model, called Concurrent Collections (CnC), to the implementation of an asynchronous-parallel algorithm for computing the Cholesky factorization of dense matrices. In CnC, the ...
Declarative aspects of memory management in the concurrent collections parallel programming model
DAMP '09: Proceedings of the 4th workshop on Declarative aspects of multicore programmingConcurrent Collections (CnC) is a declarative parallel language that allows the application developer to express their parallel application as a collection of high-level computations called steps that communicate via single-assignment data structures ...
A Parallel Tiled Solver for Dense Symmetric Indefinite Systems on Multicore Architectures
IPDPS '12: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing SymposiumWe describe an efficient and innovative parallel tiled algorithm for solving symmetric indefinite systems on multicore architectures. This solver avoids pivoting by using a multiplicative preconditioning based on symmetric randomization. This ...







Comments