Abstract

R1 is a domain specific language widely used for data analysis by the statistics community as well as by researchers in finance, biology, social sciences, and many other disciplines. As R programs are linked to input data, the exponential growth of available data makes high-performance computing with R imperative. To ease the process of writing parallel programs in R, code transformation from a sequential program to a parallel version would bring much convenience to R users. In this paper, we present our work in semi-automatic parallelization of R codes with user-added OpenMP-style pragmas. While such pragmas are used at the frontend, we take advantage of multiple parallel backends with different R packages. We provide flexibility for importing parallelism with plug-in components, impose built-in MapReduce for data processing, and also maintain code reusability. We illustrate the advantage of the on-the-fly mechanisms which can lead to significant applications in data-centered parallel computing.
- H. Chafi et al., A domain-specific approach to heterogeneous parallelism. Proc. of PPoPP'2011, Feb 2011. Google Scholar
Digital Library
- J. Dean and S. Ghemawat, MapReduce: simplified data processing on large clusters. Proc. of OSDI'2004. Google Scholar
Digital Library
- G. Dotzler, R. Veldema and M. Klemm, JCudaMP: OpenMP/Java on CUDA. Proc. of the 3rd Int. Workshop on Multicore Software Engineering (IWMSE'2010), May 2010. Google Scholar
Digital Library
- D. Eddelbuettel and R. Francois, Rcpp: seamless R and C++ integration. Journal of Statistical Software, vol. 40, iss. 8, 2011.Google Scholar
Cross Ref
- M. Feng, R. Gupta and Y. Hu, SpiceC: scalable parallelism via implicit copying and explicit commit. Proc. of PPoPP'2011, Feb 2011. Google Scholar
Digital Library
- M. Schmidberger et al., State of the art in parallel computing with R. Journal of Statistical Software, vol. 31, iss.1, 2009.Google Scholar
Cross Ref
Index Terms
OpenMP-style parallelism in data-centered multicore computing with R
Recommendations
OpenMP-style parallelism in data-centered multicore computing with R
PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel ProgrammingR1 is a domain specific language widely used for data analysis by the statistics community as well as by researchers in finance, biology, social sciences, and many other disciplines. As R programs are linked to input data, the exponential growth of ...
Compiler and Runtime Support for Running OpenMP Programs on Pentium-and Itanium-Architectures
HIPS '03: Proceedings of the Eighth International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS'03)Exploiting Thread-Level Parallelism (TLP) is a promisingway to improve the performance of applications with theadvent of general-purpose cost effective uni-processor andshared-memory multiprocessor systems. In this paper, wedescribe the OpenMP ...
Dual-level parallelism for ab initio molecular dynamics: Reaching teraflop performance with the CPMD code
We show teraflop performance of the fully featured ab initio molecular dynamics code CPMD on an IBM pSeries 690 cluster. A mixed distributed-memory, coarse-grained parallel approach using the MPI library and shared-memory, fine-grained parallelism using ...







Comments