Abstract
Dynamic languages such as R are increasingly used to process .large data sets. Here, the R interpreter induces a large memory overhead due to wasteful memory allocation policies. If an application's working set exceeds the available physical memory, the OS starts to swap, resulting in slowdowns of a several orders of magnitude. Thus, memory optimizations for R will be beneficial to many applications.
Existing R optimizations are mostly based on dynamic compilation or native libraries. Both methods are futile when the OS starts to page out memory. So far, only a few, data-type or application specific memory optimizations for R exist. To remedy this situation, we present a low-overhead page sharing approach for R that significantly reduces the interpreter's memory overhead. Concentrating on the most rewarding optimizations avoids the high runtime overhead of existing generic approaches for memory deduplication or compression. In addition, by applying knowledge of interpreter data structures and memory allocation patterns, our approach is not constrained to specific R applications and is transparent to the R interpreter.
Our page sharing optimization enables us to reduce the memory consumption by up to 53.5% with an average of 18.0% for a set of real-world R benchmarks with a runtime overhead of only 5.3% on average. In cases where page I/O can be avoided, significant speedups are achieved.
- Wang H., Wu P., Padua D., Optimizing R VM: Allocation Removal and Path Length Reduction via Interpreter-level Specialization. In Proceedings of the International Symposium on Code Generation and Optimization. Orlando, Florida. 2014. Google Scholar
Digital Library
- Neal R., pqR - a pretty quick version of R. University of Toronto, 2014. URL https://github.com/radfordneal/pqRGoogle Scholar
- Kalibera T., Maj P., Morandat F., Vitek, J., A Fast Abstract Syntax Tree Interpreter for R. In Proceedings of the 10th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments. Salt Lake City, Utah, USA. pp.89--102. 2014. Google Scholar
Digital Library
- Bertram A., Renjin: JVM-based Interpreter for the R Language for Statistical Computing. 2014. URL http://www.renjin.orgGoogle Scholar
- Talbot J., DeVito Z., Hanrahan P., Riposte: a trace-driven Compiler and Parallel VM for Vector Code in R. In Proceedings of the 21st international conference on Parallel architectures and compilation techniques. Minneapolis, Minnesota, USA. pp.43--52. 2012. Google Scholar
Digital Library
- Almasi G., Padua, D.A., MaJIC: A Matlab Just-In-Time Compiler. Languages and Compilers for Parallel Computing. Springer Berlin Heidelberg. pp.68--81. 2001. Google Scholar
Digital Library
- Bolz C.F., Cuni A., Fijalkowski M., Rigo, A., Tracing the meta-level: PyPy's tracing JIT compiler. In Proceedings of the 4th workshop on the Implementation, Compilation, Optimization of Object-Oriented Languages and Programming Systems. Genova, Italy. pp.18--25. 2009. Google Scholar
Digital Library
- Morandat F., Hill B., Osvald, L., Vitek, J., Evaluating the Design of the R Language: Objects and Functions for Data Analysis. In Proceedings of the 26th European conference on Object-Oriented Programming. Beijing, China. pp. 104--131. 2012. Google Scholar
Digital Library
- Kotthaus H., Korb I., Lang M., Bischl B., Rahnenführer J., Marwedel P., Runtime and Memory Consumption Analyses for Machine Learning R Programs. Journal of Statistical Computation and Simulation. 2014.Google Scholar
- Lang M., Kotthaus H, BenchR: Set of Benchmark of R. TU Dortmund University. 2014. URL https://github.com/allr/benchRGoogle Scholar
- R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2014. URL http://www.R-project.orgGoogle Scholar
- Valat S., Pérache M., Jalby W., Introducing Kernel-level Page Reuse for High Performance Computing. In Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness. Seattle, Washington, pp.3:1--3:9. 2013. Google Scholar
Digital Library
- Chen L., Wei Z., Cui Z., Chen M., Pan H., Bao Y., CMD: Classification-based Memory Deduplication Through Page Access Characteristics. In Proceedings of the 10th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments. Salt Lake City, Utah, USA. pp.65--76. 2014. Google Scholar
Digital Library
- Jula A., Rauchwerger L., Two Memory Allocators That Use Hints to Improve Locality. In Proceedings of the 2009 International Symposium on Memory Management. Dublin, Ireland. pp.109--118. 2009. Google Scholar
Digital Library
- Arcangeli A., Eidus I., Wright C., Increasing memory density by using KSM. In Proceedings of the Ottawa Linux Symposium. Ottawa, Ontario, Canada. pp. 19--28. 2009.Google Scholar
- Wilson P., Kaplan S., Smaragdakis Y., The Case for Compressed Caching in Virtual Memory Systems. In Proceedings of USENIX ATC Monterey, California, USA. 1999. Google Scholar
Digital Library
- Beltran V., Torres J., Ayguade E., Improving disk bandwidth-bound applications through main memory compression. In Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture (MEDEA '07) ACM, New York, NY, USA, pp. 57--63. 2007. Google Scholar
Digital Library
- Yang L., Dick R., Lekatsas H., Chakradhar S., High-performance operating system controlled online memory compression. In ACM Trans. Embed. Comput. Syst. 9, 4 ACM. 2010. Google Scholar
Digital Library
- Pekhimenko G., Seshadri V., Kim Y., Xin H., Mutlu O., Gibbons P., Kozuch M., Mowry T., Linearly compressed pages: a low-complexity, low-latency main memory compression framework. In Proceedings of MICRO-46. ACM, New York, NY, USA, pp.172--184. 2013. Google Scholar
Digital Library
- Chihaia I., Gross T., An analytical model for software-only main memory compression. In Proceedings of the 3rd workshop on Memory performance issues (WMPI '04). ACM, New York, NY, USA, pp.107--113. 2004. Google Scholar
Digital Library
- Nakar D., Weiss S., Selective main memory compression by identifying program phase changes. In Proceedings of the 3rd workshop on Memory performance issues (WMPI '04). ACM, New York, NY, USA, pp.96--101. 2004. Google Scholar
Digital Library
- Benini L., Macii A., Macii E., Offline Data Profiling Techniques to Enhance Memory Compression in Embedded Systems. In Proceedings of the 12th International Workshop on Integrated Circuit Design, Power and Timing Modeling, Optimization and Simulation (PATMOS '02) Springer-Verlag, London, UK, pp.314--322. 2002. Google Scholar
Digital Library
- Lawlor O., In-memory data compression for sparse matrices. In Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms (IA$^3$ '13). ACM, New York, NY, USA. 2013. Google Scholar
Digital Library
- Miller K., Franz F., Rittinghaus M., Hillenbrand M., Bellosa F., XLH: more effective memory deduplication scanners through cross-layer hints. In Proceedings of USENIX ATC'13. USENIX Association, Berkeley, CA, USA, 279--290. 2013. Google Scholar
Digital Library
- Sharma P., Kulkarni P., Singleton: system-wide page deduplication in virtual environments. In Proceedings of HPDC '12. ACM, New York, NY, USA, pp.15--26. 2012. Google Scholar
Digital Library
- Deng Y., Song L., Huang X., Evaluating Memory Compression and Deduplication. In Proceedings of the IEEE NAS '13. IEEE Computer Society, Washington, DC, USA, pp.282--286. 2013. Google Scholar
Digital Library
Index Terms
Dynamic page sharing optimization for the R language
Recommendations
Dynamic page sharing optimization for the R language
DLS '14: Proceedings of the 10th ACM Symposium on Dynamic languagesDynamic languages such as R are increasingly used to process .large data sets. Here, the R interpreter induces a large memory overhead due to wasteful memory allocation policies. If an application's working set exceeds the available physical memory, the ...
Dynamic scratchpad memory management for code in portable systems with an MMU
In this work, we present a dynamic memory allocation technique for a novel, horizontally partitioned memory subsystem targeting contemporary embedded processors with a memory management unit (MMU). We propose to replace the on-chip instruction cache ...
Scratchpad memory management for portable systems with a memory management unit
EMSOFT '06: Proceedings of the 6th ACM & IEEE International conference on Embedded softwareIn this paper,we present a dynamic scratchpad memory allocation strategy targeting a horizontally partitioned memory subsystem for contemporary embedded processors. The memory subsystem is equipped with a memory management unit (MMU), and physically ...







Comments