Abstract
Automatic memory management makes programming easier. This is also true for general purpose GPU computing where currently no garbage collectors exist. In this paper we present a parallel mark-and-sweep collector to collect GPU memory on the GPU and tune its performance. Performance is increased by: (1) data-parallel marking and sweeping of regions of memory, (2) marking all elements of large arrays in parallel, (3) trading recursion over parallelism to match deeply linked data structures.
(1) is achieved by coarsely processing all potential objects in a region of memory in parallel. When during (1) a large array is detected, it is put aside and a parallel-for is later issued on the GPU to mark its elements. For a data-structure that is a large linked list, we dynamically switch to a marking version with less overhead by performing a few recursive steps sequentially (and multiple lists in parallel).
The collector achieves a speedup of a factor of up-to 11 over a sequential collector on the same GPU.
- C. Attanasio, D. Bacon, A. Cocchi, and S. Smith. A Comparative Evaluation of Parallel Garbage Collector Implementations. In Languages and Compilers for Parallel Computing. LCPC'03, volume 2624 of LNCS, pages 79--94. Springer, 2003. Google Scholar
Digital Library
- H. Azatchi, Y. Levanoni, H. Paz, and E. Petrank. An on-the-fly mark and sweep garbage collector based on sliding views. In Proc. 18th ACM SIGPLAN Conf. Object-oriented Programing, Systems, Languages, and Applications, OOPSLA'03, pages 269--281, Anaheim, CA, 2003. Google Scholar
Digital Library
- D.F. Bacon, C.R. Attanasio, H.B. Lee, V.T. Rajan, and S. Smith. Java without the coffee breaks: a nonintrusive multiprocessor garbage collector. In Proc. ACM SIGPLAN 2001 Conf. Programming Language Design and Implementation, PLDI'01, pages 92--103, Snowbird, UT, 2001. Google Scholar
Digital Library
- K. Barabash, O. Ben-Yitzhak, I. Goft, E.K. Kolodner, V. Leikehman, Y. Ossia, A. Owshanko, and E. Petrank. A parallel, incremental, mostly concurrent garbage collector for servers. ACM Trans. Program. Lang. Syst., issue 6, 27:1097--1146, Nov. 2005. Google Scholar
Digital Library
- K. Barabash, Y. Ossia, and E. Petrank. Mostly concurrent garbage collection revisited. In Proc. 18th ACM SIGPLAN Conf. Object-Oriented Programing, Systems, Languages, and Applications, OOPSLA'03, pages 255--268, Anaheim, CA, 2003. Google Scholar
Digital Library
- K. Barabash and E. Petrank. Tracing garbage collection on highly parallel platforms. In Proc. 2010 Intl. Symp. Memory Management, ISMM'10, pages 1--10, Toronto, Canada, 2010. Google Scholar
Digital Library
- H.J. Boehm, A.J. Demers, and S. Shenker. Mostly parallel garbage collection. In Proc. ACM SIGPLAN 1991 Conf. Programming Language Design and Implementation, PLDI'91, pages 157--164, Toronto, Canada, 1991. Google Scholar
Digital Library
- T. Endo, K. Taura, and A. Yonezawa. A scalable mark-sweep garbage collector on large-scale shared-memory machines. In Proc. 1997 ACM/IEEE Conf. Supercomputing, pages 1--14, San Jose, CA, 1997. Google Scholar
Digital Library
- R. Jones and R. Lins. Garbage Collection: Algorithms for Automatic Dynamic Memory Management. Wiley, 1996. Google Scholar
Digital Library
- S. Marlow, T. Harris, R.P. James, and S. Peyton Jones. Parallel generational-copying garbage collection with a block-structured heap. In Proc. 7th Intl. Symp. Memory Management, ISMM'08, pages 11--20, Tucson, AZ, 2008. Google Scholar
Digital Library
- Fridtjof Siebert. Limits of parallel marking garbage collection. In Proc. 7th Intl. Symp. on Memory Management, ISMM '08, pages 21--29, Tucson, AZ, 2008. Google Scholar
Digital Library
- Ming Wu and Xiao-Feng Li. Task-pushing: a Scalable Parallel GC Marking Algorithm without Synchronization Operations. In Proc. IEEE Parallel and Distributed Processing Symp., IPDPS'07, pages 1--10, Long Beach, CA, 2007.Google Scholar
Index Terms
Iterative data-parallel mark&sweep on a GPU
Recommendations
Iterative data-parallel mark&sweep on a GPU
ISMM '11: Proceedings of the international symposium on Memory managementAutomatic memory management makes programming easier. This is also true for general purpose GPU computing where currently no garbage collectors exist. In this paper we present a parallel mark-and-sweep collector to collect GPU memory on the GPU and tune ...
Parallel memory defragmentation on a GPU
MSPC '12: Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and CorrectnessHigh-throughput memory management techniques such as malloc/free or mark-and-sweep collectors often exhibit memory fragmentation leaving allocated objects interspersed with free memory holes. Memory defragmentation removes such holes by moving objects ...
GPUs as an opportunity for offloading garbage collection
ISMM '12: Proceedings of the 2012 international symposium on Memory ManagementGPUs have become part of most commodity systems. Nonetheless, they are often underutilized when not executing graphics-intensive or special-purpose numerical computations, which are rare in consumer workloads. Emerging architectures, such as integrated ...







Comments