Abstract
Each full garbage collection in a program with millions of objects can pause the program for multiple seconds. Much of this work is typically repeated, as the collector re-traces parts of the object graph that have not changed since the last collection. Clustered Collection reduces full collection pause times by eliminating much of this repeated work. Clustered Collection identifies clusters: regions of the object graph that are reachable from a single "head" object, so that reachability of the head implies reachability of the whole cluster. As long as it is not written, a cluster need not be re-traced by successive full collections. The main design challenge is coping with program writes to clusters while ensuring safe, complete, and fast collections. In some cases program writes require clusters to be dissolved, but in most cases Clustered Collection can handle writes without having to re-trace the affected cluster. Clustered Collection chooses clusters likely to suffer few writes and to yield high savings from re-trace avoidance. Clustered Collection is implemented as modifications to the Racket collector. Measurements of the code and data from the Hacker News web site (which suffers from significant garbage collection pauses) and a Twitter-like application show that Clustered Collection decreases full collection pause times by a factor of three and six respectively. This improvement is possible because both applications have gigabytes of live data, modify only a small fraction of it, and usually write in ways that do not result in cluster dissolution. Identifying clusters takes more time than a full collection, but happens much less frequently than full collection.
- H. G. Baker, Jr. List processing in real time on a serial computer. Commun. ACM, 21(4):280–294, Apr. 1978. ISSN 0001-0782. Google Scholar
Digital Library
- M. Cohen. Clustering the heap in multi-threaded applications for improved garbage collection. In Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation, GECCO ’06, pages 1901–1908, Seattle, WA, USA, 2006. ACM. Google Scholar
Digital Library
- D. Detlefs, C. Flood, S. Heller, and T. Printezis. Garbage-first garbage collection. In Proceedings of the 4th International Symposium on Memory Management, ISMM ’04, pages 37–48, Vancouver, BC, Canada, 2004. ACM. Google Scholar
Digital Library
- R. B. Findler and PLT. DrRacket: Programming Environment. Technical Report PLT-TR-2010-2, PLT Design Inc., 2010. http://racket-lang.org/tr2/.Google Scholar
- R. H. Halstead, Jr. Multilisp: A language for concurrent symbolic computation. ACM Trans. Program. Lang. Syst., 7(4):501–538, Oct. 1985. ISSN 0164-0925. Google Scholar
Digital Library
- B. Hayes. Using key object opportunism to collect old objects. In Conference Proceedings on Object-oriented Programming Systems, Languages, and Applications, OOPSLA ’91, pages 33–46, Phoenix, Arizona, USA, 1991. ACM. Google Scholar
Digital Library
- M. Hirzel, A. Diwan, and M. Hertz. Connectivity-based garbage collection. In Proceedings of the 18th Annual ACM SIGPLAN Conference on Object-oriented Programing, Systems, Languages, and Applications, OOPSLA ’03, pages 359– 373, Anaheim, California, USA, 2003. ACM. Google Scholar
Digital Library
- B. Iyengar, G. Tene, M. Wolf, and E. Gehringer. The Collie: A Wait-free Compacting Collector. In Proceedings of the 2012 International Symposium on Memory Management, ISMM ’12, pages 85–96, Beijing, China, 2012. ACM. Google Scholar
Digital Library
- H. Lieberman and C. Hewitt. A real-time garbage collector based on the lifetimes of objects. Commun. ACM, 26(6):419– 429, June 1983. ISSN 0001-0782. Google Scholar
Digital Library
- B. McCloskey, D. F. Bacon, P. Cheng, and D. Grove. Staccato: A Parallel and Concurrent Real-time Compacting Garbage Collector for Multiprocessors. Technical report, IBM, 2008.Google Scholar
- F. Pizlo, D. Frampton, E. Petrank, and B. Steensgaard. Stopless: A real-time garbage collector for multiprocessors. In Proceedings of the 6th International Symposium on Memory Management, ISMM ’07, pages 159–172, Montreal, Quebec, Canada, 2007. ACM. Google Scholar
Digital Library
- J. Rafkind, A. Wick, J. Regehr, and M. Flatt. Precise Garbage Collection for C. In Proceedings of the 9th International Symposium on Memory Management, ISMM ’09, Dublin, Ireland, June 2009. ACM. Google Scholar
Digital Library
- M. Wegiel and C. Krintz. The mapping collector: Virtual memory support for generational, parallel, and concurrent compaction. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XIII, pages 91–102, Seattle, WA, USA, 2008. ACM. Introduction Related Work Design Overview Clusters State Cluster Analysis Cluster Size Threshold Sink Objects Watcher Tracer Discussion Implementation Applications Hacker News Squawker Evaluation Hacker News Pause Times Tolerating Writes Later Cluster Analyses Effect of Cluster Out-Pointers Squawker Pause Times Conclusion Google Scholar
Digital Library
Index Terms
Reducing pause times with clustered collection
Recommendations
Reducing pause times with clustered collection
ISMM '15: Proceedings of the 2015 International Symposium on Memory ManagementEach full garbage collection in a program with millions of objects can pause the program for multiple seconds. Much of this work is typically repeated, as the collector re-traces parts of the object graph that have not changed since the last ...
Controlling garbage collection and heap growth to reduce the execution time of Java applications
In systems that support garbage collection, a tension exists between collecting garbage too frequently and not collecting it frequently enough. Garbage collection that occurs too frequently may introduce unnecessary overheads at the risk of not ...
Reducing pause time of conservative collectors
MSP 2002 and ISMM 2002This paper describes an incremental conservative garbage collector that significantly reduces pause time of an existing collector by Boehm et al. Like their collector, it is a true conservative collector that does not require compiler cooperation but ...






Comments