skip to main content
research-article

Scalable concurrent and parallel mark

Published:15 June 2012Publication History
Skip Abstract Section

Abstract

Parallel marking algorithms use multiple threads to walk through the object heap graph and mark each reachable object as live. Parallel marker threads mark an object "live" by atomically setting a bit in a mark-bitmap or a bit in the object header. Most of these parallel algorithms strive to improve the marking throughput by using work-stealing algorithms for load-balancing and to ensure that all participating threads are kept busy. A purely "processor-centric" load-balancing approach in conjunction with a need to atomically set the mark bit, results in significant contention during parallel marking. This limits the scalability and throughput of parallel marking algorithms.

We describe a new non-blocking and lock-free, work-sharing algorithm, the primary goal being to reduce contention during atomic updates of the mark-bitmap by parallel task-threads. Our work-sharing mechanism uses the address of a word in the mark-bitmap as the key to stripe work among parallel task-threads, with only a subset of the task-threads working on each stripe. This filters out most of the contention during parallel marking with 20% improvements in performance.

In case of concurrent and on-the-fly collector algorithms, mutator threads also generate marking-work for the marking task-threads. In these schemes, mutator threads are also provided with thread-local marking stacks where they collect references to potentially "gray" objects, i.e., objects that haven't been "marked-through" by the collector. We note that since this work is generated by mutators when they reference these objects, there is a high likelihood that these objects continue to be present in the processor cache. We describe and evaluate a scheme to distribute mutator generated marking work among the collector's task-threads that is cognizant of the processor and cache topology. We prototype both our algorithms within the C4 [28] collector that ships as part of an industrial strength JVM for the Linux-X86 platform.

References

  1. Intel® 64 and ia-32 architectures developer's manual: Combined volumes,. URL http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf.Google ScholarGoogle Scholar
  2. Intel® 64 architecture processor topology enumeration,. URL http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/.Google ScholarGoogle Scholar
  3. Standard performance evaluation corporation. spec jvm98. URL http://www.spec.org/jvm98/.Google ScholarGoogle Scholar
  4. The volano benchmark. URL http://www.volano.com/benchmarks.html.Google ScholarGoogle Scholar
  5. N. S. Arora, R. D. Blumofe, and C. G. Plaxton. Thread scheduling for multiprogrammed multiprocessors. In SPAA, pages 119--129, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H. Azatchi, Y. Levanoni, H. Paz, and E. Petrank. An on-the-fly mark and sweep garbage collector based on sliding views. pages 269--281. 10.1145/949305.949329. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Barabash, O. Ben-Yitzhak, I. Goft, E. K. Kolodner, V. Leikehman, Y. Ossia, A. Owshanko, and E. Petrank. A parallel, incremental, mostly concurrent garbage collector for servers. ACM Trans. Program. Lang. Syst., 27 (6): 1097--1146, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. VanDrunen, von Dincklage, and Wiedermann}dacapoS. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In OOPSLA '06: Proceedings of the 21st annual ACM SIGPLAN conference on Object-Oriented Programing, Systems, Languages, and Applications, pages 169--190, New York, NY, USA, Oct. 2006. ACM Press. http://doi.acm.org/10.1145/1167473.1167488. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H.-J. Boehm. Reducing garbage collector cache misses. pages 59--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C.-Y. Cher, A. L. Hosking, and T. Vijaykumar. Software prefetching for mark-sweep garbage collection: Hardware analysis and software redesign. pages 199--210. 10.1145/1024393.1024417. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Click, G. Tene, and M. Wolf. The pauseless gc algorithm. In Proceedings of the 1st ACM/USENIX international conference on Virtual execution environments, VEE '05, pages 46--56, New York, NY, USA, 2005. ACM. ISBN 1-59593-047-7. URL http://doi.acm.org/10.1145/1064979.1064988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Detlefs and T. Printezis. A Generational Mostly-concurrent Garbage Collector. Technical report, Mountain View, CA, USA, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Detlefs, C. H. Flood, S. Heller, and T. Printezis. Garbage-first garbage collection. In ISMM, pages 37--48, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. W. Dijkstra, L. Lamport, A. J. Martin, C. S. Scholten, and E. F. M. Steffens. On-the-fly garbage collection: An exercise in cooperation. In Language Hierarchies and Interfaces: International Summer School, volume 46, pages 43--56. Marktoberdorf, Germany, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Domani, E. K. Kolodner, and E. Petrank. A generational on-the-fly garbage collector for java. In PLDI, pages 274--284, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. U. Drepper. What every programmer should know about memory. URL http://www.akkadia.org/drepper/cpumemory.pdf.Google ScholarGoogle Scholar
  17. T. Endo and K. Taura. Reducing pause time of conservative collectors. In MSP/ISMM, pages 119--131, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. Endo, K. Taura, and A. Yonezawa. Predicting scalability of parallel garbage collectors on shared memory multiprocessors. In Proceedings of the 15th International Parallel & Distributed Processing Symposium, IPDPS '01, pages 43--, Washington, DC, USA, 2001. IEEE Computer Society. ISBN 0-7695-0990-8. URL http://dl.acm.org/citation.cfm?id=645609.662496. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. H. Flood, D. Detlefs, N. Shavit, and X. Zhang. Parallel garbage collection for shared memory multiprocessors. In Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1, JVM'01, pages 21--21, Berkeley, CA, USA, 2001. USENIX Association. URL http://dl.acm.org/citation.cfm?id=1267847.1267868. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Garner, S. M. Blackburn, and D. Frampton. Effective prefetch for mark-sweep garbage collection. In ISMM, pages 43--54, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. H. Halstead. Multilisp: A language for concurrent symbolic computation. ACM Trans. Prog. Lang. Syst., 7 (4): 501--538, Oct. 1985. 10.1145/4472.4478. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Jones and C. Ryder. A study of Java object demographics. pages 121--130. 10.1145/1375634.1375652. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Jones, A. Hosking, and E. Moss. The Garbage Collection Handbook: The Art of Automatic Memory Management. CRC Applied Algorithms and Data Structures. Chapman & Hall, Aug. 2011. ISBN 978-1420082791. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. T. Ogasawara. Numa-aware memory manager with dominant-thread-based copying gc. In Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications, OOPSLA '09, pages 377--390, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-766-0. URL http://doi.acm.org/10.1145/1640089.1640117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Ossia, O. Ben-Yitzhak, I. Goft, E. K. Kolodner, V. Leikehman, and A. Owshanko. A parallel, incremental and concurrent GC for servers. pages 129--140. 10.1145/512529.512546. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. F. Siebert. Limits of parallel marking garbage collection. In Proceedings of the 7th international symposium on Memory management, ISMM '08, pages 21--29, New York, NY, USA, 2008. ACM. ISBN 978-1-60558-134-7. http://doi.acm.org/10.1145/1375634.1375638. URL http://doi.acm.org/10.1145/1375634.1375638. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. F. Siebert. Concurrent, parallel, real-time garbage-collection. In Proceedings of the 2010 international symposium on Memory management, ISMM '10, pages 11--20, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0054-4. http://doi.acm.org/10.1145/1806651.1806654. URL http://doi.acm.org/10.1145/1806651.1806654. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. G. Tene, B. Iyengar, and M. Wolf. C4: the continuously concurrent compacting collector. In Proceedings of the international symposium on Memory management, ISMM '11, pages 79--88, New York, NY, USA, 2011. ACM. ISBN 978-1-4503-0263-0. URL http://doi.acm.org/10.1145/1993478.1993491. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. M. Tikir and J. K. Hollingsworth. Numa-aware java heaps for server applications. IPDPS '05, pages 108.2--. IEEE Computer Society. ISBN 0-7695-2312-9. URL http://dx.doi.org/10.1109/IPDPS.2005.299. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Wu and X.-F. Li. Task-pushing: a scalable parallel gc marking algorithm without synchronization operations. In IPDPS, pages 1--10, 2007.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Scalable concurrent and parallel mark

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!