Abstract
Garbage collection (GC), especially full GC, would non- trivially impact overall application performance, especially for those memory-hungry ones handling large data sets. This paper presents an in-depth performance analysis on the full GC performance of Parallel Scavenge (PS), a state-of-the-art and the default garbage collector in the HotSpot JVM, using traditional and big-data applications running atop JVM on CPU (e.g., Intel Xeon) and many-integrated cores (e.g., Intel Xeon i). The analysis uncovers that unnecessary memory accesses and calculations during reference updating in the compaction ase are the main causes of lengthy full GC. To this end, this paper describes an incremental query model for reference calculation, which is further embodied with three schemes (namely optimistic, sort-based and region-based) for different query patterns. Performance evaluation shows that the incremental query model leads to averagely 1.9X (up to 2.9X) in full GC and 19.3% (up to 57.2%) improvement in application throughput, as well as 31.2% reduction in pause time over the vanilla PS collector on CPU, and the numbers are 2.1X (up to 3.4X), 11.1% (up to 41.2%) and 34.9% for Xeon i accordingly.
- SPECjvm2008. https://www.spec.org/jvm2008/, 2015.Google Scholar
- D. Abuaiadh, Y. Ossia, E. Petrank, and U. Silbershtein. An efficient parallel heap compaction algorithm. In Proceedings of the 19th Annual ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications, OOPSLA '04, pages 224--236, New York, NY, USA, 2004. ACM. ISBN 1--58113--831--8. 10.1145/1028976.1028995. URL http://doi.acm.org/10.1145/1028976.1028995.Google Scholar
Digital Library
- Apache. Apache gira: an iterative gra processing system built for high scalability. http://gira.apache.org/.Google Scholar
- S. M. Blackburn and K. S. McKinley. Immix: A mark-region garbage collector with space efficiency, fast collection, and mutator performance. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '08, pages 22--32, New York, NY, USA, 2008. ACM. ISBN 978--1--59593--860--2. 10.1145/1375581.1375586. URL http://doi.acm.org/10.1145/1375581.1375586.Google Scholar
Digital Library
- VanDrunen, von Dincklage, and Wiedermann]blackburn2006dacapoS. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. ansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The dacapo benchmarks: Java benchmarking development and analysis. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications, OOPSLA '06, pages 169--190, New York, NY, USA, 2006. ACM. ISBN 1--59593--348--4. 10.1145/1167473.1167488. URL http://doi.acm.org/10.1145/1167473.1167488.Google Scholar
Digital Library
- Y. Bu, V. Borkar, G. Xu, and M. J. Carey. A bloat-aware design for big data applications. In Proceedings of the 2013 International Symposium on Memory Management, ISMM '13, pages 119--130, New York, NY, USA, 2013. ACM. ISBN 978--1--4503--2100--6. 10.1145/2464157.2466485. URL http://doi.acm.org/10.1145/2464157.2466485.Google Scholar
Digital Library
- B. Cahoon and K. S. McKinley. Data flow analysis for software prefetching linked data structures in java. In Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques, PACT '01, pages 280--291, Washington, DC, USA, 2001. IEEE Computer Society. ISBN 0--7695--1363--8. URL http://dl.acm.org/citation.cfm?id=645988.674177.Google Scholar
Cross Ref
- , and Sahlin]Chung:2000:RST:325694.325744Y. C. Chung, S.-M. Moon, K. Ebcio\uglu, and D. Sahlin. Reducing sweep time for a nearly empty heap. In Proceedings of the 27th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '00, pages 378--389, New York, NY, USA, 2000. ACM. ISBN 1--58113--125--9. 10.1145/325694.325744. URL http://doi.acm.org/10.1145/325694.325744.Google Scholar
Digital Library
- J. Dean and L. A. Barroso. The tail at scale. Commun. ACM, 56 (2): 74--80, Feb. 2013. ISSN 0001-0782. 10.1145/2408776.2408794. URL http://doi.acm.org/10.1145/2408776.2408794.Google Scholar
Digital Library
- L. Gidra, G. Thomas, J. Sopena, and M. Shapiro. Assessing the scalability of garbage collectors on many cores. SIGOPS Oper. Syst. Rev., 45 (3): 15--19, Jan. 2012. ISSN 0163--5980. 10.1145/2094091.2094096. URL http://doi.acm.org/10.1145/2094091.2094096.Google Scholar
Digital Library
- L. Gidra, G. Thomas, J. Sopena, and M. Shapiro. A study of the scalability of stop-the-world garbage collectors on multicores. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '13, pages 229--240, New York, NY, USA, 2013. ACM. ISBN 978--1--4503--1870--9. 10.1145/2451116.2451142. URL http://doi.acm.org/10.1145/2451116.2451142.Google Scholar
Digital Library
- L. Gidra, G. Thomas, J. Sopena, M. Shapiro, and N. Nguyen. Numagic: A garbage collector for big data on big numa machines. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '15, pages 661--673, New York, NY, USA, 2015. ACM. ISBN 978--1--4503--2835--7. 10.1145/2694344.2694361. URL http://doi.acm.org/10.1145/2694344.2694361.Google Scholar
Digital Library
- I. Gog, J. Giceva, M. Schwarzkopf, K. Vaswani, D. Vytiniotis, G. Ramalingan, D. Murray, S. Hand, and M. Isard. Broom: Sweeping out garbage collection from big data systems. In Proceedings of the 15th USENIX Conference on Hot Topics in Operating Systems, HOTOS'15, pages 2--2, Berkeley, CA, USA, 2015. USENIX Association. URL http://dl.acm.org/citation.cfm?id=2831090.2831092.Google Scholar
Digital Library
- R. Jones, A. Hosking, and E. Moss. The Garbage Collection Handbook: The Art of Automatic Memory Management. Chapman & Hall/CRC, 1st edition, 2011. ISBN 1420082795, 9781420082791.Google Scholar
- M. Maas, T. Harris, K. Asanovic, and J. Kubiatowicz. Trash day: Coordinating garbage collection in distributed systems. In Proceedings of the 15th USENIX Conference on Hot Topics in Operating Systems, HOTOS'15, pages 1--1, Berkeley, CA, USA, 2015. USENIX Association. URL http://dl.acm.org/citation.cfm?id=2831090.2831091.Google Scholar
- S. Microystems. Memory management in the java hotspot? virtual machine, 2006.Google Scholar
- J. E. Moreira, S. P. Midkiff, M. Gupta, P. Wu, G. Almasi, and P. Artigas. Ninja: Java for high performance numerical computing. Sci. Program., 10 (1): 19--33, Jan. 2002. ISSN 1058--9244. 10.1155/2002/314103. URL http://dx.doi.org/10.1155/2002/314103.Google Scholar
Digital Library
- K. Morikawa, T. Ugawa, and H. Iwasaki. Adaptive scanning reduces sweep time for the lisp2 mark-compact garbage collector. In Proceedings of the 2013 International Symposium on Memory Management, ISMM '13, pages 15--26, New York, NY, USA, 2013. ACM. ISBN 978--1--4503--2100--6. 10.1145/2464157.2466480. URL http://doi.acm.org/10.1145/2464157.2466480.Google Scholar
Digital Library
- D. G. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham, and M. Abadi. Naiad: A timely dataflow system. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, pages 439--455, New York, NY, USA, 2013. ACM. ISBN 978--1--4503--2388--8. 10.1145/2517349.2522738. URL http://doi.acm.org/10.1145/2517349.2522738.Google Scholar
Digital Library
- R. M. Muthukumar and D. Janakiram. Yama: A scalable generational garbage collector for java in multiprocessor systems. IEEE Trans. Parallel Distrib. Syst., 17 (2): 148--159, Feb. 2006. ISSN 1045--9219. 10.1109/TPDS.2006.28. URL http://dx.doi.org/10.1109/TPDS.2006.28.Google Scholar
Digital Library
- K. Nguyen and G. Xu. Cachetor: Detecting cacheable data to remove bloat. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2013, pages 268--278, New York, NY, USA, 2013. ACM. ISBN 978--1--4503--2237--9. 10.1145/2491411.2491416. URL http://doi.acm.org/10.1145/2491411.2491416.Google Scholar
Digital Library
- K. Nguyen, K. Wang, Y. Bu, L. Fang, J. Hu, and G. Xu. Facade: A compiler and runtime for (almost) object-bounded big data applications. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '15, pages 675--690, New York, NY, USA, 2015. ACM. ISBN 978--1--4503--2835--7. 10.1145/2694344.2694345. URL http://doi.acm.org/10.1145/2694344.2694345.Google Scholar
Digital Library
- N. Sachindran, J. E. B. Moss, and E. D. Berger. Mc2: High-performance garbage collection for memory-constrained environments. In Proceedings of the 19th Annual ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications, OOPSLA '04, pages 81--98, New York, NY, USA, 2004. ACM. ISBN 1--58113--831--8. 10.1145/1028976.1028984. URL http://doi.acm.org/10.1145/1028976.1028984.Google Scholar
Digital Library
- V. Sarkar and J. Dolby. High-performance scalable java virtual machines. In Proceedings of the 8th International Conference on High Performance Computing, HiPC '01, pages 151--166, London, UK, UK, 2001. Springer-Verlag. ISBN 3--540--43009--1. URL http://dl.acm.org/citation.cfm?id=645447.652938.Google Scholar
Cross Ref
- soman2008mtm2S. Soman, C. Krintz, and L. Daynès. Mtm2: Scalable memory management for multi-tasking managed runtime environments. In Proceedings of the 22Nd European Conference on Object-Oriented Programming, ECOOP '08, pages 335--361, Berlin, Heidelberg, 2008. Springer-Verlag. ISBN 978--3--540--70591--8. 10.1007/978--3--540--70592--5\_15. URL http://dx.doi.org/10.1007/978--3--540--70592--5\_15.Google Scholar
Digital Library
- Spark. Apache spark is a fast and general engine for large-scale data processing. http://spark.apache.org/, 2015.Google Scholar
- to, Touri\ no, and Doallo]taboada2013javaG. L. Taboada, S. Ramos, R. R. Expósito, J. Touri\ no, and R. Doallo. Java in the high performance computing arena: Research, practice and experience. Sci. Comput. Program., 78 (5): 425--444, May 2013. ISSN 0167--6423. 10.1016/j.scico.2011.06.002. URL http://dx.doi.org/10.1016/j.scico.2011.06.002.Google Scholar
- D. Vengerov. Modeling, analysis and throughput optimization of a generational garbage collector. In Proceedings of the 2009 International Symposium on Memory Management, ISMM '09, pages 1--9, New York, NY, USA, 2009. ACM. ISBN 978--1--60558--347--1. 10.1145/1542431.1542433. URL http://doi.acm.org/10.1145/1542431.1542433.Google Scholar
Digital Library
- Y. Yu, T. Lei, H. Chen, and B. Zang. Openjdk meets xeon i: A comprehensive study of java hpc on intel many-core architecture. In Parallel Processing Workshops (ICPPW), 2015 44th International Conference on, pages 156--165. IEEE, 2015.Google Scholar
Digital Library
- M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, NSDI'12, pages 2--2, Berkeley, CA, USA, 2012. USENIX Association. URL http://dl.acm.org/citation.cfm?id=2228298.2228301.Google Scholar
Digital Library
- X. Zhang, E. Tune, R. Hagmann, R. Jnagal, V. Gokhale, and J. Wilkes. Cpi2: Cpu performance isolation for shared compute clusters. In Proceedings of the 8th ACM European Conference on Computer Systems, EuroSys '13, pages 379--391, New York, NY, USA, 2013. ACM. ISBN 978--1--4503--1994--2. 10.1145/2465351.2465388. URL http://doi.acm.org/10.1145/2465351.2465388.Google Scholar
Digital Library
Index Terms
Performance Analysis and Optimization of Full Garbage Collection in Memory-hungry Environments
Recommendations
Analysis and Optimizations of Java Full Garbage Collection
APSys '18: Proceedings of the 9th Asia-Pacific Workshop on SystemsJava runtime frees applications from manual memory management by its automatic garbage collection (GC), at the cost of stop-the-world pauses. State-of-the-art collectors leverage multiple generations, which will inevitably suffer from a full GC phase ...
ScissorGC: scalable and efficient compaction for Java full garbage collection
VEE 2019: Proceedings of the 15th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution EnvironmentsJava runtime frees applications from manual memory management through automatic garbage collection (GC). This, however, is usually at the cost of stop-the-world pauses. State-of-the-art collectors leverage multiple generations, which will inevitably ...
Performance Analysis and Optimization of Full Garbage Collection in Memory-hungry Environments
VEE '16: Proceedings of the12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution EnvironmentsGarbage collection (GC), especially full GC, would non- trivially impact overall application performance, especially for those memory-hungry ones handling large data sets. This paper presents an in-depth performance analysis on the full GC performance ...







Comments