Abstract
Emerging Non-Volatile Memory (NVM) technologies offer high capacity and energy efficiency compared to DRAM, but suffer from limited write endurance and longer latencies. Prior work seeks the best of both technologies by combining DRAM and NVM in hybrid memories to attain low latency, high capacity, energy efficiency, and durability. Coarsegrained hardware and OS optimizations then spread writes out (wear-leveling) and place highly mutated pages in DRAM to extend NVM lifetimes. Unfortunately even with these coarse-grained methods, popular Java applications exact impractical NVM lifetimes of 4 years or less. This paper shows how to make hybrid memories practical, without changing the programming model, by enhancing garbage collection in managed language runtimes. We find object write behaviors offer two opportunities: (1) 70% of writes occur to newly allocated objects, and (2) 2% of objects capture 81% of writes to mature objects. We introduce writerationing garbage collectors that exploit these fine-grained behaviors. They extend NVM lifetimes by placing highly mutated objects in DRAM and read-mostly objects in NVM. We implement two such systems. (1) Kingsguard-nursery places new allocation in DRAM and survivors in NVM, reducing NVM writes by 5× versus NVM only with wear-leveling. (2) Kingsguard-writers (KG-W) places nursery objects in DRAM and survivors in a DRAM observer space. It monitors all mature object writes and moves unwritten mature objects from DRAM to NVM. Because most mature objects are unwritten, KG-W exploits NVM capacity while increasing NVM lifetimes by 11×. It reduces the energy-delay product by 32% over DRAM-only and 29% over NVM-only. This work opens up new avenues for making hybrid memories practical.
Supplemental Material
- Shoaib Akram, Jennifer B. Sartor, and Lieven Eeckhout. DEP+BURST: Online DVFS Performance Prediction for Energy-Efficient Managed Language Execution. IEEE Trans. Comput. 66, 4 (April 2017). Google Scholar
Digital Library
- Bowen Alpern, C. Richard Attanasio, John J. Barton, Michael G. Burke, Perry Cheng, Jong-Deok Choi, Anthony Cocchi, Stephen J. Fink, David Grove, Michael Hind, Susan Flynn Hummel, Derek Lieber, Vassily Litvinov, Mark F. Mergen, Ton Ngo, James R. Russell, Vivek Sarkar, Mauricio J. Serrano, Janice C. Shepherd, Stephen E. Smith, Vugranam C. Sreedhar, Harini Srinivasan, and John Whaley. The Jalapeño virtual machine. IBM Systems Journal 39, 1 (2000). Google Scholar
Digital Library
- Bowen Alpern, Steve Augart, Stephen M. Blackburn, Maria A. Butrico, Anthony Cocchi, Perry Cheng, Julian Dolby, Stephen J. Fink, David Grove, Michael Hind, Kathryn S. McKinley, Mark Mergen, J. Eliot B. Moss, Ton Anh Ngo, Vivek Sarkar, and Martin Trapp. The Jikes RVM Project: Building an Open Source Research Community. IBM System Journal 44, 2 (2005). Google Scholar
Digital Library
- Andrew W. Appel. Simple Generational Garbage Collection and Fast Allocation. Softw. Pract. Exper. 19, 2 (Feb. 1989). Google Scholar
Digital Library
- Aravinthan Athmanathan, Milos Stanisavljevic, Nikolaos Papandreou, Haralampos Pozidis, and Evangelos Eleftheriou. Multilevel-Cell Phase-Change Memory: A Viable Technology. IEEE J. Emerg. Sel. Topics Circuits Syst. 6, 1 (2016).Google Scholar
Cross Ref
- Stephen M. Blackburn, Perry Cheng, and Kathryn S. McKinley. 2004. Myths and Realities: The Performance Impact of Garbage Collection. In Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS). Google Scholar
Digital Library
- Stephen M. Blackburn, Perry Cheng, and Kathryn S. McKinley. 2004. Oil and Water? High Performance Garbage Collection in Java with MMTk. In International Conference on Software Engineering (ICSE). Google Scholar
Digital Library
- Stephen M. Blackburn, Robin Garner, Chris Hoffmann, Asjad M. Khang, Kathryn S. McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanovic, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2006. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications (OOPSLA). Google Scholar
Digital Library
- Stephen M Blackburn, Martin Hirzel, Robin Garner, and Darko Stefanovic. 2010. pjbb2005: The pseudojbb benchmark, 2005. http://users.cecs.anu.edu.au/~steveb/research/research-infrastructure/pjbb2005Google Scholar
- Stephen M. Blackburn and Kathryn S. McKinley. 2008. Immix: A Markregion Garbage Collector with Space Efficiency, Fast Collection, and Mutator Performance. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Google Scholar
Digital Library
- Santiago Bock, Bruce R. Childers, Rami Melhem, and Daniel Mossé. 2014. Concurrent Page Migration for Mobile Systems with OS-managed Hybrid Memory. In Proceedings of the 11th ACM Conference on Computing Frontiers (CF). Google Scholar
Digital Library
- Geoffrey W. Burr, Matthew J. Breitwisch, Michele Franceschini, Davide Garetto, Kailash Gopalakrishnan, Bryan Jackson, BÃ?lent Kurdi, Chung Lam, Luis A. Lastras, Alvaro Padilla, Bipin Rajendran, Simone Raoux, and Rohit S. Shenoy. Phase change memory technology. Journal of Vacuum Science & Technology B, Nanotechnology and Microelectronics: Materials, Processing, Measurement, and Phenomena 28, 2 (2010).Google Scholar
- Trevor E. Carlson, Wim Heirman, Stijn Eyerman, Ibrahim Hur, and Lieven Eeckhout. An Evaluation of High-Level Mechanistic Core Models. ACM Transactions on Architecture and Code Optimization (TACO) 11, 3 (2014). Google Scholar
Digital Library
- David Detlefs, Christine Flood, Steve Heller, and Tony Printezis. 2004. Garbage-first Garbage Collection. In Proceedings of the 4th International Symposium on Memory Management (ISMM). Google Scholar
Digital Library
- Kristof Du Bois, Jennifer B. Sartor, Stijn Eyerman, and Lieven Eeckhout. 2013. Bottle Graphs: Visualizing Scalability Bottlenecks in Multithreaded Applications. In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA). Google Scholar
Digital Library
- Daniel Frampton, Stephen M. Blackburn, Perry Cheng, Robin J. Garner, David Grove, J. Eliot B. Moss, and Sergey I. Salishev. 2009. Demystifying Magic: High-level Low-level Programming. In ACM SIGPLAN/ SIGOPS International Conference on Virtual Execution Environments (VEE). Google Scholar
Digital Library
- Tiejun Gao, Karin Strauss, Stephen M. Blackburn, Kathryn S. McKinley, Doug Burger, and James Larus. 2013. Using Managed Runtime Systems to Tolerate Holes in Wearable Memories. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Google Scholar
Digital Library
- Jungwoo Ha, Magnus Gustafsson, Stephen M. Blackburn, and Kathryn S. McKinley. 2008. Microarchitectural Characterization of Production JVMs and Java Workloads. In IBM CAS Workshop.Google Scholar
- Michael Hicks, Luke Hornof, Jonathan T. Moore, and Scott M. Nettles. A Study of Large Object Spaces. SIGPLAN Not. 34, 3 (Oct. 1998). Google Scholar
Digital Library
- Xianglong Huang, Stephen M. Blackburn, Kathryn S. McKinley, J Eliot B. Moss, Zhenlin Wang, and Perry Cheng. 2004. The Garbage Collection Advantage: Improving Mutator Locality. In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA). Google Scholar
Digital Library
- ITRS. 2015. Internatial Technology Roadmap for Semiconductors 2.0: Executive Report.Google Scholar
- Michael R. Jantz, Forrest J. Robinson, Prasad A. Kulkarni, and Kshitij A. Doshi. 2015. Cross-layer Memory Management for Managed Language Applications. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). Google Scholar
Digital Library
- Richard Jones and Rafael Lins. 1996. Garbage Collection: Algorithms for Automatic Dynamic Memory Management. Google Scholar
Digital Library
- Mark H. Kryder and Chang Soo Kim. After Hard Drives -- What Comes Next? IEEE Transactions on Magnetics 45, 10 (2009).Google Scholar
Cross Ref
- Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-scale Graph Computation on Just a PC. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI). Google Scholar
Digital Library
- Benjamin C. Lee, Engin Ipek, Onur Mutlu, and Doug Burger. 2009. Architecting Phase Change Memory As a Scalable Dram Alternative. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA). Google Scholar
Digital Library
- Benjamin C. Lee, Ping Zhou, Jun Yang, Youtao Zhang, Bo Zhao, Engin Ipek, Onur Mutlu, and Doug Burger. Phase-Change Technology and the Future of Main Memory. IEEE Micro 30, 1 (Jan. 2010). Google Scholar
Digital Library
- Myoung-Jae Lee, Chang Bum Lee, Dongsoo Lee, Seung Ryul Lee, Man Chang, Ji Hyun Hur, Young-Bae Kim, Chang-Jung Kim, David H. Seo, Sunae Seo, U-In Chung, In-Kyeong Yoo, and Kinam Kim. A fast, high-endurance and scalable non-volatile memory device made from asymmetric Ta2O5-x/TaO2-x bilayer structures. Nature Materials 10, 3 (2011).Google Scholar
Cross Ref
- Soyoon Lee, Hyokyung Bahn, and Sam H. Noh. 2011. Characterizing Memory Write References for Efficient Management of Hybrid PCM and DRAM Memory. In IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS). Google Scholar
Digital Library
- Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.Google Scholar
- Sheng Li, Jung Ho Ahn, R.D. Strong, J.B. Brockman, D.M. Tullsen, and N.P. Jouppi. 2009. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). Google Scholar
Digital Library
- Kevin Lim, Jichuan Chang, Trevor Mudge, Parthasarathy Ranganathan, Steven K. Reinhardt, and Thomas F. Wenisch. 2009. Disaggregated Memory for Expansion and Sharing in Blade Servers. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA). Google Scholar
Digital Library
- Micron. 2007. TN-41-01: Calculating memory system power for DDR3.Google Scholar
- Onur Mutlu and Lavanya Subramanian. Research Problems and Opportunities in Memory Systems. Supercomputing Frontiers and Innovations 1, 3 (Oct 2014). Google Scholar
Digital Library
- Khanh Nguyen, Lu Fang, Guoqing Xu, Brian Demsky, Shan Lu, Sanazsadat Alamian, and Onur Mutlu. 2016. Yak: A High-performance Bigdata-friendly Garbage Collector. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI). Google Scholar
Digital Library
- Numonym. 2008. Phase Change Memory. http://www.pdl.cmu.edu/SDI/2009/slides/Numonyx.pdfGoogle Scholar
- Michael Paleczny, Christopher Vick, and Cliff Click. 2001. The Java Hotspot server compiler. In Usenix Java Virtual Machine Research and Technology Symposium (JVM). Google Scholar
Digital Library
- Moinuddin K. Qureshi. 2011. Pay-As-You-Go: Low-overhead Hard-error Correction for Phase Change Memories. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). Google Scholar
Digital Library
- Moinuddin K. Qureshi, John Karidis, Michele Franceschini, Vijayalakshmi Srinivasan, Luis Lastras, and Bulent Abali. 2009. Enhancing Lifetime and Security of PCM-based Main Memory with Start-gap Wear Leveling. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). Google Scholar
Digital Library
- Moinuddin K. Qureshi, John Karidis, Michele Franceschini, Vijayalakshmi Srinivasan, Luis Lastras, and Bulent Abali. 2009. Enhancing lifetime and security of PCM-based Main Memory with Start-Gap Wear Leveling. In 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). Google Scholar
Digital Library
- Moinuddin K. Qureshi, Andre Seznec, Luis A. Lastras, and Michele M. Franceschini. 2011. Practical and secure PCM systems by online detection of malicious write streams. In IEEE 17th International Symposium on High Performance Computer Architecture (HPCA). Google Scholar
Digital Library
- Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, and Jude A. Rivers. 2009. Scalable High Performance Main Memory System Using Phase-change Memory Technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA). Google Scholar
Digital Library
- Luiz E. Ramos, Eugene Gorbatov, and Ricardo Bianchini. 2011. Page Placement in Hybrid Memory Systems. In Proceedings of the International Conference on Supercomputing (ICS). Google Scholar
Digital Library
- Jennifer B. Sartor, Wim Heirman, Stephen M. Blackburn, Lieven Eeckhout, and Kathryn S. McKinley. 2014. Cooperative Cache Scrubbing. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation (PACT). Google Scholar
Digital Library
- Andre Seznec. A Phase Change Memory as a Secure Main Memory. IEEE Computer Architecture Letters 9, 1 (Jan 2010). Google Scholar
Digital Library
- Rifat Shahriyar, Stephen M. Blackburn, Xi Yang, and Kathryn S. McKinley. 2013. Taking Off the Gloves with Reference Counting Immix. In ACM International Conference on Object Oriented Programming Systems Languages & Applications (OOPSLA). Google Scholar
Digital Library
- Darko Stefanovic, Kathryn S. McKinley, and J. Eliot B. Moss. 1999. Age-based Garbage Collection. In Proceedings of the 14th ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA). Google Scholar
Digital Library
- David Ungar and Frank Jackson. An Adaptive Tenuring Policy for Generation Scavengers. ACM Trans. Program. Lang. Syst. 14, 1 (Jan. 1992). Google Scholar
Digital Library
- Chenxi Wang, Ting Coa, John Zigman, Fang Lv, Yunquan Zhang, and Xiaobing Feng. 2016. Efficient Management for Hybrid Memory in Managed Language Runtime. In IFIP International Conference on Network and Parallel Computing (NPC).Google Scholar
- Wei Wei, Dejun Jiang, Sally A. McKee, Jin Xiong, and Mingyu Chen. 2015. Exploiting Program Semantics to Place Data in Hybrid Memory. In Proceedings of the International Conference on Parallel Architecture and Compilation (PACT). Google Scholar
Digital Library
- Paul R. Wilson. A Simple Bucket-brigade Advancement Mechanism for Generation-bases Garbage Collection. SIGPLAN Not. 24, 5 (May 1989). Google Scholar
Digital Library
- Xi Yang, Stephen M. Blackburn, Daniel Frampton, and Antony L. Hosking. 2012. Barriers Reconsidered, Friendlier Still!. In Proceedings of the Eleventh ACM SIGPLAN International Symposium on Memory Management (ISMM). Google Scholar
Digital Library
- Xi Yang, Stephen M Blackburn, Daniel Frampton, Jennifer B. Sartor, and Kathryn S McKinley. 2011. Why Nothing Matters: The Impact of Zeroing. In Proceedings of the ACM Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). Google Scholar
Digital Library
- Lunkai Zhang, Brian Neely, Diana Franklin, Dmitri Strukov, Yuan Xie, and Frederic T. Chong. 2016. MellowWrites: Extending Lifetime in Resistive Memories Through Selective Slow Write Backs. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA). Google Scholar
Digital Library
- Wangyuan Zhang and Tao Li. 2009. Exploring Phase Change Memory and 3D Die-Stacking for Power/Thermal Friendly, Fast and Durable Memory Architectures. In Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques (PACT). Google Scholar
Digital Library
- Yi Zhao, Jin Shi, Kai Zheng, Haichuan Wang, Haibo Lin, and Ling Shao. 2009. AllocationWall: A Limiting Factor of Java Applications on Emerging Multi-core Platforms. In Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA). Google Scholar
Digital Library
- Yuanyuan Zhou, James Philbin, and Kai Li. 2001. The Multi-Queue Replacement Algorithm for Second Level Buffer Caches. In Proceedings of the USENIX Annual Technical Conference (ATC). Google Scholar
Digital Library
Index Terms
Write-rationing garbage collection for hybrid memories
Recommendations
Write-rationing garbage collection for hybrid memories
PLDI 2018: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and ImplementationEmerging Non-Volatile Memory (NVM) technologies offer high capacity and energy efficiency compared to DRAM, but suffer from limited write endurance and longer latencies. Prior work seeks the best of both technologies by combining DRAM and NVM in hybrid ...
Managing hybrid memories by predicting object write intensity
Programming '18: Companion Proceedings of the 2nd International Conference on the Art, Science, and Engineering of ProgrammingEmerging Non-Volatile Memory (NVM) technologies offer more capacity and energy efficiency than DRAM, but their write endurance is lower and latency is higher. Hybrid memories seek the best of both worlds — scalability, efficiency, and performance — by ...
Write activity reduction on non-volatile main memories for embedded chip multiprocessors
Recent advances in circuit and semiconductor technologies have pushed Non-Volatile Memory (NVM) technologies into a new era. These technologies exhibit appealing properties such as low power consumption, non-volatility, shock-resistivity, and high ...







Comments