skip to main content
research-article
Public Access

What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study

Published:21 December 2018Publication History
Skip Abstract Section

Abstract

Main memory (DRAM) consumes as much as half of the total system power in a computer today, due to the increasing demand for memory capacity and bandwidth. There is a growing need to understand and analyze DRAM power consumption, which can be used to research new DRAM architectures and systems that consume less power. A major obstacle against such research is the lack of detailed and accurate information on the power consumption behavior of modern DRAM devices. Researchers have long relied on DRAM power models that are predominantly based off of a set of standardized current measurements provided by DRAM vendors, called IDD values. Unfortunately, we find that state-of-the-art DRAM power models are often highly inaccurate, as these models do not reflect the actual power consumed by real DRAM devices. To build an accurate model and provide insights into DRAM power consumption, we perform the first comprehensive experimental characterization of the power consumed by modern real-world DRAM modules. Our extensive characterization of 50 DDR3L DRAM modules from three major vendors yields four key new observations about DRAM power consumption that prior models cannot capture: (1) across all IDD values that we measure, the current consumed by real DRAM modules varies significantly from the current specified by the vendors; (2) DRAM power consumption strongly depends on the data value that is read or written; (3) there is significant structural variation, where the same banks and rows across multiple DRAM modules from the same model consume more power than other banks or rows; and (4) over successive process technology generations, DRAM power consumption has not decreased by as much as vendor specifications have indicated. Because state-of-the-art DRAM power models do not account for any of these four key characteristics, they are highly inaccurate compared to the actual, measured power consumption of 50 real DDR3L modules. Based on our detailed analysis and characterization data, we develop the Variation-Aware model of Memory Power Informed by Real Experiments (VAMPIRE). VAMPIRE is a new, accurate power consumption model for DRAM that takes into account (1) module-to-module and intra-module variations, and (2) power consumption variation due to data value dependency. We show that VAMPIRE has a mean absolute percentage error of only 6.8% compared to actual measured DRAM power. VAMPIRE enables a wide range of studies that were not possible using prior DRAM power models. As an example, we use VAMPIRE to evaluate the energy efficiency of three different encodings that can be used to store data in DRAM. We find that a new power-aware data encoding mechanism can reduce total DRAM energy consumption by an average of 12.2%, across a wide range of applications. We have open-sourced both VAMPIRE and our extensive raw data collected during our experimental characterization.

References

  1. N. Aggarwal, J. F. Cantin, M. H. Lipasti, and J. E. Smith, “Power-Efficient DRAM Speculation,” in HPCA, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  2. A. Agrawal, A. Ansari, and J. Torrellas, “Mosaic: Exploiting the Spatial Locality of Process Variation to Reduce Refresh Energy in On-Chip eDRAM Modules,” in HPCA, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  3. A. R. Alameldeen and D. A. Wood, “Adaptive Cache Compression for High-Performance Processors,” in ISCA, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. M. Amin and Z. A. Chishti, “Rank-Aware Cache Replacement and Write Buffering to Improve DRAM Energy Efficiency,” in ISLPED, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. V. Anagnostopoulou, S. Biswas, H. Saadeldeen, A. Savage, R. Bianchini, T. Yang, D. Franklin, and F. T. Chong, “Barely Alive Memory Servers: Keeping Data Active in a Low-Power State,” ACM JETC, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Bakhoda, G. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt, “Analyzing CUDA Workloads Using a Detailed GPU Simulator,” in ISPASS, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  7. B. M. Beckmann and D. A. Wood, “TLC: Transmission Line Caches,” in MICRO, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. I. Bhati, Z. Chishti, and B. Jacob, “Coordinated Refresh: Energy Efficient Techniques for DRAM Refresh Scheduling,” in ISLPED, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. I. Bhati, Z. Chishti, S. Lu, and B. Jacob, “Flexible Auto-Refresh: Enabling Scalable and Energy-Efficient DRAM Refresh Reductions,” in ISCA , 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Bi, R. Duan, and C. Gniady, “Delay-Hiding Energy Management Mechanisms for DRAM,” in HPCA, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  11. N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood, “gem5: A Multiple-ISA Full System Simulator with Detailed Memory Model,” CAN, vol. 39, June 2011.Google ScholarGoogle Scholar
  12. M. N. Bojnordi and E. .Ipek, “DESC: Energy-Efficient Data Exchange Using Synchronized Counters,” in MICRO, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Brooks, V. Tiwari, and M. Martonosi, “Wattch: A Framework for Architectural-Level Power Analysis and Optimizations,” in ISCA , 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Cai, S. Ghose, Y. Luo, K. Mai, O. Mutlu, and E. F. Haratsch, “Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques,” in HPCA, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  15. Y. Cai, Y. Luo, S. Ghose, E. F. Haratsch, K. Mai, and O. Mutlu, “Read Disturb Errors in MLC NAND Flash Memory: Characterization and Mitigation,” in DSN, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. Cai, Y. Luo, E. F. Haratsch, K. Mai, and O. Mutlu, “Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery,” in HPCA, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  17. Y. Cai, G. Yalcin, O. Mutlu, E. F. Haratsch, A. Cristal, O. Unsal, and K. Mai, “Flash Correct and Refresh: Retention Aware Management for Increased Lifetime,” in ICCD, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Cai, G. Yalcin, O. Mutlu, E. F. Haratsch, A. Cristal, O. Unsal, and K. Mai, “Error Analysis and Retention-Aware Error Management for NAND Flash Memory,” Intel Technol. J., May 2013.Google ScholarGoogle Scholar
  19. Y. Cai, S. Ghose, E. F. Haratsch, Y. Luo, and O. Mutlu, “Error Characterization, Mitigation, and Recovery in Flash-Memory-Based Solid-State Drives,” Proceedings of the IEEE, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  20. Y. Cai, S. Ghose, E. F. Haratsch, Y. Luo, and O. Mutlu, “Reliability Issues in Flash-Memory-Based Solid-State Drives: Experimental Analysis, Mitigation, Recovery,” in Inside Solid State Drives (SSDs), 2nd ed.hskip 1em plus 0.5em minus 0.4emrelax Springer Nature, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  21. Y. Cai, E. F. Haratsch, O. Mutlu, and K. Mai, “Error Patterns in MLC NAND Flash Memory: Measurement, Characterization, and Analysis,” in DATE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Cai, E. F. Haratsch, O. Mutlu, and K. Mai, “Threshold Voltage Distribution in MLC NAND Flash Memory: Characterization, Analysis, and Modeling,” in DATE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Cai, O. Mutlu, E. F. Haratsch, and K. Mai, “Program Interference in MLC NAND Flash Memory: Characterization, Modeling, and Mitigation,” in ICCD, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  24. Y. Cai, G. Yalcin, O. Mutlu, E. F. Haratsch, O. Unsal, A. Cristal, and K. Mai, “Neighbor-Cell Assisted Error Correction for MLC NAND Flash Memories,” in SIGMETRICS, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. K. Chandrasekar, S. Goossens, C. Weis, M. Koedam, B. Akesson, N. Wehn, and K. Goossens, “Exploiting Expendable Process-Margins in DRAMs for Run-Time Performance Optimization,” in DATE, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. K. Chandrasekar, B. Akesson, and K. Goossens, “Improved Power Modelling of DDR SDRAMs,” in DSD, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. K. Chandrasekar, C. Weis, Y. Li, S. Goossens, M. Jung, O. Naji, B. Akesson, N. Wehn, and K. Goossens, “DRAMPower: Open-Source DRAM Power & Energy Estimation Tool,” http://www.drampower.info.Google ScholarGoogle Scholar
  28. K. K. Chang, “Understanding and Improving the Latency of DRAM-Based Memory Systems,” Ph.D. dissertation, Carnegie Mellon Univ., 2017.Google ScholarGoogle Scholar
  29. K. K. Chang, D. Lee, Z. Chishti, A. Alameldeen, C. Wilkerson, Y. Kim, and O. Mutlu, “Improving DRAM Performance by Parallelizing Refreshes with Accesses,” in HPCA, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  30. K. K. Chang, P. J. Nair, D. Lee, S. Ghose, M. K. Qureshi, and O. Mutlu, “Low-Cost Inter-Linked Subarrays (LISA): Enabling Fast Inter-Subarray Data Movement in DRAM,” in HPCA, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  31. K. K. Chang, A. G. A. G. Yauglikcci, S. Ghose, A. Agrawal, N. Chatterjee, A. Kashyap, D. Lee, M. O'Connor, H. Hassan, and O. Mutlu, “Understanding Reduced-Voltage Operation in Modern DRAM Devices: Experimental Characterization, Analysis, and Mechanisms,” in SIGMETRICS, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. K. K. Chang, A. Kashyap, H. Hassan, S. Ghose, K. Hsieh, D. Lee, T. Li, G. Pekhimenko, S. Khan, and O. Mutlu, “Understanding Latency Variation in Modern DRAM Chips: Experimental Characterization, Analysis, and Optimization,” in SIGMETRICS, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. N. Chatterjee, M. O'Connor, D. Lee, D. R. Johnson, M. Rhu, S. W. Kecker, and W. J. Dally, “Architecting an Energy-Efficient DRAM System for GPUs,” in HPCA, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  34. K. Chen, S. Li, N. Muralimanohar, J. H. Ahn, J. B. Brockman, and N. P. Jouppi, “CACTI-3DD: Architecture-Level Modeling for 3D Die-Stacked DRAM Main Memory,” in DATE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. X. Chen, L. Yang, R. Dick, L. Shang, and H. Lekatsas, “A High-Performance Microprocessor Cache Compression Algorithm,” TVLSI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. Y. Choi, “LPDDR4: Evolution for New Mobile Worlds,” in MEMCON , 2013.Google ScholarGoogle Scholar
  37. E. Cooper-Balis and B. Jacob, “Fine-Grained Activation for Power Reduction in DRAM,” IEEE Micro, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. H. David, C. Fallin, E. Gorbatov, U. R. Hanebutte, and O. Mutlu, “Memory Power Management via Dynamic Voltage/Frequency Scaling,” in ICAC, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. V. De La Luz, M. Kandemir, and I. Kolcu, “Automatic Data Migration for Reducing Energy Consumption in Multi-Bank Memory Systems,” in DAC , 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. V. De La Luz, M. Kandemir, N. Vijaykrishnan, A. Sivasubramaniam, and M. J. Irwin, “DRAM Energy Management Using Software and Hardware Directed Power Mode Control,” in HPCA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. V. De La Luz, A. Sivasubramaniam, M. Kandemir, N. Vijaykrishnan, and M. J. Irwin, “Scheduler Based DRAM Energy Management,” in DAC, 2002.Google ScholarGoogle Scholar
  42. Q. Deng, D. Meisner, L. Ramos, T. F. Wenisch, and R. Bianchini, “MemScale: Active Low-Power Modes for Main Memory,” in ASPLOS, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Q. Deng, D. Meisner, L. Ramos, T. F. Wenisch, and R. Bianchini, “Active Low-Power Modes for Main Memory with MemScale,” in MICRO, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. B. Diniz, D. Guedes, J. W. Meira, and R. Bianchini, “Limiting the Power Consumption of Main Memory,” in ISCA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. X. Dong, C. Xu, Y. Xie, and N. P. Jouppi, “NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory,” TCAD, June 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. J. Dusser, T. Piquet, and A. Seznec, “Zero-Content Augmented Caches,” in ICS, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. M. Ekman and P. Stenströ m, “A Robust Main-Memory Compression Scheme,” in ISCA, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. R. Elmore, K. Gruchalla, C. Phillips, A. Purkayastha, and N. Wunder, “An Analysis of Application Power and Schedule Composition in a High Performance Computing Environment,” National Renewable Energy Laboratory, Tech Report NREL/TP-2C00--65392, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  49. X. Fan, C. S. Ellis, and A. R. Lebeck, “Memory Controller Policies for DRAM Power Management,” in ISLPED, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki, and B. Falsafi, “Clearing the Clouds: A Study of Emerging Scale-Out Workloads on Modern Hardware,” in ASPLOS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. C. F. Gauss, Theoria Motus Corporum Coelestium in Sectionibus Conicis Solem Ambientium .hskip 1em plus 0.5em minus 0.4emrelax F. Perthes et I. H. Besser, 1809.Google ScholarGoogle Scholar
  52. H. Hassan, G. Pekhimenko, N. Vijaykumar, V. Seshadri, D. Lee, O. Ergin, and O. Mutlu, “ChargeCache: Reducing DRAM Latency by Exploiting Row Access Locality,” in HPCA, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  53. H. Hassan, N. Vijaykumar, S. Khan, S. Ghose, K. Chang, G. Pekhimenko, D. Lee, O. Ergin, and O. Mutlu, “SoftMC: A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies,” in HPCA , 2017.Google ScholarGoogle ScholarCross RefCross Ref
  54. Hewlett Packard Enterprise, “CACTI 7.0,” https://github.com/HewlettPackard/cacti.Google ScholarGoogle Scholar
  55. T. M. Hollis, “Data Bus Inversion in High-Speed Memory Applications,” TCAS II, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. U. Holzle and L. A. Barroso, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines .hskip 1em plus 0.5em minus 0.4emrelax Morgan & Claypool, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. S. Hong, “Memory Technology Trend and Future Challenges,” in IEDM, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  58. A. Hwang, I. Stefanovici, and B. Schroeder, “Cosmic Rays Don't Strike Twice: Understanding the Nature of DRAM Errors and the Implications for System Design,” in ASPLOS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. M. Inoue, T. Yamada, H. Kotani, H. Yamauchi, A. Fujiwara, J. Matsushima, H. Akamatsu, M. Fukumoto, M. Kubota, I. Nakao, N. Aoi, G. Fuse, S. Ogawa, S. Odanaka, A. Ueno, and H. Yamamoto, “A 16-Mbit DRAM with a Relaxed Sense-Amplifier-Pitch Open-Bit-Line Architecture,” JSSC, 1988.Google ScholarGoogle ScholarCross RefCross Ref
  60. JEDEC Solid State Technology Assn., JESD79--3F: DDR3 SDRAM Standard , 2012.Google ScholarGoogle Scholar
  61. JEDEC Solid State Technology Assn., JESD79--3--1A.01: Addendum No.1 to JESD79--3 - 1.35V DDR3L-800, DDR3L-1066, DDR3L-1333, DDR3L-1600, and DDR3L-1866, 2013.Google ScholarGoogle Scholar
  62. JEDEC Solid State Technology Assn., JESD21C, Module 4.20.18: 204-Pin DDR3 SDRAM Unbuffered SO-DIMM Design Specification, 2014.Google ScholarGoogle Scholar
  63. JEDEC Solid State Technology Assn., JESD209--3C: Low Power Double Data Rate 3 SDRAM (LPDDR3) Standard, 2015.Google ScholarGoogle Scholar
  64. JEDEC Solid State Technology Assn., JESD209--4B: Low Power Double Data Rate 4 (LPDDR4) Standard, 2017.Google ScholarGoogle Scholar
  65. M. Jung, D. M. Mathew, É. F. Zulian, C. Weis, and N. Wehn, “A New Bank Sensitive DRAMPower Model for Efficient Design Space Exploration,” in PATMOS, 2016.Google ScholarGoogle Scholar
  66. M. Jung, D. M. Mathew, C. C. Rheinl"a nder, C. Weis, and N. Wehn, “A Platform to Analyze DDR3 DRAM's Power and Retention Time,” IEEE Design and Test, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  67. M. Kandemir, O. Ozturk, and M. Karakoy, “Dynamic On-Chip Memory Management for Chip Multiprocessors,” in CASES, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. M. Kandemir, U. Sezer, and V. De La Luz, “Improving Memory Energy Using Access Pattern Classification,” in ICCAD, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. M. Kandemir, T. Yemliha, S. W. Son, and O. Ozturk, “Memory Bank Aware Dynamic Loop Scheduling,” in DATE, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. D. Kaseridis, J. Stuechelia, and L. K. John, “Minimalist Open-Page: A DRAM Page-Mode Scheduling Policy for the Many-Core Era,” in MICRO , 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. B. Keeth, R. J. Baker, B. Johnson, and F. Lin, DRAM Circuit Design: Fundamental and High-Speed Topics .hskip 1em plus 0.5em minus 0.4emrelax Wiley-IEEE Press, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Keysight Technologies, Inc., 34134A AC/DC DMM Current Probe: User's Guide , https://literature.cdn.keysight.com/litweb/pdf/34134--90001.pdf, 2009.Google ScholarGoogle Scholar
  73. Keysight Technologies, Inc., Keysight Truevolt Series Digital Multimeters: Operating and Service Guide , https://literature.cdn.keysight.com/litweb/pdf/34460--90901.pdf, 2017.Google ScholarGoogle Scholar
  74. S. Khan, D. Lee, Y. Kim, A. R. Alameldeen, C. Wilkerson, and O. Mutlu, “The Efficacy of Error Mitigation Techniques for DRAM Retention Failures: A Comparative Experimental Study,” in SIGMETRICS, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. S. Khan, D. Lee, and O. Mutlu, “PARBOR: An Efficient System-Level Technique to Detect Data Dependent Failures in DRAM,” in DSN, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  76. S. Khan, C. Wilkerson, D. Lee, A. R. Alameldeen, and O. Mutlu, “A Case for Memory Content-Based Detection and Mitigation of Data-Dependent Failures in DRAM,” CAL, 2016.Google ScholarGoogle Scholar
  77. S. Khan, C. Wilkerson, Z. Wang, A. R. Alameldeen, D. Lee, and O. Mutlu, “Detecting and Mitigating Data-Dependent DRAM Failures by Exploiting Current Memory Content,” in MICRO, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. H. S. Kim, M. Kandemir, N. Vijaykrishnan, and M. J. Irwin, “Characterization of Memory Energy Behavior,” WWC, 2000.Google ScholarGoogle Scholar
  79. J. Kim, M. Patel, H. Hassan, and O. Mutlu, “The DRAM Latency PUF: Quickly Evaluating Physical Unclonable Functions by Exploiting the Latency--Reliability Tradeoff in Modern DRAM Devices,” in HPCA, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  80. Y. Kim, “Architectural Techniques to Enhance DRAM Scaling,” Ph.D. dissertation, Carnegie Mellon Univ., 2015.Google ScholarGoogle Scholar
  81. Y. Kim, M. Papamichael, O. Mutlu, and M. Harchol-Balter, “Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior,” in MICRO, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Y. Kim, V. Seshadri, D. Lee, J. Liu, and O. Mutlu, “A Case for Exploiting Subarray Level Parallelism (SALP) in DRAM,” ISCA, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Y. Kim, W. Yang, and O. Mutlu, “Ramulator: A Fast and Extensible DRAM Simulator,” CAL, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. Y. Kim, R. Daly, J. Kim, C. Fallin, J. H. Lee, D. Lee, C. Wilkerson, K. Lai, and O. Mutlu, “Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors,” in ISCA, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. A. R. Lebeck, X. Fan, H. Zeng, and C. Ellis, “Power Aware Page Allocation,” in ASPLOS, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. C. J. Lee, V. Narasiman, O. Mutlu, and Y. N. Patt, “Improving Memory Bank-Level Parallelism in the Presence of Prefetching,” in MICRO, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. D. Lee, “Reducing DRAM Energy at Low Cost by Exploiting Heterogeneity,” Ph.D. dissertation, Carnegie Mellon Univ., 2016.Google ScholarGoogle Scholar
  88. D. Lee, S. Ghose, G. Pekhimenko, S. Khan, and O. Mutlu, “Simultaneous Multi-Layer Access: Improving 3D-Stacked Memory Bandwidth at Low Cost,” ACM TACO, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. D. Lee, S. Khan, L. Subramanian, S. Ghose, R. Ausavarungnirun, G. Pekhimenko, V. Seshadri, and O. Mutlu, “Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms,” in SIGMETRICS, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. D. Lee, Y. Kim, V. Seshadri, J. Liu, L. Subramanian, and O. Mutlu, “Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture,” in HPCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. D. Lee, L. Subramanian, R. Ausavarungnirun, J. Choi, and O. Mutlu, “Decoupled Direct Memory Access: Isolating CPU and IO Traffic by Leveraging a Dual-Data-Port DRAM,” in PACT, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. D. Lee, Y. Kim, G. Pekhimenko, S. Khan, V. Seshadri, K. Chang, and O. Mutlu, “Adaptive-Latency DRAM: Optimizing DRAM Timing for the Common-Case,” in HPCA, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  93. C. Lefurgy, K. Rajamani, F. Rawson, W. Felter, M. Kistler, and T. Keller, “Energy Management for Commercial Servers,” Computer, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. A.-M. Legendre, Nouvelles Méthodes pour la Détermination des Orbites des Comètes .hskip 1em plus 0.5em minus 0.4emrelax F. Didot, 1805.Google ScholarGoogle Scholar
  95. S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi., “McPAT: An Integrated Power, Area and Timing Modeling Framework for Multicore and Manycore Architectures.” in MICRO, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. C. H. Lin, D. Y. Shen, Y. J. Chen, C. L. Yang, and M. Wang, “SECRET: Selective Error Correction for Refresh Energy Reduction in DRAMs,” in ICCD, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. J. Liu, B. Jaiyen, Y. Kim, C. Wilkerson, and O. Mutlu, “An Experimental Study of Data Retention Behavior in Modern DRAM Devices: Implications for Retention Time Profiling Mechanisms,” in ISCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. J. Liu, B. Jaiyen, R. Veras, and O. Mutlu, “RAIDR: Retention-Aware Intelligent DRAM Refresh,” in ISCA, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. S. Liu, K. Pattabiraman, T. Moscibroda, and B. G. Zorn, “Flikker: Saving DRAM Refresh-Power Through Critical Data Partitioning,” in ASPLOS, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood, “Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation,” in PLDI, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. Y. Luo, S. Ghose, Y. Cai, E. F. Haratsch, and O. Mutlu, “HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness,” in HPCA, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  102. Y. Luo, S. Ghose, Y. Cai, E. F. Haratsch, and O. Mutlu, “Improving 3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process Variation,” in SIGMETRICS, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. C. Lyuh and T. Kim, “Memory Access Scheduling and Binding Considering Energy Minimization in Multi-Bank Memory Systems,” in DAC, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. K. T. Malladi, F. A. Nothaft, K. Periyathambi, B. C. Lee, C. Kozyrakis, and M. Horowitz, “Towards Energy-Proportional Datacenter Memory with Mobile DRAM,” in ISCA, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. K. T. Malladi, I. Shaeffer, L. Gopalakrishnan, D. Lo, B. C. Lee, and M. Horowitz, “Rethinking DRAM Power Modes for Energy Proportionality,” in MICRO, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. D. M. Mathew, M. Schultheis, C. C. Rheinl"ander, C. Sudarshan, C. Weis, N. Wehn, and M. Jung, “An Analysis on Retention Error Behavior and Power Consumption of Recent DDR4 DRAMs,” in DATE, 2018.Google ScholarGoogle Scholar
  107. D. M. Mathew, Éder F. Zulian, S. Kannoth, M. Jung, C. Weis, and N. Wehn, “A Bank-Wise DRAM Power Model for System Simulations,” RAPIDO , 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. J. Meza, Q. Wu, S. Kumar, and O. Mutlu, “Revisiting Memory Errors in Large-Scale Production Data Centers: Analysis and Modeling of New Trends from the Field,” in DSN, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  109. MFactors, “JET-5467A Product Page,” http://www.mfactors.com/jet-5467a-ddr3-sodimm-extender-with-current-sensing/.Google ScholarGoogle Scholar
  110. Micron Technology, Inc., “DDR3 Point-to-Point Design Support,” Technical Note TN-41--13, 2013.Google ScholarGoogle Scholar
  111. Micron Technology, Inc., “Calculating Memory System Power for DDR3,” Technical Note TN-41-01, 2015.Google ScholarGoogle Scholar
  112. Micron Technology, Inc., “DDR4 Point-to-Point Design Guide,” Technical Note TN-40--40, 2018.Google ScholarGoogle Scholar
  113. J. Mukundan, H. Hunter, K. H. Kim, J. Stuecheli, and J. F. Martinez, “Understanding and Mitigating Refresh Overheads in High-Density DDR4 DRAM Systems,” in ISCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  114. K. P. Muller, B. Flietner, C. L. Hwang, R. L. Kleinhenz, T. Nakao, R. Ranade, Y. Tsunashima, and T. Mii, “Trench Storage Node Technology for Gigabit DRAM Generations,” in IEDM, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  115. O. Mutlu, “The RowHammer Problem and Other Issues We May Face as Memory Becomes Denser,” in DATE, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. O. Mutlu and T. Moscibroda, “Parallelism-Aware Batch Scheduling: Enhancing Both Performance and Fairness of Shared DRAM Systems,” in ISCA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  117. O. Mutlu, “Memory Scaling: A Systems Architecture Perspective,” in IMW, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  118. T. Ohsawa, K. Kai, and K. Murakami, “Optimizing the DRAM Refresh Count for Merged DRAM/Logic LSIs,” in ISLPED, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  119. O. Ozturk and M. Kandemir, “Data Replication in Banked DRAMs for Reducing Energy Consumption,” in ISQED, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  120. A. Patel, F. Afram, S. Chen, and K. Ghose, “MARSSx86: A Full System Simulator for x86 CPUs,” in DAC, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  121. M. Patel, J. Kim, and O. Mutlu, “The Reach Profiler (REAPER): Enabling the Mitigation of DRAM Retention Failures via Profiling at Aggressive Conditions,” in ISCA, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  122. I. Paul, W. Huang, M. Arora, and S. Yalamanchili, “Harmonia: Balancing Compute and Memory Power in High-Performance GPUs,” in ISCA, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  123. K. Pearson, “Notes on Regression and Inheritance in the Case of Two Parents,” Proc. Royal Soc. London, 1895.Google ScholarGoogle Scholar
  124. G. Pekhimenko, E. Bolotin, N. Vijaykumar, O. Mutlu, T. C. Mowry, and S. W. Keckler, “A Case for Toggle-Aware Compression for GPU Systems,” in HPCA, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  125. G. Pekhimenko, T. Huberty, R. Cai, O. Mutlu, P. B. Gibbons, M. A. Kozuch, and T. C. Mowry, “Exploiting Compressed Block Size as an Indicator of Future Reuse,” in HPCA, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  126. G. Pekhimenko, V. Seshadri, Y. Kim, H. Xin, O. Mutlu, M. A. Kozuch, P. B. Gibbons, and T. C. Mowry, “Linearly Compressed Pages: A Low-Complexity, Low-Latency Main Memory Compression Framework,” in MICRO, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  127. G. Pekhimenko, V. Seshadri, O. Mutlu, M. A. Kozuch, P. B. Gibbons, and T. C. Mowry, “Base-Delta-Immediate Compression: Practical Data Compression for On-Chip Caches,” in PACT, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  128. M. K. Qureshi, D. H. Kim, S. Khan, P. J. Nair, and O. Mutlu, “AVATAR: A Variable-Retention-Time (VRT) Aware Refresh for DRAM Systems,” in DSN, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  129. Rambus, Inc., “RDRAM Memory Architecture,” https://www.rambus.com/memory-and-interfaces/rdram-memory-architecture/.Google ScholarGoogle Scholar
  130. S. Rixner, W. J. Dally, U. J. Kapasi, P. Mattson, and J. D. Owens, “Memory Access Scheduling,” in ISCA, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  131. P. Rosenfeld, E. Cooper-Balis, and B. Jacob, “DRAMSim2: A Cycle Accurate Memory System Simulator,” CAL, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  132. SAFARI Research Group, “Characterization Results of Modern DRAM Devices Under Reduced-Voltage Operation -- GitHub Repository,” https://github.com/Carnegie Mellon University-SAFARI/DRAM-Voltage-Study.Google ScholarGoogle Scholar
  133. SAFARI Research Group, “Ramulator: A DRAM Simulator -- GitHub Repository,” https://github.com/Carnegie Mellon University-SAFARI/ramulator.Google ScholarGoogle Scholar
  134. SAFARI Research Group, “SoftMC -- GitHub Repository,” https://github.com/Carnegie Mellon University-SAFARI/SoftMC.Google ScholarGoogle Scholar
  135. SAFARI Research Group, “VAMPIRE -- GitHub Repository,” https://github.com/Carnegie Mellon University-SAFARI/VAMPIRE.Google ScholarGoogle Scholar
  136. B. Schroeder, E. Pinheiro, and W. Webe, “DRAM Errors in the Wild: A Large-Scale Field Study,” in SIGMETRICS, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  137. V. Seshadri, D. Lee, T. Mullins, H. Hassan, A. Boroumand, J. Kim, M. A. Kozuch, O. Mutlu, P. B. Gibbons, and T. C. Mowry, “Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology,” in MICRO, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  138. V. Seshadri, “Simple DRAM and Virtual Memory Abstractions to Enable Highly Efficient Memory Systems,” Ph.D. dissertation, Carnegie Mellon University, 2016.Google ScholarGoogle Scholar
  139. V. Seshadri, Y. Kim, C. Fallin, D. Lee, R. Ausavarungnirun, G. Pekhimenko, Y. Luo, O. Mutlu, P. B. Gibbons, M. A. Kozuch, and T. C. Mowry, “RowClone: Fast and Energy-Efficient In-DRAM Bulk Data Copy and Initialization,” in MICRO, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  140. V. Seshadri and O. Mutlu, “Simple Operations in Memory to Reduce Data Movement,” in Advances in Computers, Volume 106, 2017.Google ScholarGoogle Scholar
  141. W. Shin, J. Yang, J. Choi, and L.-S. Kim, “NUAT: A Non-Uniform Access Time Memory Controller,” in HPCA, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  142. V. Sridharan, N. DeBardeleben, S. Blanchard, K. B. Ferreira, J. Stearley, J. Shalf, and S. Gurumurthi, “Memory Errors in Modern Systems: The Good, the Bad, and the Ugly,” in ASPLOS, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  143. V. Sridharan and D. Liberty, “A Study of DRAM Failures in the Field,” in SC, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  144. M. R. Stan and W. P. Burleson, “Bus-Invert Coding for Low-Power I/O,” TVLSI, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  145. M. R. Stan and W. P. Burleson, “Coding a Terminated Bus for Low Power,” in GLSVLSI, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  146. Standard Performance Evaluation Corp., “SPEC CPU2006 Benchmarks,” http://www.spec.org/cpu2006.Google ScholarGoogle Scholar
  147. J. Stuecheli, D. Kaseridis, D. Daly, H. C. Hunter, and L. K. John, “The Virtual Write Queue: Coordinating DRAM and Last-Level Cache Policies,” in ISCA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  148. J. Stuecheli, D. Kaseridis, D. Daly, H. C. Hunter, and L. K. John, “Coordinating DRAM and Last-Level-Cache Policies with the Virtual Write Queue,” IEEE Micro, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  149. K. Sudan, N. Chatterjee, D. Nellans, M. Awasthi, R. Balasubramonian, and A. Davis, “Micro-Pages: Increasing DRAM Efficiency with Locality-Aware Data Placement,” in ASPLOS, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  150. A. N. Udipi, N. Muralimanohar, and R. Balasubramonian, “Non-Uniform Power Access in Large Caches with Low-Swing Wires,” in HiPC, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  151. A. N. Udipi, N. Muralimanohar, N. Chatterjee, R. Balasubramonian, A. Davis, and N. P. Jouppi, “Rethinking DRAM Design and Organization for Energy-Constrained Multi-Cores,” in ISCA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  152. R. Venkatesan, S. Herr, and E. Rotenberg, “Retention-Aware Placement in DRAM (RAPID): Software Methods for Quasi-Non-Volatile DRAM,” in HPCA , 2006.Google ScholarGoogle ScholarCross RefCross Ref
  153. N. Vijaykumar, G. Pekhimenko, A. Jog, A. Bhowmick, R. Ausavarungnirun, C. Das, M. T. Kandemir, T. C. Mowry, and O. Mutlu, “A Case for Core-Assisted Bottleneck Acceleration in GPUs: Enabling Flexible Data Compression with Assist Warps,” in ISCA, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  154. T. Vogelsang, “Understanding the Energy Consumption of Dynamic Random Access Memories,” in MICRO, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  155. L. Wang, J. Zhan, C. Luo, Y. Zhu, Q. Yang, Y. He, W. Gao, Z. Jia, Y. Shi, S. Zhang, C. Zheng, G. Lu, K. Zhan, X. Li, and B. Qiu, “BigDataBench: A Big Data Benchmark Suite From Internet Services,” in HPCA, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  156. F. A. Ware and C. Hampel, “Improving Power and Data Efficiency with Threaded Memory Modules,” in ICCD, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  157. M. Ware, K. Rajamani, M. Floyd, B. Brock, J. C. Rubio, F. Rawson, and J. B. Carter, “Architecting for Power Management: The IBM POWER7 Approach,” in HPCA, 2010.Google ScholarGoogle Scholar
  158. P. R. Wilson, S. F. Kaplan, and Y. Smaragdakis, “The Case for Compressed Caching in Virtual Memory Systems,” in USENIX ATC, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  159. Xilinx, Inc., “Virtex-6 FPGA Family,” https://www.xilinx.com/products/silicon-devices/fpga/virtex-6.html.Google ScholarGoogle Scholar
  160. Xilinx, Inc., “ML605 Hardware User Guide,” https://www.xilinx.com/support/documentation/boards_and_kits/ug534.pdf, 2012.Google ScholarGoogle Scholar
  161. Xilinx, Inc., “MIG 7 Series and Virtex-6 DDR2/DDR3 Solution Center - Design Assistant - Memory Controller Efficiency and Possible Improvements,” https://www.xilinx.com/support/answers/36719.html, 2017.Google ScholarGoogle Scholar
  162. J. Yang, Y. Zhang, and R. Gupta, “Frequent Value Compression in Data Caches,” in MICRO, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  163. D. H. Yoon, J. Chang, N. Muralimanohar, and P. Ranganathan, “BOOM: Enabling Mobile Memory Based Low-Power Server DIMMs,” in ISCA, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  164. D. H. Yoon, M. K. Jeong, and M. Erez, “Adaptive Granularity Memory Systems: A Tradeoff Between Storage Efficiency and Throughput,” in ISCA , 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  165. T. Zhang, K. Chen, C. Xu, G. Sun, T. Wang, and Y. Xie, “Half-DRAM: A High-Bandwidth and Low-Power DRAM Architecture from the Rethinking of Fine-Grained Activation,” in ISCA, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  166. H. Zheng, J. Lin, Z. Zhang, E. Gorbatov, H. David, and Z. Zhu, “Mini-Rank: Adaptive DRAM Architecture for Improving Memory Power Efficiency,” in MICRO, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  167. W. Zuravleff and T. Robinson, “Controller for a Synchronous DRAM That Maximizes Throughput by Allowing Memory Requests and Commands to Be Issued Out of Order,” U.S. Patent No. 5,630,096, 1997.Google ScholarGoogle Scholar

Index Terms

  1. What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!