skip to main content
research-article
Public Access

Understanding Reduced-Voltage Operation in Modern DRAM Devices: Experimental Characterization, Analysis, and Mechanisms

Published:13 June 2017Publication History
Skip Abstract Section

Abstract

The energy consumption of DRAM is a critical concern in modern computing systems. Improvements in manufacturing process technology have allowed DRAM vendors to lower the DRAM supply voltage conservatively, which reduces some of the DRAM energy consumption. We would like to reduce the DRAM supply voltage more aggressively, to further reduce energy. Aggressive supply voltage reduction requires a thorough understanding of the effect voltage scaling has on DRAM access latency and DRAM reliability.

In this paper, we take a comprehensive approach to understanding and exploiting the latency and reliability characteristics of modern DRAM when the supply voltage is lowered below the nominal voltage level specified by DRAM standards. Using an FPGA-based testing platform, we perform an experimental study of 124 real DDR3L (low-voltage) DRAM chips manufactured recently by three major DRAM vendors. We find that reducing the supply voltage below a certain point introduces bit errors in the data, and we comprehensively characterize the behavior of these errors. We discover that these errors can be avoided by increasing the latency of three major DRAM operations (activation, restoration, and precharge). We perform detailed DRAM circuit simulations to validate and explain our experimental findings. We also characterize the various relationships between reduced supply voltage and error locations, stored data patterns, DRAM temperature, and data retention.

Based on our observations, we propose a new DRAM energy reduction mechanism, called Voltron. The key idea of Voltron is to use a performance model to determine by how much we can reduce the supply voltage without introducing errors and without exceeding a user-specified threshold for performance loss. Our evaluations show that Voltron reduces the average DRAM and system energy consumption by 10.5% and 7.3%, respectively, while limiting the average system performance loss to only 1.8%, for a variety of memory-intensive quad-core workloads. We also show that Voltron significantly outperforms prior dynamic voltage and frequency scaling mechanisms for DRAM.

References

  1. "Predictive Technology Model," 2007.Google ScholarGoogle Scholar
  2. "Ramulator," https://github.com/Carnegie Mellon University-SAFARI/ramulator, 2015.Google ScholarGoogle Scholar
  3. "DRAM Voltage Study," https://github.com/Carnegie Mellon University-SAFARI/DRAM-Voltage-Study, 2017.Google ScholarGoogle Scholar
  4. Advanced Micro Devices, Inc., "AMD Opteron 4300 Series Processors," http://www.amd.com/en-us/products/server/4000/4300.Google ScholarGoogle Scholar
  5. Advanced Micro Devices, Inc., "BKDG for AMD Family 16h Models 00h-0Fh Processors," Oct. 2013.Google ScholarGoogle Scholar
  6. N. Aggarwal, J. F. Cantin, M. H. Lipasti, and J. E. Smith, "Power-Efficient DRAM Speculation," in HPCA, 2008.Google ScholarGoogle Scholar
  7. A. Agrawal, A. Ansari, and J. Torrellas, "Mosaic: Exploiting the Spatial Locality of Process Variation to Reduce Refresh Energy in On-Chip eDRAM Modules," in HPCA, 2014.Google ScholarGoogle Scholar
  8. A. R. Alameldeen, Z. Chishti, C. Wilkerson, W. Wu, and S.-L. Lu, "Adaptive Cache Design to Enable Reliable Low-Voltage Operation," IEEE TC, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. R. Alameldeen, I. Wagner, Z. Chishti, W. Wu, C. Wilkerson, and S.-L. Lu, "Energy-Efficient Cache Design Using Variable-Strength Error-Correcting Codes," in ISCA, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. M. Amin and Z. A. Chishti, "Rank-Aware Cache Replacement and Write Buffering to Improve DRAM Energy Efficiency," in ISLPED, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. ARM Ltd., "Cortex-A9 Processor," https://www.arm.com/products/processors/cortex-a/cortex-a9.php.Google ScholarGoogle Scholar
  12. R. J. Baker, CMOS Circuit Design, Layout, and Simulation.\hskip 1em plus 0.5em minus 0.4em\relax Wiley-IEEE Press, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. \BIBentryALTinterwordspacingH. Bauer, S. Burghardt, S. Tandon, and F. Thalmayr, "Memory: Are Challenges Ahead?" March 2016. łinebreak http://www.mckinsey.com/industries/semiconductors/our-insights/memory-are-challenges-ahead\BIBentrySTDinterwordspacingGoogle ScholarGoogle Scholar
  14. R. Begum, D. Werner, M. Hempstead, G. Prasad, and G. Challen, "Energy-Performance Trade-Offs on Energy-Constrained Devices with Multi-component DVFS," in IISWC, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. I. Bhati, Z. Chishti, S.-L. Lu, and B. Jacob, "Flexible Auto-Refresh: Enabling Scalable and Energy-Efficient DRAM Refresh Reductions," in ISCA, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Bi, R. Duan, and C. Gniady, "Delay-Hiding Energy Management Mechanisms for DRAM," in HPCA, 2010.Google ScholarGoogle Scholar
  17. K. Chandrasekar, S. Goossens, C. Weis, M. Koedam, B. Akesson, N. Wehn, and K. Goossens, "Exploiting Expendable Process-Margins in DRAMs for Run-Time Performance Optimization," in DATE, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. K. Chandrasekar, C. Weis, Y. Li, S. Goossens, M. Jung, O. Naji, B. Akesson, N. Wehn, and K. Goossens, "DRAMPower: Open-Source DRAM Power & Energy Estimation Tool," http://www.drampower.info.Google ScholarGoogle Scholar
  19. K. K. Chang, D. Lee, Z. Chishti, A. Alameldeen, C. Wilkerson, Y. Kim, and O. Mutlu, "Improving DRAM Performance by Parallelizing Refreshes with Accesses," in HPCA, 2014.Google ScholarGoogle Scholar
  20. K. K. Chang, "Understanding and Improving Latency of DRAM-Based Memory Systems," Ph.D. dissertation, Carnegie Mellon University, 2017.Google ScholarGoogle Scholar
  21. K. K. Chang, A. Kashyap, H. Hassan, S. Ghose, K. Hsieh, D. Lee, T. Li, G. Pekhimenko, S. Khan, and O. Mutlu, "Understanding Latency Variation in Modern DRAM Chips: Experimental Characterization, Analysis, and Optimization," in SIGMETRICS, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. K. Chang, P. J. Nair, D. Lee, S. Ghose, M. K. Qureshi, and O. Mutlu, "Low-Cost Inter-Linked Subarrays (LISA): Enabling Fast Inter-Subarray Data Movement in DRAM," in HPCA, 2016.Google ScholarGoogle Scholar
  23. N. Chatterjee, M. Shevgoor, R. Balasubramonian, A. Davis, Z. Fang, R. Illikkal, and R. Iyer, "Leveraging Heterogeneity in DRAM Main Memories to Accelerate Critical Word Access," in MICRO, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. N. Chatterjee, M. O'Connor, D. Lee, D. R. Johnson, M. Rhu, S. W. Kecker, and W. J. Dally, "Architecting an Energy-Efficient DRAM System for GPUs," in HPCA, 2017.Google ScholarGoogle Scholar
  25. Z. Chishti, A. R. Alameldeen, C. Wilkerson, W. Wu, and S.-L. Lu, "Improving Cache Lifetime Reliability at Ultra-Low Voltages," in MICRO, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Y. Choi, "LPDDR4: Evolution for New Mobile World," in MEMCON, 2013.Google ScholarGoogle Scholar
  27. B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, "Benchmarking Cloud Serving Systems with YCSB," in SOCC, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. E. Cooper-Balis and B. Jacob, "Fine-Grained Activation for Power Reduction in DRAM," IEEE Micro, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. Das, R. Ausavarungnirun, O. Mutlu, A. Kumar, and M. Azimi, "Application-to-Core Mapping Policies to Reduce Memory System Interference in Multi-Core Systems," in HPCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. R. Das, O. Mutlu, T. Moscibroda, and C. Das, "Application-Aware Prioritization Mechanisms for On-Chip Networks," in MICRO, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. Das, O. Mutlu, T. Moscibroda, and C. R. Das, "Aérgia: Exploiting Packet Latency Slack in On-Chip Networks," in ISCA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. David, C. Fallin, E. Gorbatov, U. R. Hanebutte, and O. Mutlu, "Memory Power Management via Dynamic Voltage/Frequency Scaling," in ICAC, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Q. Deng, D. Meisner, A. Bhattacharjee, T. F. Wenisch, and R. Bianchini, "CoScale: Coordinating CPU and Memory System DVFS in Server Systems," in MICRO, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Q. Deng, D. Meisner, A. Bhattacharjee, T. F. Wenisch, and R. Bianchini, "MultiScale: Memory System DVFS with Multiple Memory Controllers," in ISLPED, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Q. Deng, D. Meisner, L. Ramos, T. F. Wenisch, and R. Bianchini, "MemScale: Active Low-Power Modes for Main Memory," in ASPLOS, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. B. Diniz, D. Guedes, W. Meira, Jr., and R. Bianchini, "Limiting the Power Consumption of Main Memory," in ISCA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. N. El-Sayed, I. A. Stefanovici, G. Amvrosiadis, A. A. Hwang, and B. Schroeder, "Temperature Management in Data Centers: Why Some (Might) Like It Hot," in SIGMETRICS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. S. Eyerman and L. Eeckhout, "System-Level Performance Metrics for Multiprogram Workloads," IEEE Micro, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. X. Fan, C. Ellis, and A. Lebeck, "Memory Controller Policies for DRAM Power Management," in ISLPED, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. S. Ghose, H. Lee, and J. F. Martínez, "Improving Memory Scheduling via Processor-Side Load Criticality Information," in ISCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. A. Glew, "MLP Yes! ILP No! Memory Level Parallelism, or, Why I No Longer Worry About IPC," in ASPLOS Wild and Crazy Ideas Session, 1997.Google ScholarGoogle Scholar
  42. Google, "Chromebook," https://www.google.com/chromebook/.Google ScholarGoogle Scholar
  43. H. Hassan, N. Vijaykumar, S. Khan, S. Ghose, K. Chang, G. Pekhimenko, D. Lee, O. Ergin, and O. Mutlu, "SoftMC: A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies," in HPCA, 2017.Google ScholarGoogle Scholar
  44. H. Hassan, G. Pekhimenko, N. Vijaykumar, V. Seshadri, D. Lee, O. Ergin, and O. Mutlu, "ChargeCache: Reducing DRAM Latency by Exploiting Row Access Locality," in HPCA, 2016.Google ScholarGoogle Scholar
  45. U. Höelzle and L. A. Barroso, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.\hskip 1em plus 0.5em minus 0.4em\relax Morgan & Claypool, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. INRIA, "scikit-learn," http://scikit-learn.org/stable/index.html.Google ScholarGoogle Scholar
  47. Intel Corp., "Intel® Extreme Memory Profile (Intel® XMP) DDR3 Technology," 2009.Google ScholarGoogle Scholar
  48. E. Ipek, O. Mutlu, J. F. Martínez, and R. Caruana, "Self-Optimizing Memory Controllers: A Reinforcement Learning Approach," in ISCA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. JEDEC Solid State Technology Assn., "JESD209--3C: Low Power Double Data Rate 3 SDRAM (LPDDR3)," 2012.Google ScholarGoogle Scholar
  50. JEDEC Solid State Technology Assn., "JESD79--3F: DDR3 SDRAM Standard," 2012.Google ScholarGoogle Scholar
  51. JEDEC Solid State Technology Assn., "JESD79--3--1A.01: Addendum No.1 to JESD79--3 - 1.35V DDR3L-800, DDR3L-1066, DDR3L-1333, DDR3L-1600, and DDR3L-1866," 2013.Google ScholarGoogle Scholar
  52. JEDEC Solid State Technology Assn., "JESD209--4B: Low Power Double Data Rate 4 (LPDDR4)," 2017.Google ScholarGoogle Scholar
  53. M. Jung, D. M. Mathew, É. F. Zulian, C. Weis, and N. Wehn, "A New Bank Sensitive DRAMPower Model for Efficient Design Space Exploration," in PATMOS, 2016.Google ScholarGoogle Scholar
  54. M. Jung, C. C. Rheinl\"ander, C. Weis, and N. Wehn, "Reverse Engineering of DRAMs: Row Hammer with Crosshair," in MEMSYS, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. R. Kalla, B. Sinharoy, W. J. Starke, and M. Floyd, "POWER7: IBM's Next-Generation Server Processor," IEEE Micro, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. B. Keeth and R. J. Baker, DRAM Circuit Design: A Tutorial.\hskip 1em plus 0.5em minus 0.4em\relax Wiley, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. S. Khan, D. Lee, Y. Kim, A. R. Alameldeen, C. Wilkerson, and O. Mutlu, "The Efficacy of Error Mitigation Techniques for DRAM Retention Failures: A Comparative Experimental Study," in SIGMETRICS, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. S. Khan, D. Lee, and O. Mutlu, "PARBOR: An Efficient System-Level Technique to Detect Data Dependent Failures in DRAM," in DSN, 2016.Google ScholarGoogle Scholar
  59. S. Khan, C. Wilkerson, D. Lee, A. R. Alameldeen, and O. Mutlu, "A Case for Memory Content-Based Detection and Mitigation of Data-Dependent Failures in DRAM," CAL, 2016.Google ScholarGoogle Scholar
  60. H. Kim, D. de Niz, B. Andersson, M. Klein, O. Mutlu, and R. Rajkumar, "Bounding and Reducing Memory Interference Delay in COTS-Based Multi-Core Systems," in RTS, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. H. Kim, D. de Niz, B. Andersson, M. Klein, O. Mutlu, and R. Rajkumar, "Bounding Memory Interference Delay in COTS-Based Multi-Core Systems," in RTAS, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. K. Kim and J. Lee, "A New Investigation of Data Retention Time in Truly Nanoscaled DRAMs," EDL, 2009.Google ScholarGoogle Scholar
  63. Y. Kim, W. Yang, and O. Mutlu, "Ramulator: A Fast and Extensible DRAM Simulator," CAL, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Y. Kim, "Architectural Techniques to Enhance DRAM Scaling," Ph.D. dissertation, Carnegie Mellon University, 2015.Google ScholarGoogle Scholar
  65. Y. Kim, R. Daly, J. Kim, C. Fallin, J. H. Lee, D. Lee, C. Wilkerson, K. Lai, and O. Mutlu, "Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors," in ISCA, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Y. Kim, D. Han, O. Mutlu, and M. Harchol-Balter, "ATLAS: A Scalable and High-Performance Scheduling Algorithm for Multiple Memory Controllers," in HPCA, 2010.Google ScholarGoogle Scholar
  67. Y. Kim, M. Papamichael, O. Mutlu, and M. Harchol-Balter, "Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior," in MICRO, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Y. Kim, V. Seshadri, D. Lee, J. Liu, and O. Mutlu, "A Case for Exploiting Subarray-Level Parallelism (SALP) in DRAM," in ISCA, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. A. R. Lebeck, X. Fan, H. Zeng, and C. Ellis, "Power Aware Page Allocation," in ASPLOS, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. C. J. Lee, V. Narasiman, E. Ebrahimi, O. Mutlu, and Y. N. Patt, "DRAM-Aware Last-Level Cache Writeback: Reducing Write-Caused Interference in Memory Systems," Tech. Rep., 2010.Google ScholarGoogle Scholar
  71. C. J. Lee, V. Narasiman, O. Mutlu, and Y. N. Patt, "Improving Memory Bank-Level Parallelism in the Presence of Prefetching," in MICRO, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. D. Lee, S. Khan, L. Subramanian, S. Ghose, R. Ausavarungnirun, G. Pekhimenko, V. Seshadri, and O. Mutlu, "Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms," in SIGMETRICS, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. D. Lee, "Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity," Ph.D. dissertation, Carnegie Mellon University, 2016.Google ScholarGoogle Scholar
  74. D. Lee, S. Khan, L. Subramanian, R. Ausavarungnirun, G. Pekhimenko, V. Seshadri, S. Ghose, and O. Mutlu, "Reducing DRAM Latency by Exploiting Design-Induced Latency Variation in Modern DRAM Chips," in CoRR abs/1610.09604, 2016.Google ScholarGoogle Scholar
  75. D. Lee, Y. Kim, G. Pekhimenko, S. Khan, V. Seshadri, K. Chang, and O. Mutlu, "Adaptive-Latency DRAM: Optimizing DRAM Timing for the Common-Case," in HPCA, 2015.Google ScholarGoogle Scholar
  76. D. Lee, Y. Kim, V. Seshadri, J. Liu, L. Subramanian, and O. Mutlu, "Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture," in HPCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. D. Lee, G. Pekhimenko, S. M. Khan, S. Ghose, and O. Mutlu, "Simultaneous Multi Layer Access: A High Bandwidth and Low Cost 3D-Stacked Memory Interface," TACO, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. D. Lee, L. Subramanian, R. Ausavarungnirun, J. Choi, and O. Mutlu, "Decoupled Direct Memory Access: Isolating CPU and IO Traffic by Leveraging a Dual-Data-Port DRAM," in PACT, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi, "McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures," in MICRO, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Y. Li, H. Schneider, F. Schnabel, R. Thewes, and D. Schmitt-Landsiedel, "DRAM Yield Analysis and Optimization by a Statistical Design Approach," in IEEE TCSI, 2011.Google ScholarGoogle Scholar
  81. C. H. Lin, D. Y. Shen, Y. J. Chen, C. L. Yang, and M. Wang, "SECRET: Selective Error Correction for Refresh Energy Reduction in DRAMs," in ICCD, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Linear Technology Corp., "LTspice IV," http://www.linear.com/LTspice.Google ScholarGoogle Scholar
  83. J. Liu, B. Jaiyen, Y. Kim, C. Wilkerson, and O. Mutlu, "An Experimental Study of Data Retention Behavior in Modern DRAM Devices: Implications for Retention Time Profiling Mechanisms," in ISCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. J. Liu, B. Jaiyen, R. Veras, and O. Mutlu, "RAIDR: Retention-Aware Intelligent DRAM Refresh," in ISCA, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. Y. Luo, S. Govindan, B. Sharma, M. Santaniello, J. Meza, A. Kansal, J. Liu, B. Khessib, K. Vaid, and O. Mutlu, "Characterizing Application Memory Error Vulnerability to Optimize Datacenter Cost via Heterogeneous-Reliability Memory," in DSN, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. C. Lyuh and T. Kim, "Memory Access Scheduling and Binding Considering Energy Minimization in Multi-Bank Memory Systems," in DAC, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. K. T. Malladi, B. C. Lee, F. A. Nothaft, C. Kozyrakis, K. Periyathambi, and M. Horowitz, "Towards Energy-Proportional Datacenter Memory with Mobile DRAM," in ISCA, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. G. Massobrio and P. Antognetti, Semiconductor Device Modeling with SPICE.\hskip 1em plus 0.5em minus 0.4em\relax McGraw-Hill, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. D. M. Mathew, E. F. Zulian, S. Kannoth, M. Jung, C. Weis, and N. Wehn, "A Bank-Wise DRAM Power Model for System Simulations," in RAPIDO, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. J. Meza, Q. Wu, S. Kumar, and O. Mutlu, "Revisiting Memory Errors in Large-Scale Production Data Centers: Analysis and Modeling of New Trends from the Field," in DSN, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Micron Technology, Inc., "Calculating Memory System Power for DDR3," 2007.Google ScholarGoogle Scholar
  92. Micron Technology, Inc., "2Gb: x4, x8, x16 DDR3L SDRAM," 2015.Google ScholarGoogle Scholar
  93. S. P. Muralidhara, L. Subramanian, O. Mutlu, M. Kandemir, and T. Moscibroda, "Reducing Memory Interference in Multicore Systems via Application-aware Memory Channel Partitioning," in MICRO, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. O. Mutlu, H. Kim, and Y. N. Patt, "Efficient runahead execution: Power-efficient memory latency tolerance," IEEE Micro, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. O. Mutlu, "Memory Scaling: A Systems Architecture Perspective," IMW, 2013.Google ScholarGoogle Scholar
  96. O. Mutlu, H. Kim, and Y. N. Patt, "Techniques for Efficient Processing in Runahead Execution Engines," in ISCA, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. O. Mutlu, H. Kim, and Y. N. Patt, "Efficient Runahead Execution: Power-Efficient Memory Latency Tolerance," IEEE Micro, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. O. Mutlu and T. Moscibroda, "Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors," in MICRO, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. O. Mutlu and T. Moscibroda, "Parallelism-Aware Batch Scheduling: Enhancing Both Performance and Fairness of Shared DRAM Systems," in ISCA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. O. Mutlu, J. Stark, C. Wilkerson, and Y. N. Patt, "Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-Order Processors," in HPCA, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. O. Mutlu and L. Subramanian, "Research Problems and Opportunities in Memory Systems," SUPERFRI, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. L. W. Nagel and D. Pederson, "SPICE (Simulation Program with Integrated Circuit Emphasis)," EECS Department, University of California, Berkeley, Tech. Rep. UCB/ERL M382, 1973.Google ScholarGoogle Scholar
  103. P. J. Nair, D.-H. Kim, and M. K. Qureshi, "ArchShield: Architectural Framework for Assisting DRAM Scaling by Tolerating High Error Rates," in ISCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. P. J. Nair, D. A. Roberts, and M. K. Qureshi, "Citadel: Efficiently Protecting Stacked Memory from Large Granularity Failures," in MICRO, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. NVIDIA Corp., "SHIELD Tablet," https://www.nvidia.com/en-us/shield/tablet/.Google ScholarGoogle Scholar
  106. T. Ohsawa, K. Kai, and K. Murakami, "Optimizing the DRAM Refresh Count for Merged DRAM/Logic LSIs," in ISLPED, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. M. Patel, J. Kim, and O. Mutlu, "The Reach Profiler (REAPER): Enabling the Mitigation of DRAM Retention Failures via Profiling at Aggressive Conditions," in ISCA, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. I. Paul, W. Huang, M. Arora, and S. Yalamanchili, "Harmonia: Balancing Compute and Memory Power in High-performance GPUs," in ISCA, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  109. M. K. Qureshi, D. H. Kim, S. Khan, P. J. Nair, and O. Mutlu, "AVATAR: A Variable-Retention-Time (VRT) Aware Refresh for DRAM Systems," in DSN, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. S. Rixner, W. Dally, U. Kapasi, P. Mattson, and J. Owens, "Memory Access Scheduling," in ISCA, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. D. Roberts, N. S. Kim, and T. Mudge, "On-Chip Cache Device Scaling Limits and Effective Fault Repair Techniques in Future Nanoscale Technology," in DSD, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. Samsung Electronics Co., Ltd., "2Gb D-die DDR3L SDRAM," 2011.Google ScholarGoogle Scholar
  113. B. Schroeder, E. Pinheiro, and W.-D. Weber, "DRAM Errors in the Wild: A Large-Scale Field Study," in SIGMETRICS, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  114. V. Seshadri, "Simple DRAM and Virtual Memory Abstractions to Enable Highly Efficient Memory Systems," Ph.D. dissertation, Carnegie Mellon University, 2016.Google ScholarGoogle Scholar
  115. V. Seshadri, Y. Kim, C. Fallin, D. Lee, R. Ausavarungnirun, G. Pekhimenko, Y. Luo, O. Mutlu, P. B. Gibbons, M. A. Kozuch, and T. C. Mowry, "RowClone: Fast and Energy-Efficient In-DRAM Bulk Data Copy and Initialization," in MICRO, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. V. Seshadri, D. Lee, T. Mullins, H. Hassan, A. Boroumand, J. Kim, M. A. Kozuch, O. Mutlu, P. B. Gibbons, and T. C. Mowry, "Buddy-RAM: Improving the Performance and Efficiency of Bulk Bitwise Operations Using DRAM," in CoRR abs/1611.09988, 2016.Google ScholarGoogle Scholar
  117. V. Seshadri, T. Mullins, A. Boroumand, O. Mutlu, P. B. Gibbons, M. A. Kozuch, and T. C. Mowry, "Gather-Scatter DRAM: In-DRAM Address Translation to Improve the Spatial Locality of Non-Unit Strided Accesses," in MICRO, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. A. Shafiee, M. Taassori, R. Balasubramonian, and A. Davis, "MemZip: Exploring Unconventional Benefits from Memory Compression," in HPCA, 2014.Google ScholarGoogle Scholar
  119. M. Shevgoor, J.-S. Kim, N. Chatterjee, R. Balasubramonian, A. Davis, and A. N. Udipi, "Quantifying the relationship between the power delivery network and architectural policies in a 3D-stacked memory device," in MICRO, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  120. SK Hynix, Inc., "DDR3L SDRAM Unbuffered SODIMMs Based on 4Gb A-die," 2014.Google ScholarGoogle Scholar
  121. A. Snavely and D. Tullsen, "Symbiotic Jobscheduling for a Simultaneous Multithreading Processor," in ASPLOS, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  122. V. Sridharan, N. DeBardeleben, S. Blanchard, K. B. Ferreira, J. Stearley, J. Shalf, and S. Gurumurthi, "Memory Errors in Modern Systems: The Good, The Bad, and The Ugly," in ASPLOS, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  123. V. Sridharan and D. Liberty, "A Study of DRAM Failures in the Field," in SC, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  124. Standard Performance Evaluation Corp., "SPEC CPU2006 Benchmarks,"mboxhttp://www.spec.org/cpu2006.Google ScholarGoogle Scholar
  125. L. Subramanian, D. Lee, V. Seshadri, H. Rastogi, and O. Mutlu, "BLISS: Balancing Performance, Fairness and Complexity in Memory Access Scheduling," in IEEE TPDS, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  126. L. Subramanian, D. Lee, V. Seshadri, H. Rastogi, and O. Mutlu, "The Blacklisting Memory Scheduler: Achieving High Performance and Fairness at Low Cost," in ICCD, 2014.Google ScholarGoogle Scholar
  127. V. Sundriyal and M. Sosonkina, "Joint Frequency Scaling of Processor and DRAM," The Journal of Supercomputing, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  128. Texas Instruments, "USB Interface Adapter EVM," http://www.ti.com/tool/usb-to-gpio, 2006.Google ScholarGoogle Scholar
  129. A. N. Udipi, N. Muralimanohar, N. Chatterjee, R. Balasubramonian, A. Davis, and N. P. Jouppi, "Rethinking DRAM Design and Organization for Energy-Constrained Multi-Cores," in ISCA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  130. H. Usui, L. Subramanian, K. K.-W. Chang, and O. Mutlu, "DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators," TACO, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  131. R. Venkatesan, S. Herr, and E. Rotenberg, "Retention-Aware Placement in DRAM (RAPID): Software Methods for Quasi-Non-Volatile DRAM," in HPCA, 2006.Google ScholarGoogle Scholar
  132. T. Vogelsang, "Understanding the Energy Consumption of Dynamic Random Access Memories," in MICRO, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  133. F. A. Ware and C. Hampel, "Improving Power and Data Efficiency with Threaded Memory Modules," in ICCD, 2006.Google ScholarGoogle Scholar
  134. M. Ware, K. Rajamani, M. Floyd, B. Brock, J. C. Rubio, F. Rawson, and J. B. Carter, "Architecting for Power Management: The IBM® POWER7#8482;\ Approach," in HPCA, 2010.Google ScholarGoogle Scholar
  135. C. Wilkerson, H. Gao, A. R. Alameldeen, Z. Chishti, M. M. Khellah, and S.-L. Lu, "Trading Off Cache Capacity for Reliability to Enable Low Voltage Operation," in ISCA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  136. C. Wilkerson, H. Gao, A. R. Alameldeen, Z. Chishti, M. M. Khellah, and S.-L. Lu, "Trading Off Cache Capacity for Low-Voltage Operation," IEEE Micro, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  137. Xilinx, Inc., "Xilinx XTP052 -- ML605 Schematics (Rev D)," https://www.xilinx.com/support/documentation/boards_and_kits/xtp052_ml605_schematics.pdf.Google ScholarGoogle Scholar
  138. D. H. Yoon, J. Chang, N. Muralimanohar, and P. Ranganathan, "BOOM: Enabling Mobile Memory Based Low-Power Server DIMMs," in ISCA, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  139. D. H. Yoon, M. K. Jeong, and M. Erez, "Adaptive Granularity Memory Systems: A Tradeoff Between Storage Efficiency and Throughput," in ISCA, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  140. T. Zhang, K. Chen, C. Xu, G. Sun, T. Wang, and Y. Xie, "Half-DRAM: A High-Bandwidth and Low-Power DRAM Architecture from the Rethinking of Fine-Grained Activation," in ISCA, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  141. J. Zhao, O. Mutlu, and Y. Xie, "FIRM: Fair and High-Performance Memory Control for Persistent Memory Systems," in MICRO, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  142. W. Zhao and Y. Cao, "New Generation of Predictive Technology Model for Sub-45nm Design Exploration," in ISQED, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  143. H. Zheng, J. Lin, Z. Zhang, and Z. Zhu, "Memory Access Scheduling Schemes for Systems with Multi-Core Processors," in ICPP, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  144. H. Zheng, J. Lin, Z. Zhang, E. Gorbatov, H. David, and Z. Zhu, "Mini-Rank: Adaptive DRAM Architecture for Improving Memory Power Efficiency," in MICRO, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  145. W. Zuravleff and T. Robinson, "Controller for a Synchronous DRAM That Maximizes Throughput by Allowing Memory Requests and Commands to Be Issued Out of Order," U.S. Patent 5630096, 1997.Google ScholarGoogle Scholar

Index Terms

  1. Understanding Reduced-Voltage Operation in Modern DRAM Devices: Experimental Characterization, Analysis, and Mechanisms

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!