skip to main content
research-article

Dynamically replicated memory: building reliable systems from nanoscale resistive memories

Published:13 March 2010Publication History
Skip Abstract Section

Abstract

DRAM is facing severe scalability challenges in sub-45nm tech- nology nodes due to precise charge placement and sensing hur- dles in deep-submicron geometries. Resistive memories, such as phase-change memory (PCM), already scale well beyond DRAM and are a promising DRAM replacement. Unfortunately, PCM is write-limited, and current approaches to managing writes must de- commission pages of PCM when the first bit fails.

This paper presents dynamically replicated memory (DRM), the first hardware and operating system interface designed for PCM that allows continued operation through graceful degradation when hard faults occur. DRM reuses memory pages that con- tain hard faults by dynamically forming pairs of complementary pages that act as a single page of storage. No changes are required to the processor cores, the cache hierarchy, or the operating sys- tem's page tables. By changing the memory controller, the TLBs, and the operating system to be DRM-aware, we can improve the lifetime of PCM by up to 40x over conventional error-detection techniques.

References

  1. Y. Azar, A. Broder, A. R. Karlin, and E. Upfal. Balanced allocations. In Symposium on Theory of Computing, May 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Condit, E. Nightingale, C. Frost, E. Ipek, D. Burger, B. Lee, and D. Coetzee. Better I/O through byte-addressable, persistent memory. In International Sympoisum on Operating System Principles, October 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. Doller. Phase change memory, September 2009. http://www.pdl. cmu.edu/SDI/2009/slides/Numonyx.pdf.Google ScholarGoogle Scholar
  4. M. Dyer, A. Frieze, and B. Pittel. The average performance of the greedy matching algorithm. Annals of Applied Probability, 3(2), 1993.Google ScholarGoogle ScholarCross RefCross Ref
  5. J. Edmonds. Path, trees and flowers. Can. J. Math, 17, 1965.Google ScholarGoogle Scholar
  6. S. Eilert. PCM fault models, November 2009. Private communication with Sean Eilert, Director of Architecture Pathfinding at Numonyx.Google ScholarGoogle Scholar
  7. B. Gleixner, F. Pellizzer, and R. Bez. Reliability characterization of phase change memory. In European Phase Change and Ovonics Symposium, September 2009.Google ScholarGoogle ScholarCross RefCross Ref
  8. International technology roadmap for semiconductors. Process integration, devices, and structures, 2009.Google ScholarGoogle Scholar
  9. C. Kim, D. Kang, T.-Y. Lee, K. H. P. Kim, Y.-S. Kang, J. Lee, S.-W. Nam, K.-B. Kim, and Y. Khang. Direct evidence of phase separation in Ge2Sb2Te5 in phase change memory devices. Applied Physics Letters, 94(10):5--5, May 2009.Google ScholarGoogle Scholar
  10. K. Kim and S. J. Ahn. Reliability investigations for manufacturable high density pram. In IEEE International Reliability Physics Symposium, April 2005.Google ScholarGoogle ScholarCross RefCross Ref
  11. B. Lee, E. Ipek, O. Mutlu, and D. Burger. Architecting phase-change memory as a scalable dram alternative. In International Symposium on Computer Architecture, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Micali and V. V. Vazirani. An O(√|V ||E|) algorithm for finding maximum matching in general graphs. In FOCS, 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Micheloni, A. Marelli, and R. Ravasio. In Error Correction Codes for Non-Volatile Memories, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Micron. 512Mb DDR2 SDRAM Component Data Sheet: MT47H128 M4B6--25, March 2006. http://download.micron.com/pdf/datasheets/ dram/ddr2/512MbDDR2.pdf.Google ScholarGoogle Scholar
  15. M. D. Mitzenmacher. The power of two choices in randomized load balancing. Doctoral Dissertaion, Graduate Division of the University of California at Berkeley, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. N. Mudge, G. S. Dasika, and D. A. Roberts. Storage of data in data stores having some faulty storage locations, March 2008. United States Patent Application 20080077824.Google ScholarGoogle Scholar
  17. Numonyx. The basics of PCM technology, September 2008. http:// www.numonyx.com/Documents/WhitePapers.Google ScholarGoogle Scholar
  18. Numonyx. Phase change memory: A new memory technology to enable new memory usage models, September 2009.http://www. numonyx.com/Documents/WhitePapers.Google ScholarGoogle Scholar
  19. A. Pirovano, A. Radaelli, F. Pellizzer, F. Ottogalli, M. Tosi, D. Ielmini, A. L. Lacaita, and R. Bez. Reliability study of phase-change non-volatile memories. IEEE Transactions on Device and Materials Reliability, 4(3):422--427, September 2004.Google ScholarGoogle ScholarCross RefCross Ref
  20. M. K. Qureshi, M. Fraceschini, V. Srinivasan, L. Lastras, B. Abali, and J. Karidis. Enhancing lifetime and security of phase change memories via start-gap wear leveling. In International Symposium on Microarchitecture, November 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. K. Qureshi, V. Srinivasan, and J. A. Rivers. Scalable high performance main memory system using phase-change memory technology. In International Symposium on Computer Architecture, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Raoux, D. M. Ritchiea, K. Thompsona, D. M. Ritchiea, K. Thompsona, D. M. Ritchiea, and K. Thompson. Phase-change random access memory: A scalable technology. IBM Journal of Research and Development, 52(7):5--5, July 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Renau, B. Fraguela, J. Tuck, W. Liu, M. Prvulovic, L. Ceze, S. Sarangi, P. Sack, K. Strauss, and P. Montesinos SESC simulator, January 2005. http://sesc.sourceforge.net.Google ScholarGoogle Scholar
  24. D. Roberts, N. S. Kim, and T. Mudge. On-chip cache device scaling limits and effective fault repair techniques in future nanoscale technology. In Euromicro Conference on Digital System Design, August 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. Roberts, N. S. Kim, and T. Mudge. On-chip cache device scaling limits and effective fault repair techniques in future nanoscale technology. Elsevier Microprocessors and Microsystems, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Rodgers, J. Maimon, T. Storey, D. Lee, M. Graziano, L. Rockett, and K. Hunt. A 4-mb non-volatile chalcogenide random access memory designed for space applications: Project status update. In IEEE Non-Volatile Memory Technology Symposium, November 2008.Google ScholarGoogle ScholarCross RefCross Ref
  27. Samsung. Bad-block management, September 2009. http://www. samsung.com/global/business/semiconductor/products/flash/downloads /xsr v15 badblockmgmt application note.pdfGoogle ScholarGoogle Scholar
  28. D. Tarjan, S. Thoziyoor, and N. P. Jouppi. Cacti 4.0. Technical report, HP Labs, 2006.Google ScholarGoogle Scholar
  29. X. Wu, J. Li, L. Zhang, E. Speight, R. Rajamony, and Y. Xie. Hybrid cache architecture with disparate memory technologies. In International Symposium on Computer Architecture, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. W. Zhang and T. Li. Characterizing and mitigating the impact of process variations on phase change based memory systems. In International Symposium on Microarchitecture, September 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. W. Zhang and T. Li. Exploring phase change memory and 3d die-stacking for power/thermal friendly, fast and durable memory architectures. In International Conference on Parallel Architectures and Compilation Techniques, September 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. Zhou and Z.-C. Ou-Yang. Maximum Matching on Random Graphs. Europhsics Letters -- Preprint, 2003.Google ScholarGoogle Scholar
  33. P. Zhouand, B. Zhao, J. Yang, and Y. Zhang. A durable and energy efficient main memory using phase change memory technology. In International Symposium on Computer Architecture, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Dynamically replicated memory: building reliable systems from nanoscale resistive memories

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGARCH Computer Architecture News
      ACM SIGARCH Computer Architecture News  Volume 38, Issue 1
      ASPLOS '10
      March 2010
      399 pages
      ISSN:0163-5964
      DOI:10.1145/1735970
      Issue’s Table of Contents
      • cover image ACM Conferences
        ASPLOS XV: Proceedings of the fifteenth International Conference on Architectural support for programming languages and operating systems
        March 2010
        422 pages
        ISBN:9781605588391
        DOI:10.1145/1736020
        • General Chair:
        • James C. Hoe,
        • Program Chair:
        • Vikram S. Adve

      Copyright © 2010 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 March 2010

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!