skip to main content
research-article

Differential RAID: Rethinking RAID for SSD reliability

Authors Info & Claims
Published:30 July 2010Publication History
Skip Abstract Section

Abstract

SSDs exhibit very different failure characteristics compared to hard drives. In particular, the bit error rate (BER) of an SSD climbs as it receives more writes. As a result, RAID arrays composed from SSDs are subject to correlated failures. By balancing writes evenly across the array, RAID schemes can wear out devices at similar times. When a device in the array fails towards the end of its lifetime, the high BER of the remaining devices can result in data loss. We propose Diff-RAID, a parity-based redundancy solution that creates an age differential in an array of SSDs. Diff-RAID distributes parity blocks unevenly across the array, leveraging their higher update rate to age devices at different rates. To maintain this age differential when old devices are replaced by new ones, Diff-RAID reshuffles the parity distribution on each drive replacement. We evaluate Diff-RAID's reliability by using real BER data from 12 flash chips on a simulator and show that it is more reliable than RAID-5, in some cases by multiple orders of magnitude. We also evaluate Diff-RAID's performance using a software implementation on a 5-device array of 80 GB Intel X25-M SSDs and show that it offers a trade-off between throughput and reliability.

References

  1. Andersen, D. G., Franklin, J., Kaminsky, M., Phanishayee, A., Tan, L., and Vasudevan, V. 2009. FAWN: A fast array of wimpy nodes. In Proceedings of the 22nd Symposium on Operating Systems Principles (SOSP). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bitar, R. 2008. Deploying hybrid storage pools with sun flash technology and the solaris zfs file system. Tech. rep. SUN-820-5881-10, Sun Microsystems.Google ScholarGoogle Scholar
  3. Desnoyers, P. 2009. Empirical evaluation of NAND flash memory performance. In Proceedings of the 1st Workshop on Hot Topics in Storage (HotStorage).Google ScholarGoogle Scholar
  4. Fusion-io. MySpace Case Study. http://www.fusionio.com/case-studies/myspace-case-study.pdf.Google ScholarGoogle Scholar
  5. Greenan, K. M., Long, D. D., Miller, E. L., Schwarz, T. J. E., and Wildani, A. 2009. Building flexible, fault-tolerant flash-based storage systems. In Proceedings of the 5th Workshop on Hot Topics in Dependability (HotDep).Google ScholarGoogle Scholar
  6. Grupp, L. M., Caulfield, A. M., Coburn, J., Swanson, S., Yaakobi, E., Siegel, P. H., and Wolf, J. K. 2009. Characterizing flash memory: anomalies, observations, and applications. In Proceedings of the Annual International Symposium on Microachitecture (MICRO). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Harris, R. 2007. Why RAID-5 stops working in 2009. http://blogs.zdnet.com/storage/?p=162.Google ScholarGoogle Scholar
  8. Hutsell, W. An In-depth look at the RamSan-500 cached flash solid state disk. http://www.texmemsys.com/files/f000233.pdf.Google ScholarGoogle Scholar
  9. Intel Corporation. Intel X18-M/X25-M SATA Solid State Drive. http://download.intel.com/design/flash/nand/mainstream/mainstream-satassd-datasheet.pdf.Google ScholarGoogle Scholar
  10. Kadav, A., Balakrishnan, M., Prabhakaran, V., and Malkhi, D. 2009. Differential RAID: rethinking RAID for SSD reliability. In Proceedings of the 1st Workshop on Hot Topics in Storage (HotStorage).Google ScholarGoogle Scholar
  11. Matthews, J., Trika, S., Hensgen, D., Coulson, R., and Grimsrud, K. 2008. Intel®turbo memory: nonvolatile disk caches in the storage hierarchy of mainstream computer systems. Trans. Storage 4, 2, 1--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Mielke, N., Marquart, T., Wu, N., Kessenich, J., Belgal, H., Schares, E., Trivedi, F., Goodness, E., and Nevill, L. R. 2008. Bit error rate in NAND flash memories. In Proceedings of the IEEE International Reliability Physics Symposium (IRPS). 9--19.Google ScholarGoogle Scholar
  13. Narayanan, D., Thereska, E., Donnelly, A., Elnikety, S., and Rowstron, A. 2009. Migrating server storage to SSDs: analysis of tradeoffs. In Proceedings of the 4th ACM European Conference on Computer Systems. 145--158. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Differential RAID: Rethinking RAID for SSD reliability

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                • Published in

                  cover image ACM Transactions on Storage
                  ACM Transactions on Storage  Volume 6, Issue 2
                  July 2010
                  89 pages
                  ISSN:1553-3077
                  EISSN:1553-3093
                  DOI:10.1145/1807060
                  Issue’s Table of Contents

                  Copyright © 2010 ACM

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 30 July 2010
                  • Received: 1 May 2010
                  • Accepted: 1 May 2010
                  Published in tos Volume 6, Issue 2

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Author Tags

                  Qualifiers

                  • research-article
                  • Research
                  • Refereed

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader
                About Cookies On This Site

                We use cookies to ensure that we give you the best experience on our website.

                Learn more

                Got it!