skip to main content
research-article

Sector-Disk (SD) Erasure Codes for Mixed Failure Modes in RAID Systems

Published:01 January 2014Publication History
Skip Abstract Section

Abstract

Traditionally, when storage systems employ erasure codes, they are designed to tolerate the failures of entire disks. However, the most common types of failures are latent sector failures, which only affect individual disk sectors, and block failures which arise through wear on SSD’s. This article introduces SD codes, which are designed to tolerate combinations of disk and sector failures. As such, they consume far less storage resources than traditional erasure codes. We specify the codes with enough detail for the storage practitioner to employ them, discuss their practical properties, and detail an open-source implementation.

References

  1. Amvrosiadis, G., Oprea, A., and Schroeder, B. 2012. Practical scrubbing: Getting to the bad sector at the right time. In Proceedings of the International Conference on Dependable Systems and Networks (DSN’12). IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bairavasundaram, L. N., Goodson, G., Schroeder, B., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2008. An analysis of data corruption in the storage stack. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Balakrishnan, M., Kadav, A., Prabhakaran, V., and Malkhi, D. 2010. Differential RAID: Rethinking RAID for SSD reliability. ACM Trans. Storage 6, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Blaum, M. and Plank, J. S. 2013. Construction of two SD codes. arXiv:1305.1221 {cs.IT}.Google ScholarGoogle Scholar
  5. Blaum, M. and Roth, R. M. 1999. On lowest density MDS codes. IEEE Trans. Inf. Theory 45, 1, 46--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Blaum, M., Brady, J., Bruck, J., and Menon, J. 1995. EVENODD: An efficient scheme for tolerating double disk failures in RAID architectures. IEEE Trans. Comput. 44, 2, 192--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Blaum, M., Hafner, J. L., and Hetzler, S. 2012. Partial-MDS codes and their application to RAID type of architectures. IBM Res. rep. RJ10498 (ALM1202-001).Google ScholarGoogle Scholar
  8. Blomer, J., Kalfane, M., Karpinski, M., Karp, R., Luby, M., and Zuckerman, D. 1995. An XOR-based erasure-resilient coding scheme. Tech. rep. TR-95-048, International Computer Science Institute.Google ScholarGoogle Scholar
  9. Bowers, K., Juels, A., and Oprea, A. 2009. HAIL: A high-availability and integrity layer for cloud storage. In Proceedings of the 16th ACM Conference on Computer and Communications Security. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Cadambe, V., Huang, C., Li, J., and Mehrotra, S. 2011. Compound codes for optimal repair in MDS code based distributed storage systems. In Proceedings of the Asilomar Conference on Signals, Systems and Computers.Google ScholarGoogle Scholar
  11. Calder, B., Wang, J., Ogus, A., Nilakantan, N., Skjolsvold, A., McKelvie, S., Xu, Y., Srivastav, S., Wu, J., Simitci, H., Haridas, J., Uddaraju, C., Khatri, H., Edwards, A., Bedekar, V., Mainali, S., Abbasi, R., Agarwal, A., ul Haq, M. F., ul Haq, M. I., Bhardwaj, D., Dayanand, S., Adusumilli, A., McNett, M., Sankaran, S., Manivannan, K., and Rigas, L. 2011. Windows Azure Storage: A highly available cloud storage service with strong consistency. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP’11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Chen, B., Curtmola, R., Ateniese, G., and Burns, R. 2010. Remote data checking for network coding-based distributed storage systems. In Proceedings of the Cloud Computing Security Workshop. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Corbett, P., English, B., Goel, A., Grcanac, T., Kleiman, S., Leong, J., and Sankar, S. 2004. Row diagonal parity for double disk failure correction. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST’04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Dholakia, A., Eleftheriou, E., Hu, X. Y., Iliadis, I., Menon, J., and Rao, K. K. 2008. A new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors. ACM Trans. Storage 4, 1, 1--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Dimakis, A. G., Ramchandran, K., Wu, Y., and Suh, C. 2011. A survey on network codes for distributed storage. Proc. IEEE 99, 3.Google ScholarGoogle ScholarCross RefCross Ref
  16. Edwards, J. K., Ellard, D., Everhart, C., Fair, R., Hamilton, E., Kahn, A., Kanevsky, A., Lentini, J., Prakash, A., Smith, K. A., and Zayas, E. 2008. FlexVol: Flexible, efficient file volume virtualization in WAFL. In Proceedings of the USENIX Annual Technical Conference. 129--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Elerath, J. G. and Pecht, M. 2009. A highly accurate method for assessing reliability of redundant arrays of inexpensive disks. IEEE Trans. Comput. 58, 3, 289--299. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ghemawat, S., Gobioff, H., and Leung, S. T. 2003. The Google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP’03). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Gopalan, P., Huang, C., Simitci, H., and Yekhanin, S. 2012. On the locality of codeword symbols. IEEE Trans. Inf. Theory 58, 11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Greenan, K., Miller, E., and Schwartz, T. J. 2008. Optimizing galois field arithmetic for diverse processor architectures and applications. In Proceedings of the 16th IEEE Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS’08).Google ScholarGoogle Scholar
  21. Greenan, K. M., Long, D. D., Miller, E. L., Schwarz, T. J. E., and Wildani, A. 2009. Building flexible, fault-tolerant flash-based storage systems. In Proceedings of the 5th Workshop on Hot Topics in Dependability.Google ScholarGoogle Scholar
  22. Greenan, K. M., Li, X., and Wylie, J. J. 2010. Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery and tradeoffs. In Proceedings of the 26th IEEE Symposium on Massive Storage Systems and Technologies (MSST’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Hafner, J. L. 2006. HoVer erasure codes for disk arrays. In Proceedings of the IEEE International Conference on Dependable Systems and Networks (DSN’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hafner, J. L., Deenadhayalan, V., Belluomini, W., and Rao, K. 2008. Undetected disk errors in RAID arrays. IBM J. Res. Devel. 52, 4/5, 418--425. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Hu, Y., Chen, H. C. H., Lee, P. P. C., and Tang, Y. 2012. NCCloud: Applying network coding for the storage repair in a cloud-of-clouds. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Huang, C., Chen, M., and Li, J. 2007. Pyramid codes: Flexible schemes to trade space for access efficienty in reliable data storage systems. In Proceedings of the 6th IEEE International Symposium on Network Computing Applications (NCA’07).Google ScholarGoogle Scholar
  27. Huang, C., Simitci, H., Xu, Y., Ogus, A., Calder, B., Gopalan, P., Li, J., and Yekhanin, S. 2012. Erasure coding in Windows Azure storage. In Proceedings of the USENIX Annual Technical Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Josephson, W. K., Bongo, L. A., Flynn, D., and Li, K. 2010. DFS: A file system for virtualized flash storage. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Kenchammana-Hosekote, D., He, D., and Hafner, J. L. 2007. REO: A generic RAID engine and optimizer. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST’07). 261--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Khan, O., Burns, R., Plank, J. S., Pierce, W., and Huang, C. 2012. Rethinking erasure codes for cloud file systems: Minimizing I/O for recovery and degraded reads. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Klein, H. and Keller, J. 2009. Storage architecture with integrity, redundancy and encryption. In Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Lastras-Montaño, L. A., Meaney, P. J., Stephens, E., Trager, B. M., O’Connor, J., and Alves, L. C. 2011. A new class of array codes for memory storage. In Proceedings of the Information Theory and Applications Workshop (ITA).Google ScholarGoogle Scholar
  33. Li, M., Shu, J., and Zheng, W. 2009. GRID codes: Strip-based erasure codes with high fault tolerance for storage systems. ACM Trans. Storage 4, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Li, X., Marchant, A., Shah, M. A., Smathers, K., Tucek, J., Uysal, M., and Wylie, J. J. 2010. Efficient eventual consistency in Pahoehoe, an erasure-coded key-blob archive. In Proceedings of the International Conference on Dependable Systems and Networks (DSN’10). IEEE.Google ScholarGoogle Scholar
  35. Luby, M. 2002. LT codes. In Proceedings of the IEEE Symposium on Foundations of Computer Science. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Luo, J., Bowers, K. D., Oprea, A., and Xu, L. 2012. Efficient software implementations of large finite fields GF(2n) for secure storage applications. ACM Trans. Storage 8, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. MacWilliams, F. J. and Sloane, N. J. A. 1977. The Theory of Error-Correcting Codes, Part I. North-Holland Publishing Company, Amsterdam.Google ScholarGoogle Scholar
  38. Oh, Y., Choi, J., Lee, D., and Noh, S. H. 2012. Caching less for better performance: Balancing cache size and update cost of flash memory cache in hybrid storage systems. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Onion Networks. 2001. Java FEC Library v1.0.3. Open source code distribution: http://onionnetworks.com/fec/javadoc/.Google ScholarGoogle Scholar
  40. Oprea, A. and Juels, A. 2010. A clean-slate look at disk scrubbing. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). 57--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Ousterhout, J. K. and Douglis, F. 1989. Beating the I/O bottleneck: A case for log-structured file systems. Oper. Syst. Rev. 23, 1, 11--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Partow, A. 2000-2007. Schifra Reed-Solomon ECC Library. Open source code distribution: http://www.schifra.com/downloads.html.Google ScholarGoogle Scholar
  43. Peterson, W. W. and Weldon, Jr., E. J. 1972. Error-Correcting Codes 2nd Ed. The MIT Press, Cambridge, MA.Google ScholarGoogle Scholar
  44. Plank, J. S. 1997. A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems. Softw. Pract. Exper. 27, 9, 995--1012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Plank, J. S. and Huang, C. 2013. Tutorial: Erasure coding for storage applications. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13).Google ScholarGoogle Scholar
  46. Plank, J. S., Luo, J., Schuman, C. D., Xu, L., and Wilcox-O’Hearn, Z. 2009. A performance evaluation and examination of open-source erasure coding libraries for storage. In Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST’09). 253--265. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Plank, J. S., Blaum, M., and Hafner, J. L. 2013a. SD codes: Erasure codes designed for how storage systems really fail. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13).Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Plank, J. S., Greenan, K. M., and Miller, E. L. 2013b. Screaming fast Galois Field arithmetic using Intel SIMD instructions. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13).Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Resch, J. K. and Plank, J. S. 2011. AONT-RS: Blending security and performance in dispersed storage systems. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST’11). 191--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Rhea, S., Eaton, P., Geels, D., Weatherspoon, H., Zhao, B., and Kubiatowitz, J. 2003. Pond: The OceanStore prototype. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST’03). Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Schroeder, B. and Gibson, G. 2007. Disk failures in the real world: What does an MTTF of 1,000,000 mean to you? In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Schroeder, B., Damouras, S., and Gill, P. 2010. Understanding latent sector errors and how to protect against them. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). 71--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Seltzer, M., Bostic, K., McKusick, M., and Staelin, C. 1993. An implementation of a log-structured file system for UNIX. In Conference Proceedings of the USENIX Winter 1993 Technical Conference. 307--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Storer, M. W., Greenan, K. M., Miller, E. L., and Voruganti, K. 2008. Pergamum: Replacing tape with energy efficient, reliable, disk-based archival storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). 1--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Storer, M. W., Greenan, K. M., Miller, E. L., and Voruganti, K. 2009. POTSHARDS -- A secure, long-term storage system. ACM Trans. Storage 5, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Suh, C. and Ramchandran, K. 2010. Exact regeneration codes for distributed storage repair using interference alignment. In Proceedings of the IEEE International Symposium on Information Theory (ISIT).Google ScholarGoogle Scholar
  57. Wang, Z., Dimakis, A. G., and Bruck, J. 2010. Rebuilding for array codes in distributed storage systems. In Proceedings of the GLOBECOM ACTEMT Workshop. 1905--1909.Google ScholarGoogle Scholar
  58. Warner, B., Wilcox-O’Hearn, Z., and Kinninmont, R. 2008. Tahoe: A secure distributed filesystem. White paper. http://allmydata.org/~warner/pycon-tahoe.html.Google ScholarGoogle Scholar
  59. Welch, B., Unangst, M., Abbasi, Z., Gibson, G., Mueller, B., Small, J., Zelenka, J., and Zhou, B. 2008. Scalable performance of the Panasas parallel file system. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). 17--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Xiang, L., Xu, Y., Lui, J. C. S., and Chang, Q. 2010. Optimal recovery of single disk failure in RDP code storage systems. In Proceedings of the ACM SIGMETRICS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Xu, L. 2005. Hydra: A platform for survivable and secure data storage systems. In Proceedings of the ACM International Workshop on Storage Security and Survivability (StorageSS’05). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Sector-Disk (SD) Erasure Codes for Mixed Failure Modes in RAID Systems

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Storage
              ACM Transactions on Storage  Volume 10, Issue 1
              January 2014
              94 pages
              ISSN:1553-3077
              EISSN:1553-3093
              DOI:10.1145/2578042
              • Editor:
              • Darrell Long
              Issue’s Table of Contents

              Copyright © 2014 ACM

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 January 2014
              • Accepted: 1 June 2013
              • Received: 1 May 2013
              Published in tos Volume 10, Issue 1

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!