Abstract
Traditionally, when storage systems employ erasure codes, they are designed to tolerate the failures of entire disks. However, the most common types of failures are latent sector failures, which only affect individual disk sectors, and block failures which arise through wear on SSD’s. This article introduces SD codes, which are designed to tolerate combinations of disk and sector failures. As such, they consume far less storage resources than traditional erasure codes. We specify the codes with enough detail for the storage practitioner to employ them, discuss their practical properties, and detail an open-source implementation.
- Amvrosiadis, G., Oprea, A., and Schroeder, B. 2012. Practical scrubbing: Getting to the bad sector at the right time. In Proceedings of the International Conference on Dependable Systems and Networks (DSN’12). IEEE. Google Scholar
Digital Library
- Bairavasundaram, L. N., Goodson, G., Schroeder, B., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2008. An analysis of data corruption in the storage stack. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). Google Scholar
Digital Library
- Balakrishnan, M., Kadav, A., Prabhakaran, V., and Malkhi, D. 2010. Differential RAID: Rethinking RAID for SSD reliability. ACM Trans. Storage 6, 2. Google Scholar
Digital Library
- Blaum, M. and Plank, J. S. 2013. Construction of two SD codes. arXiv:1305.1221 {cs.IT}.Google Scholar
- Blaum, M. and Roth, R. M. 1999. On lowest density MDS codes. IEEE Trans. Inf. Theory 45, 1, 46--59. Google Scholar
Digital Library
- Blaum, M., Brady, J., Bruck, J., and Menon, J. 1995. EVENODD: An efficient scheme for tolerating double disk failures in RAID architectures. IEEE Trans. Comput. 44, 2, 192--202. Google Scholar
Digital Library
- Blaum, M., Hafner, J. L., and Hetzler, S. 2012. Partial-MDS codes and their application to RAID type of architectures. IBM Res. rep. RJ10498 (ALM1202-001).Google Scholar
- Blomer, J., Kalfane, M., Karpinski, M., Karp, R., Luby, M., and Zuckerman, D. 1995. An XOR-based erasure-resilient coding scheme. Tech. rep. TR-95-048, International Computer Science Institute.Google Scholar
- Bowers, K., Juels, A., and Oprea, A. 2009. HAIL: A high-availability and integrity layer for cloud storage. In Proceedings of the 16th ACM Conference on Computer and Communications Security. Google Scholar
Digital Library
- Cadambe, V., Huang, C., Li, J., and Mehrotra, S. 2011. Compound codes for optimal repair in MDS code based distributed storage systems. In Proceedings of the Asilomar Conference on Signals, Systems and Computers.Google Scholar
- Calder, B., Wang, J., Ogus, A., Nilakantan, N., Skjolsvold, A., McKelvie, S., Xu, Y., Srivastav, S., Wu, J., Simitci, H., Haridas, J., Uddaraju, C., Khatri, H., Edwards, A., Bedekar, V., Mainali, S., Abbasi, R., Agarwal, A., ul Haq, M. F., ul Haq, M. I., Bhardwaj, D., Dayanand, S., Adusumilli, A., McNett, M., Sankaran, S., Manivannan, K., and Rigas, L. 2011. Windows Azure Storage: A highly available cloud storage service with strong consistency. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP’11). Google Scholar
Digital Library
- Chen, B., Curtmola, R., Ateniese, G., and Burns, R. 2010. Remote data checking for network coding-based distributed storage systems. In Proceedings of the Cloud Computing Security Workshop. Google Scholar
Digital Library
- Corbett, P., English, B., Goel, A., Grcanac, T., Kleiman, S., Leong, J., and Sankar, S. 2004. Row diagonal parity for double disk failure correction. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST’04). Google Scholar
Digital Library
- Dholakia, A., Eleftheriou, E., Hu, X. Y., Iliadis, I., Menon, J., and Rao, K. K. 2008. A new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors. ACM Trans. Storage 4, 1, 1--42. Google Scholar
Digital Library
- Dimakis, A. G., Ramchandran, K., Wu, Y., and Suh, C. 2011. A survey on network codes for distributed storage. Proc. IEEE 99, 3.Google Scholar
Cross Ref
- Edwards, J. K., Ellard, D., Everhart, C., Fair, R., Hamilton, E., Kahn, A., Kanevsky, A., Lentini, J., Prakash, A., Smith, K. A., and Zayas, E. 2008. FlexVol: Flexible, efficient file volume virtualization in WAFL. In Proceedings of the USENIX Annual Technical Conference. 129--142. Google Scholar
Digital Library
- Elerath, J. G. and Pecht, M. 2009. A highly accurate method for assessing reliability of redundant arrays of inexpensive disks. IEEE Trans. Comput. 58, 3, 289--299. Google Scholar
Digital Library
- Ghemawat, S., Gobioff, H., and Leung, S. T. 2003. The Google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP’03). Google Scholar
Digital Library
- Gopalan, P., Huang, C., Simitci, H., and Yekhanin, S. 2012. On the locality of codeword symbols. IEEE Trans. Inf. Theory 58, 11. Google Scholar
Digital Library
- Greenan, K., Miller, E., and Schwartz, T. J. 2008. Optimizing galois field arithmetic for diverse processor architectures and applications. In Proceedings of the 16th IEEE Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS’08).Google Scholar
- Greenan, K. M., Long, D. D., Miller, E. L., Schwarz, T. J. E., and Wildani, A. 2009. Building flexible, fault-tolerant flash-based storage systems. In Proceedings of the 5th Workshop on Hot Topics in Dependability.Google Scholar
- Greenan, K. M., Li, X., and Wylie, J. J. 2010. Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery and tradeoffs. In Proceedings of the 26th IEEE Symposium on Massive Storage Systems and Technologies (MSST’10). Google Scholar
Digital Library
- Hafner, J. L. 2006. HoVer erasure codes for disk arrays. In Proceedings of the IEEE International Conference on Dependable Systems and Networks (DSN’06). Google Scholar
Digital Library
- Hafner, J. L., Deenadhayalan, V., Belluomini, W., and Rao, K. 2008. Undetected disk errors in RAID arrays. IBM J. Res. Devel. 52, 4/5, 418--425. Google Scholar
Digital Library
- Hu, Y., Chen, H. C. H., Lee, P. P. C., and Tang, Y. 2012. NCCloud: Applying network coding for the storage repair in a cloud-of-clouds. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). Google Scholar
Digital Library
- Huang, C., Chen, M., and Li, J. 2007. Pyramid codes: Flexible schemes to trade space for access efficienty in reliable data storage systems. In Proceedings of the 6th IEEE International Symposium on Network Computing Applications (NCA’07).Google Scholar
- Huang, C., Simitci, H., Xu, Y., Ogus, A., Calder, B., Gopalan, P., Li, J., and Yekhanin, S. 2012. Erasure coding in Windows Azure storage. In Proceedings of the USENIX Annual Technical Conference. Google Scholar
Digital Library
- Josephson, W. K., Bongo, L. A., Flynn, D., and Li, K. 2010. DFS: A file system for virtualized flash storage. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). Google Scholar
Digital Library
- Kenchammana-Hosekote, D., He, D., and Hafner, J. L. 2007. REO: A generic RAID engine and optimizer. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST’07). 261--276. Google Scholar
Digital Library
- Khan, O., Burns, R., Plank, J. S., Pierce, W., and Huang, C. 2012. Rethinking erasure codes for cloud file systems: Minimizing I/O for recovery and degraded reads. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). Google Scholar
Digital Library
- Klein, H. and Keller, J. 2009. Storage architecture with integrity, redundancy and encryption. In Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing. Google Scholar
Digital Library
- Lastras-Montaño, L. A., Meaney, P. J., Stephens, E., Trager, B. M., O’Connor, J., and Alves, L. C. 2011. A new class of array codes for memory storage. In Proceedings of the Information Theory and Applications Workshop (ITA).Google Scholar
- Li, M., Shu, J., and Zheng, W. 2009. GRID codes: Strip-based erasure codes with high fault tolerance for storage systems. ACM Trans. Storage 4, 4. Google Scholar
Digital Library
- Li, X., Marchant, A., Shah, M. A., Smathers, K., Tucek, J., Uysal, M., and Wylie, J. J. 2010. Efficient eventual consistency in Pahoehoe, an erasure-coded key-blob archive. In Proceedings of the International Conference on Dependable Systems and Networks (DSN’10). IEEE.Google Scholar
- Luby, M. 2002. LT codes. In Proceedings of the IEEE Symposium on Foundations of Computer Science. Google Scholar
Digital Library
- Luo, J., Bowers, K. D., Oprea, A., and Xu, L. 2012. Efficient software implementations of large finite fields GF(2n) for secure storage applications. ACM Trans. Storage 8, 2. Google Scholar
Digital Library
- MacWilliams, F. J. and Sloane, N. J. A. 1977. The Theory of Error-Correcting Codes, Part I. North-Holland Publishing Company, Amsterdam.Google Scholar
- Oh, Y., Choi, J., Lee, D., and Noh, S. H. 2012. Caching less for better performance: Balancing cache size and update cost of flash memory cache in hybrid storage systems. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). Google Scholar
Digital Library
- Onion Networks. 2001. Java FEC Library v1.0.3. Open source code distribution: http://onionnetworks.com/fec/javadoc/.Google Scholar
- Oprea, A. and Juels, A. 2010. A clean-slate look at disk scrubbing. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). 57--70. Google Scholar
Digital Library
- Ousterhout, J. K. and Douglis, F. 1989. Beating the I/O bottleneck: A case for log-structured file systems. Oper. Syst. Rev. 23, 1, 11--27. Google Scholar
Digital Library
- Partow, A. 2000-2007. Schifra Reed-Solomon ECC Library. Open source code distribution: http://www.schifra.com/downloads.html.Google Scholar
- Peterson, W. W. and Weldon, Jr., E. J. 1972. Error-Correcting Codes 2nd Ed. The MIT Press, Cambridge, MA.Google Scholar
- Plank, J. S. 1997. A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems. Softw. Pract. Exper. 27, 9, 995--1012. Google Scholar
Digital Library
- Plank, J. S. and Huang, C. 2013. Tutorial: Erasure coding for storage applications. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13).Google Scholar
- Plank, J. S., Luo, J., Schuman, C. D., Xu, L., and Wilcox-O’Hearn, Z. 2009. A performance evaluation and examination of open-source erasure coding libraries for storage. In Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST’09). 253--265. Google Scholar
Digital Library
- Plank, J. S., Blaum, M., and Hafner, J. L. 2013a. SD codes: Erasure codes designed for how storage systems really fail. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13).Google Scholar
Digital Library
- Plank, J. S., Greenan, K. M., and Miller, E. L. 2013b. Screaming fast Galois Field arithmetic using Intel SIMD instructions. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13).Google Scholar
Digital Library
- Resch, J. K. and Plank, J. S. 2011. AONT-RS: Blending security and performance in dispersed storage systems. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST’11). 191--202. Google Scholar
Digital Library
- Rhea, S., Eaton, P., Geels, D., Weatherspoon, H., Zhao, B., and Kubiatowitz, J. 2003. Pond: The OceanStore prototype. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST’03). Google Scholar
Digital Library
- Schroeder, B. and Gibson, G. 2007. Disk failures in the real world: What does an MTTF of 1,000,000 mean to you? In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST’07). Google Scholar
Digital Library
- Schroeder, B., Damouras, S., and Gill, P. 2010. Understanding latent sector errors and how to protect against them. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). 71--84. Google Scholar
Digital Library
- Seltzer, M., Bostic, K., McKusick, M., and Staelin, C. 1993. An implementation of a log-structured file system for UNIX. In Conference Proceedings of the USENIX Winter 1993 Technical Conference. 307--326. Google Scholar
Digital Library
- Storer, M. W., Greenan, K. M., Miller, E. L., and Voruganti, K. 2008. Pergamum: Replacing tape with energy efficient, reliable, disk-based archival storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). 1--16. Google Scholar
Digital Library
- Storer, M. W., Greenan, K. M., Miller, E. L., and Voruganti, K. 2009. POTSHARDS -- A secure, long-term storage system. ACM Trans. Storage 5, 2. Google Scholar
Digital Library
- Suh, C. and Ramchandran, K. 2010. Exact regeneration codes for distributed storage repair using interference alignment. In Proceedings of the IEEE International Symposium on Information Theory (ISIT).Google Scholar
- Wang, Z., Dimakis, A. G., and Bruck, J. 2010. Rebuilding for array codes in distributed storage systems. In Proceedings of the GLOBECOM ACTEMT Workshop. 1905--1909.Google Scholar
- Warner, B., Wilcox-O’Hearn, Z., and Kinninmont, R. 2008. Tahoe: A secure distributed filesystem. White paper. http://allmydata.org/~warner/pycon-tahoe.html.Google Scholar
- Welch, B., Unangst, M., Abbasi, Z., Gibson, G., Mueller, B., Small, J., Zelenka, J., and Zhou, B. 2008. Scalable performance of the Panasas parallel file system. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). 17--33. Google Scholar
Digital Library
- Xiang, L., Xu, Y., Lui, J. C. S., and Chang, Q. 2010. Optimal recovery of single disk failure in RDP code storage systems. In Proceedings of the ACM SIGMETRICS. Google Scholar
Digital Library
- Xu, L. 2005. Hydra: A platform for survivable and secure data storage systems. In Proceedings of the ACM International Workshop on Storage Security and Survivability (StorageSS’05). Google Scholar
Digital Library
Index Terms
Sector-Disk (SD) Erasure Codes for Mixed Failure Modes in RAID Systems
Recommendations
Minimum density RAID-6 codes
RAID-6 codes protect disk array storage systems from two-disk failures. This article presents a complete treatment of a class of RAID-6 codes, called minimum density RAID-6 codes, that have an optimal blend of performance properties. There are two ...
HPDA: A hybrid parity-based disk array for enhanced performance and reliability
Flash-based Solid State Drive (SSD) has been productively shipped and deployed in large scale storage systems. However, a single flash-based SSD cannot satisfy the capacity, performance and reliability requirements of the modern storage systems that ...
Hierarchical RAID: Design, performance, reliability, and recovery
Hierarchical RAID (HRAID) extends the RAID paradigm to mask the failure of whole Storage Nodes (SNs) or bricks, where each SN is a disk array with a certain RAID level. HRAIDk/@? with N SNs and M disks per SN tolerates k SN failures and @? disk failures ...






Comments