Abstract
The current parallel storage systems use thousands of inexpensive disks to meet the storage requirement of applications. Data redundancy and/or coding are used to enhance data availability, for instance, Row-diagonal parity (RDP) and EVENODD codes, which are widely used in RAID-6 storage systems, provide data availability with up to two disk failures. To reduce the probability of data unavailability, whenever a single disk fails, disk recovery will be carried out. We find that the conventional recovery schemes of RDP and EVENODD codes for a single failed disk only use one parity disk. However, there are two parity disks in the system, and both can be used for single disk failure recovery. In this article, we propose a hybrid recovery approach that uses both parities for single disk failure recovery, and we design efficient recovery schemes for RDP code (RDOR-RDP) and EVENODD code (RDOR-EVENODD). Our recovery scheme has the following attractive properties: (1) “read optimality” in the sense that our scheme issues the smallest number of disk reads to recover a single failed disk and it reduces approximately 1/4 of disk reads compared with conventional schemes; (2) “load balancing property” in that all surviving disks will be subjected to the same (or almost the same) amount of additional workload in rebuilding the failed disk.
We carry out performance evaluation to quantify the merits of RDOR-RDP and RDOR-EVENODD on some widely used disks with DiskSim. The offline experimental results show that RDOR-RDP and RDOR-EVENODD outperform the conventional recovery schemes of RDP and EVENODD codes in terms of total recovery time and recovery workload on individual surviving disk. However, the improvements are less than the theoretical value (approximately 25%), as RDOR-RDP and RDOR-EVENODD change the disk access pattern from purely sequential to a more random one compared with their conventional schemes.
- Baker, M., Shah, M., Rosenthal, D. S. H., Roussopoulos, M., Maniatis, P., Giuli, T., and Bungale, P. 2006. A fresh look at the reliability of long-term digital storage. In Proceedings of the EuroSys Conference (EuroSys’06). ACM, 221--234. Google Scholar
Digital Library
- Blaum, M., Brady, J., Bruck, J., and Menon, J. 1995. EVENODD: An efficient scheme for tolerating double disk failures in RAID architectures. IEEE Trans. Comput. 44, 2, 192--202. Google Scholar
Digital Library
- Bucy, J., Schindler, J., Schlosser, S., and Ganger, G. 2008. The DiskSim simulation environment (v4.0). Tech. rep. CMU_PDL_08_101, Carnegie Melon University.Google Scholar
- Cassidy, B. and Hafner, J. L. 2007. Space efficient matrix methods for lost data reconstruction in erasure codes. Tech. rep. RJ10415, IBM Research.Google Scholar
- Chen, P. M., Lee, E. K., Gibson, G. A., Katz, R. H., and Patterson, D. A. 1994. RAID: High-performance, reliable secondary storage. ACM Comput. Surv. 26, 145--185. Google Scholar
Digital Library
- Corbett, P., English, B., Goel, A., Grcanac, T., Kleiman, S., Leong, J., and Sankar, S. 2004. Row-diagonal parity for double disk failure correction. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST’04). USENIX Association, Berkeley, CA, 1--14. Google Scholar
Digital Library
- Dimakis, A. G., Godfrey, P. B., Wu, Y., Wainwright, M. J., and Ramchandran, K. 2010. Network coding for distributed storage systems. IEEE Trans. Inform. Theory 56, 4539--4551. Google Scholar
Digital Library
- Emrich, T., Graf, F., Kriegel, H.-P., Schubert, M., and Thoma, M. 2010. On the impact of flash ssds on spatial indexing. In Proceedings of the 6th International Workshop on Data Management on New Hardware (DaMoN’10). ACM, New York, NY, 3--8. Google Scholar
Digital Library
- Ghemawat, S., Gobioff, H., and Leung, S.-T. 2003. The Google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP’03). ACM, New York, NY, 29--43. Google Scholar
Digital Library
- Greenan, K. M., Li, X., and Wylie, J. J. 2010. Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery, and tradeoffs. In Procceedings of the 26th Symposium on Mass Storage Systems and Technologies (MSST’10). IEEE, Los Alamitos, CA, 1--14. Google Scholar
Digital Library
- Hafner, J. L., Deenadhayalan, V., Rao, K. K., and Tomlin, J. A. 2005. Matrix methods for lost data reconstruction in erasure codes. In Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST’05). USENIX Association, Berkeley, CA, 15--30. Google Scholar
Digital Library
- Holland, M. 1994. On-line data reconstruction in redundant disk arrays. Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA. Google Scholar
Digital Library
- Holland, M. and Gibson, G. A. 1992. Parity declustering for continuous operation in redundant disk arrays. In Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-V). ACM, New York, NY, 23--35. Google Scholar
Digital Library
- Holland, M., Gibson, G. A., and Siewiorek, D. P. 1993. Fast, on-line failure recovery in redundant disk arrays. In Proceedings of the 23rd Annual International Symposium on Fault-Tolerant Computing (FTCS’93). 422--431.Google Scholar
- Holland, M., Gibson, G. A., and Siewiorek, D. P. 1994. Architectures and algorithms for on-line failure recovery in redundant disk arrays. Distrib. Paral. Datab. 2, 3, 295--335. Google Scholar
Digital Library
- Joukov, N., Krishnakumar, A. M., Patti, C., Rai, A., Satnur, S., Traeger, A., and Zadok, E. 2007. RAIF: Redundant array of independent filesystems. In Proceedings of 24th IEEE Conference on Mass Storage Systems and Technologies (MSST’07). IEEE, Los Alamitos, CA, 199--212. Google Scholar
Digital Library
- Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Weimer, W., Wells, C., and Zhao, B. 2000. Oceanstore: An architecture for global-scale persistent storage. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’00). 190--201. Google Scholar
Digital Library
- Lee, J. Y. B. and Lui, J. C. S. 2002. Automatic recovery from disk failure in continuous-media servers. IEEE Trans. Paral. Distrib. Syst. 13, 5, 499--515. Google Scholar
Digital Library
- Lueth, C. 2004. RAID-DP: Network Appliance implementation of RAID double parity for data protection. Tech. rep. No. 3298, Network Appliance Inc.Google Scholar
- Lyman, P. and Varian, H. R. 2003. How much information? http://www.sims.berkeley.edu/how-much-info-2003.Google Scholar
- Ma, S. L. 1994. A survey of partial difference sets. Des. Codes Cryptog. 4, 3, 221--261. Google Scholar
Digital Library
- Menon, J. and Mattson., D. 1992. Comparison of sparing alternative for disk arrays. In Proceedings of the International Symposium on Computer Architecture (ISCA’92). 318--329. Google Scholar
Digital Library
- Merchant, A. and Yu, P. S. 1996. Analytic modeling of clustered RAID with mapping based on nearly random permutation. IEEE Trans. Comput. 45, 3, 367--373. Google Scholar
Digital Library
- Muntz, R. R. and Lui, J. C. S. 1990. Performance analysis of disk arrays under failure. In Proceedings of the 16th International Conference on Very Large Databases (VLDB’90). Morgan Kaufmann Publishers Inc., San Francisco, CA, 162--173. Google Scholar
Digital Library
- Pinheiro, E., Weber, W.-D., and Barroso, L. A. 2007. Failure trends in a large disk drive population. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST’07). USENIX Association, Berkeley, CA, 17--28. Google Scholar
Digital Library
- Plank, J. S. 2008. The RAID-6 liberation codes. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). USENIX Association, Berkeley, CA, 1--14. Google Scholar
Digital Library
- Plank, J. S., Luo, J., Schuman, C. D., Xu, L., and Wilcox-O’Hearn, Z. 2009. A performance evaluation and examination of open-source erasure coding libraries for storage. In Proccedings of the 7th Conference on File and Storage Technologies (FAST’09). USENIX Association, Berkeley, CA, 253--265. Google Scholar
Digital Library
- Pless, V. 1998. Introduction to the Theory of Error-Correcting Codes. Wiley Interscience.Google Scholar
- Schroeder, B. and Gibson, G. A. 2007. Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you? In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST’07). USENIX Association, Berkeley, CA, 1--16. Google Scholar
Digital Library
- Seagate. 2007. Cheetah® 15K.5 Fibre Channel 146-GB Hard Drive ST3146855FC Product Manual. Tech. rep. Cheetah 15K.5 FC, Seagate Inc.Google Scholar
- Thomasian, A. and Menon, J. 1997. RAID-5 performance with distributed sparing. IEEE Trans. Paral. Distrib. Syst. 8, 6, 640--657. Google Scholar
Digital Library
- Tian, L., Feng, D., Jiang, H., Zhou, K., Zeng, L., Chen, J., Wang, Z., and Song, Z. 2007. PRO: A popularity-based multi-threaded reconstruction optimization for RAID-structured storage systems. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST’05). USENIX Association, Berkeley, CA, 301--314. Google Scholar
Digital Library
- van Lint, J., Wilson, R. M., and Hale, J. K. 1993. A Course in Combinatorics. Cambridge University Press, Cambridge, UK.Google Scholar
- Wikipedia. 2010. DDR2 SDRAM. http://en.wikipedia.org/wiki/DDR2_SDRAM.Google Scholar
- Wu, S., Jiang, H., Feng, D., Tian, L., and Mao, B. 2009. Workout: I/O workload outsourcing for boosting RAID reconstruction performance. In Proccedings of the 7th USENIX Conference on File and Storage Technologies (FAST’09). USENIX Association, Berkeley, CA, 239--252. Google Scholar
Digital Library
- Xiang, L., Xu, Y., Lui, J. C., and Chang, Q. 2010. Optimal recovery of single disk failure in RDP code storage systems. In Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’10). ACM, New York, NY, 119--130. Google Scholar
Digital Library
- Xin, Q., Miller, E. L., Schwarz, T., Long, D. D. E., Brandt, S. A., and Litwin, W. 2003. Reliability mechanisms for very large storage systems. In Proceedings of the 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST’03). IEEE, Los Alamitos, CA, 146--156. Google Scholar
Digital Library
- Xin, Q., Miller, E. L., and Schwarz, T. J. E. 2004. Evaluation of distributed recovery in large-scale storage systems. In Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing (HPDC’04). IEEE, Los Alamitos, CA, 172--181. Google Scholar
Digital Library
- Xu, L. and Bruck, J. 1999. X-code: MDS array codes with optimal encoding. IEEE Trans. Inform. Theory 45, 1, 272--276. Google Scholar
Digital Library
Index Terms
A Hybrid Approach to Failed Disk Recovery Using RAID-6 Codes: Algorithms and Performance Evaluation
Recommendations
Optimal recovery of single disk failure in RDP code storage systems
Performance evaluation reviewModern storage systems use thousands of inexpensive disks to meet the storage requirement of applications. To enhance the data availability, some form of redundancy is used. For example, conventional RAID-5 systems provide data availability for single ...
Optimal recovery of single disk failure in RDP code storage systems
SIGMETRICS '10: Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systemsModern storage systems use thousands of inexpensive disks to meet the storage requirement of applications. To enhance the data availability, some form of redundancy is used. For example, conventional RAID-5 systems provide data availability for single ...
RAID triple parity
RAID triple parity (RTP) is a new algorithm for protecting against three-disk failures. It is an extension of the double failure correction Row-Diagonal Parity code. For any number of data disks, RTP uses only three parity disks. This is optimal with ...






Comments