Abstract
A recent article on the reliability of RAID-6 storage systems overlooks certain relevant prior work published in the past 20 years and concludes that the widely used mean time to data loss (MTTDL) metric does not provide accurate results. In this note, we refute this position by invoking uncited relevant prior work and demonstrating that the MTTDL remains a useful metric.
- P. M. Chen, E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson. 1994. RAID: High-performance, reliable secondary storage. ACM Comput. Surv. 26, 2 (June 1994), 145--185. Google Scholar
Digital Library
- A. Dholakia, E. Eleftheriou, X.-Y. Hu, I. Iliadis, J. Menon, and K. K. Rao. 2006a. Analysis of a New Intra-Disk Redundancy Scheme for High-Reliability RAID Storage Systems in the Presence of Unrecoverable Errors. IBM Research Report, RZ 3652.Google Scholar
- A. Dholakia, E. Eleftheriou, X.-Y. Hu, I. Iliadis, J. Menon, and K. K. Rao. 2006b. Analysis of a new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors. ACM SIGMETRICS Perform. Eval. Rev. 34, 1 (June 2006), 373--374. (Proc. ACM SIGMETRICS 2006/Performance 2006, Saint Malo, France). Google Scholar
Digital Library
- A. Dholakia, E. Eleftheriou, X.-Y. Hu, I. Iliadis, J. Menon, and K. K. Rao. 2008. A new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors. ACM Trans. Storage 4, 1, Article 1 (May 2008), 42 pages. Google Scholar
Digital Library
- J. G. Elerath and J. Schindler. 2014. Beyond MTTDL: A closed-form RAID 6 reliability equation. ACM Trans. Storage 10, 2, Article 7 (March 2014), 21 pages. Google Scholar
Digital Library
- K. M. Greenan, J. S. Plank, and J. J. Wylie. 2010. Mean time to meaningless: MTTDL, Markov models, and storage system reliability. In Proceedings of the USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’10). 1--5. Google Scholar
Digital Library
- I. Iliadis, R. Haas, X.-Y. Hu, and E. Eleftheriou. 2011. Disk scrubbing versus intradisk redundancy for RAID storage systems. ACM Trans. Storage 7, 2, Article 5 (July 2011), 42 pages. Google Scholar
Digital Library
- D. A. Patterson, G. Gibson, and R. H. Katz. 1988. A case for redundant arrays of inexpensive disks (RAID). In Proceedings of the ACM SIGMOD International Conference on Management of Data. 109--116. Google Scholar
Digital Library
- A. Thomasian and M. Blaum. 2009. Higher reliability redundant disk arrays: Organization, operation, and coding. ACM Trans. Storage 5, 3, Article 7 (Nov. 2009), 59 pages. Google Scholar
Digital Library
- V. Venkatesan. 2012. Reliability Analysis of Data Storage Systems. Ph.D. Dissertation. EPFL, Lausanne, Switzerland.Google Scholar
- V. Venkatesan and I. Iliadis. 2012a. Effect of Codeword Placement on the Reliability of Erasure Coded Data Storage Systems. IBM Research Report, RZ 3827.Google Scholar
- V. Venkatesan and I. Iliadis. 2012b. A general reliability model for data storage systems. In Proceedings of the 9th International Conference on Quantitative Evaluation of Systems (QEST’12). 209--219. Google Scholar
Digital Library
- V. Venkatesan and I. Iliadis. 2013a. Effect of codeword placement on the reliability of erasure coded data storage systems. In Proceedings of the 10th International Conference on Quantitative Evaluation of Systems (QEST’13). 241--257. Google Scholar
Digital Library
- V. Venkatesan and I. Iliadis. 2013b. Effect of latent errors on the reliability of data storage systems. In Proceedings of the 21th Annual IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’13). 293--297. Google Scholar
Digital Library
- V. Venkatesan, I. Iliadis, C. Fragouli, and R. Urbanke. 2011. Reliability of clustered vs. declustered replica placement in data storage systems. In Proceedings of the 19th Annual IEEE/ACM International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’11). 307--317. Google Scholar
Digital Library
- V. Venkatesan, I. Iliadis, and R. Haas. 2012. Reliability of data storage systems under network rebuild bandwidth constraints. In Proceedings of the 20th Annual IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’12). 189--197. Google Scholar
Digital Library
Index Terms
Rebuttal to “Beyond MTTDL: A Closed-Form RAID-6 Reliability Equation”
Recommendations
A new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors
Today's data storage systems are increasingly adopting low-cost disk drives that have higher capacity but lower reliability, leading to more frequent rebuilds and to a higher risk of unrecoverable media errors. We propose an efficient intradisk ...
Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems
SIGMETRICS '08: Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systemsTwo schemes proposed to cope with unrecoverable or latent media errors and enhance the reliability of RAID systems are examined. The first scheme is the established, widely used disk scrubbing scheme, which operates by periodically accessing disk drives ...
Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems
SIGMETRICS '08Two schemes proposed to cope with unrecoverable or latent media errors and enhance the reliability of RAID systems are examined. The first scheme is the established, widely used disk scrubbing scheme, which operates by periodically accessing disk drives ...






Comments