skip to main content
research-article

LoneStar RAID: Massive Array of Offline Disks for Archival Systems

Published:07 January 2016Publication History
Skip Abstract Section

Abstract

The need for huge storage archives rises with the ever growing creation of data. With today’s big data and data analytics applications, some of these huge archives become active in the sense that all stored data can be accessed at any time. Running and evolving these archives is a constant tradeoff between performance, capacity, and price. We present the LoneStar RAID, a disk-based storage architecture, which focuses on high reliability, low energy consumption, and cheap reads. It is designed for MAID systems with up to hundreds of disk drives per server and is optimized for “write once, read sometimes” workloads. We use dedicated data and parity disks, and export the data disks as individually accessible buckets. By intertwining disk groups into a two-dimensional RAID and improving single-disk reliability with intradisk redundancy, the system achieves an elastic fault tolerance that can at least recover all 3-disk failures. Furthermore, we integrate a cache to offload parity updates and a journal to track the RAID’s state. The LoneStar RAID scheme provides a mean time to data loss (MTTDL) that competes with today’s erasure codes and is optimized to require only a minimal set of running disk drives.

References

  1. I. F. Adams, E. L. Miller, and M. W. Storer. 2010. Examining energy use in heterogeneous archival storage systems. In Proceedings of the 18th IEEE International Symposium on Modeling, Analysis, and Simulation (MASCOTS’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. N. Bairavasundaram, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, G. R. Goodson, and B. Schroeder. 2008. An analysis of data corruption in the storage stack. ACM Transactions on Storage 4, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. N. Bairavasundaram, G. R. Goodson, S. Pasupathy, and J. Schindler. 2007. An analysis of latent sector errors in disk drives. In Proceedings of the 2007 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Balakrishnan, R. Black, A. Donnelly, P. England, A. Glass, D. Harper, S. Legtchenko, A. Ogus, E. Peterson, and A. Rowstron. 2014. Pelican: A building block for exascale cold data storage. In Proceedings of the 11th Conference on Operating Systems Design and Implementation (OSDI’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Colarelli, D. Grunwald, and M. Neufeld. 2002. The case for massive arrays of idle disks (MAID). In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’02).Google ScholarGoogle Scholar
  6. A. Dholakia, E. Eleftheriou, X.-Y. Hu, I. Iliadis, J. Menon, and K. K. Rao. 2008. A new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors. ACM Transactions on Storage 4, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. W. Felter, A. Hylick, and J. Carter. 2011. Reliability-aware energy management for hybrid storage system. In Proceedings of the 27th IEEE Conference on Mass Storage Systems and Technologies (MSST’11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. Gao, D. Meister, and A. Brinkmann. 2010. Reliability analysis of declustered-parity RAID 6 with disk scrubbing and considering irrecoverable read errors. In Proceedings of the 5th IEEE International Conference on Networking, Architecture, and Storage (NAS’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Grawinkel, G. Best, M. Splietker, and A. Brinkmann. 2014. LoneStar stack: Architecture of a disk-based archival system. In Proceedings of the 9th IEEE International Conference on Networking, Architecture, and Storage (NAS’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Grawinkel, H. Dömer M. Pargmann, and A. Brinkmann. 2011. LoneStar: An energy-aware disk based long-term archival storage system. In Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems (ICPADS’11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Grawinkel, L. Nagel, M. Mäsker, F. Padua, A. Brinkmann, and L. Sorth. 2015. Analysis of the ECMWF storage landscape. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Grawinkel, T. Schäfer, A. Brinkmann, J. Hagemeyer, and M. Porrmann. 2011. Evaluation of applied intra-disk redundancy schemes to improve single disk reliability. In Proceedings of the 19th IEEE International Symposium on Modeling, Analysis, and Simulation (MASCOTS’11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. M. Greenan, X. Li, and J. J. Wylie. 2010. Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery, and tradeoffs. In Proceedings of the 26th IEEE Conference on Mass Storage Systems and Technologies (MSST’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. K. M. Greenan, D. D. E. Long, E. L. Miller, T. Schwarz, and J. J. Wylie. 2008. A spin-up saved is energy earned: Achieving power-efficient, erasure-coded storage. In Proceedings of the 4th Workshop on Hot Topics in System Dependability (HotDep’08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. L. Hafner. 2005. WEAVER codes: Highly fault tolerant erasure codes for storage systems. In Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST’05). San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. L. Hafner and K. K. Rao. 2006. Notes on Reliability Models for Non-MDS Erasure Codes. Technical Report. IBM Research Division, San Jose, CA.Google ScholarGoogle Scholar
  17. D. Hitz, J. Lau, and M. Malcolm. 1994. File system design for an NFS file server appliance. In Proceedings of the USENIX Winter Technical Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Holland and G. A. Gibson. 1992. Parity Declustering for Continuous Operation in Redundant Disk arrays. Vol. 27. ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. Huang, H. Simitci, Y. Xu, A. Ogus, B. Calder, P. Gopalan, J. Li, and S. Yekhanin. 2012. Erasure coding in Windows azure storage. In Proceedings of the Annual Technical Conference (ATC’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. I. Iliadis, R. Haas, X.-Y. Hu, and E. Eleftheriou. 2008. Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. E. Jaffe and S. Kirkpatrick. 2009. Architecture of the Internet Archive. In Proceedings of the ACM Israeli Experimental Systems Conference (SYSTOR’09). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. L. Jones, M. Reid, M. Unangst, G. Gibson, and B. Welch. 2010. Panasas tiered parity architecture. White Paper. Panasas, Inc., Sunnyvale, CA.Google ScholarGoogle Scholar
  23. A. Leventhal. 2009. Triple-parity RAID and beyond. ACM Queue 7, 11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. Li and J. Wang. 2004. EERAID: Energy efficient redundant and inexpensive disk array. In Proceedings of the 11th ACM SIGOPS European Workshop (SIGOPS’04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. Narayanan, A. Donnelly, and A. Rowstron. 2008. Write off-loading: Practical power management for enterprise storage. ACM Transactions on Storage 4, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J.-F. Pâris, A. Amer, D. D. E. Long, and T. Schwarz. 2009. Evaluating the impact of irrecoverable read errors on disk array reliability. In Proceedings of the 15th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC’09). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J.-F. Pâris, T. Schwarz, D. D. E. Long, and A. Amer. 2008. When MTTDLs are not good enough: Providing better estimates of disk array reliability. In Proceedings of the 7th International Information and Telecommunication Technologies Symposium (I2TS’08).Google ScholarGoogle Scholar
  28. J.-F. Pâris, T. J. E. Schwarz, A. Amer, and D. D. E. Long. 2010. Improving disk array reliability through expedited scrubbing. In Proceedings of the 5th IEEE International Conference on Networking, Architecture, and Storage (NAS’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J.-F. Pâris, T. J. E. Schwarz, A. Amer, and D. D. E. Long. 2012. Highly reliable two-dimensional RAID arrays for archival storage. In Proceedings of the 31st IEEE Performance Computing and Communications Conference (IPCCC’12).Google ScholarGoogle ScholarCross RefCross Ref
  30. E. Pinheiro and R. Bianchini. 2004. Energy conservation techniques for disk array-based servers. In Proceedings of the 18th Annual International Conference on Supercomputing (ICS’04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. S. Plank. 2008. The RAID-6 liberation codes. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. S. Plank, K. M. Greenan, and E. L. Miller. 2013. Screaming fast Galois field arithmetic using Intel SIMD extensions. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur. 2013. XORing elephants: Novel erasure codes for big data. In Proceedings of the 39th International Conference on Very Large Data Bases (VLDB’13 Endowment). Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. B. Schroeder, S. Damouras, and P. Gill. 2010. Understanding latent sector errors and how to protect against them. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. B. Schroeder and G. Gibson. 2007. Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. T. J. E. Schwarz and W. A. Burkhard. 1993. Multi-Dimensional Disk Array Reliability. Technical Report. University of California, San Diego, La Jolla, CA.Google ScholarGoogle Scholar
  37. T. J. E. Schwarz, Q. Xin, E. L. Miller, D. D. E. Long, A. Hospodor, and S. W. Ng. 2004. Disk scrubbing in large archival storage systems. In Proceedings of the 12th IEEE International Symposium on Modeling, Analysis, and Simulation (MASCOTS’04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. D. Stodolsky, M. Holland, W. V. Courtright, and G. A. Gibson. 1994. Parity logging disk arrays. ACM Transactions on Computer Systems 12, 3, 206--235. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. M. W. Storer, K. M. Greenan, E. L. Miller, and K. Voruganti. 2008. Pergamum: Replacing tape with energy efficient, reliable, disk-based archival storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. A. Thomasian and M. Blaum. 2009. Higher reliability redundant disk arrays: Organization, operation, and coding. ACM Transactions on Storage 5, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. J. Wang, H. Zhu, and D. Li. 2008. eRAID: Conserving energy in conventional disk-based RAID system. IEEE Transactions on Computers 57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. C. Weddle, M. Oldham, J. Qian, A. Wang, P. Reiher, and G. Kuenning. 2007. PARAID: A gear-shifting power-aware RAID. ACM Transactions on Storage 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. A. Wildani and E. L. Miller. 2010. Semantic data placement for power management in archival storage. In Proceedings of the 5th Petascale Data Storage Workshop (PDSW’10).Google ScholarGoogle Scholar
  44. M. Woitaszek and H. M. Tufo. 2007. Tornado codes for MAID archival storage. In Proceedings of the 24th IEEE Conference on Mass Storage Systems and Technologies (MSST’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. L. Xiao, T. Yu-An, and S. Zhizhuo. 2011. Semi-RAID: A reliable energy-aware RAID data layout for sequential data access. In Proceedings of the 27th IEEE Conference on Mass Storage Systems and Technologies (MSST’11). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. LoneStar RAID: Massive Array of Offline Disks for Archival Systems

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Storage
          ACM Transactions on Storage  Volume 12, Issue 1
          Special Issue on Massive Storage Systems and Technologies (MSST 2015)
          February 2016
          108 pages
          ISSN:1553-3077
          EISSN:1553-3093
          DOI:10.1145/2875132
          Issue’s Table of Contents

          Copyright © 2016 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 7 January 2016
          • Accepted: 1 October 2015
          • Revised: 1 March 2015
          • Received: 1 December 2013
          Published in tos Volume 12, Issue 1

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!