Abstract
The need for huge storage archives rises with the ever growing creation of data. With today’s big data and data analytics applications, some of these huge archives become active in the sense that all stored data can be accessed at any time. Running and evolving these archives is a constant tradeoff between performance, capacity, and price. We present the LoneStar RAID, a disk-based storage architecture, which focuses on high reliability, low energy consumption, and cheap reads. It is designed for MAID systems with up to hundreds of disk drives per server and is optimized for “write once, read sometimes” workloads. We use dedicated data and parity disks, and export the data disks as individually accessible buckets. By intertwining disk groups into a two-dimensional RAID and improving single-disk reliability with intradisk redundancy, the system achieves an elastic fault tolerance that can at least recover all 3-disk failures. Furthermore, we integrate a cache to offload parity updates and a journal to track the RAID’s state. The LoneStar RAID scheme provides a mean time to data loss (MTTDL) that competes with today’s erasure codes and is optimized to require only a minimal set of running disk drives.
- I. F. Adams, E. L. Miller, and M. W. Storer. 2010. Examining energy use in heterogeneous archival storage systems. In Proceedings of the 18th IEEE International Symposium on Modeling, Analysis, and Simulation (MASCOTS’10). Google Scholar
Digital Library
- L. N. Bairavasundaram, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, G. R. Goodson, and B. Schroeder. 2008. An analysis of data corruption in the storage stack. ACM Transactions on Storage 4, 3. Google Scholar
Digital Library
- L. N. Bairavasundaram, G. R. Goodson, S. Pasupathy, and J. Schindler. 2007. An analysis of latent sector errors in disk drives. In Proceedings of the 2007 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’07). Google Scholar
Digital Library
- S. Balakrishnan, R. Black, A. Donnelly, P. England, A. Glass, D. Harper, S. Legtchenko, A. Ogus, E. Peterson, and A. Rowstron. 2014. Pelican: A building block for exascale cold data storage. In Proceedings of the 11th Conference on Operating Systems Design and Implementation (OSDI’14). Google Scholar
Digital Library
- D. Colarelli, D. Grunwald, and M. Neufeld. 2002. The case for massive arrays of idle disks (MAID). In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’02).Google Scholar
- A. Dholakia, E. Eleftheriou, X.-Y. Hu, I. Iliadis, J. Menon, and K. K. Rao. 2008. A new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors. ACM Transactions on Storage 4, 1. Google Scholar
Digital Library
- W. Felter, A. Hylick, and J. Carter. 2011. Reliability-aware energy management for hybrid storage system. In Proceedings of the 27th IEEE Conference on Mass Storage Systems and Technologies (MSST’11). Google Scholar
Digital Library
- Y. Gao, D. Meister, and A. Brinkmann. 2010. Reliability analysis of declustered-parity RAID 6 with disk scrubbing and considering irrecoverable read errors. In Proceedings of the 5th IEEE International Conference on Networking, Architecture, and Storage (NAS’10). Google Scholar
Digital Library
- M. Grawinkel, G. Best, M. Splietker, and A. Brinkmann. 2014. LoneStar stack: Architecture of a disk-based archival system. In Proceedings of the 9th IEEE International Conference on Networking, Architecture, and Storage (NAS’14). Google Scholar
Digital Library
- M. Grawinkel, H. Dömer M. Pargmann, and A. Brinkmann. 2011. LoneStar: An energy-aware disk based long-term archival storage system. In Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems (ICPADS’11). Google Scholar
Digital Library
- M. Grawinkel, L. Nagel, M. Mäsker, F. Padua, A. Brinkmann, and L. Sorth. 2015. Analysis of the ECMWF storage landscape. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). Google Scholar
Digital Library
- M. Grawinkel, T. Schäfer, A. Brinkmann, J. Hagemeyer, and M. Porrmann. 2011. Evaluation of applied intra-disk redundancy schemes to improve single disk reliability. In Proceedings of the 19th IEEE International Symposium on Modeling, Analysis, and Simulation (MASCOTS’11). Google Scholar
Digital Library
- K. M. Greenan, X. Li, and J. J. Wylie. 2010. Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery, and tradeoffs. In Proceedings of the 26th IEEE Conference on Mass Storage Systems and Technologies (MSST’10). Google Scholar
Digital Library
- K. M. Greenan, D. D. E. Long, E. L. Miller, T. Schwarz, and J. J. Wylie. 2008. A spin-up saved is energy earned: Achieving power-efficient, erasure-coded storage. In Proceedings of the 4th Workshop on Hot Topics in System Dependability (HotDep’08). Google Scholar
Digital Library
- J. L. Hafner. 2005. WEAVER codes: Highly fault tolerant erasure codes for storage systems. In Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST’05). San Francisco, CA. Google Scholar
Digital Library
- J. L. Hafner and K. K. Rao. 2006. Notes on Reliability Models for Non-MDS Erasure Codes. Technical Report. IBM Research Division, San Jose, CA.Google Scholar
- D. Hitz, J. Lau, and M. Malcolm. 1994. File system design for an NFS file server appliance. In Proceedings of the USENIX Winter Technical Conference. Google Scholar
Digital Library
- M. Holland and G. A. Gibson. 1992. Parity Declustering for Continuous Operation in Redundant Disk arrays. Vol. 27. ACM, New York, NY. Google Scholar
Digital Library
- C. Huang, H. Simitci, Y. Xu, A. Ogus, B. Calder, P. Gopalan, J. Li, and S. Yekhanin. 2012. Erasure coding in Windows azure storage. In Proceedings of the Annual Technical Conference (ATC’12). Google Scholar
Digital Library
- I. Iliadis, R. Haas, X.-Y. Hu, and E. Eleftheriou. 2008. Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’08). Google Scholar
Digital Library
- E. Jaffe and S. Kirkpatrick. 2009. Architecture of the Internet Archive. In Proceedings of the ACM Israeli Experimental Systems Conference (SYSTOR’09). Google Scholar
Digital Library
- L. Jones, M. Reid, M. Unangst, G. Gibson, and B. Welch. 2010. Panasas tiered parity architecture. White Paper. Panasas, Inc., Sunnyvale, CA.Google Scholar
- A. Leventhal. 2009. Triple-parity RAID and beyond. ACM Queue 7, 11. Google Scholar
Digital Library
- D. Li and J. Wang. 2004. EERAID: Energy efficient redundant and inexpensive disk array. In Proceedings of the 11th ACM SIGOPS European Workshop (SIGOPS’04). Google Scholar
Digital Library
- D. Narayanan, A. Donnelly, and A. Rowstron. 2008. Write off-loading: Practical power management for enterprise storage. ACM Transactions on Storage 4, 3. Google Scholar
Digital Library
- J.-F. Pâris, A. Amer, D. D. E. Long, and T. Schwarz. 2009. Evaluating the impact of irrecoverable read errors on disk array reliability. In Proceedings of the 15th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC’09). Google Scholar
Digital Library
- J.-F. Pâris, T. Schwarz, D. D. E. Long, and A. Amer. 2008. When MTTDLs are not good enough: Providing better estimates of disk array reliability. In Proceedings of the 7th International Information and Telecommunication Technologies Symposium (I2TS’08).Google Scholar
- J.-F. Pâris, T. J. E. Schwarz, A. Amer, and D. D. E. Long. 2010. Improving disk array reliability through expedited scrubbing. In Proceedings of the 5th IEEE International Conference on Networking, Architecture, and Storage (NAS’10). Google Scholar
Digital Library
- J.-F. Pâris, T. J. E. Schwarz, A. Amer, and D. D. E. Long. 2012. Highly reliable two-dimensional RAID arrays for archival storage. In Proceedings of the 31st IEEE Performance Computing and Communications Conference (IPCCC’12).Google Scholar
Cross Ref
- E. Pinheiro and R. Bianchini. 2004. Energy conservation techniques for disk array-based servers. In Proceedings of the 18th Annual International Conference on Supercomputing (ICS’04). Google Scholar
Digital Library
- J. S. Plank. 2008. The RAID-6 liberation codes. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). Google Scholar
Digital Library
- J. S. Plank, K. M. Greenan, and E. L. Miller. 2013. Screaming fast Galois field arithmetic using Intel SIMD extensions. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). Google Scholar
Digital Library
- M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur. 2013. XORing elephants: Novel erasure codes for big data. In Proceedings of the 39th International Conference on Very Large Data Bases (VLDB’13 Endowment). Google Scholar
Digital Library
- B. Schroeder, S. Damouras, and P. Gill. 2010. Understanding latent sector errors and how to protect against them. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). Google Scholar
Digital Library
- B. Schroeder and G. Gibson. 2007. Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST’07). Google Scholar
Digital Library
- T. J. E. Schwarz and W. A. Burkhard. 1993. Multi-Dimensional Disk Array Reliability. Technical Report. University of California, San Diego, La Jolla, CA.Google Scholar
- T. J. E. Schwarz, Q. Xin, E. L. Miller, D. D. E. Long, A. Hospodor, and S. W. Ng. 2004. Disk scrubbing in large archival storage systems. In Proceedings of the 12th IEEE International Symposium on Modeling, Analysis, and Simulation (MASCOTS’04). Google Scholar
Digital Library
- D. Stodolsky, M. Holland, W. V. Courtright, and G. A. Gibson. 1994. Parity logging disk arrays. ACM Transactions on Computer Systems 12, 3, 206--235. Google Scholar
Digital Library
- M. W. Storer, K. M. Greenan, E. L. Miller, and K. Voruganti. 2008. Pergamum: Replacing tape with energy efficient, reliable, disk-based archival storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). Google Scholar
Digital Library
- A. Thomasian and M. Blaum. 2009. Higher reliability redundant disk arrays: Organization, operation, and coding. ACM Transactions on Storage 5, 3. Google Scholar
Digital Library
- J. Wang, H. Zhu, and D. Li. 2008. eRAID: Conserving energy in conventional disk-based RAID system. IEEE Transactions on Computers 57. Google Scholar
Digital Library
- C. Weddle, M. Oldham, J. Qian, A. Wang, P. Reiher, and G. Kuenning. 2007. PARAID: A gear-shifting power-aware RAID. ACM Transactions on Storage 3. Google Scholar
Digital Library
- A. Wildani and E. L. Miller. 2010. Semantic data placement for power management in archival storage. In Proceedings of the 5th Petascale Data Storage Workshop (PDSW’10).Google Scholar
- M. Woitaszek and H. M. Tufo. 2007. Tornado codes for MAID archival storage. In Proceedings of the 24th IEEE Conference on Mass Storage Systems and Technologies (MSST’07). Google Scholar
Digital Library
- L. Xiao, T. Yu-An, and S. Zhizhuo. 2011. Semi-RAID: A reliable energy-aware RAID data layout for sequential data access. In Proceedings of the 27th IEEE Conference on Mass Storage Systems and Technologies (MSST’11). Google Scholar
Digital Library
Index Terms
LoneStar RAID: Massive Array of Offline Disks for Archival Systems
Recommendations
Lonestar: An Energy-Aware Disk Based Long-Term Archival Storage System
ICPADS '11: Proceedings of the 2011 IEEE 17th International Conference on Parallel and Distributed SystemsWe present the architecture for an disk based archival storage system and propose a new RAID scheme that is designed for "write once, read sometimes" workloads. By intertwining parity groups into a multi-dimensional RAID and improving the single disk ...
Flash-Aware RAID Techniques for Dependable and High-Performance Flash Memory SSD
Solid-state disks (SSDs), which are composed of multiple NAND flash chips, are replacing hard disk drives (HDDs) in the mass storage market. The performances of SSDs are increasing due to the exploitation of parallel I/O architectures. However, ...
Lone Star Stack: Architecture of a Disk-Based Archival System
NAS '14: Proceedings of the 2014 9th IEEE International Conference on Networking, Architecture, and StorageThe need for huge storage systems rises with the ever growing creation of data. With growing capacities and shrinking prices, "write once read sometimes" workloads become more common. New data is constantly added, rarely updated or deleted, and every ...






Comments