Abstract
We present the design, implementation, and evaluation of D-GRAID, a gracefully degrading and quickly recovering RAID storage array. D-GRAID ensures that most files within the file system remain available even when an unexpectedly high number of faults occur. D-GRAID achieves high availability through aggressive replication of semantically critical data, and fault-isolated placement of logically related data. D-GRAID also recovers from failures quickly, restoring only live file system data to a hot spare. Both graceful degradation and live-block recovery are implemented in a prototype SCSI-based storage system underneath unmodified file systems, demonstrating that powerful “file-system like” functionality can be implemented within a “semantically smart” disk system behind a narrow block-based interface.
- Acharya, A., Uysal, M., and Saltz, J. 1998. Active disks: Programming model, algorithms and evaluation. In Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VIII, San Jose, CA). Google Scholar
Digital Library
- Alvarez, G. A., Burkhard, W. A., and Cristian, F. 1997. Tolerating multiple failures in RAID architectures with optimal storage and uniform declustering. In Proceedings of the 24th Annual International Symposium on Computer Architecture (ISCA '97, Denver, CO). Google Scholar
Digital Library
- Anderson, D., Chase, J., and Vahdat, A. 2002. Interposed request routing for scalable network storage. ACM Trans. Comput. Syst. 20, 1 (Feb.), 25--48. Google Scholar
Digital Library
- Bitton, D. and Gray, J. 1988. Disk shadowing. In Proceedings of the 14th International Conference on Very Large Data Bases (VLDB 14, Los Angeles, CA). 331--338. Google Scholar
Digital Library
- Boehm, H. and Weiser, M. 1988. Garbage collection in an uncooperative environment. Softw.---Pract. Exper. 18, 9 (Sep.), 807--820. Google Scholar
Digital Library
- Burkhard, W. and Menon, J. 1993. Disk array storage system reliability. In Proceedings of the 23rd International Symposium on Fault-Tolerant Computing (FTCS-23, Toulouse, France). 432--441.Google Scholar
- Chapin, J., Rosenblum, M., Devine, S., Lahiri, T., Teodosiu, D., and Gupta, A. 1995. Hive: Fault containment for shared-memory multiprocessors. In Proceedings of the 15th ACM Symposium on Operating Systems Principles (SOSP '95, Copper Mountain Resort, CO). Google Scholar
Digital Library
- Chen, P. M., Lee, E. K., Gibson, G. A., Katz, R. H., and Patterson, D. A. 1994. RAID: High-performance, reliable secondary storage. ACM Comput. Surv. 26, 2 (June), 145--185. Google Scholar
Digital Library
- Denehy, T. E., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2002. Bridging the information gap in storage protocol stacks. In Proceedings of the USENIX Annual Technical Conference (USENIX '02, Monterey, CA). Google Scholar
Digital Library
- Dowse, I. and Malone, D. 2002. Recent filesystem optimisations on FreeBSD. In Proceedings of the USENIX Annual Technical Conference (FREENIX Track, Monterey, CA). Google Scholar
Digital Library
- EMC Corporation. 2002. Symmetrix Enterprise Information Storage Systems. EMC Corporation, Hopkinton, MA. Web site: http://www.emc.com.Google Scholar
- English, R. M. and Stepanov, A. A. 1992. Loge: A self-organizing disk controller. In Proceedings of the USENIX Winter Technical Conference (USENIX Winter '92, San Francisco, CA).Google Scholar
- Ganger, G. R. 2001. Blurring the line between oses and storage devices. Tech. rep. CMU-CS-01-166. Carnegie Mellon University, Pittsburgh, PA.Google Scholar
- Ganger, G. R., McKusick, M. K., Soules, C. A., and Patt, Y. N. 2000. Soft updates: A solution to the metadata update problem in file systems. ACM Trans. Comput. Syst. 18, 2 (May), 127--153. Google Scholar
Digital Library
- Ganger, G. R., Worthington, B. L., Hou, R. Y., and Patt, Y. N. 1993. Disk subsystem load balancing: Disk striping vs. conventional data placement. In HICSS '93.Google Scholar
- Gibson, G. A., Nagle, D. F., Amiri, K., Butler, J., Chang, F. W., Gobioff, H., Hardin, C., Riedel, E., Rochberg, D., and Zelenka, J. 1998. A cost-effective, high-bandwidth storage architecture. In Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VIII, San Jose, CA). Google Scholar
Digital Library
- Gray, J. 1987. Why do computers stop and what can we do about it? In Proceedings of the 6th International Conference on Reliability and Distributed Databases.Google Scholar
- Gray, J., Horst, B., and Walker, M. 1990. Parity striping of disc arrays: Low-cost reliable storage with acceptable throughput. In Proceedings of the 16th International Conference on Very Large Data Bases (VLDB 16, Brisbane, Australia). 148--159. Google Scholar
Digital Library
- Gribble, S. D. 2001. Robustness in complex systems. In Proceedings of the Eighth Workshop on Hot Topics in Operating Systems (HotOS VIII, Schloss Elmau, Germany). Google Scholar
Digital Library
- Hagmann, R. 1987. Reimplementing the Cedar file system using logging and group commit. In Proceedings of the 11th ACM Symposium on Operating Systems Principles (SOSP '87, Austin, Texas). Google Scholar
Digital Library
- Holland, M., Gibson, G., and Siewiorek, D. 1993. Fast, on-line failure recovery in redundant disk arrays. In Proceedings of the 23rd International Symposium on Fault-Tolerant Computing (FTCS-23, Toulouse, France).Google Scholar
- Hsiao, H.-I. and DeWitt, D. 1990. Chained declustering: A new availability strategy for multiprocessor database machines. In Proceedings of the 6th International Data Engineering Conference. Google Scholar
Digital Library
- IBM. 2001. ServeRAID---recovering from multiple disk failures. Web site: http://www.pc.ibm.com/qtechinfo/MIGR-39144.html.Google Scholar
- Ji, M., Felten, E., Wang, R., and Singh, J. P. 2000. Archipelago: An island-based file system for highly available and scalable Internet services. In Proceedings of the 4th USENIX Windows Symposium. Google Scholar
Digital Library
- Katcher, J. 1997. PostMark: A new file system benchmark. Tech. rep. TR-3022, Network Appliance Inc., Sunnyvale, CA. Web site: http://www.netapp.com.Google Scholar
- Keeton, K. and Wilkes, J. 2002. Automating data dependability. In Proceedings of the 10th ACM-SIGOPS European Workshop. (Saint-Emilion, France). 93--100. Google Scholar
Digital Library
- Kistler, J. and Satyanarayanan, M. 1992. Disconnected operation in the Coda file system. ACM Trans. Comput. Syst. 10, 1 (Feb.), 3--25. Google Scholar
Digital Library
- McKusick, M. K., Joy, W. N., Leffler, S. J., and Fabry, R. S. 1984. A fast file system for UNIX. ACM Trans. Comput. Syst. 2, 3 (Aug.), 181--197. Google Scholar
Digital Library
- Menon, J. and Mattson, D. 1992. Comparison of sparing alternatives for disk arrays. In ISCA '92. (Gold Coast, Australia). Google Scholar
Digital Library
- Microsoft Corporation. 2000. Web site: http://www.microsoft.com/hwdev/.Google Scholar
- Orji, C. U. and Solworth, J. A. 1993. Doubly distorted mirrors. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD '93, Washington, DC). Google Scholar
Digital Library
- Park, A. and Balasubramanian, K. 1986. Providing fault tolerance in parallel secondary storage systems. Tech. rep. CS-TR-057-86. Princeton, University, Princeton, NJ.Google Scholar
- Patterson, D., Gibson, G., and Katz, R. 1988. A case for redundant arrays of inexpensive disks (RAID). In Proceedings of the 1988 ACM SIGMOD Conference on the Management of Data (SIGMOD '88, Chicago, IL). Google Scholar
Digital Library
- Patterson, D. A. 2002. Availability and maintainability ≫ performance: New focus for a new century. Key note speech at FAST '02.Google Scholar
- Popek, G., Walker, B., Chow, J., Edwards, D., Kline, C., Rudisin, G., and Thiel, G. 1981. LOCUS: A network transparent, high reliability distributed system. In Proceedings of the 8th ACM Symposium on Operating Systems Principles (SOSP '81, Pacific Grove, CA). Google Scholar
Digital Library
- Reddy, A. L. N. and Banerjee, P. 1991. Gracefully degradable disk arrays. In Proceedings of the 21st International Symposium on Fault-Tolerant Computing (FTCS-21, Montreal, P.Q. Canada). 401--408.Google Scholar
- Riedel, E., Gibson, G., and Faloutsos, C. 1998. Active storage for large-scale data mining and multimedia. In Proceedings of the 24th International Conference on Very Large Databases (VLDB 24, New York, NY). Google Scholar
Digital Library
- Riedel, E., Kallahalla, M., and Swaminathan, R. 2002. A framework for evaluating storage system security. In Proceedings of the 1st USENIX Symposium on File and Storage Technologies (FAST '02, Monterey, CA). 14--29. Google Scholar
Digital Library
- Rosenblum, M. and Ousterhout, J. 1992. The design and implementation of a log-structured file system. ACM Trans. Comput. Syst. 10, 1 (Feb.), 26--52. Google Scholar
Digital Library
- Rowstron, A. and Druschel, P. 2001. Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP '01, Banff, Alto., Canada). Google Scholar
Digital Library
- Ruemmler, C. and Wilkes, J. 1991. Disk shuffling. Tech. rep. HPL-91-156. Hewlett Packard Laboratories, Palo Alto, CA.Google Scholar
- Saito, Y., Karamanolis, C., Karlsson, M., and Mahalingam, M. 2002. Taming aggressive replication in the Pangaea wide-area file system. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI '02, Boston, MA). Google Scholar
Digital Library
- Savage, S. and Wilkes, J. 1996. AFRAID---a frequently redundant array of independent disks. In Proceedings of the USENIX Annual Technical Conference (USENIX '96, San Diego, CA). 27--39. Google Scholar
Digital Library
- Sivathanu, M., Prabhakaran, V., Popovici, F. I., Denehy, T. E., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2003. Semantically-smart disk systems. In FAST '03 (San Francisco, CA). 73--88. Google Scholar
Digital Library
- Ts'o, T. and Tweedie, S. 2002. Future directions for the Ext2/3 filesystem. In Proceedings of the USENIX Annual Technical Conference (FREENIX Track, Monterey, CA).Google Scholar
- Wang, R., Anderson, T. E., and Patterson, D. A. 1999. Virtual log-based file systems for a programmable disk. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation (OSDI '99, New Orleans, LA). Google Scholar
Digital Library
- Wilkes, J., Golding, R., Staelin, C., and Sullivan, T. 1996. The HP AutoRAID hierarchical storage system. ACM Trans. Comput. Syst. 14, 1 (Feb.), 108--136. Google Scholar
Digital Library
- Wolf, J. L. 1989. The placement optimization problem: A practical solution to the disk file assignment problem. In Proceedings of the 1989 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '89, Berkeley, CA). 1--10. Google Scholar
Digital Library
Index Terms
Improving storage system availability with D-GRAID
Recommendations
Higher reliability redundant disk arrays: Organization, operation, and coding
Parity is a popular form of data protection in redundant arrays of inexpensive/independent disks (RAID). RAID5 dedicates one out of N disks to parity to mask single disk failures, that is, the contents of a block on a failed disk can be reconstructed by ...
Improving storage system availability with D-GRAID
FAST'04: Proceedings of the 3rd USENIX conference on File and storage technologiesWe present the design, implementation, and evaluation of D-GRAID, a gracefully-degrading and quickly-recovering RAID storage array. D-GRAID ensures that most files within the file system remain available even when an unexpectedly high number of faults ...
Awarded Best Student Paper! -- Improving Storage System Availability with D-GRAID
FAST '04: Proceedings of the 3rd USENIX Conference on File and Storage TechnologiesWe present the design, implementation, and evaluation of D-GRAID, a gracefully-degrading and quickly-recovering RAID storage array. D-GRAID ensures that most files within the file system remain available even when an unexpectedly high number of faults ...








Comments