Abstract
Scaling up a RAID-0 volume with added disks can increase its storage capacity and I/O bandwidth simultaneously. For preserving a round-robin data distribution, existing scaling approaches require all the data to be migrated. Such large data migration results in a long redistribution time as well as a negative impact on application performance. In this article, we present a new approach to RAID-0 scaling called FastScale. First, FastScale minimizes data migration, while maintaining a uniform data distribution. It moves only enough data blocks from old disks to fill an appropriate fraction of new disks. Second, FastScale optimizes data migration with access aggregation and lazy checkpoint. Access aggregation enables data migration to have a larger throughput due to a decrement of disk seeks. Lazy checkpoint minimizes the number of metadata writes without compromising data consistency. Using several real system disk traces, we evaluate the performance of FastScale through comparison with SLAS, one of the most efficient existing scaling approaches. The experiments show that FastScale can reduce redistribution time by up to 86.06% with smaller application I/O latencies. The experiments also illustrate that the performance of RAID-0 scaled using FastScale is almost identical to, or even better than, that of the round-robin RAID-0.
- Alemany, J. and Thathachar, J. S. 1997. Random striping news on demand servers. Tech. rep. TR-97-02-02, University of Washington.Google Scholar
- Brigham Young University. 2010. TPC-C Postgres 20 iterations. DTB v1.1. Performance Evaluation Laboratory, Trace distribution center. http://tds.cs.byu.edu/tds/.Google Scholar
- Brinkmann, A., Salzwedel, K., and Scheideler, C. 2000. Efficient, distributed data placement strategies for storage area networks. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures. 119--128. Google Scholar
Digital Library
- Brown, N. 2006. Online RAID-5 resizing. drivers/md/ raid5.c in the source code of Linux Kernel 2.6.18. http://www.kernel.org/.Google Scholar
- Bucy, J., Schindler, J., Schlosser, S., and Ganger, G. 2008. The DiskSim Simulation Environment Version 4.0 Reference Manual. Tech. rep. CMU-PDL-08-101, Carnegie Mellon University.Google Scholar
- Franklin, C. R. and Wong, J. T. 2006. Expansion of RAID subsystems using spare space with immediate access to new space. US Patent 10/033,997.Google Scholar
- Goel, A., Shahabi, C., Yao, S., and Zimmermann, R. 2002. SCADDAR: An efficient randomized technique to reorganize continuous media blocks. In Proceedings of the 18th International Conference on Data Engineering (ICDE). 473--482. Google Scholar
Digital Library
- Gonzalez, J. L. and Cortes, T. 2004. Increasing the capacity of RAID5 by online gradual assimilation. In Proceedings of the International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI). 17--24. Google Scholar
Digital Library
- Gonzalez, J. L. and Cortes, T. 2007. Adaptive data block placement based on deterministic zones (AdaptiveZ). In Lecture Notes in Computer Science, vol. 4804, 1214--1232. Google Scholar
Digital Library
- Hennessy, J. and Patterson, D. 2003. Computer Architecture: A Quantitative Approach, 3rd Ed. Morgan Kaufmann Publishers, Inc., San Francisco, CA. Google Scholar
Digital Library
- Hetzler, S. R. 2008. Data storage array scaling method and system with minimal data movement. US Patent 20080276057.Google Scholar
- Hitachi. 2001. Hard disk drive specifications Ultrastar 36Z15. http://www.hitachigst.com/tech/techlib.nsf/techdocs/85256AB8006A31E587256A7800739FEB/$file/U36Z15 sp10.PDF. Revision 1.0, April.Google Scholar
- Honicky, R. J. and Miller, E. L. 2003. A fast algorithm for online placement and reorganization of replicated data. In Proceedings of the 17th International Parallel and Distributed Processing Symposium. Google Scholar
Digital Library
- Honicky, R. J. and Miller, E. L. 2004. Replication under scalable hashing: A family of algorithms for scalable decentralized data distribution. In Proceedings of the 18th International Parallel and Distributed Processing Symposium.Google Scholar
- Kim, C., Kim, G., and Shin, B. 2001. Volume management in SAN environment. In Proceedings of the 8th International Conference on Parallel and Distributed Systems (ICPADS). 500--505. Google Scholar
Digital Library
- Legg, C. B. 1999. Method of increasing the storage capacity of a level five RAID disk array by adding, in a single step, a new parity block and N-1 new data blocks which respectively reside in new columns, where N is at least two. US Patent: 6000010, December 1999.Google Scholar
- Muller, K. and Vignaux, T. 2009. SimPy 2.0.1 documentation. http://simpy.sourceforge.net/SimPyDocs/index.html.Google Scholar
- Patterson, D. A. 2002. A simple way to estimate the cost of down-time. In Proceedings of the 16th Large Installation Systems Administration Conference (LISA). 185--188. Google Scholar
Digital Library
- Patterson, D. A., Gibson, G. A., and Katz, R. H. 1988. A case for redundant arrays of inexpensive disks (RAID). In Proceedings of the International Conference on Management of Date (SIGMOD). 109--116. Google Scholar
Digital Library
- Santos, J. R., Muntz, R. R., and Ribeiro-Neto, B. A. 2000. Comparing random data allocation and data striping in multimedia servers. ACM SIGMETRICS Perform. Eval. Rev. 28, 1, 44--55. Google Scholar
Digital Library
- Seo, B. and Zimmermann, R. 2005. Efficient disk replacement and data migration algorithms for large disk subsystems. ACM Trans. Storage 1, 3, 316--345. Google Scholar
Digital Library
- Sivathanu, M., Prabhakaran, V., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2004. Improving storage system availability with D-GRAID. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST). Google Scholar
Digital Library
- Storage Performance Council. 2010. http://www.storageperformance.org/home.Google Scholar
- UMass Trace Repository. 2007. OLTP Application I/O and Search Engine I/O. http://traces.cs.umass.edu/index.php/Storage/Storage.Google Scholar
- Weil, S. A., Brandt, S. A., Miller, E. L., and Maltzahn, C. 2006. CRUSH: Controlled, scalable, decentralized placement of replicated data. In Proceedings of the International Conference on Supercomputing (SC). Google Scholar
Digital Library
- Wilkes, J., Golding, R., Staelin, C., and Sullivan, T. 1996. The HP AutoRAID hierarchical storage system. ACM Trans. Comput. Syst. 14, 1, 108--136. Google Scholar
Digital Library
- Wu, S. J., Jiang, H., Feng, D., Tian, L., and Mao, B. 2009. WorkOut: I/O workload outsourcing for boosting the RAID reconstruction performance. In Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST). 239--252. Google Scholar
Digital Library
- Zhang, G. Y., Shu, J. W., Xue, W., and Zheng, W. M. 2007. SLAS: An efficient approach to scaling round-robin striped volumes. ACM Trans. Storage 3, 1, 1--39. Google Scholar
Digital Library
- Zheng, W. M. and Zhang, G. Y. 2011. FastScale: Accelerate RAID scaling by minimizing data migration. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST). Google Scholar
Digital Library
Index Terms
Design and Evaluation of a New Approach to RAID-0 Scaling
Recommendations
Accelerate RAID scaling by reducing disk I/Os and XOR operations
HP3C '19: Proceedings of the 3rd International Conference on High Performance Compilation, Computing and CommunicationsIn order to suffice the storage requirements under the big data environment, scaling method is generally adopted to increase the storage capacity of the storage system with the exponential growth of data in the current. RAID has received wide attention ...
H-Scale: A Fast Approach to Scale Disk Arrays via Hybrid Stripe Deployment
To satisfy the explosive growth of data in large-scale data centers, where redundant arrays of independent disks (RAIDs), especially RAID-5, are widely deployed, effective storage scaling and disk expansion methods are desired. However, a way to reduce ...
Performance Evaluation of 2FT RAID
NBIS '11: Proceedings of the 2011 14th International Conference on Network-Based Information SystemsRecently, there has been increased demand for large-scale online storage for clouds, life logs, and other applications. Previously, we developed the VLSD (Virtual Large-Scale Disks) toolkit for constructing large-scale online storage, using RAID to ...






Comments