Abstract
Elastic storage systems can be expanded or contracted to meet current demand, allowing servers to be turned off or used for other tasks. However, the usefulness of an elastic distributed storage system is limited by its agility: how quickly it can increase or decrease its number of servers. Due to the large amount of data they must migrate during elastic resizing, state of the art designs usually have to make painful trade-offs among performance, elasticity, and agility.
This article describes the state of the art in elastic storage and a new system, called SpringFS, that can quickly change its number of active servers, while retaining elasticity and performance goals. SpringFS uses a novel technique, termed bounded write offloading, that restricts the set of servers where writes to overloaded servers are redirected. This technique, combined with the read offloading and passive migration policies used in SpringFS, minimizes the work needed before deactivation or activation of servers. Analysis of real-world traces from Hadoop deployments at Facebook and various Cloudera customers and experiments with the SpringFS prototype confirm SpringFS’s agility, show that it reduces the amount of data migrated for elastic resizing by up to two orders of magnitude, and show that it cuts the percentage of active servers required by 67--82%, outdoing state-of-the-art designs by 6--120%.
- AMPLab. 2013. Algorithms, Machines, People Laboratory, Berkley. http://amplab.cs.berkeley.edu.Google Scholar
- Hrishikesh Amur, James Cipar, Varun Gupta, Gregory R. Ganger, Michael A. Kozuch, and Karsten Schwan. 2010. Robust and flexible power-proportional storage. In Proceedings of the ACM Symposium on Cloud Computing. 217--228. Google Scholar
Digital Library
- Peter Bodik, Michael Armbrust, Kevin Canini, Armando Fox, Michael Jordan, and David Patterson. 2008. A Case for Adaptive Datacenters to Conserve Energy and Improve reliability. University of California at Berkeley, Tech. Rep. UCB/EECS-2008-127.Google Scholar
- Dhruba Borthakur. 2007. The Hadoop Distributed File System: Architecture and Design. The Apache Software Foundation.Google Scholar
- Randal E. Bryant. 2007. Data-intensive supercomputing: The case for DISC. Tech. rep., Carnegie Mellon University.Google Scholar
- Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2008. BigTable: A distributed storage system for structured data. ACM Trans. Comput. Syst. 26, 2, 1--26. Google Scholar
Digital Library
- Yanpei Chen, Sara Alspaugh, and Randy Katz. 2012. Interactive analytical processing in big data systems: A cross industry study of MapReduce workloads. Proc. VLDB Endow. 5, 12, 1802--1813. Google Scholar
Digital Library
- Yanpei Chen, Archana Ganapathi, Rean Griffith, and Randy Katz. 2011. The case for evaluating MapReduce performance using workload suites. In Proceedings of the IEEE 9th International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS). Google Scholar
Digital Library
- Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Commun. ACM 51, 1, 107--108. Google Scholar
Digital Library
- Sanjay Ghemawat, Howard Gobioff, and Shun tak Leung. 2003. The Google file system. In Proceedings of the 19th ACM Symposium on Operating System Principles (SOSP). 29--43. Google Scholar
Digital Library
- Daniel Gmach, Jerry Rolia, Ludmila Cherkasova, and Alfons Kemper. 2007. Workload analysis and demand prediction of enterprise data center applications. In Proceedings of the IEEE 10th International Symposium or Workload Characterization (IISWC). Google Scholar
Digital Library
- Hadoop. 2012. The Apache Hadoop project. http://hadoop.apache.org.Google Scholar
- Larry Hardesty. 2012. MIT, Intel unveil new initiatives addressing ’Big Data’. http://web.mit.edu/newsoffice/2012/big-data-csail-intel-center-0531.html.Google Scholar
- Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder, Kunal Talwar, and Andrew Goldberg. 2009. Quincy: Fair scheduling for distributed computing clusters. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP’09). ACM, New York, NY, 261--276. Google Scholar
Digital Library
- ISTC-CC. 2013. Intel science and technology center - cloud computing. www.istc-cc.cmu.edu.Google Scholar
- Jacob Leverich and Christos Kozyrakis. 2009. On the energy (in)efficiency of Hadoop clusters. In Proceedings of the Workshop on Power-Aware Computing and System HotPower.Google Scholar
- Minghong Lin, Adam Wierman, Lachlan L. H. Andrew, and Eno Thereska. 2011. Dynamic right-sizing for power-proportional data centers. In Proceedings of the INFOCOM.Google Scholar
Cross Ref
- Dushyanth Narayanan, Austin Donnelly, and Antony Rowstron. 2008a. Write off-loading: Practical power management for enterprise storage. In Proceedings of the USENIX Conference on File and Storage Technologies. USENIX Association, Berkeley, CA, 1--15. Google Scholar
Digital Library
- Dushyanth Narayanan, Austin Donnelly, Eno Thereska, Sameh Elnikety, and Antony Rowstron. 2008b. Everest: Scaling down peak loads through I/O off-loading. In Proceedings of the 8th USENIX Symposium on Operating Systems and Implementation (OSD). Google Scholar
Digital Library
- Yasushi Saito, Svend Frølund, Alistair Veitch, Arif Merchant, and Susan Spence. 2004. FAB: Building distributed enterprise disk arrays from commodity components. In Proceedings of the 11th International Conference on Architechtural Support for Programming Languages and Operating System. 48--58. Google Scholar
Digital Library
- Eno Thereska, Austin Donnelly, and Dushyanth Narayanan. 2011. Sierra: Practical power-proportionality for data center storage. In Proceedings of the 6th Conference on Computer Systems (EuroSys). 169--182. Google Scholar
Digital Library
- Nedeljko Vasić, Martin Barisits, Vincent Salzgeber, and Dejan Kostic. 2009. Making cluster applications energy-aware. In Proceedings of the Workshop on Automated Control for Datacenters and Clouds. 37--42. Google Scholar
Digital Library
- Charles Weddle, Mathew Oldham, Jin Qian, An-I Andy Wang, Peter L. Reiher, and Geoffrey H. Kuenning. 2007. PARAID: A gear-shifting power-aware RAID. ACM Trans. Storage 3, 3, Article 13. Google Scholar
Digital Library
- E. R. Zayas. 1991. AFS-3 programmer’s reference: Architectural overview. Tech. Rep. Transarc Corporation.Google Scholar
Index Terms
Agility and Performance in Elastic Distributed Storage
Recommendations
Optimizing Local File Accesses for FUSE-Based Distributed Storage
SCC '12: Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and AnalysisModern distributed file systems can store huge amounts of information while retaining the benefits of high reliability and performance. Many of these systems are prototyped with FUSE, a popular framework for implementing user-level file systems. ...
Reducing Storage Overhead with Small Write Bottleneck Avoiding in Cloud RAID System
GRID '12: Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid ComputingCloud storage systems commonly use replication of stored data sets to ensure high reliability and availability. However, the high storage overhead of replication becomes increasingly unacceptable with the explosive growth of data stored in cloud. Some ...
Evaluating and Optimizing the Storage Strategies for an Elastic Object Store
WISA '13: Proceedings of the 2013 10th Web Information System and Application ConferenceIn this paper, we focus on evaluating different storage strategies of different kinds of data and their index stored in Punt Table. Punt Table is a NoSQL database designed for elastic objects storage. Punt Table uses a schema-free way to store and get ...






Comments