Abstract
With the recent performance improvements in commodity hardware, low-cost commodity server-based storage has become a practical alternative to dedicated-storage appliances. Because of the high failure rate of commodity servers, data redundancy across multiple servers is required in a server-based storage system. However, the extra storage capacity for this redundancy significantly increases the system cost. Although erasure coding (EC) is a promising method to reduce the amount of redundant data, it requires distributing and encoding data among servers. There remains a need to reduce the performance impact of these processes involving much network traffic and processing overhead. Especially, the performance impact becomes significant for random-intensive applications. In this article, we propose a new lightweight redundancy control for server-based storage. Our proposed method uses a new local filesystem-based approach that avoids distributing data by adding data redundancy to locally stored user data. Our method switches the redundancy method of user data between replication and EC according to workloads to improve capacity efficiency while achieving higher performance. Our experiments show up to 230% better online-transaction-processing performance for our method compared with CephFS, a widely used alternative system. We also confirmed that our proposed method prevents unexpected performance degradation while achieving better capacity efficiency.
- [1] . 2019. File systems unfit as distributed storage backends: Lessons from 10 years of Ceph evolution. In Proceedings of the 27th ACM Symposium on Operating Systems Principles
(SOSP'19) .ACM , 353–369. Google ScholarDigital Library
- [2] . 2019. Welcome to Apache ZooKeeper. Retrieved from https://zookeeper.apache.org/.Google Scholar
- [3] . 2018. The Datacenter as a Computer: Designing Warehouse-scale Machines (3rd ed.). Morgan & Claypool Publishers. Google Scholar
Digital Library
- [4] . 2012. Cloudstep: A step-by-step decision process to support legacy application migration to the cloud. In Proceedings of the IEEE 6th International Workshop on the Maintenance and Evolution of Service-oriented and Cloud-based Systems
(MESOCA'12) .IEEE , 7–16.Google ScholarCross Ref
- [5] . 2007. The Hadoop distributed file system: Architecture and design. Retrieved from http://svn.apache.org/repos/asf/hadoop/common/tags/release-0.16.3/docs/hdfs_design.pdfGoogle Scholar
- [6] . 2005. Understanding the Linux Kernel: From I/O Ports to Process Management. O'Reilly Media, Inc. Google Scholar
Digital Library
- [7] . 2012. GlusterFS One Storage Server to Rule Them All.
Technical Report . Los Alamos National Lab (LANL).Google Scholar - [8] . 2017. Curator: Self-managing storage for enterprise clusters. In Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation
(NSDI'17) . 51–66. Google ScholarDigital Library
- [9] . 2014. Parity logging with reserved space: Towards efficient updates and recovery in erasure-coded clustered storage. In Proceedings of the 12th USENIX Conference on File and Storage Technologies
(FAST'14) . 163–176. Google ScholarDigital Library
- [10] . 2011. Design implications for enterprise storage systems via multi-dimensional trace analysis. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles
(SOSP'15) . 43–56. Google ScholarDigital Library
- [11] . 2010. Open by handle. Retrieved from https://lwn.net/Articles/375888/.Google Scholar
- [12] . 2010. Punching holes in files. Retrieved from https://lwn.net/Articles/415889/.Google Scholar
- [13] . 2020. intel.com. Retrieved from https://www.intel.co.jp/.Google Scholar
- [14] . 2011. DiskReduce: Replication as a Prelude to Erasure Coding in Data-intensive Scalable Computing. Technical Report CMU-PDL-11-112, Carnegie Mellon Univsersity, Parallel Data Laboratory.Google Scholar
- [15] . 2010. Storage architecture and challenges. Talk at the Google Faculty Summit.Google Scholar
- [16] . 2019. Lightweight dynamic redundancy control for server-based storage. In Proceedings of the 38th International Symposium on Reliable Distributed Systems
(SRDS'19) .IEEE , 353–369.Google ScholarCross Ref
- [17] . 2018. A method to adapt storage protocol stack using custom file metadata to commodity Linux servers. Int. J. Smart Comput. Artif. Intell 2, 1 (2018), 23–42.Google Scholar
Cross Ref
- [18] . 2010. DiskReduce v2.0 for HDFS. Retrieved from https://www.pdl.cmu.edu/DiskReduce/talks/2010-Jan-28-OpenCirrus-Gibson.pdf.Google Scholar
- [19] . 2012. Erasure coding in Windows Azure storage. In Proceedings of the USENIX Annual Technical Conference
(USENIX ATC'12) . 15–26. Google ScholarDigital Library
- [20] . 2018. The open group base specifications issue 7, 2018 edition (IEEE Std 1003.1™-2017).Google Scholar
- [21] . 2019. Basic performance measurements of the Intel Optane DC persistent memory module. arXiv preprint arXiv:1903.05714 (2019).Google Scholar
- [22] . 2016. Efficient data tiering in GlusterFS. In Proceedings of the Storage Developer Conference
(SDC'16) .Google Scholar - [23] . 2020. The Linux Kernel archives. Retrieved from https://www.kernel.org.Google Scholar
- [24] . 2012. Rethinking erasure codes for cloud file systems: Minimizing I/O for recovery and degraded reads. In Proceedings of the 10th USENIX Conference on File and Storage Technologies
(FAST'12) . 20. Google ScholarDigital Library
- [25] . 2009. Cloud computing: Today and tomorrow. J. Object Technol. 8, 1 (2009), 65–72.Google Scholar
Cross Ref
- [26] . 2017. Understanding system characteristics of online erasure coding on scalable, distributed and large-scale SSD array systems. In Proceedings of the IEEE International Symposium on Workload Characterization
(IISWC'17) .IEEE , 76–86.Google ScholarCross Ref
- [27] . 2017. Understanding storage traffic characteristics on enterprise virtual desktop infrastructure. In Proceedings of the 10th ACM International Systems and Storage Conference
(SYSTOR'17) .ACM , 1–11. Google ScholarDigital Library
- [28] . 2017. Understanding write behaviors of storage backends in Ceph object store. In Proceedings of the IEEE International Conference on Massive Storage Systems and Technology
(MSST'17) , Vol. 10.Google Scholar - [29] . 2008. Measurement and analysis of large-scale network file system workloads. In Proceedings of the USENIX Annual Technical Conference
(USENIX ATC'08) , Vol. 1. 5–2. Google ScholarDigital Library
- [30] . 2017. Enabling efficient and reliable transition from replication to erasure coding for clustered file systems. IEEE Trans. Parallel Distrib. Syst. 28, 9 (2017), 2500–2513.Google Scholar
Digital Library
- [31] . 2020. libfuse. Retrieved from https://github.com/libfuse.Google Scholar
- [32] . 2012. Scc: Cluster storage provisioning informed by application characteristics and SLAs. In Proceedings of the 10th USENIX Conference on File and Storage Technologies
(FAST'12) . Google ScholarDigital Library
- [33] . 2007. The new ext4 filesystem: Current status and future plans. In Proceedings of the Linux Symposium, Vol. 2. 21–33.Google Scholar
- [34] . 2020. Practical quick file server migration. ACM Trans. Stor. 16, 2 (2020), 1–30. Google Scholar
Digital Library
- [35] . 2012. Server-based storage. In Proceedings of the Flash Memory Summit.Google Scholar
- [36] . 2012. A study of practical deduplication. ACM Trans. Stor. 7, 4 (2012). Google Scholar
Digital Library
- [37] . 2014. Blizzard: Fast, cloud-scale block storage for cloud-oblivious applications. In Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation
(NSDI'14) . 257–273. Google ScholarDigital Library
- [38] . 2020. Microsoft Azure Lsv2-series. Retrieved from https://docs.microsoft.com/en-us/azure/virtual-machines/lsv2-series.Google Scholar
- [39] . 2016. Modern erasure codes for distributed storage systems. In Proceedings of the Storage Developer Conference
(SDC'16) .Google Scholar - [40] . 2008. Write off-loading: Practical power management for enterprise storage. ACM Trans. Stor. 4, 3 (2008), 1–23. Google Scholar
Digital Library
- [41] . 2017. Directory-aware file system backup to object storage for fast on-demand restore. Int. J. Smart Comput. Artif. Intell. 1, 1 (2017), 1–19.Google Scholar
Cross Ref
- [42] . 2018. Hybrid storage systems: A survey of architectures and algorithms. IEEE Access 6 (2018), 13385–13406.Google Scholar
Cross Ref
- [43] . 2013. Erasure codes for storage systems: A brief primer. Login: The USENIX Mag. 38, 6 (2013), 44–50.Google Scholar
- [44] . 2007. Jerasure: A Library in C/C++ Facilitating Erasure Coding for Storage Applications. Technical Report CS-07–603, University of Tennessee.Google Scholar
- [45] . 2017. SafeFS: A modular architecture for secure user-space file systems: One FUSE to rule them all. In Proceedings of the 10th ACM International Systems and Storage Conference
(SYSTOR'17) .ACM , 9. Google ScholarDigital Library
- [46] . 2015. Having your cake and eating it too: Jointly optimal erasure codes for I/O, storage, and network-bandwidth. In Proceedings of the 13th USENIX Conference on File and Storage Technologies
(FAST'15) . 81–94. Google ScholarDigital Library
- [47] . 2003. The Google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles
(SOSP'03) .ACM , 29–43. Google ScholarDigital Library
- [48] . 2020. Amazon EC2 I3 Instances. Retrieved from https://aws.amazon.com/ec2/instance-types/i3/.Google Scholar
- [49] . 2020. PMEM file-system in user-space. In Proceedings of the Storage Developer Conference
(SDC'20) .Google Scholar - [50] . 2017. SPEC SFS®2014. Retrieved from http://spec.org/sfs2014/.Google Scholar
- [51] . 2016. Filebench: A flexible framework for file system benchmarking. login: The USENIX Magazine 41, 1 (2016), 6–12.Google Scholar
- [52] . 2017. Ceph Cookbook: Practical Recipes to Design, Implement, Operate, and Manage Ceph Storage Systems. Packt Publishing Ltd. Google Scholar
Digital Library
- [53] . 2018. Clay codes: Moulding MDS codes to yield an MSR code. In 16th USENIX Conference on File and Storage Technologies
(FAST'18) . 139–154. Google ScholarDigital Library
- [54] . 2017. To FUSE or not to FUSE: Performance of user-space file systems. In 15th USENIX Conference on File and Storage Technologies
(FAST'17) . 59–72. Google ScholarDigital Library
- [55] . 2018. VMware vSAN 6.7 Technical Overview. https://storagehub.vmware.com/static/media/65dde536-5a57-4226-87fd-1c08fd63516a.pdf.Google Scholar
- [56] . 2012. Characteristics of backup workloads in production systems.. In 10th USENIX Conference on File and Storage Technologies
(FAST'12) . 1–16. Google ScholarDigital Library
- [57] . 2017. DSC: Dynamic stripe construction for asynchronous encoding in clustered file system. In Proceedings of IEEE INFOCOM 2017 International Conference on Computer Communications. IEEE, 1–9.Google Scholar
Cross Ref
- [58] . 2006. Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation
(OSDI'06) .USENIX Association , 307–320. Google ScholarDigital Library
- [59] . 2017. Mirador: An active control plane for datacenter storage. In 15th USENIX Conference on File and Storage Technologies
(FAST'17) . 213–228. Google ScholarDigital Library
- [60] . 2013. XFS.org. https://xfs.org/.Google Scholar
- [61] . 2015. A tale of two erasure codes in HDFS. In 13th USENIX Conference on File and Storage Technologies
(FAST'15) . 213–226. Google ScholarDigital Library
- [62] . 2015. Performance analysis of NVMe SSDs and their implication on realworld databases. In Proceedings of the 8th ACM International Systems and Storage Conference
(SYSTOR 2015) .ACM . Google ScholarDigital Library
- [63] . 2020. Spool: Reliable virtualized NVMe storage pool in public cloud infrastructure. In 2020 USENIX Annual Technical Conference
(USENIX ATC'20) . 97–110. Google ScholarDigital Library
- [64] . 2016. Write skew and zipf distribution: Evidence and implications. ACM Transactions on Storage
(TOS) 12, 4 (2016), 21. Google ScholarDigital Library
- [65] . 2020. PBS: An efficient erasure-coded block storage system based on speculative partial writes. ACM Transactions on Storage (TOS) 16, 1 (2020), 1–25. Google Scholar
Digital Library
- [66] . 2019. Fast erasure coding for data storage: A comprehensive study of the acceleration techniques. In 17th USENIX Conference on File and Storage Technologies
(FAST'19) . 317–329. Google ScholarDigital Library
Index Terms
Lightweight Dynamic Redundancy Control with Adaptive Encoding for Server-based Storage
Recommendations
Reliability Analysis of Highly Redundant Distributed Storage Systems with Dynamic Refuging
PDP '15: Proceedings of the 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based ProcessingIn recent data centres, large-scale storage systems storing big data comprise thousands of large-capacity drives. Our goal is to establish a method for building highly reliable storage systems using more than a thousand low-cost large-capacity drives. ...
Insight into redundancy schemes in DHTs
In order to provide high data availability in peer-to-peer (P2P) DHTs, proper data redundancy schemes are required. This paper compares two popular schemes: replication and erasure coding. Unlike previous comparison, we take user download behavior into ...
Verification of data redundancy in cloud storage
Cloud Computing '13: Proceedings of the 2013 international workshop on Security in cloud computingData redundancy is key to preventing data loss and achieving fault-tolerance in cloud storage. Cloud storage provider usually charges users according to the requested level of redundancy. However, a cloud provider may fail to offer the committed level ...






Comments