Abstract
Host-managed shingled magnetic recording drives (HM-SMR) are advantageous in capacity to harness the explosive growth of data. For key-value (KV) stores based on log-structured merge trees (LSM-trees), the HM-SMR drive is an ideal solution owning to its capacity, predictable performance, and economical cost. However, building an LSM-tree-based KV store on HM-SMR drives presents severe challenges in maintaining the performance and space utilization efficiency due to the redundant cleaning processes for applications and storage devices (i.e., compaction and garbage collection). To eliminate the overhead of on-disk garbage collection (GC) and improve compaction efficiency, this article presents GearDB, a GC-free KV store tailored for HM-SMR drives. GearDB improves the write performance and space efficiency through three new techniques: a new on-disk data layout, compaction windows, and a novel gear compaction algorithm. We further augment the read performance of GearDB with a new SSTable layout and read ahead mechanism. We implement GearDB with LevelDB, and use zonefs to access a real HM-SMR drive. Our extensive experiments confirm that GearDB achieves both high performance and space efficiency, i.e., on average 1.7× and 1.5× better than LevelDB in random write and read, respectively, with up to 86.9% space efficiency.
- [1] . 2015. Skylight—A window on shingled disk operation. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). 135–149.Google Scholar
Digital Library
- [2] . 2010. Design issues for a shingled write disk system. In Proceedings of the IEEE 26th Symposium on Massive Storage Systems and Technology (MSST’10).Google Scholar
Digital Library
- [3] . 2019. From open-channel SSDs to zoned namespaces. In Proceedings of the Linux Storage and Filesystems Conference (Vault’19). 1.Google Scholar
- [4] . 2010. Indirection systems for shingled-recording disk drives. In Proceedings of the IEEE 26th Symposium on Massive Storage Systems and Technology (MSST’10). 1–14.Google Scholar
Digital Library
- [5] . 2006. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06). 205–218.Google Scholar
- [6] . 2016. Increasing SSD Performance and Lifetime with Multi-stream Technology. Retrieved from https://www.snia.org/sites/default/files/DSI/2016/presentations/sec/ChanghoChoi_Increasing_SSD_Performance-rev.pdf.Google Scholar
- [7] . 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the ACM Symposium on Cloud Computing (SOCC’10).Google Scholar
Digital Library
- [8] . 2017. Monkey: Optimal navigable key-value store. In Proceedings of the ACM International Conference on Management of Data. ACM, 79–94.Google Scholar
Digital Library
- [9] . 2016. dm-zoned. Retrieved from https://github.com/hgst/dm-zoned-tools.Google Scholar
- [10] . 2019. ZoneFS. Retrieved from https://github.com/damien-lemoal/zonefs-tools.Google Scholar
- [11] . [n.d.]. RocksDB, A Persistent Key-value Store for Fast Storage Enviroments. Retrieved from http://rocksdb.org/.Google Scholar
- [12] . 2013. Shingled magnetic recording: Areal density increase requires new data management. USENIX; Login: Mag. 38, 3 (2013), 22–30.Google Scholar
- [13] . 2016. LevelDB. Retrieved from https://github.com/Level/leveldown/issues/298.Google Scholar
- [14] . 2011. Principles of operation for shingled disk devices. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-11-107 (2011).Google Scholar
- [15] . 2015. Scaling concurrent log-structured data stores. In Proceedings of the 10th European Conference on Computer Systems (EuroSys’15).Google Scholar
Digital Library
- [16] . 2014. Application-driven flash translation layers on open-channel SSDs. In Proceedings of the Nonvolatile Memory Workshop (NVMW’14).Google Scholar
- [17] . 2020. ZenFS, Zones and RocksDB—Who Likes to Take Out the Garbage Anyway? Retrieved from https://www.snia.org/educational-library/zenfs-zones-and-rocksdb-who-likes-take-out-garbage-anyway-2020.Google Scholar
- [18] . 2015. HGST Delivers World’s First 10TB Enterprise HDD for Active Archive Applications. Retrieved from http://investor.wdc.com/news-releases/news-release-details/hgst-delivers-worlds-first-10tb-enterprise-hdd-active-archive.Google Scholar
- [19] . 2017. Libzbc Version 5.4.1. Retrieved from https://github.com/hgst/libzbc.Google Scholar
- [20] . 2017. Ultrastar Hs14—14TB 3.5 inch Helium Platform Enterprise SMR Hard Drive. Retrieved from https://www.hgst.com/products/hard-drives/ultrastar-hs14.Google Scholar
- [21] . 2018. Ultrastar DC HC600 SMR Series, 15TB. Retrieved from https://www.westerndigital.com/products/data-center-drives/ultrastar-dc-hc600-series-hdd.Google Scholar
- [22] . 2017. Information Technology-Zoned Block Commands (ZBC). Draft Standard T10/BSR INCITS 550, American National Standards Institute, Inc. Retrieved from http://www.t10.org/drafts.htm.Google Scholar
- [23] . [n.d.]. Zoned-device ata Command Set (ZAC) Working Draft.Google Scholar
- [24] . 1997. Incremental organization for data recording and warehousing. In Proceedings of the Conference on Very Large data Bases (VLDB’97). 16–25.Google Scholar
- [25] . 2014. HiSMRfs: A high performance file system for shingled storage array. In Proceedings of the IEEE 30th Symposium on Massive Storage Systems and Technology (MSST’14). IEEE, 1–6.Google Scholar
Cross Ref
- [26] . 2015. Caveat-Scriptor: Write anywhere shingled disks. In Proceedings of the 7th USENIX Workshop on HotStorage.Google Scholar
- [27] . 2017. SlimDB—A space-efficient key-value storage engine for semi-sorted data. Proc. VLDB Endow. 10, 13 (2017).Google Scholar
Digital Library
- [28] . 2018. Redesigning LSMs for nonvolatile memory with NoveLSM. In Proceedings of the USENIX Annual Technical Conference. 993–1005.Google Scholar
- [29] . 2018. PCStream: Automatic stream allocation using program contexts. In Proceedings of the 10th USENIX Workshop on HotStorage.Google Scholar
- [30] . 2015. An SMR-aware append-only file system. In Proceedings of the Storage Developer Conference.Google Scholar
- [31] . 2009. Cassandra: A decentralized structured storage system. In Proceedings of the 3rd ACM SIGOPS International Workshop on Large Scale Distributed Systems and Middleware.Google Scholar
- [32] . 2016. WiscKey: Separating keys from values in SSD-conscious storage. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). 133–148.Google Scholar
- [33] . 2015. F2FS: A new file system for flash storage. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). 273–286.Google Scholar
- [34] . 2016. Application-managed flash.. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). 339–353.Google Scholar
Digital Library
- [35] . 2015. Implement object storage with SMR-based key-value store. In Proceedings of the Storage Developer Conference.Google Scholar
- [36] . 2017. SMORE: A cold data object store for SMR drives. In Proceedings of the IEEE 33th Symposium on Massive Storage Systems and Technology (MSST’17).Google Scholar
- [37] . 2016. ZEA, A data management approach for SMR. In Proceedings of the 8th USENIX Workshop on HotStorage.Google Scholar
- [38] Oana Balmau, Diego Didona, Rachid Guerraoui, Willy Zwaenepoel, Huapeng Yuan, Aashray Arora, Karan Gupta, and Pavan Konka. 2017. TRIAD: Creating synergies between memory, disk and log in log structured key-value stores. In Proceedings of the USENIX Annual Technical Conference.Google Scholar
- [39] . 2014. NVMKV: A scalable and lightweight flash aware key-value store. In Proceedings of the 6th USENIX Workshop on HotStorage.Google Scholar
- [40] . 2015. Trash day: Coordinating garbage collection in distributed systems. In Proceedings of the Workshop on Hot Topics in Operating Systems (HotOS’15).Google Scholar
- [41] . 2020. Zonefs: Mapping POSIX file system interface to raw zoned block device accesses. USENIX Association, Santa Clara, CA.Google Scholar
- [42] . 1996. The log-structured merge-tree (LSM-tree). Acta Informatica 33, 4 (1996), 351–385.Google Scholar
Digital Library
- [43] . 2015. SMRDB: Key-value data store for shingled magnetic recording disks. In Proceedings of the 8th ACM International Systems and Storage Conference.Google Scholar
Digital Library
- [44] . 2017. Pebblesdb: Building key-value stores using fragmented log-structured merge trees. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP’17). ACM, 497–514.Google Scholar
Digital Library
- [45] . 2013. BTRFS: The linux B-tree filesystem. ACM Trans. Stor. 9, 3 (2013), 1–32.Google Scholar
Digital Library
- [46] . [n.d.]. The Seagate Kinetic Open Storage Vision. Retrieved from https://www.seagate.com/tech-insights/kinetic-vision-how-seagate-new-developer-tools-meets-the-needs-of-cloud-storage-platforms-master-ti.Google Scholar
- [47] . 2014. Archive HDDs from Seagate. Retrieved from http://www.seagate.com/www-content/product-content/hdd-fam/seagate-archive-hdd/en-us/docs/100757960a.pdf.Google Scholar
- [48] . 2012. bLSM: A general purpose log structured merge tree. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’12).Google Scholar
Digital Library
- [49] . 2013. Building workload-independent storage with VT-trees. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). 17–30.Google Scholar
Digital Library
- [50] . 2016. Evaluating host aware SMR drives. In Proceedings of the 8th USENIX Workshop on HotStorage.Google Scholar
- [51] . 2015. LSM-trie: An LSM-tree-based ultra-large key- value store for small data. In Proceedings of the USENIX Annual Technical Conference.Google Scholar
- [52] . 2018. A set-aware key-value store on shingled magnetic recording drives with dynamic band. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS’18). IEEE, 306–315.Google Scholar
Cross Ref
- [53] . 2017. A light-weight compaction tree to reduce I/O amplification toward efficient key-value stores. In Proceedings of the IEEE 33rd Symposium on Massive Storage Systems and Technology (MSST’17).Google Scholar
Index Terms
Building GC-free Key-value Store on HM-SMR Drives with ZoneFS
Recommendations
LSM-tree managed storage for large-scale key-value store
SoCC '17: Proceedings of the 2017 Symposium on Cloud ComputingKey-value stores are increasingly adopting LSM-trees as their enabling data structure in the backend storage, and persisting their clustered data through a file system. A file system is expected to not only provide file/directory abstraction to organize ...
Building Efficient Key-Value Stores via a Lightweight Compaction Tree
Special Issue on MSST 2017 and Regular PapersLog-Structure Merge tree (LSM-tree) has been one of the mainstream indexes in key-value systems supporting a variety of write-intensive Internet applications in today’s data centers. However, the performance of LSM-tree is seriously hampered by ...
GHStore: A High Performance Global Hash Based Key-Value Store
Database Systems for Advanced ApplicationsAbstractLog-Structured Merge tree (LSM-tree) has become the mainstream data structure of persistent key-value (KV) stores, but it suffers from serious write and read amplification. In update intensive workloads, repeated and useless compaction of outdated ...






Comments