Abstract
The Log-Structured Merge Tree (LSM-Tree) is widely used in key-value (KV) stores because of its excwrite performance. But LSM-Tree-based KV stores still have the overhead of write-ahead log and write stall caused by slow L0 flush and L0-L1 compaction. New byte-addressable, persistent memory (PM) devices bring an opportunity to improve the write performance of LSM-Tree. Previous studies on PM-based LSM-Tree have not fully exploited PM’s “dual role” of main memory and external storage. In this article, we analyze two strategies of memtables based on PM and the reasons write stall problems occur in the first place. Inspired by the analysis result, we propose FlatLSM, a specially designed flat LSM-Tree for non-volatile memory based KV stores. First, we propose PMTable with separated index and data. The PM Log utilizes the Buffer Log to store KVs of size less than 256B. Second, to solve the write stall problem, FlatLSM merges the volatile memtables and the persistent L0 into large PMTables, which can reduce the depth of LSM-Tree and concentrate I/O bandwidth on L0-L1 compaction. To mitigate write stall caused by flushing large PMTables to SSD, we propose a parallel flush/compaction algorithm based on KV separation. We implemented FlatLSM based on RocksDB and evaluated its performance on Intel’s latest PM device, the Intel Optane DC PMM with the state-of-the-art PM-based LSM-Tree KV stores, FlatLSM improves the throughput 5.2× on random write workload and 2.55× on YCSB-A.
- [1] SNIA NVM Programming Technical Working Group. 2017. NVM Programming Model (Version 1.2). SNIA NVM Programming Technical Working Group.Google Scholar
- [2] 2018. Titan: A RocksDB Plugin to Reduce Write Amplification. Retrieved April 25, 2019 from https://pingcap.com/blog/titan-storage-engine-design-and-implementation.Google Scholar
- [3] . 2014. HBase. Retrieved January 30, 2023 from https://hbase.apache.org/.Google Scholar
- [4] . 2015. Let’s talk about storage & recovery methods for non-volatile memory database systems. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. 707–722.Google Scholar
Digital Library
- [5] . 2017. TRIAD: Creating synergies between memory, disk and log in log structured key-value stores. In Proceedings of the 2017 USENIX Annual Technical Conference (USENIX ATC’17). 363–375.Google Scholar
- [6] . 2019. SILK: Preventing latency spikes in log-structured merge key-value stores. In Proceedings of the 2017 USENIX Annual Technical Conference (USENIX ATC’19). 753–766.Google Scholar
- [7] . 2010. Finding a needle in haystack: Facebook’s photo storage. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation (OSDI’10). 1–8.Google Scholar
- [8] . 2008. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems 26, 2 (2008), 1–26.Google Scholar
Digital Library
- [9] . 2020. FlatStore: An efficient log-structured key-value storage engine for persistent memory. In Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems. 1077–1091.Google Scholar
Digital Library
- [10] . 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing. 143–154.Google Scholar
Digital Library
- [11] . 2019. Intel Optane DC Persistent Memory Product Brief. Retrieved January 30, 2023 from https://www.intel.com/content/www/us/en/products/docs/memory-storage/optane-persistent-memory/optane-dc-persistent-memory-brief.html.Google Scholar
- [12] . 2019. Persistent Memory Development Kit. Retrieved January 30, 2023 from https://pmem.io/pmdk/.Google Scholar
- [13] . 2019. The log-structured merge-bush & the wacky continuum. In Proceedings of the 2019 International Conference on Management of Data. 449–466.Google Scholar
Digital Library
- [14] . 2007. Dynamo: Amazon’s highly available key-value store. ACM SIGOPS Operating Systems Review 41, 6 (2007), 205–220.Google Scholar
Digital Library
- [15] . 2018. Reducing DRAM footprint with NVM in Facebook. In Proceedings of the 13th EuroSys Conference. 1–13.Google Scholar
Digital Library
- [16] . 2008. Cassandra. Retrieved January 30, 2023 from https://cassandra.apache.org/.Google Scholar
- [17] . 2013. RocksDB. Retrieved January 30, 2023 from https://rocksdb.org/.Google Scholar
- [18] . 2020. EvenDB: Optimizing key-value storage for spatial locality. In Proceedings of the 15th European Conference on Computer Systems. 1–16.Google Scholar
Digital Library
- [19] . 2011. LevelDB. Retrieved January 30, 2023 from https://github.com/google/leveldb.Google Scholar
- [20] . 2018. Endurable transient inconsistency in byte-addressable persistent B+-tree. In Proceedings of the 16th USENIX Conference on File and Storage Technologies (FAST’18). 187–200.Google Scholar
- [21] . What Is Intel Optane DC Persistent Memory? Retrieved January 30, 2023 from https://www.boston.co.uk/blog/2019/07/10/intel-optane-dc-persistant-memory.aspx.Google Scholar
- [22] . 2019. Basic performance measurements of the Intel Optane DC persistent memory module. arXiv preprint arXiv:1903.05714 (2019).Google Scholar
- [23] . 2019. SLM-DB: Single-level key-value store with persistent memory. In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST’19). 191–205.Google Scholar
- [24] . 2018. Redesigning LSMs for nonvolatile memory with NoveLSM. In Proceedings of the 2018 USENIX Annual Technical Conference (USENIX ATC’18). 993–1005.Google Scholar
- [25] . 2019. KVell: The design and implementation of a fast persistent key-value store. In Proceedings of the 27th ACM Symposium on Operating Systems Principles. 447–461.Google Scholar
Digital Library
- [26] . 2017. NVMRocks: RocksDB on non-volatile memory systems.Google Scholar
- [27] . 2021. Differentiated key-value storage management for balanced IO performance. In Proceedings of the 2021 USENIX Annual Technical Conference (USENIX ATC’21). 673–687.Google Scholar
- [28] . 2020. RangeKV: An efficient key-value store based on hybrid DRAM-NVM-SSD storage structure. IEEE Access 8 (2020), 154518–154529.Google Scholar
Cross Ref
- [29] . 2017. WiscKey: Separating keys from values in SSD-conscious storage. ACM Transactions on Storage 13, 1 (2017), 1–28.Google Scholar
Digital Library
- [30] . 2012. Cache craftiness for fast multicore key-value storage. In Proceedings of the 7th ACM European Conference on Computer Systems. 183–196.Google Scholar
Cross Ref
- [31] . 2018. SifrDB: A unified solution for write-optimized key-value stores in large datacenter. In Proceedings of the ACM Symposium on Cloud Computing. 477–489.Google Scholar
Digital Library
- [32] . 2019. Write-optimized dynamic hashing for persistent memory. In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST’19). 31–44.Google Scholar
- [33] . 2016. FPTree: A hybrid SCM-DRAM persistent and concurrent B-tree for storage class memory. In Proceedings of the 2016 International Conference on Management of Data. 371–386.Google Scholar
Digital Library
- [34] . 1996. The log-structured merge-tree (LSM-Tree). Acta Informatica 33, 4 (1996), 351–385.Google Scholar
Digital Library
- [35] . 2017. PebblesDB: Building key-value stores using fragmented log-structured merge trees. In Proceedings of the 26th Symposium on Operating Systems Principles. 497–514.Google Scholar
Digital Library
- [36] . 2008. Phase-change random access memory: A scalable technology. IBM Journal of Research and Development 52, 4.5 (2008), 465–479.Google Scholar
Digital Library
- [37] . 2012. bLSM: A general purpose log structured merge tree. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. 217–228.Google Scholar
Digital Library
- [38] . 2017. HiKV: A hybrid index key-value store for DRAM-NVM memory systems. In Proceedings of the 2017 USENIX Annual Technical Conference (USENIX ATC’17). 349–362.Google Scholar
- [39] . 2015. Overcoming the challenges of crossbar resistive memory architectures. In Proceedings of the 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA’15). IEEE, Los Alamitos, CA, 476–488.Google Scholar
Cross Ref
- [40] . 2021. Revisiting the design of LSM-Tree based OLTP storage engine with persistent memory. Proceedings of the VLDB Endowment 14, 10 (2021), 1872–1885.Google Scholar
Digital Library
- [41] . 2020. An empirical guide to the behavior and use of scalable persistent memory. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST’20). 169–182. https://www.usenix.org/conference/fast20/presentation/yang.Google Scholar
Digital Library
- [42] . 2019. GearDB: A GC-free key-value store on HM-SMR drives with gear compaction. In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST’19). 159–171.Google Scholar
- [43] . 2020. MatrixKV: Reducing write stalls and write amplification in LSM-Tree based KV stores with matrix container in NVM. In Proceedings of the 2020 USENIX Annual Technical Conference (USENIX ATC’20). 17–31.Google Scholar
- [44] . 2018. Write-optimized and high-performance hashing index scheme for persistent memory. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI’18). 461–476.Google Scholar
Index Terms
FlatLSM: Write-Optimized LSM-Tree for PM-Based KV Stores
Recommendations
LSM-tree managed storage for large-scale key-value store
SoCC '17: Proceedings of the 2017 Symposium on Cloud ComputingKey-value stores are increasingly adopting LSM-trees as their enabling data structure in the backend storage, and persisting their clustered data through a file system. A file system is expected to not only provide file/directory abstraction to organize ...
An Efficient Memory-Mapped Key-Value Store for Flash Storage
SoCC '18: Proceedings of the ACM Symposium on Cloud ComputingPersistent key-value stores have emerged as a main component in the data access path of modern data processing systems. However, they exhibit high CPU and I/O overhead. Today, due to power limitations it is important to reduce CPU overheads for data ...
SpacKV: A Pmem-Aware Key-Value Separation Store Based on LSM-Tree
Network and Parallel ComputingAbstractKey-value (KV) stores based on persistent memories such as Intel Optane Pmem can deliver higher throughput and lower latency, compared to traditional SSD/HDD. Many KV stores adopt LSM-tree as the bone index structure. However, LSM-tree suffers ...






Comments