Abstract
Log-Structure Merge tree (LSM-tree) has been one of the mainstream indexes in key-value systems supporting a variety of write-intensive Internet applications in today’s data centers. However, the performance of LSM-tree is seriously hampered by constantly occurring compaction procedures, which incur significant write amplification and degrade the write throughput. To alleviate the performance degradation caused by compactions, we introduce a lightweight compaction tree (LWC-tree), a variant of LSM-tree index optimized for minimizing the write amplification and maximizing the system throughput. The lightweight compaction drastically decreases write amplification by appending data in a table and only merging the metadata that have much smaller size. Using our proposed LWC-tree, we have implemented three key-value LWC-stores on different storage mediums including Shingled Magnetic Recording (SMR) drives, Solid State Drives (SSD), and conventional Hard Disk Drives (HDDs). The LWC-store is particularly optimized for SMR drives, as it eliminates the multiplicative I/O amplification from both LSM-trees and SMR drives. Due to the lightweight compaction procedure, LWC-store reduces the write amplification by a factor of up to 5× compared to the popular LevelDB key-value store. Moreover, the random write throughput of the LWC-tree on SMR drives is significantly improved by up to 467% even compared with LevelDB on conventional HDDs. Furthermore, LWC-tree has wide applicability and delivers impressive performance improvement in various conditions, including different storage mediums (i.e., SMR, HDD, SSD) and various value sizes and access patterns (i.e., uniform and Zipfian).
- Abutalib Aghayev and Peter Desnoyers. 2015. Skylight a window on shingled disk operation. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). Google Scholar
Digital Library
- Abutalib Aghayev, Theodore Tso, Garth Gibson, and Peter Desnoyers. 2017. Evolving Ext4 for shingled disks. In Proceedings of 15th USENIX Conference on File and Storage Technologies (FAST’17), Vol. 1. 105. Google Scholar
Digital Library
- Jung-Sang Ahn, Chiyoung Seo, Ravi Mayuram, Rahim Yaseen, Jin-Soo Kim, and Seungryoul Maeng. 2016. ForestDB: A fast key-value storage system for variable-length string keys. IEEE Trans. Comput. 65, 3 (2016), 902--915. Google Scholar
Digital Library
- Ahmed Amer, Darrell D. E. Long, Ethan L. Miller, Jehan-Francois Paris, and S. J. Thomas Schwarz. 2010. Design issues for a shingled write disk system. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST’10). Google Scholar
Digital Library
- Apache. 2007. HBase. Retrieved from http://hbase.apache.org/.Google Scholar
- Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. ACM Sigmetr. Perf. Eval. Rev. 40, 1 (2012), 53--64. Google Scholar
Digital Library
- Yuval Cassuto, Marco A. A. Sanvido, Cyril Guyot, David R. Hall, and Zvonimir Z. Bandic. 2010. Indirection systems for shingled-recording disk drives. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST’10). IEEE, 1--14. Google Scholar
Digital Library
- Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, and Robert Gruber. 2006. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06). 205--218. Google Scholar
Digital Library
- Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, and Ramana Yerneni. 2008. PNUTS: Yahoo!s hosted data serving platform. In Proceedings of the VLDB Endowment (PVLDB’08).Google Scholar
Digital Library
- Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the ACM Symposium on Cloud Computing (SOCC’10). Google Scholar
Digital Library
- Biplob Debnath, Sudipta Sengupta, and Jin Li. 2010. FlashStore: High throughput persistent key-value store. Proc. VLDB Endow. 3, 1--2 (2010), 1414--1425. Google Scholar
Digital Library
- Biplob Debnath, Sudipta Sengupta, and Jin Li. 2011. SkimpyStash: RAM space skimpy key-value store on flash-based storage. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. ACM, 25--36. Google Scholar
Digital Library
- Facebook. 2016. RocksDB, A persistent key-value store for fast storage enviroments. Retrieved from http://rocksdb.org/.Google Scholar
- Tim Feldman and Garth Gibson. 2013. Shingled magnetic recording: Areal density increase requires new data management. USENIX 38, 3 (2013), 22--30.Google Scholar
- Brad Fitzpatrick and Anatoly Vorobey. 2011. Memcached: A distributed memory object caching system. https://memcached.org/.Google Scholar
- Sanjay Ghemawat and Jeff Dean. 2016. LevelDB. Retrieved from https://github.com/Level/leveldown/issues/298.Google Scholar
- Garth Gibson and Greg Ganger. 2011. Principles of operation for shingled disk devices. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-11-107 (2011).Google Scholar
- Weiping He and David H. C. Du. 2017. SMaRT: An approach to shingled magnetic recording translation. In Proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST’17). 121. Google Scholar
Digital Library
- Ping Huang, Pradeep Subedi, Xubin He, Shuang He, and Ke Zhou. 2014. FlexECC: Partially relaxing ECC of MLC SSD for better cache performance. In Proceedings of the USENIX Annual Technical Conference. 489--500. Google Scholar
Digital Library
- Ping Huang, Guanying Wu, Xubin He, and Weijun Xiao. 2014. An aggressive worn-out flash block management scheme to alleviate SSD performance degradation. In Proceedings of the 9th European Conference on Computer Systems. ACM, 22. Google Scholar
Digital Library
- Chunbo Lai, Song Jiang, Liqiong Yang, Shiding Lin, Guangyu Sun, Zhenyu Hou, Can Cui, and Jason Cong. 2015. Atlas: Baidu’s key-value storage system for cloud data. In Proceedings of the 2015 31st Symposium on Mass Storage Systems and Technologies (MSST’15). 1--14.Google Scholar
Cross Ref
- Avinash Lakshman and Prashant Malik. 2009. Cassandra: A decentralized structured storage system. In Proceedings of the 3rd ACM SIGOPS International Workshop on Large Scale Distributed Systems and Middleware.Google Scholar
- Conglong Li and Alan L. Cox. 2015. GD-wheel: A cost-aware replacement policy for key-value stores. In Proceedings of the 10th European Conference on Computer Systems. 5. Google Scholar
Digital Library
- Hyeontaek Lim, Bin Fan, David G. Andersen, and Michael Kaminsky. 2011. SILT: A memory-efficient, high-performance key-value store. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles. 1--13. Google Scholar
Digital Library
- Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. WiscKey: Separating keys from values in SSD-conscious storage. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). 133--148. Google Scholar
Digital Library
- Adam Manzanares, Noah Watkins, Cyril Guyot, Damien LeMoal, Carlos Maltzahn, and Zvonimr Bandic. 2016. ZEA, A data management approach for SMR. In Proceedings of the 8th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’16). Google Scholar
Digital Library
- Leonardo Marmol, Swaminathan Sundararaman, Nisha Talagala, Raju Rangaswami, Sushma Devendrappa, Bharath Ramsundar, and Sriram Ganesan. 2014. NVMKV: A scalable and lightweight flash aware key-value store. In Proceedings of the 6th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’14). Google Scholar
Digital Library
- Chris Nyberg, Tom Barclay, Zarka Cvetanovic, Jim Gray, and Dave Lomet. 1994. AlphaSort: A RISC machine sort. In Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data (SIGMOD’94). Google Scholar
Digital Library
- Patrick ONeil, Edward Cheng, Dieter Gawlick, and Elizabeth ONeil. 1996. The log-structured merge-tree (LSM-tree). Acta Inform. 33, 4 (1996), 351--385. Google Scholar
Digital Library
- Rekha Pichumani, James Hughes, and Ethan L. Miller. 2015. SMRDB: Key-value data store for shingled magnetic recording disks. In Proceedings of the ACM International Systems and Storage Conference (SYSTOR’15). Google Scholar
Digital Library
- Salvatore Sanfilippo and Pieter Noordhuis. 2009. Redis. Retrieved from http://redis.io/.Google Scholar
- Russell Sears and Raghu Ramakrishnan. 2012. bLSM: A general purpose log structured merge tree. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD’12). Google Scholar
Digital Library
- I. Tagawa and M. Williams. 2009. High density data-storage using shingled write. In Proceedings of the IEEE International Magnetics Conference (INTERMAG’09).Google Scholar
- Hua Wang, Ping Huang, Shuang He, Ke Zhou, Chunhua Li, and Xubin He. 2013. A novel I/O scheduler for SSD with improved performance and lifetime. In Proceedings of the 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST’13). IEEE, 1--5.Google Scholar
Cross Ref
- Peng Wang, Guangyu Sun, Song Jiang, Jian Ouyang, Shiding Lin, Chen Zhang, and Jason Cong. 2014. An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In Proceedings of the 9th European Conference on Computer Systems. 16:1--16:14. Google Scholar
Digital Library
- Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. 2015. LSM-trie: An LSM-tree-based ultra-large key-value store for small data. In Proceedings of the USENIX Annual Technical Conference (USENIX’15). Google Scholar
Digital Library
- Xingbo Wu, Li Zhang, Yandong Wang, Yufei Ren, Michel Hack, and Song Jiang. 2016. zExpander: A key-value cache with both high performance and fewer misses. In Proceedings of the 11th European Conference on Computer Systems. ACM, 14. Google Scholar
Digital Library
- Jingpei Yang, Ned Plasson, Greg Gillis, and Nisha Talagala. 2013. HEC: Improving endurance of high performance flash-based cache devices. In Proceedings of the 6th International Systems and Storage Conference. ACM, 10. Google Scholar
Digital Library
- Yinliang Yue, Bingsheng He, Yuzhe Li, and Weiping Wang. 2017. Building an efficient put-intensive key-value store with skip-tree. IEEE Transactions on Parallel and Distributed Systems 28, 4. IEEE. Google Scholar
Digital Library
- Ke Zhou, Shaofu Hu, Ping Huang, and Yuhong Zhao. 2017. LX-SSD: Enhancing the lifespan of NAND flash-based memory via recycling invalid pages. In Proceedings of the 2017 IEEE 33rd Symposium on Massive Storage Systems and Technology (MSST’17).Google Scholar
- You Zhou, Fei Wu, Ping Huang, Xubin He, Changsheng Xie, and Jian Zhou. 2015. An efficient page-level ftl to optimize address translation in flash memory. In Proceedings of the 10th European Conference on Computer Systems. ACM, 12. Google Scholar
Digital Library
Index Terms
Building Efficient Key-Value Stores via a Lightweight Compaction Tree
Recommendations
LSM-tree managed storage for large-scale key-value store
SoCC '17: Proceedings of the 2017 Symposium on Cloud ComputingKey-value stores are increasingly adopting LSM-trees as their enabling data structure in the backend storage, and persisting their clustered data through a file system. A file system is expected to not only provide file/directory abstraction to organize ...
A new sequential-write-constrained cache management to mitigate write amplification for SMR drives
SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied ComputingShingled magnetic recording (SMR) is regarded as a promising solution for fulfilling the capacity requirement of next-generation big data applications. However, due to the sequential-write constraint of SMR drives, random-write requests could only be ...
dCompaction: Delayed Compaction for the LSM-Tree
Key-value (KV) stores have become a backbone of large-scale applications in today's data centers. Write-optimized data structures like the Log-Structured Merge-tree (LSM-tree) and their variants are widely used in KV storage systems like BigTable and ...






Comments