skip to main content
research-article

Building Efficient Key-Value Stores via a Lightweight Compaction Tree

Authors Info & Claims
Published:24 November 2017Publication History
Skip Abstract Section

Abstract

Log-Structure Merge tree (LSM-tree) has been one of the mainstream indexes in key-value systems supporting a variety of write-intensive Internet applications in today’s data centers. However, the performance of LSM-tree is seriously hampered by constantly occurring compaction procedures, which incur significant write amplification and degrade the write throughput. To alleviate the performance degradation caused by compactions, we introduce a lightweight compaction tree (LWC-tree), a variant of LSM-tree index optimized for minimizing the write amplification and maximizing the system throughput. The lightweight compaction drastically decreases write amplification by appending data in a table and only merging the metadata that have much smaller size. Using our proposed LWC-tree, we have implemented three key-value LWC-stores on different storage mediums including Shingled Magnetic Recording (SMR) drives, Solid State Drives (SSD), and conventional Hard Disk Drives (HDDs). The LWC-store is particularly optimized for SMR drives, as it eliminates the multiplicative I/O amplification from both LSM-trees and SMR drives. Due to the lightweight compaction procedure, LWC-store reduces the write amplification by a factor of up to 5× compared to the popular LevelDB key-value store. Moreover, the random write throughput of the LWC-tree on SMR drives is significantly improved by up to 467% even compared with LevelDB on conventional HDDs. Furthermore, LWC-tree has wide applicability and delivers impressive performance improvement in various conditions, including different storage mediums (i.e., SMR, HDD, SSD) and various value sizes and access patterns (i.e., uniform and Zipfian).

References

  1. Abutalib Aghayev and Peter Desnoyers. 2015. Skylight a window on shingled disk operation. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Abutalib Aghayev, Theodore Tso, Garth Gibson, and Peter Desnoyers. 2017. Evolving Ext4 for shingled disks. In Proceedings of 15th USENIX Conference on File and Storage Technologies (FAST’17), Vol. 1. 105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jung-Sang Ahn, Chiyoung Seo, Ravi Mayuram, Rahim Yaseen, Jin-Soo Kim, and Seungryoul Maeng. 2016. ForestDB: A fast key-value storage system for variable-length string keys. IEEE Trans. Comput. 65, 3 (2016), 902--915. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ahmed Amer, Darrell D. E. Long, Ethan L. Miller, Jehan-Francois Paris, and S. J. Thomas Schwarz. 2010. Design issues for a shingled write disk system. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Apache. 2007. HBase. Retrieved from http://hbase.apache.org/.Google ScholarGoogle Scholar
  6. Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. ACM Sigmetr. Perf. Eval. Rev. 40, 1 (2012), 53--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Yuval Cassuto, Marco A. A. Sanvido, Cyril Guyot, David R. Hall, and Zvonimir Z. Bandic. 2010. Indirection systems for shingled-recording disk drives. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST’10). IEEE, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, and Robert Gruber. 2006. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06). 205--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, and Ramana Yerneni. 2008. PNUTS: Yahoo!s hosted data serving platform. In Proceedings of the VLDB Endowment (PVLDB’08).Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the ACM Symposium on Cloud Computing (SOCC’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Biplob Debnath, Sudipta Sengupta, and Jin Li. 2010. FlashStore: High throughput persistent key-value store. Proc. VLDB Endow. 3, 1--2 (2010), 1414--1425. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Biplob Debnath, Sudipta Sengupta, and Jin Li. 2011. SkimpyStash: RAM space skimpy key-value store on flash-based storage. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. ACM, 25--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Facebook. 2016. RocksDB, A persistent key-value store for fast storage enviroments. Retrieved from http://rocksdb.org/.Google ScholarGoogle Scholar
  14. Tim Feldman and Garth Gibson. 2013. Shingled magnetic recording: Areal density increase requires new data management. USENIX 38, 3 (2013), 22--30.Google ScholarGoogle Scholar
  15. Brad Fitzpatrick and Anatoly Vorobey. 2011. Memcached: A distributed memory object caching system. https://memcached.org/.Google ScholarGoogle Scholar
  16. Sanjay Ghemawat and Jeff Dean. 2016. LevelDB. Retrieved from https://github.com/Level/leveldown/issues/298.Google ScholarGoogle Scholar
  17. Garth Gibson and Greg Ganger. 2011. Principles of operation for shingled disk devices. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-11-107 (2011).Google ScholarGoogle Scholar
  18. Weiping He and David H. C. Du. 2017. SMaRT: An approach to shingled magnetic recording translation. In Proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST’17). 121. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Ping Huang, Pradeep Subedi, Xubin He, Shuang He, and Ke Zhou. 2014. FlexECC: Partially relaxing ECC of MLC SSD for better cache performance. In Proceedings of the USENIX Annual Technical Conference. 489--500. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ping Huang, Guanying Wu, Xubin He, and Weijun Xiao. 2014. An aggressive worn-out flash block management scheme to alleviate SSD performance degradation. In Proceedings of the 9th European Conference on Computer Systems. ACM, 22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Chunbo Lai, Song Jiang, Liqiong Yang, Shiding Lin, Guangyu Sun, Zhenyu Hou, Can Cui, and Jason Cong. 2015. Atlas: Baidu’s key-value storage system for cloud data. In Proceedings of the 2015 31st Symposium on Mass Storage Systems and Technologies (MSST’15). 1--14.Google ScholarGoogle ScholarCross RefCross Ref
  22. Avinash Lakshman and Prashant Malik. 2009. Cassandra: A decentralized structured storage system. In Proceedings of the 3rd ACM SIGOPS International Workshop on Large Scale Distributed Systems and Middleware.Google ScholarGoogle Scholar
  23. Conglong Li and Alan L. Cox. 2015. GD-wheel: A cost-aware replacement policy for key-value stores. In Proceedings of the 10th European Conference on Computer Systems. 5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hyeontaek Lim, Bin Fan, David G. Andersen, and Michael Kaminsky. 2011. SILT: A memory-efficient, high-performance key-value store. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles. 1--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. WiscKey: Separating keys from values in SSD-conscious storage. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). 133--148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Adam Manzanares, Noah Watkins, Cyril Guyot, Damien LeMoal, Carlos Maltzahn, and Zvonimr Bandic. 2016. ZEA, A data management approach for SMR. In Proceedings of the 8th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’16). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Leonardo Marmol, Swaminathan Sundararaman, Nisha Talagala, Raju Rangaswami, Sushma Devendrappa, Bharath Ramsundar, and Sriram Ganesan. 2014. NVMKV: A scalable and lightweight flash aware key-value store. In Proceedings of the 6th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Chris Nyberg, Tom Barclay, Zarka Cvetanovic, Jim Gray, and Dave Lomet. 1994. AlphaSort: A RISC machine sort. In Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data (SIGMOD’94). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Patrick ONeil, Edward Cheng, Dieter Gawlick, and Elizabeth ONeil. 1996. The log-structured merge-tree (LSM-tree). Acta Inform. 33, 4 (1996), 351--385. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Rekha Pichumani, James Hughes, and Ethan L. Miller. 2015. SMRDB: Key-value data store for shingled magnetic recording disks. In Proceedings of the ACM International Systems and Storage Conference (SYSTOR’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Salvatore Sanfilippo and Pieter Noordhuis. 2009. Redis. Retrieved from http://redis.io/.Google ScholarGoogle Scholar
  32. Russell Sears and Raghu Ramakrishnan. 2012. bLSM: A general purpose log structured merge tree. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. I. Tagawa and M. Williams. 2009. High density data-storage using shingled write. In Proceedings of the IEEE International Magnetics Conference (INTERMAG’09).Google ScholarGoogle Scholar
  34. Hua Wang, Ping Huang, Shuang He, Ke Zhou, Chunhua Li, and Xubin He. 2013. A novel I/O scheduler for SSD with improved performance and lifetime. In Proceedings of the 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST’13). IEEE, 1--5.Google ScholarGoogle ScholarCross RefCross Ref
  35. Peng Wang, Guangyu Sun, Song Jiang, Jian Ouyang, Shiding Lin, Chen Zhang, and Jason Cong. 2014. An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In Proceedings of the 9th European Conference on Computer Systems. 16:1--16:14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. 2015. LSM-trie: An LSM-tree-based ultra-large key-value store for small data. In Proceedings of the USENIX Annual Technical Conference (USENIX’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Xingbo Wu, Li Zhang, Yandong Wang, Yufei Ren, Michel Hack, and Song Jiang. 2016. zExpander: A key-value cache with both high performance and fewer misses. In Proceedings of the 11th European Conference on Computer Systems. ACM, 14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Jingpei Yang, Ned Plasson, Greg Gillis, and Nisha Talagala. 2013. HEC: Improving endurance of high performance flash-based cache devices. In Proceedings of the 6th International Systems and Storage Conference. ACM, 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Yinliang Yue, Bingsheng He, Yuzhe Li, and Weiping Wang. 2017. Building an efficient put-intensive key-value store with skip-tree. IEEE Transactions on Parallel and Distributed Systems 28, 4. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ke Zhou, Shaofu Hu, Ping Huang, and Yuhong Zhao. 2017. LX-SSD: Enhancing the lifespan of NAND flash-based memory via recycling invalid pages. In Proceedings of the 2017 IEEE 33rd Symposium on Massive Storage Systems and Technology (MSST’17).Google ScholarGoogle Scholar
  41. You Zhou, Fei Wu, Ping Huang, Xubin He, Changsheng Xie, and Jian Zhou. 2015. An efficient page-level ftl to optimize address translation in flash memory. In Proceedings of the 10th European Conference on Computer Systems. ACM, 12. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Building Efficient Key-Value Stores via a Lightweight Compaction Tree

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Storage
          ACM Transactions on Storage  Volume 13, Issue 4
          Special Issue on MSST 2017 and Regular Papers
          November 2017
          329 pages
          ISSN:1553-3077
          EISSN:1553-3093
          DOI:10.1145/3160863
          • Editor:
          • Sam H. Noh
          Issue’s Table of Contents

          Copyright © 2017 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 November 2017
          • Accepted: 1 September 2017
          • Received: 1 August 2017
          Published in tos Volume 13, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!