Abstract
Persistent key-value (KV) stores mostly build on the Log-Structured Merge (LSM) tree for high write performance, yet the LSM-tree suffers from the inherently high I/O amplification. KV separation mitigates I/O amplification by storing only keys in the LSM-tree and values in separate storage. However, the current KV separation design remains inefficient under update-intensive workloads due to its high garbage collection (GC) overhead in value storage. We propose HashKV, which aims for high update performance atop KV separation under update-intensive workloads. HashKV uses hash-based data grouping, which deterministically maps values to storage space to make both updates and GC efficient. We further relax the restriction of such deterministic mappings via simple but useful design extensions. We extensively evaluate various design aspects of HashKV. We show that HashKV achieves 4.6× update throughput and 53.4% less write traffic compared to the current KV separation design. In addition, we demonstrate that we can integrate the design of HashKV with state-of-the-art KV stores and improve their respective performance.
- Nitin Agrawal, Vijayan Prabhakaran, Ted Wobber, John D. Davis, Mark Manasse, and Rina Panigrahy. 2008. Design tradeoffs for SSD performance. In Proceedings of the USENIX Annual Technical Conference (ATC’08). 57--70. Google Scholar
Digital Library
- Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’12). 53--64. Google Scholar
Digital Library
- Oana Balmau, Diego Didona, Rachid Guerraoui, Willy Zwaenepoel, Huapeng Yuan, Aashray Arora, Karan Gupta, and Pavan Konka. 2017. TRIAD: Creating synergies between memory, disk and log in log structured key-value stores. In Proceedings of the USENIX Annual Technical Conference (ATC’17). 363--375. Google Scholar
Digital Library
- Doug Beaver, Sanjeev Kumar, Harry C. Li, Jason Sobel, and Peter Vajgel. 2010. Finding a needle in haystack: Facebook’s photo storage. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI’10). 47--60. Google Scholar
Digital Library
- Helen H. W. Chan, Yongkun Li, Patrick P. C. Lee, and Yinlong Xu. 2018. HashKV: Enabling efficient updates in KV storage via hashing. In Proceedings of the USENIX Annual Technical Conference (ATC’18). 1007--1019. Google Scholar
Digital Library
- Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2006. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI’06). 15--15. Google Scholar
Digital Library
- Yu Lin Chen, Shuai Mu, Jinyang Li, Cheng Huang, Jin Li, Aaron Ogus, and Douglas Phillips. 2017. Giza: Erasure coding objects across global data centers. In Proceedings of the USENIX Annual Technical Conference (ATC’17). 539--551. Google Scholar
Digital Library
- B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC’10). 143--154. Google Scholar
Digital Library
- Biplob Debnath, Sudipta Sengupta, and Jin Li. 2010. FlashStore: High throughput persistent key-value store. Proceedings of the VLDB Endowment 3, 1--2 (Sept. 2010), 1414--1425. Google Scholar
Digital Library
- Biplob Debnath, Sudipta Sengupta, and Jin Li. 2011. SkimpyStash: RAM space skimpy key-value store on flash-based storage. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’11). 25--36. Google Scholar
Digital Library
- Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon’s highly available key-value store. In Proceedings of the 21st ACM SIGOPS Symposium on Operating Systems Principles (SOSP’07). 205--220. Google Scholar
Digital Library
- Dgraph Labs. 2019. BadgerDB. Retrieved from https://github.com/dgraph-io/badger/.Google Scholar
- Robert Escriva. 2019. HyperLevelDB. Retrieved from https://github.com/rescrv/HyperLevelDB/.Google Scholar
- Facebook. 2019. RocksDB. Retrieved from https://rocksdb.org.Google Scholar
- Facebook. 2019. RocksDB Features that are not in LevelDB. Retrieved from https://github.com/facebook/rocksdb/wiki/Features-Not-in-LevelDB.Google Scholar
- Bin Fan, David G. Andersen, and Michael Kaminsky. 2013. MemC3: Compact and concurrent MemCache with dumber caching and smarter hashing. In Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation (NSDI’13). 371--384. Google Scholar
Digital Library
- Brad Fitzpatrick. 2004. Distributed caching with memcached. Linux J. 2004, 124 (Aug. 2004). Retrieved from http://www.linuxjournal.com/article/7451 Google Scholar
Digital Library
- S. Ghemawat and J. Dean. 2019. LevelDB. Retrieved from https://leveldb.org.Google Scholar
- Nigel Griffiths. 2019. nmon for Linux. Retrieved from http://nmon.sourceforge.net/.Google Scholar
- Jen-Wei Hsieh, Tei-Wei Kuo, and Li-Pin Chang. 2006. Efficient identification of hot data for flash memory storage systems. ACM Trans. Storage 2, 1 (Feb. 2006), 22--40. Google Scholar
Digital Library
- S. Kavalanekar, B. Worthington, Qi Zhang, and V. Sharda. 2008. Characterization of storage workload traces from production windows servers. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC’08). 119--128.Google Scholar
- C. Lai, S. Jiang, L. Yang, S. Lin, G. Sun, Z. Hou, C. Cui, and J. Cong. 2015. Atlas: Baidu’s key-value storage system for cloud data. In Proceedings of the 31st Symposium on Mass Storage Systems and Technologies (MSST’15). 1--14.Google Scholar
- Jongsung Lee and Jin-Soo Kim. 2013. An empirical study of hot/cold data separation policies in solid state drives (SSDs). In Proceedings of the 6th International Systems and Storage Conference (SYSTOR’13). 12. Google Scholar
Digital Library
- Yongkun Li, Patrick P. C. Lee, John C. S. Lui, and Yinlong Xu. 2015. Impact of data locality on garbage collection in SSDs: A general analytical study. In Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering (ICPE’15). 305--315. Google Scholar
Digital Library
- Hyeontaek Lim, Bin Fan, David G. Andersen, and Michael Kaminsky. 2011. SILT: A memory-efficient, high-performance key-value store. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP’11). 1--13. Google Scholar
Digital Library
- Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky. 2014. MICA: A holistic approach to fast in-memory key-value storage. In Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation (NSDI’14). 429--444. Google Scholar
Digital Library
- Linux Raid Wiki. 2019. RAID setup. Retrieved from https://raid.wiki.kernel.org/index.php/RAID_setup.Google Scholar
- Lanyue Lu, T. S. Pillai, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. 2017. WiscKey: Separating keys from values in SSD-conscious storage. ACM Trans. Storage 13, 1 (Mar. 2017), 5. Google Scholar
Digital Library
- Chen Luo and Michael J. Carey. 2018. LSM-based storage techniques: A survey. Retrieved from http://arxiv.org/abs/1812.07527.Google Scholar
- John MacCormick, Nicholas Murphy, Venugopalan Ramasubramanian, Udi Wieder, Junfeng Yang, and Lidong Zhou. 2009. Kinesis: A new approach to replica placement in distributed storage systems. ACM Trans. Storage 4, 4 (2009), 11. Google Scholar
Digital Library
- Leonardo Marmol, Swaminathan Sundararaman, Nisha Talagala, and Raju Rangaswami. 2015. NVMKV: A scalable, lightweight, FTL-aware key-value store. In Proceedings of the USENIX Annual Technical Conference (ATC’15). 207--219. Google Scholar
Digital Library
- Jeanna Neefe Matthews, Drew Roselli, Adam M. Costello, Randolph Y. Wang, and Thomas E. Anderson. 1997. Improving the performance of log-structured file systems with adaptive methods. In Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP’97). 238--251. Google Scholar
Digital Library
- Changwoo Min, Kangnyeon Kim, Hyunjin Cho, Sang-Won Lee, and Young Ik Eom. 2012. SFS: Random write considered harmful in solid state drives. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). 12--12. Google Scholar
Digital Library
- Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, and Enkateshwaran Venkataramani. 2013. Scaling memcache at Facebook. In Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation. 385--398. Google Scholar
Digital Library
- Patrick O’Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O’Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Informatica 33, 4 (1996), 351--385. Google Scholar
Digital Library
- Oracle. 2017. Oracle Berkeley DB, Java Edition: Getting Started with Berkeley DB Java Edition, 12c Release 2 Library Version 12.2.7.5.Google Scholar
- Anastasios Papagiannis, Giorgos Saloustros, Pilar González-Férez, and Angelos Bilas. 2016. Tucana: Design and implementation of a fast and efficient scale-up key-value store. In Proceedings of the USENIX Annual Technical Conference (ATC’16). 537--550. Google Scholar
Digital Library
- Pandian Raju, Rohan Kadekodi, Vijay Chidambaram, and Ittai Abraham. 2017. PebblesDB: Building key-value stores using fragmented log-structured merge trees. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP’17). 497--514. Google Scholar
Digital Library
- Redis. Retrieved in June 2019. Retrieved from http://redis.io.Google Scholar
- Mendel Rosenblum and John K. Ousterhout. 1992. The design and implementation of a log-structured file system. ACM Trans. Comput. Syst. 10, 1 (Feb. 1992), 26--52. Google Scholar
Digital Library
- Stephen M. Rumble, Ankita Kejriwal, and John Ousterhout. 2014. Log-structured memory for DRAM-based storage. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’16). 1--16. Google Scholar
Digital Library
- Russell Sears and Raghu Ramakrishnan. 2012. bLSM: A general purpose log structured merge tree. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’12). 217--228. Google Scholar
Digital Library
- Zhaoyan Shen, Feng Chen, Yichen Jia, and Zili Shao. 2018. DIDACache: A deep integration of device and application for flash-based key-value caching. ACM Trans. Storage 14, 3 (Nov. 2018), 26. Google Scholar
Digital Library
- Pradeep J. Shetty, Richard P. Spillane, Ravikant R. Malpani, Binesh Andrews, Justin Seyster, and Erez Zadok. 2013. Building workload-independent storage with VT-trees. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). 17--30. Google Scholar
Digital Library
- Dejun Teng, Lei Guo, Rubao Lee, Feng Chen, Yanfeng Zhang, Siyuan Ma, and Xiaodong Zhang. 2018. A low-cost disk solution enabling LSM-tree to achieve high performance for mixed read/write workloads. ACM Trans. Storage 14, 2 (2018), 15:1--15:26. Google Scholar
Digital Library
- Threadpool. Retrieved in June 2019. Retrieved from http://threadpool.sourceforge.net/.Google Scholar
- TPC. Retrieved in June 2019. TPC-C is an On-Line Transaction Processing Benchmark. Retrieved from http://www.tpc.org/tpcc/.Google Scholar
- Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006. Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI’06). 307--320. Google Scholar
Digital Library
- Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. 2015. LSM-trie: An LSM-tree-based ultra-large key-value store for small data. In Proceedings of the USENIX Annual Technical Conference (ATC’15). 71--82. Google Scholar
Digital Library
- Fei Xia, Dejun Jiang, Jin Xiong, and Ninghui Sun. 2017. HiKV: A hybrid index key-value store for DRAM-NVM memory systems. In Proceedings of the USENIX Annual Technical Conference (ATC’17). 349--362. Google Scholar
Digital Library
- Ting Yao, Jiguang Wan, Ping Huang, Xubin He, Qingxin Gui, Fei Wu, and Changsheng Xie. 2017. A light-weight compaction tree to reduce I/O amplification toward efficient key-value stores. In Proceedings of the 33rd International Conference on Massive Storage Systems and Technology (MSST’17).Google Scholar
- Yinliang Yue, Bingsheng He, Yuzhe Li, and Weiping Wang. 2017. Building an efficient put-intensive key-value store with skip-tree. IEEE Trans. Parallel Distrib. Syst. 28, 4 (Apr. 2017), 961--973. Google Scholar
Digital Library
- Heng Zhang, Mingkai Dong, and Haibo Chen. 2017. Efficient and available in-memory KV-store with hybrid erasure coding and replication. ACM Trans. Storage 13, 3 (Oct. 2017), 25. Google Scholar
Digital Library
Index Terms
Enabling Efficient Updates in KV Storage via Hashing: Design and Performance Evaluation
Recommendations
LSM-tree managed storage for large-scale key-value store
SoCC '17: Proceedings of the 2017 Symposium on Cloud ComputingKey-value stores are increasingly adopting LSM-trees as their enabling data structure in the backend storage, and persisting their clustered data through a file system. A file system is expected to not only provide file/directory abstraction to organize ...
An Efficient Memory-Mapped Key-Value Store for Flash Storage
SoCC '18: Proceedings of the ACM Symposium on Cloud ComputingPersistent key-value stores have emerged as a main component in the data access path of modern data processing systems. However, they exhibit high CPU and I/O overhead. Today, due to power limitations it is important to reduce CPU overheads for data ...
Building Efficient Key-Value Stores via a Lightweight Compaction Tree
Special Issue on MSST 2017 and Regular PapersLog-Structure Merge tree (LSM-tree) has been one of the mainstream indexes in key-value systems supporting a variety of write-intensive Internet applications in today’s data centers. However, the performance of LSM-tree is seriously hampered by ...






Comments