Abstract
In this paper, we examine the design tradeoffs of existing in-memory data structures of a state-of-the-art key-value store. We observe that no data structures provide both fast point-accesses and consistent ranged- retrievals, and naive amalgamations of existing structures fail to get the best of both worlds. Furthermore, our experiments reveal a performance anomaly when increasing the memory size: as more key-value pairs are maintained in memory, the shortcomings of the data structures exacerbate. To address the above problems, we present TeksDB, a fast and consistent key-value store with a novel in-memory data structure, which effciently handles both point- and ranged- accesses at a modest increase in memory footprint. Our evaluation demonstrates that TeksDB outperforms RocksDB by 3.6×, 9×, and 4.5× for get, scan, and range_query, respectively. The effectiveness of TeksDB extends to real-world workloads, achieving up to 3.3× speedup for YCSB.
- Ardb. 2013. Ardb. https://github.com/yinqiwen/ardb .Google Scholar
- Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '12, London, United Kingdom, June 11--15, 2012 . ACM, NewYork, NY, USA, 53--64. Google Scholar
Digital Library
- Oana Balmau, Diego Didona, Rachid Guerraoui, Willy Zwaenepoel, Huapeng Yuan, Aashray Arora, Karan Gupta, and Pavan Konka. 2017a. TRIAD: Creating Synergies Between Memory, Disk and Log in Log Structured Key-Value Stores. In 2017 USENIX Annual Technical Conference, USENIX ATC 2017, Santa Clara, CA, USA, July 12--14, 2017. USENIX, Berkely, CA, USA, 363--375. Google Scholar
Digital Library
- Oana Balmau, Rachid Guerraoui, Vasileios Trigonakis, and Igor Zablotchi. 2017b. FloDB: Unlocking Memory in Persistent Key-Value Stores. In Proceedings of the Twelfth European Conference on Computer Systems, EuroSys 2017, Belgrade, Serbia, April 23--26, 2017. ACM, NewYork, NY, USA, 80--94. Google Scholar
Digital Library
- Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O'Neil, and Patrick O'Neil. 1995. A critique of ANSI SQL isolation levels. In ACM SIGMOD Record, Vol. 24. ACM, NewYork, NY, USA, 1--10. Google Scholar
Digital Library
- Philip A Bernstein and Nathan Goodman. 1981. Concurrency control in distributed database systems. ACM Computing Surveys (CSUR) , Vol. 13, 2 (1981), 185--221. Google Scholar
Digital Library
- Lucas Braun, Thomas Etter, Georgios Gasparis, Martin Kaufmann, Donald Kossmann, Daniel Widmer, Aharon Avitzur, Anthony Iliopoulos, Eliezer Levy, and Ning Liang. 2015. Analytics in Motion: High Performance Event-Processing AND Real-Time Analytics in the Same Database. In Proceedings of the 2015 ACM International Conference on Management of Data, SIGMOD Conference 2015, Melbourne, Victoria, Australia, May 31 - June 4, 2015. ACM, NewYork, NY, USA, 251--264. Google Scholar
Digital Library
- cameron314. 2014. Concurrent Queue. https://github.com/cameron314/concurrentqueue.git.Google Scholar
- Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2008. Bigtable: A Distributed Storage System for Structured Data. ACM Trans. Comput. Syst. , Vol. 26, 2 (2008), 4:1--4:26. Google Scholar
Digital Library
- Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC 2010, Indianapolis, Indiana, USA, June 10--11, 2010 . ACM, NewYork, NY, USA, 143--154. Google Scholar
Digital Library
- Niv Dayan, Manos Athanassoulis, and Stratos Idreos. 2017. Monkey: Optimal Navigable Key-Value Store. In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14--19, 2017. ACM, NewYork, NY, USA, 79--94. Google Scholar
Digital Library
- Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: amazon's highly available key-value store. In Proceedings of the 21st ACM Symposium on Operating Systems Principles 2007, SOSP 2007, Stevenson, Washington, USA, October 14--17, 2007. ACM, NewYork, NY, USA, 205--220. Google Scholar
Digital Library
- Facebook. 2012. RocksDB. https://github.com/facebook/rocksdb .Google Scholar
- Facebook. 2018. MyRocks. https://myrocks.io .Google Scholar
- The Apache Software Foundation. 2008. Cassandra. https://github.com/apache/cassandra.Google Scholar
- FoundationDB. 2013. FoundationDB. https://www.foundationdb.org/.Google Scholar
- Guy Golan-Gueta, Edward Bortnikov, Eshcar Hillel, and Idit Keidar. 2015. Scaling concurrent log-structured data stores. In Proceedings of the Tenth European Conference on Computer Systems, EuroSys 2015, Bordeaux, France, April 21--24, 2015 . ACM, NewYork, NY, USA, 32:1--32:14. Google Scholar
Digital Library
- Google. 2011. LevelDB. https://github.com/google/leveldb .Google Scholar
- Tyler Harter, Dhruba Borthakur, Siying Dong, Amitanand S. Aiyer, Liyin Tang, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2014. Analysis of HDFS under HBase: a facebook messages case study. In Proceedings of the 12th USENIX conference on File and Storage Technologies, FAST 2014, Santa Clara, CA, USA, February 17--20, 2014. USENIX, Berkely, CA, USA, 199--212. Google Scholar
Digital Library
- HyperDex. 2011. HyperLevelDB. https://github.com/rescrv/HyperLevelDB .Google Scholar
- Sudarsun Kannan, Nitish Bhat, Ada Gavrilovska, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2018. Redesigning LSMs for Nonvolatile Memory with NoveLSM. In 2018 USENIX Annual Technical Conference, USENIX ATC 2018, Boston, MA, USA, July 11--13, 2018. USENIX, Berkely, CA, USA, 993--1005. Google Scholar
Digital Library
- Dong-Yun Lee, Kisik Jeong, Sang-Hoon Han, Jin-Soo Kim, Joo-Young Hwang, and Sangyeun Cho. 2017. Understanding write behaviors of storage backends in Ceph object store. In Proceedings of the 2017 International Conference on Massive Storage Systems and Technology . Santa Clara University, Santa Clara, CA, USA.Google Scholar
- Eunji Lee, Youil Han, Suli Yang, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2018. How to Teach an Old File System Dog New Object Store Tricks. In 10th USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2018, Boston, MA, USA, July 9--10, 2018. USENIX, Berkely, CA, USA. Google Scholar
Digital Library
- LMDB. 2011. LMDB. https://symas.com/lmdb/.Google Scholar
- Simon Loesing, Markus Pilman, Thomas Etter, and Donald Kossmann. 2015. On the Design and Scalability of Distributed Shared-Data Databases. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31 - June 4, 2015. ACM, NewYork, NY, USA, 663--676. Google Scholar
Digital Library
- Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Hariharan Gopalakrishnan, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2017. WiscKey: Separating keys from values in SSD-conscious storage. ACM Transactions on Storage (TOS) , Vol. 13, 1 (2017), 5. Google Scholar
Digital Library
- Memcached. 2003. Memcached. https://memcached.org .Google Scholar
- Alexander Merritt, Ada Gavrilovska, Yuan Chen, and Dejan Milojicic. 2017. Concurrent log-structured memory for many-core key-value stores. Proceedings of the VLDB Endowment , Vol. 11, 4 (2017), 458--471. Google Scholar
Digital Library
- Inc. MongoDB. 2018. MongoDB. https://www.mongodb.com.Google Scholar
- Patrick E. O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth J. O'Neil. 1996. The Log-Structured Merge-Tree (LSM-Tree). Acta Inf. , Vol. 33, 4 (1996), 351--385. Google Scholar
Digital Library
- Daniel Peng and Frank Dabek. 2010. Large-scale Incremental Processing Using Distributed Transactions and Notifications. In 9th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2010, October 4--6, 2010, Vancouver, BC, Canada, Proceedings . USENIX, Berkely, CA, USA, 251--264. Google Scholar
Digital Library
- Markus Pilman, Kevin Bocksrocker, Lucas Braun, Renato Marroquin, and Donald Kossmann. 2017. Fast Scans on Key-Value Stores. PVLDB , Vol. 10, 11 (2017), 1526--1537. Google Scholar
Digital Library
- William Pugh. 1990. Skip Lists: A Probabilistic Alternative to Balanced Trees. Commun. ACM , Vol. 33, 6 (1990), 668--676. Google Scholar
Digital Library
- Pandian Raju, Rohan Kadekodi, Vijay Chidambaram, and Ittai Abraham. 2017. PebblesDB: Building Key-Value Stores using Fragmented Log-Structured Merge Trees. In Proceedings of the 26th Symposium on Operating Systems Principles, Shanghai, China, October 28--31, 2017 . ACM, NewYork, NY, USA, 497--514. Google Scholar
Digital Library
- Redis. 2009. Redis. https://redis.io .Google Scholar
- Margo Seltzer and Keith Bostic. 1994. Berkeley DB. http://https://www.oracle.com/database/berkeley-db/index.html .Google Scholar
- TokyoCabinet. 2009. TokyoCabinet. http://fallabs.com/tokyocabinet/.Google Scholar
- Sheng Wang, Tien Tuan Anh Dinh, Qian Lin, Zhongle Xie, Meihui Zhang, Qingchao Cai, Gang Chen, Beng Chin Ooi, and Pingcheng Ruan. 2018. ForkBase: An Efficient Storage Engine for Blockchain and Forkable Applications. PVLDB , Vol. 11, 10 (2018), 1137--1150. Google Scholar
Digital Library
- Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006. Ceph: A Scalable, High-Performance Distributed File System. In 7th Symposium on Operating Systems Design and Implementation (OSDI '06), November 6--8, Seattle, WA, USA. USENIX, Berkely, CA, USA, 307--320. Google Scholar
Digital Library
- WiredTiger. 2016. WiredTiger. http://www.wiredtiger.com/.Google Scholar
Index Terms
TeksDB: Weaving Data Structures for a High-Performance Key-Value Store
Recommendations
PebblesDB: Building Key-Value Stores using Fragmented Log-Structured Merge Trees
SOSP '17: Proceedings of the 26th Symposium on Operating Systems PrinciplesKey-value stores such as LevelDB and RocksDB offer excellent write throughput, but suffer high write amplification. The write amplification problem is due to the Log-Structured Merge Trees data structure that underlies these key-value stores. To remedy ...
TeksDB: Weaving Data Structures for a High-Performance Key-Value Store
SIGMETRICS '19: Abstracts of the 2019 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer SystemsIn this paper, we examine the design tradeoffs of existing in-memory data structures of a state-of-the-art key-value store. We observe that no data structures provide both fast point-accesses and consistent ranged-retrievals, and naitive amalgamations ...
TeksDB: Weaving Data Structures for a High-Performance Key-Value Store
Key-value stores (KVS) are now an integral part of modern dataintensive systems. thanks to its simplicity, scalability, and efficiency over traditional database systems. Databases such as MySQL employ KVS (in this case, RocksDB as their backend storage ...






Comments