Abstract
We present the sticky trie, a new variant of the standard trie data structure that achieves high-performing atomic insertions and lookups for large key universes by precluding deletions. It has applications in several areas, including address tracking, logging, and garbage collection. By leveraging features of a modern operating system, we show how a runtime can exploit the absence of deletions to realize an efficient sticky-trie implementation.
We report on an evaluation of two representative uses---compelling Bloom-filter alternative and fast substitute for a garbage collector's sequential store buffer (SSB). We demonstrate that a sticky trie, when compared with what is perhaps among the simplest Bloom filters, can be over 43% faster, scale substantially better with increasing threads, and yet be free of false positives. By introducing the concept of an ideal SSB, we also demonstrate that a sticky trie could be competitive in performance with a class of SSBs.
- ADVANCED MICRO DEVICES, INC. AMD64 Architecture Programmer's Manual: System Programming (Vol. 2), Mar. 2012.Google Scholar
- AL-SUWAIYEL, M., AND HOROWITZ, E. Algorithms for Trie Compaction. ACM Transactions on Database Systems 9, 2 (June 1984), 243--263. Google Scholar
Digital Library
- ANDERSSON, A., AND NILSSON, S. Improved Behaviour of Tries by Adaptive Branching. Information Processing Letters 46, 6 (July 1993), 295--300. Google Scholar
Digital Library
- APPEL, A. W. Simple Generational Garbage Collection and Fast Allocation. Software--Practice and Experience 19, 2 (Feb.1989),171--183. Google Scholar
Digital Library
- ARCANGELI, A., EIDUS, I., AND WRIGHT, C. Increasing Memory Density by Using KSM. In Proc. Linux Symposium (July 2009), pp. 19--28.Google Scholar
- BAGWELL, P. Ideal Hash Trees. Research Report LAMP-REPORT- 2001-001, -- École Polytechnique Fédérale de Lausanne, Oct. 2001.Google Scholar
- BLACKBURN, S. M., AND HOSKING, A. L. Barriers: Friend or Foe? In Proc. International Symposium onMemoryManagement (Oct. 2004), pp. 143--151. Google Scholar
Digital Library
- BLACKBURN, S. M., AND MCKINLEY, K. S. In or Out? Putting Write Barriers in Their Place. In Proc. International Symposium on Memory Management (June 2002), pp. 175--184. Google Scholar
Digital Library
- BLOOM, B. H. Space/Time Trade-offs in Hash Coding with Allowable Errors. Communications of the ACM 13, 7 (July 1970), 422--426. Google Scholar
Digital Library
- BOVET, D. P., AND CESATI, M. Understanding the Linux Kernel, third ed. O'Reilly Media, Inc., 2005. Google Scholar
Digital Library
- BRECHT, T., AND SANDHU, H. The Region Trap Library: Handling Traps on Application-Defined Regions of Memory. In Proc. USENIX Annual Technical Conference (June 1999), pp. 85--100. Google Scholar
Digital Library
- BRONEVETSKY, G., MARQUES, D., PINGALI, K., SZWED, P., AND SCHULZ, M. Application-Level Checkpointing for Shared Memory Programs. In Proc. International Conference on Architectural Support for Programming Languages and Operating Systems (Oct. 2004),pp. 235--247. Google Scholar
Digital Library
- CLEMENT, J., FLAJOLET, P., AND VALLÉE, B. Dynamical Sources in Information Theory: A General Analysis of Trie Structures. Algorithmica 29, 1--2 (Feb. 2001), 307--369.Google Scholar
- CORMEN, T. H., LEISERSON, C. E., RIVEST, R. L., AND STEIN, C. Introduction to Algorithms, second ed. The MIT Press, 2007. Google Scholar
Digital Library
- DETLEFS, D., KNIPPEL, R., CLINGER, W. D., AND JACOB, M. Concurrent Remembered Set Refinement in Generational Garbage Collection. In Proc. Java Virtual Machine Research and Technology Symposium (Aug. 2002), pp. 13--26. Google Scholar
Digital Library
- DICE, D. False Sharing Induced by Card Table Marking. At https://blogs.oracle.com/dave/entry/false sharing induced by card, Feb. 2011.Google Scholar
- DOUGLIS, F. The Compression Cache: Using On-line Compression to Extend Physical Memory. In Proc. USENIX Winter Technical Conference (Jan. 1993), pp. 519--529.Google Scholar
- FREDKIN, E. Trie Memory. Communications of the ACM 3, 9 (Sept. 1960), 490--499. Google Scholar
Digital Library
- FREE SOFTWARE FOUNDATION. The GNU C++ Library Manual, 2011.Google Scholar
- GERMANN, U., JOANIS, E., AND LARKIN, S. Tightly Packed Tries: How to Fit Large Models into Memory, and Make them Load Fast Too. In Proc.Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (June 2009), pp. 31--39. Google Scholar
Digital Library
- HEINZ, S., ZOBEL, J., AND WILLIAMS, H. E. Burst Tries: A Fast, Efficient Data Structure for String Keys. ACM Transactions on Information Systems 20, 2 (Apr. 2002), 192--223. Google Scholar
Digital Library
- HEWLETT-PACKARD COMPANY. Programming with Judy: C Language Judy Version 4.0, June 2001. Part Number: B6841--90001.Google Scholar
- HOSKING, A. L., MOSS, E., AND STEFANOVIC , D. A Comparative Performance Evaluation of Write Barrier Implementations. In Proc. Conference on Object-Oriented Programming, Systems, Languages and Applications (Oct. 1992), pp. 92--109. Google Scholar
Digital Library
- HUDSON, R. L., AND DIWAN, A. Adaptive Garbage Collection for Modula-3 and Smalltalk. In Addendum to OOPSLA/ECOOP'90 Proceedings (Oct. 1990), E. Jul and N.-C. Juul, Eds.Google Scholar
- KIRSCH, A., AND MITZENMACHER, M. Less Hashing, Same Performance: Building a Better Bloom Filter. Random Structures and Algorithms 33, 2 (Sept. 2008), 187--218. Google Scholar
Digital Library
- KNUTH, D. E. The Art of Computer Programming: Sorting and Searching, second ed., vol. 3. Addison-Wesley, 2011.Google Scholar
- MALY, K. Compressed Tries. Communications of the ACM 19, 7 (July 1976), 409--415. Google Scholar
Digital Library
- MAO, Y., KOHLER, E., AND MORRIS, R. Cache Craftiness for Fast Multicore Key-Value Storage. In Proc. European Conference on Computer Systems (Apr. 2012), pp. 183--196. Google Scholar
Digital Library
- MORIMOTO, K., IRIGUCHI, H., AND AOE, J.-I. A Method of Compressing Trie Structures. Software--Practice and Experience 24, 3 (Mar. 1994), 265--288. Google Scholar
Digital Library
- MORRISON, D. R. PATRICIA--Practical Algorithm to Retrieve Information Coded in Alphanumeric. Journal of the ACM 15, 4 (Oct. 1968), 514--534. Google Scholar
Digital Library
- NEBEL, M. E. The Stack-Size of Combinatorial Tries Revisited. Discrete Mathematics and Theoretical Computer Science 5, 1 (2002), 1--16.Google Scholar
- PARK, G., HWANG, H.-K., NICOD'EME, P., AND SZPANKOWSKI, W.Profiles of Tries. SIAM Journal on Computing 38, 5 (2009), 1821--1880. Google Scholar
Digital Library
- PROKOPEC, A., BRONSON, N. G., BAGWELL, P., AND ODERSKY, M. Concurrent Tries with Efficient Non-Blocking Snapshots. In Proc. Symposium on Principles and Practices of Parallel Programming(Feb. 2012), pp. 151--160. Google Scholar
Digital Library
- PURDIN, T. D. M. Compressing Tries for Storing Dictionaries. In Proc. Symposium on Applied Computing (Apr. 1990), pp. 336--340.Google Scholar
- REGNIER, M., AND JACQUET, P. New Results on the Size of Tries. IEEE Transactions on Information Theory 35, 1 (Jan. 1989), 203--205. Google Scholar
Digital Library
- SAHNI, S., AND KIM, K. S. Efficient Construction of Multibit Tries for IP Lookup. IEEE/ACM Transactions on Networking 11, 4 (Aug. 2003), 650--662. Google Scholar
Digital Library
- SOBALVARRO, P. G. A Lifetime-Based Garbage Collector for LISP Systems on General-Purpose Computers, 1988. BS thesis.Google Scholar
- SPRUGNOLI, R. Perfect Hashing Functions: A Single Probe Retrieving Method for Static Sets. Communications of the ACM 20, 11 (Nov. 1977), 841--850. Google Scholar
Digital Library
- SUSSENGUTH JR., E. H. Use of Tree Structures for Processing Files. Communications of the ACM 6, 5 (May 1963), 272--279. Google Scholar
Digital Library
- TARJAN, R. E., AND YAO, A. C.-C. Storing a Sparse Table. Communications of the ACM 22, 11 (Nov. 1979), 606--611. Google Scholar
Digital Library
- How to Use the Kernel SamepageMerging Feature. At https://ww w.kernel.org/doc/Documentation/vm/ksm.txt, 2009.Google Scholar
- TUDUCE, I. C., AND GROSS, T. Adaptive Main Memory Compression. In Proc. USENIX Annual Technical Conference (Apr. 2005), pp. 237--250. Google Scholar
Digital Library
- VOLOS, H., TACK, A. J., AND SWIFT, M. M. Mnemosyne: Lightweight Persistent Memory. In Proc. International Conference on Architectural Support for Programming Languages and Operating Systems (Mar. 2011), pp. 91--103. Google Scholar
Digital Library
- WHITE, S. J. Pointer Swizzling Techniques for Object-Oriented Database Systems. PhD thesis, University of Wisconsin-Madison, 1994. Google Scholar
Digital Library
- WILSON, P. R. Some Issues and Strategies in Heap Management and Memory Hierarchies. ACM SIGPLAN Notices 26, 3 (Mar. 1991), 45-- 52. Google Scholar
Digital Library
- YANG, L., LEKATSAS, H., AND DICK, R. P. High-Performance Operating System Controlled Memory Compression. In Proc. Design Automation Conference (July 2006), pp. 701--704. Google Scholar
Digital Library
- YANG, X., BLACKBURN, S. M., FRAMPTON, D., AND HOSKING, A. L. Barriers Reconsidered, Friendlier Still! In Proc. International Symposium on Memory Management (June 2012), pp. 37--47. Google Scholar
Digital Library
Index Terms
Sticky tries: fast insertions, fast lookups, no deletions for large key universes
Recommendations
Sticky tries: fast insertions, fast lookups, no deletions for large key universes
ISMM '14: Proceedings of the 2014 international symposium on Memory managementWe present the sticky trie, a new variant of the standard trie data structure that achieves high-performing atomic insertions and lookups for large key universes by precluding deletions. It has applications in several areas, including address tracking, ...
Smoothed heights of tries and patricia tries
Tries and patricia tries are two popular data structures for storing strings. Let Hn denote the height of the trie (the patricia trie, respectively) on a set of n strings. Under the uniform distribution model on the strings, it is well known that Hn/...
Laws of large numbers and tail inequalities for random tries and PATRICIA trees
Special issue: Probabilistic methods in combinatorics and combinatorial optimizationWe consider random tries and random PATRICIA trees constructed from n independent strings of symbols drawn from any distribution on any discrete space. If Hn, is the height of this tree, we show that Hn/E{Hn} tends to one in probability. Additional tail ...







Comments