skip to main content
research-article
Open Access
Artifacts Available
Artifacts Evaluated & Functional

Persistence for the masses: RRB-vectors in a systems language

Published:29 August 2017Publication History
Skip Abstract Section

Abstract

Relaxed Radix Balanced Trees (RRB-Trees) is one of the latest members in a family of persistent tree based data-structures that combine wide branching factors with simple and relatively flat structures. Like the battle-tested immutable sequences of Clojure and Scala, they have effectively constant lookup and updates, good cache utilization, but also logarithmic concatenation and slicing. Our goal is to bring the benefits of persistent data structures to the discipline of systems programming via generic yet efficient immutable vectors supporting transient batch updates. We describe a C++ implementation that can be integrated in the runtime of higher level languages with a C core (Lisps like Guile or Racket, but also Python or Ruby), thus widening the access to these persistent data structures.

In this work we propose (1) an Embedding RRB-Tree (ERRB-Tree) data structure that efficiently stores arbitrary unboxed types, (2) a technique for implementing tree operations orthogonal to optimizations for a more compact representation of the tree, (3) a policy-based design to support multiple memory management and reclamation mechanisms (including automatic garbage collection and reference counting), (4) a model of transience based on move-semantics and reference counting, and (5) a definition of transience for confluent meld operations. Combining these techniques a performance comparable to that of mutable arrays can be achieved in many situations, while using the data structure in a functional way.

Skip Supplemental Material Section

Supplemental Material

References

  1. Umut A. Acar, Arthur Charguéraud, and Mike Rainey. 2014. Theory and Practice of Chunked Sequences. Springer Berlin Heidelberg, Berlin, Heidelberg, 25–36. Google ScholarGoogle ScholarCross RefCross Ref
  2. Andrei Alexandrescu. 2001. Modern C++ design: generic programming and design patterns applied. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.Google ScholarGoogle Scholar
  3. Matthew H. Austern. 2000. Segmented Iterators and Hierarchical Algorithms. In Selected Papers from the International Seminar on Generic Programming. Springer-Verlag, London, UK, UK, 80–90. http://dl.acm.org/citation.cfm?id=647373. 724070Google ScholarGoogle Scholar
  4. Phil Bagwell. 2000. Fast And Space Efficient Trie Searches. Technical Report.Google ScholarGoogle Scholar
  5. Phil Bagwell. 2001. Ideal Hash Trees. Es Grands Champs 1195 (2001).Google ScholarGoogle Scholar
  6. Phil Bagwell. 2002. Fast Functional Lists, Hash-Lists, Deques and Variable Length Arrays. In In Implementation of Functional Languages, 14th International Workshop. 34.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Philip Bagwell and Tiark Rompf. 2011. RRB-Trees: Efficient Immutable Vectors. Technical Report. EPFL.Google ScholarGoogle Scholar
  8. Emery D. Berger, Benjamin G. Zorn, and Kathryn S. McKinley. 2001. Composing High-performance Memory Allocators. SIGPLAN Not. 36, 5 (May 2001), 114–124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Hans-Juergen Boehm and Mark Weiser. 1988. Garbage Collection in an Uncooperative Environment. Softw., Pract. Exper. 18, 9 (1988), 807–820. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hans-J. Boehm, Russ Atkinson, and Michael Plass. 1995. Ropes: An Alternative to Strings. Softw. Pract. Exper. 25, 12 (Dec. 1995), 1315–1330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H-J Boehm, M Spertus, and C Nelson. 2008. N2670: Minimal support for garbage collection and reachability-based leak detection (revised. (2008).Google ScholarGoogle Scholar
  12. Sébastien Collette, John Iacono, and Stefan Langerman. 2012. Confluent Persistence Revisited. In Proceedings of the Twentythird Annual ACM-SIAM Symposium on Discrete Algorithms (SODA ’12). Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 593–601. http://dl.acm.org/citation.cfm?id=2095116.2095166 Google ScholarGoogle ScholarCross RefCross Ref
  13. Erik D. Demaine, Stefan Langerman, and Eric Price. 2010. Confluently Persistent Tries for Efficient Version Control. Algorithmica 57, 3 (July 2010), 462–483. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ulrich Drepper. 2008. What Every Programmer Should Know About Memory. Technical Report. Red Hat. http://people. redhat.com/drepper/cpumemory.pdfGoogle ScholarGoogle Scholar
  15. J R Driscoll, N Sarnak, D D Sleator, and R E Tarjan. 1986. Making Data Structures Persistent. In Proceedings of the Eighteenth Annual ACM Symposium on Theory of Computing (STOC ’86). ACM, New York, NY, USA, 109–121. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Richard Fabian. 2013. Data-Oriented Design. (2013). http://www.dataorienteddesign.com/dodmain/dodmain.htmlGoogle ScholarGoogle Scholar
  17. Amos Fiat and Haim Kaplan. 2001. Making Data Structures Confluently Persistent. In Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA ’01). Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 537–546. http://dl.acm.org/citation.cfm?id=365411.365528Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Matthew Flatt and PLT. 2010. Reference: Racket. Technical Report PLT-TR-2010-1. PLT Design Inc. https://racket-lang.org/ tr1/ .Google ScholarGoogle Scholar
  19. Mark Galassi, Jim Blandy, Gary Houston, Tim Pierce, Neil Jerram, Martin Grabmüller, and Andy Wingo. 2002. Guile Reference Manual. (2002). https://www.gnu.org/software/guile/manual/guile.htmlGoogle ScholarGoogle Scholar
  20. Erich Gamma, Richard Helm, Ralph E. Johnson, and John Vlissides. 1995. Design Patterns. Elements of Reusable ObjectOriented Software. Addison-Wesley.Google ScholarGoogle Scholar
  21. Andy Georges, Dries Buytaert, and Lieven Eeckhout. 2007. Statistically Rigorous Java Performance Evaluation. SIGPLAN Not. 42, 10 (Oct. 2007), 57–76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Matthias Grimmer, Chris Seaton, Thomas Würthinger, and Hanspeter Mössenböck. 2015. Dynamically Composing Languages in a Modular Way: Supporting C Extensions for Dynamic Languages. In Proceedings of the 14th International Conference on Modularity (MODULARITY 2015). ACM, New York, NY, USA, 1–13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Rich Hickey. 2008. The Clojure Programming Language. In Proceedings of the 2008 Symposium on Dynamic Languages (DLS ’08). ACM, New York, NY, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Howard E. Hinnant, David Abrahams, and Peter Dimov. 2004. A Proposal to Add an Rvalue Reference to the C++ Language. Technical Report N1690=04-0130. ISO JTC1/SC22/WG21 – C++ working group.Google ScholarGoogle Scholar
  25. Ralf Hinze and Ross Paterson. 2006. Finger Trees: A Simple General-purpose Data Structure. Journal of Functional Programming 16, 2 (2006), 197–217. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Haim Kaplan. 2005. Persistent data structures. In In Handbook On Data Structures And applications, CRC Press 2001, Dinesh Meht And Sarta Sahni (Editors) Boroujerdi, A., And Moret, B.M.E., "Persistency in Computational Geometry"; Proc. 7TH Canadian Conf. Comp. Geometry, Quebeq. 241–246.Google ScholarGoogle Scholar
  27. Jean Niklas L’orange. 2014. Improving RRB-Tree Performance through Transience. Master’s thesis. Norwegian University of Science and Technology.Google ScholarGoogle Scholar
  28. Nicholas D. Matsakis and Felix S. Klock, II. 2014. The Rust Language. Ada Lett. 34, 3 (Oct. 2014), 103–104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. Okasaki. 1999. Purely Functional Data Structures. Cambridge University Press. https://books.google.de/books?id= SxPzSTcTalACGoogle ScholarGoogle Scholar
  30. Aleksandar Prokopec. 2014. Data Structures and Algorithms for Data-Parallel Computing in a Managed Runtime. Ph.D. Dissertation. IC, Lausanne. Google ScholarGoogle ScholarCross RefCross Ref
  31. Jon Rafkind, Adam Wick, John Regehr, and Matthew Flatt. 2009. Precise Garbage Collection for C. In Proceedings of the 2009 International Symposium on Memory Management (ISMM ’09). ACM, New York, NY, USA, 39–48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Michael J. Steindorfer and Jurgen J. Vinju. 2015. Optimizing Hash-array Mapped Tries for Fast and Lean Immutable JVM Collections. SIGPLAN Not. 50, 10 (Oct. 2015), 783–800. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Michael J. Steindorfer and Jurgen J. Vinju. 2016. Towards a Software Product Line of Trie-based Collections. In Proceedings of the 2016 ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE 2016). ACM, New York, NY, USA, 168–172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Nicolas Stucki, Tiark Rompf, Vlad Ureche, and Phil Bagwell. 2015. RRB Vector: A Practical General Purpose Immutable Sequence. SIGPLAN Not. 50, 9 (Aug. 2015), 342–354. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. D Walker. 2005. Substructural type systems. In In Advanced Topics in Types and Programming Languages. The MIT Press.Google ScholarGoogle Scholar

Index Terms

  1. Persistence for the masses: RRB-vectors in a systems language

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!