skip to main content
10.1145/2784731.2784739acmconferencesArticle/Chapter ViewAbstractPublication PagesicfpConference Proceedingsconference-collections
research-article

RRB vector: a practical general purpose immutable sequence

Published: 29 August 2015 Publication History

Abstract

State-of-the-art immutable collections have wildly differing performance characteristics across their operations, often forcing programmers to choose different collection implementations for each task. Thus, changes to the program can invalidate the choice of collections, making code evolution costly. It would be desirable to have a collection that performs well for a broad range of operations. To this end, we present the RRB-Vector, an immutable sequence collection that offers good performance across a large number of sequential and parallel operations. The underlying innovations are: (1) the Relaxed-Radix-Balanced (RRB) tree structure, which allows efficient structural reorganization, and (2) an optimization that exploits spatio-temporal locality on the RRB data structure in order to offset the cost of traversing the tree. In our benchmarks, the RRB-Vector speedup for parallel operations is lower bounded by 7x when executing on 4 CPUs of 8 cores each. The performance for discrete operations, such as appending on either end, or updating and removing elements, is consistently good and compares favorably to the most important immutable sequence collections in the literature and in use today. The memory footprint of RRB-Vector is on par with arrays and an order of magnitude less than competing collections.

References

[1]
P. Bagwell. Fast and Space-efficient Trie Searches. Technical report, EPFL, 2000.
[2]
P. Bagwell. Ideal hash trees. Technical report, EPFL, 2001.
[3]
P. Bagwell. Fast Functional Lists, Hash-Lists, Deques and Variable Length Arrays. In Implementation of Functional Languages, 2002.
[4]
P. Bagwell and T. Rompf. RRB-Trees: Efficient Immutable Vectors. Technical report, EPFL, 2011.
[5]
H.-J. Boehm, R. Atkinson, and M. Plass. Ropes: An alternative to strings. Software: Practice and Experience, 25(12):1315–1330, 1995.
[6]
D. Comer. The ubiquitous b-tree. ACM Comput. Surv., 11(2):121–137, 1979.
[7]
D. Coutts, R. Leshchinskiy, and D. Stewart. Stream fusion: From lists to streams to nothing at all. In Proceedings of the 12th ACM SIGPLAN International Conference on Functional Programming, ICFP ’07, pages 315–326, New York, NY, USA, 2007. ACM.
[8]
I. Dragos. Optimizing Higher-Order Functions in Scala. In ICOOOLPS, 2008.
[9]
14 https://github.com/nicolasstucki/scala-rrb-vector
[10]
I. Dragos. Compiling Scala for Performance. PhD thesis, IC, 2010.
[11]
I. Dragos and M. Odersky. Compiling Generics through User-directed Type Specialization. In ICOO0LPS ’09. ACM, 2009.
[12]
J. R. Driscoll, N. Sarnak, D. D. Sleator, and R. E. Tarjan. Making data structures persistent. J. Comput. Syst. Sci., 38(1):86–124, Feb. 1989.
[13]
A. Georges, D. Buytaert, and L. Eeckhout. Statistically rigorous java performance evaluation. In Proceedings of the 22Nd Annual ACM SIGPLAN Conference on Object-oriented Programming Systems and Applications, OOPSLA ’07, pages 57–76, New York, NY, USA, 2007. ACM.
[14]
S. Hanke. The Performance of Concurrent Red-Black Tree Algorithms. In J. Vitter and C. Zaroliagis, editors, Algorithm Engineering, volume 1668 of Lecture Notes in Computer Science. Springer Berlin Heidelberg, 1999.
[15]
M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. Apr. 2008.
[16]
R. Hickey. The Clojure programming language, 2006.
[17]
R. Hinze and R. Paterson. Finger Trees: A Simple General-purpose Data Structure. J. Funct. Program., 16(2), 2006.
[18]
T. Kotzmann, C. Wimmer, H. Mössenböck, T. Rodriguez, K. Russell, and D. Cox. Design of the Java HotSpot&Trade; Client Compiler for Java 6. ACM Trans. Archit. Code Optim., 5(1), May 2008.
[19]
D. Lea. A Java Fork/Join Framework. In Proceedings of the ACM 2000 Conference on Java Grande, JAVA ’00, New York, NY, USA, 2000. ACM.
[20]
V. Leis, A. Kemper, and T. Neumann. The adaptive radix tree: Artful indexing for main-memory databases. In C. S. Jensen, C. M. Jermaine, and X. Zhou, editors, 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, April 8-12, 2013, pages 38–49. IEEE Computer Society, 2013.
[21]
J. N. L’orange. Improving RRB-Tree Performance through Transience. Master’s thesis, Norwegian University of Science and Technology, June 2014.
[22]
Moir and Shavit. Concurrent data structures. In Mehta and Sahni, editors, Handbook of Data Structures and Applications, Chapman & Hall/CRC. 2005.
[23]
A. Moors. Type Constructor Polymorphism for Scala: Theory and Practice (Type constructor polymorfisme voor Scala: theorie en praktijk). PhD thesis, Informatics Section, Department of Computer Science, Faculty of Engineering Science, May 2009. Joosen, Wouter and Piessens, Frank (supervisors).
[24]
A. Moors, F. Piessens, and M. Odersky. Generics of a Higher Kind. Acm Sigplan Notices, 43, 2008.
[25]
D. R. Morrison. PATRICIA-practical algorithm to retrieve information coded in alphanumeric. J. ACM, 15(4):514–534, Oct. 1968.
[26]
M. Odersky. Future-Proofing Collections: From Mutable to Persistent to Parallel. In Compiler Construction, volume 6601 of Lecture Notes in Computer Science. Springer-Verlag New York, Ms Ingrid Cunningham, 175 Fifth Ave, New York, Ny 10010 Usa, 2011.
[27]
M. Odersky and A. Moors. Fighting bit Rot with Types (Experience Report: Scala Collections). In R. Kannan and K. N. Kumar, editors, IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, volume 4 of Leibniz International Proceedings in Informatics (LIPIcs), Dagstuhl, Germany, 2009.
[28]
Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.
[29]
C. Okasaki. Purely Functional Data Structures. Cambridge University Press, New York, NY, USA, 1998.
[30]
B. C. d. S. Oliveira, A. Moors, and M. Odersky. Type Classes as Objects and Implicits. In OOPSLA ’10. ACM, 2010.
[31]
M. Paleczny, C. Vick, and C. Click. The java hotspottm server compiler. In Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1, JVM’01, pages 1–1, Berkeley, CA, USA, 2001. USENIX Association.
[32]
A. Prokopec. ScalaMeter. https://scalameter.github.io/.
[33]
A. Prokopec. Data Structures and Algorithms for Data-Parallel Computing in a Managed Runtime. PhD thesis, IC, Lausanne, 2014.
[34]
A. Prokopec and M. Odersky. Near optimal work-stealing tree scheduler for highly irregular data-parallel workloads. In Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science, pages 55–86. Springer International Publishing, 2014.
[35]
A. Prokopec, D. Petrashko, and M. Odersky. Efficient Lock-Free Work-stealing Iterators for Data-Parallel Collections. 2015.
[36]
A. Prokopec, T. Rompf, P. Bagwell, and M. Odersky. On a generic parallel collection framework, 2011.
[37]
T. Rompf and M. Odersky. Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs. Communications Of The Acm, 55, 2012.
[38]
T. Rompf, A. K. Sujeeth, K. J. Brown, H. Lee, H. Chafi, and K. Olukotun. Surgical precision JIT compilers. In M. F. P. O’Boyle and K. Pingali, editors, ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, Edinburgh, United Kingdom - June 09 - 11, 2014, page 8. ACM, 2014.
[39]
N. Stucki. Turning Relaxed Radix Balanced Vector from Theory into Practice for scala collections. Master’s thesis, EPFL, 2015.
[40]
W. Taha and T. Sheard. MetaML and Multi-Stage Programming with Explicit Annotations. In Theoretical Computer Science. ACM Press, 1999.
[41]
V. Ureche, E. Burmako, and M. Odersky. Late data layout: Unifying data representation transformations. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA ’14, pages 397–416, New York, NY, USA, 2014. ACM.
[42]
V. Ureche, C. Talau, and M. Odersky. Miniboxing: Improving the Speed to Code Size Tradeoff in Parametric Polymorphism Translations. In OOPSLA’13, OOPSLA ’13, pages 73–92, New York, NY, USA, 2013. ACM.
[43]
B. Venner, G. Berger, and C. C. Seng. Scalatest. http://www. scalatest.org/.

Cited By

View all
  • (2024)Reducing Write Barrier Overheads for Orthogonal PersistenceProceedings of the 17th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3687997.3695646(210-223)Online publication date: 17-Oct-2024
  • (2023)Enhancing Ropes for Collaborative Text Editing2023 Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE)10.1109/CSCE60160.2023.00415(2593-2599)Online publication date: 24-Jul-2023
  • (2022)Specification and verification of a transient stackProceedings of the 11th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3497775.3503677(82-99)Online publication date: 17-Jan-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICFP 2015: Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming
August 2015
436 pages
ISBN:9781450336697
DOI:10.1145/2784731
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 50, Issue 9
    ICFP '15
    September 2015
    436 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2858949
    • Editor:
    • Andy Gill
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 August 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Arrays
  2. Data Structures
  3. Immutable
  4. Radix-Balanced
  5. Relaxed-Radix-Balanced
  6. Sequences
  7. Trees
  8. Vectors

Qualifiers

  • Research-article

Conference

ICFP'15
Sponsor:

Acceptance Rates

Overall Acceptance Rate 333 of 1,064 submissions, 31%

Upcoming Conference

ICFP '25
ACM SIGPLAN International Conference on Functional Programming
October 12 - 18, 2025
Singapore , Singapore

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)66
  • Downloads (Last 6 weeks)7
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Reducing Write Barrier Overheads for Orthogonal PersistenceProceedings of the 17th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3687997.3695646(210-223)Online publication date: 17-Oct-2024
  • (2023)Enhancing Ropes for Collaborative Text Editing2023 Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE)10.1109/CSCE60160.2023.00415(2593-2599)Online publication date: 24-Jul-2023
  • (2022)Specification and verification of a transient stackProceedings of the 11th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3497775.3503677(82-99)Online publication date: 17-Jan-2022
  • (2020)MODProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378472(775-788)Online publication date: 9-Mar-2020
  • (2019)Fluid data structuresProceedings of the 17th ACM SIGPLAN International Symposium on Database Programming Languages10.1145/3315507.3330197(3-17)Online publication date: 23-Jun-2019
  • (2019)PureMEMProceedings of the 34th ACM/SIGAPP Symposium on Applied Computing10.1145/3297280.3299739(1544-1551)Online publication date: 8-Apr-2019
  • (2019)The Random Access ZipperTrends in Functional Programming10.1007/978-3-030-14805-8_9(155-171)Online publication date: 21-Feb-2019
  • (2018)To-many or to-one? all-in-one! efficient purely functional multi-maps with type-heterogeneous hash-triesACM SIGPLAN Notices10.1145/3296979.319242053:4(283-295)Online publication date: 11-Jun-2018
  • (2018)To-many or to-one? all-in-one! efficient purely functional multi-maps with type-heterogeneous hash-triesProceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3192366.3192420(283-295)Online publication date: 11-Jun-2018
  • (2017)Quad Ropes: immutable, declarative arrays with parallelizable operationsProceedings of the 4th ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming10.1145/3091966.3091971(1-8)Online publication date: 18-Jun-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media