skip to main content
research-article

RRB vector: a practical general purpose immutable sequence

Published:29 August 2015Publication History
Skip Abstract Section

Abstract

State-of-the-art immutable collections have wildly differing performance characteristics across their operations, often forcing programmers to choose different collection implementations for each task. Thus, changes to the program can invalidate the choice of collections, making code evolution costly. It would be desirable to have a collection that performs well for a broad range of operations. To this end, we present the RRB-Vector, an immutable sequence collection that offers good performance across a large number of sequential and parallel operations. The underlying innovations are: (1) the Relaxed-Radix-Balanced (RRB) tree structure, which allows efficient structural reorganization, and (2) an optimization that exploits spatio-temporal locality on the RRB data structure in order to offset the cost of traversing the tree. In our benchmarks, the RRB-Vector speedup for parallel operations is lower bounded by 7x when executing on 4 CPUs of 8 cores each. The performance for discrete operations, such as appending on either end, or updating and removing elements, is consistently good and compares favorably to the most important immutable sequence collections in the literature and in use today. The memory footprint of RRB-Vector is on par with arrays and an order of magnitude less than competing collections.

References

  1. P. Bagwell. Fast and Space-efficient Trie Searches. Technical report, EPFL, 2000.Google ScholarGoogle Scholar
  2. P. Bagwell. Ideal hash trees. Technical report, EPFL, 2001.Google ScholarGoogle Scholar
  3. P. Bagwell. Fast Functional Lists, Hash-Lists, Deques and Variable Length Arrays. In Implementation of Functional Languages, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Bagwell and T. Rompf. RRB-Trees: Efficient Immutable Vectors. Technical report, EPFL, 2011.Google ScholarGoogle Scholar
  5. H.-J. Boehm, R. Atkinson, and M. Plass. Ropes: An alternative to strings. Software: Practice and Experience, 25(12):1315–1330, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Comer. The ubiquitous b-tree. ACM Comput. Surv., 11(2):121–137, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Coutts, R. Leshchinskiy, and D. Stewart. Stream fusion: From lists to streams to nothing at all. In Proceedings of the 12th ACM SIGPLAN International Conference on Functional Programming, ICFP ’07, pages 315–326, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. I. Dragos. Optimizing Higher-Order Functions in Scala. In ICOOOLPS, 2008.Google ScholarGoogle Scholar
  9. 14 https://github.com/nicolasstucki/scala-rrb-vectorGoogle ScholarGoogle Scholar
  10. I. Dragos. Compiling Scala for Performance. PhD thesis, IC, 2010.Google ScholarGoogle Scholar
  11. I. Dragos and M. Odersky. Compiling Generics through User-directed Type Specialization. In ICOO0LPS ’09. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. R. Driscoll, N. Sarnak, D. D. Sleator, and R. E. Tarjan. Making data structures persistent. J. Comput. Syst. Sci., 38(1):86–124, Feb. 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Georges, D. Buytaert, and L. Eeckhout. Statistically rigorous java performance evaluation. In Proceedings of the 22Nd Annual ACM SIGPLAN Conference on Object-oriented Programming Systems and Applications, OOPSLA ’07, pages 57–76, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Hanke. The Performance of Concurrent Red-Black Tree Algorithms. In J. Vitter and C. Zaroliagis, editors, Algorithm Engineering, volume 1668 of Lecture Notes in Computer Science. Springer Berlin Heidelberg, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. Apr. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Hickey. The Clojure programming language, 2006.Google ScholarGoogle Scholar
  17. R. Hinze and R. Paterson. Finger Trees: A Simple General-purpose Data Structure. J. Funct. Program., 16(2), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. Kotzmann, C. Wimmer, H. Mössenböck, T. Rodriguez, K. Russell, and D. Cox. Design of the Java HotSpot&Trade; Client Compiler for Java 6. ACM Trans. Archit. Code Optim., 5(1), May 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Lea. A Java Fork/Join Framework. In Proceedings of the ACM 2000 Conference on Java Grande, JAVA ’00, New York, NY, USA, 2000. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V. Leis, A. Kemper, and T. Neumann. The adaptive radix tree: Artful indexing for main-memory databases. In C. S. Jensen, C. M. Jermaine, and X. Zhou, editors, 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, April 8-12, 2013, pages 38–49. IEEE Computer Society, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. N. L’orange. Improving RRB-Tree Performance through Transience. Master’s thesis, Norwegian University of Science and Technology, June 2014.Google ScholarGoogle Scholar
  22. Moir and Shavit. Concurrent data structures. In Mehta and Sahni, editors, Handbook of Data Structures and Applications, Chapman & Hall/CRC. 2005.Google ScholarGoogle Scholar
  23. A. Moors. Type Constructor Polymorphism for Scala: Theory and Practice (Type constructor polymorfisme voor Scala: theorie en praktijk). PhD thesis, Informatics Section, Department of Computer Science, Faculty of Engineering Science, May 2009. Joosen, Wouter and Piessens, Frank (supervisors).Google ScholarGoogle Scholar
  24. A. Moors, F. Piessens, and M. Odersky. Generics of a Higher Kind. Acm Sigplan Notices, 43, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. R. Morrison. PATRICIA-practical algorithm to retrieve information coded in alphanumeric. J. ACM, 15(4):514–534, Oct. 1968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Odersky. Future-Proofing Collections: From Mutable to Persistent to Parallel. In Compiler Construction, volume 6601 of Lecture Notes in Computer Science. Springer-Verlag New York, Ms Ingrid Cunningham, 175 Fifth Ave, New York, Ny 10010 Usa, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Odersky and A. Moors. Fighting bit Rot with Types (Experience Report: Scala Collections). In R. Kannan and K. N. Kumar, editors, IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, volume 4 of Leibniz International Proceedings in Informatics (LIPIcs), Dagstuhl, Germany, 2009.Google ScholarGoogle Scholar
  28. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.Google ScholarGoogle Scholar
  29. C. Okasaki. Purely Functional Data Structures. Cambridge University Press, New York, NY, USA, 1998. Google ScholarGoogle ScholarCross RefCross Ref
  30. B. C. d. S. Oliveira, A. Moors, and M. Odersky. Type Classes as Objects and Implicits. In OOPSLA ’10. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Paleczny, C. Vick, and C. Click. The java hotspottm server compiler. In Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1, JVM’01, pages 1–1, Berkeley, CA, USA, 2001. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. Prokopec. ScalaMeter. https://scalameter.github.io/.Google ScholarGoogle Scholar
  33. A. Prokopec. Data Structures and Algorithms for Data-Parallel Computing in a Managed Runtime. PhD thesis, IC, Lausanne, 2014.Google ScholarGoogle Scholar
  34. A. Prokopec and M. Odersky. Near optimal work-stealing tree scheduler for highly irregular data-parallel workloads. In Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science, pages 55–86. Springer International Publishing, 2014.Google ScholarGoogle Scholar
  35. A. Prokopec, D. Petrashko, and M. Odersky. Efficient Lock-Free Work-stealing Iterators for Data-Parallel Collections. 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Prokopec, T. Rompf, P. Bagwell, and M. Odersky. On a generic parallel collection framework, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  37. T. Rompf and M. Odersky. Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs. Communications Of The Acm, 55, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. T. Rompf, A. K. Sujeeth, K. J. Brown, H. Lee, H. Chafi, and K. Olukotun. Surgical precision JIT compilers. In M. F. P. O’Boyle and K. Pingali, editors, ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, Edinburgh, United Kingdom - June 09 - 11, 2014, page 8. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. N. Stucki. Turning Relaxed Radix Balanced Vector from Theory into Practice for scala collections. Master’s thesis, EPFL, 2015.Google ScholarGoogle Scholar
  40. W. Taha and T. Sheard. MetaML and Multi-Stage Programming with Explicit Annotations. In Theoretical Computer Science. ACM Press, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. V. Ureche, E. Burmako, and M. Odersky. Late data layout: Unifying data representation transformations. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA ’14, pages 397–416, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. V. Ureche, C. Talau, and M. Odersky. Miniboxing: Improving the Speed to Code Size Tradeoff in Parametric Polymorphism Translations. In OOPSLA’13, OOPSLA ’13, pages 73–92, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. B. Venner, G. Berger, and C. C. Seng. Scalatest. http://www. scalatest.org/.Google ScholarGoogle Scholar

Index Terms

  1. RRB vector: a practical general purpose immutable sequence

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!