Abstract
State-of-the-art immutable collections have wildly differing performance characteristics across their operations, often forcing programmers to choose different collection implementations for each task. Thus, changes to the program can invalidate the choice of collections, making code evolution costly. It would be desirable to have a collection that performs well for a broad range of operations. To this end, we present the RRB-Vector, an immutable sequence collection that offers good performance across a large number of sequential and parallel operations. The underlying innovations are: (1) the Relaxed-Radix-Balanced (RRB) tree structure, which allows efficient structural reorganization, and (2) an optimization that exploits spatio-temporal locality on the RRB data structure in order to offset the cost of traversing the tree. In our benchmarks, the RRB-Vector speedup for parallel operations is lower bounded by 7x when executing on 4 CPUs of 8 cores each. The performance for discrete operations, such as appending on either end, or updating and removing elements, is consistently good and compares favorably to the most important immutable sequence collections in the literature and in use today. The memory footprint of RRB-Vector is on par with arrays and an order of magnitude less than competing collections.
- P. Bagwell. Fast and Space-efficient Trie Searches. Technical report, EPFL, 2000.Google Scholar
- P. Bagwell. Ideal hash trees. Technical report, EPFL, 2001.Google Scholar
- P. Bagwell. Fast Functional Lists, Hash-Lists, Deques and Variable Length Arrays. In Implementation of Functional Languages, 2002. Google Scholar
Digital Library
- P. Bagwell and T. Rompf. RRB-Trees: Efficient Immutable Vectors. Technical report, EPFL, 2011.Google Scholar
- H.-J. Boehm, R. Atkinson, and M. Plass. Ropes: An alternative to strings. Software: Practice and Experience, 25(12):1315–1330, 1995. Google Scholar
Digital Library
- D. Comer. The ubiquitous b-tree. ACM Comput. Surv., 11(2):121–137, 1979. Google Scholar
Digital Library
- D. Coutts, R. Leshchinskiy, and D. Stewart. Stream fusion: From lists to streams to nothing at all. In Proceedings of the 12th ACM SIGPLAN International Conference on Functional Programming, ICFP ’07, pages 315–326, New York, NY, USA, 2007. ACM. Google Scholar
Digital Library
- I. Dragos. Optimizing Higher-Order Functions in Scala. In ICOOOLPS, 2008.Google Scholar
- 14 https://github.com/nicolasstucki/scala-rrb-vectorGoogle Scholar
- I. Dragos. Compiling Scala for Performance. PhD thesis, IC, 2010.Google Scholar
- I. Dragos and M. Odersky. Compiling Generics through User-directed Type Specialization. In ICOO0LPS ’09. ACM, 2009. Google Scholar
Digital Library
- J. R. Driscoll, N. Sarnak, D. D. Sleator, and R. E. Tarjan. Making data structures persistent. J. Comput. Syst. Sci., 38(1):86–124, Feb. 1989. Google Scholar
Digital Library
- A. Georges, D. Buytaert, and L. Eeckhout. Statistically rigorous java performance evaluation. In Proceedings of the 22Nd Annual ACM SIGPLAN Conference on Object-oriented Programming Systems and Applications, OOPSLA ’07, pages 57–76, New York, NY, USA, 2007. ACM. Google Scholar
Digital Library
- S. Hanke. The Performance of Concurrent Red-Black Tree Algorithms. In J. Vitter and C. Zaroliagis, editors, Algorithm Engineering, volume 1668 of Lecture Notes in Computer Science. Springer Berlin Heidelberg, 1999. Google Scholar
Digital Library
- M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. Apr. 2008. Google Scholar
Digital Library
- R. Hickey. The Clojure programming language, 2006.Google Scholar
- R. Hinze and R. Paterson. Finger Trees: A Simple General-purpose Data Structure. J. Funct. Program., 16(2), 2006. Google Scholar
Digital Library
- T. Kotzmann, C. Wimmer, H. Mössenböck, T. Rodriguez, K. Russell, and D. Cox. Design of the Java HotSpot&Trade; Client Compiler for Java 6. ACM Trans. Archit. Code Optim., 5(1), May 2008. Google Scholar
Digital Library
- D. Lea. A Java Fork/Join Framework. In Proceedings of the ACM 2000 Conference on Java Grande, JAVA ’00, New York, NY, USA, 2000. ACM. Google Scholar
Digital Library
- V. Leis, A. Kemper, and T. Neumann. The adaptive radix tree: Artful indexing for main-memory databases. In C. S. Jensen, C. M. Jermaine, and X. Zhou, editors, 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, April 8-12, 2013, pages 38–49. IEEE Computer Society, 2013. Google Scholar
Digital Library
- J. N. L’orange. Improving RRB-Tree Performance through Transience. Master’s thesis, Norwegian University of Science and Technology, June 2014.Google Scholar
- Moir and Shavit. Concurrent data structures. In Mehta and Sahni, editors, Handbook of Data Structures and Applications, Chapman & Hall/CRC. 2005.Google Scholar
- A. Moors. Type Constructor Polymorphism for Scala: Theory and Practice (Type constructor polymorfisme voor Scala: theorie en praktijk). PhD thesis, Informatics Section, Department of Computer Science, Faculty of Engineering Science, May 2009. Joosen, Wouter and Piessens, Frank (supervisors).Google Scholar
- A. Moors, F. Piessens, and M. Odersky. Generics of a Higher Kind. Acm Sigplan Notices, 43, 2008. Google Scholar
Digital Library
- D. R. Morrison. PATRICIA-practical algorithm to retrieve information coded in alphanumeric. J. ACM, 15(4):514–534, Oct. 1968. Google Scholar
Digital Library
- M. Odersky. Future-Proofing Collections: From Mutable to Persistent to Parallel. In Compiler Construction, volume 6601 of Lecture Notes in Computer Science. Springer-Verlag New York, Ms Ingrid Cunningham, 175 Fifth Ave, New York, Ny 10010 Usa, 2011. Google Scholar
Digital Library
- M. Odersky and A. Moors. Fighting bit Rot with Types (Experience Report: Scala Collections). In R. Kannan and K. N. Kumar, editors, IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, volume 4 of Leibniz International Proceedings in Informatics (LIPIcs), Dagstuhl, Germany, 2009.Google Scholar
- Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.Google Scholar
- C. Okasaki. Purely Functional Data Structures. Cambridge University Press, New York, NY, USA, 1998. Google Scholar
Cross Ref
- B. C. d. S. Oliveira, A. Moors, and M. Odersky. Type Classes as Objects and Implicits. In OOPSLA ’10. ACM, 2010. Google Scholar
Digital Library
- M. Paleczny, C. Vick, and C. Click. The java hotspottm server compiler. In Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1, JVM’01, pages 1–1, Berkeley, CA, USA, 2001. USENIX Association. Google Scholar
Digital Library
- A. Prokopec. ScalaMeter. https://scalameter.github.io/.Google Scholar
- A. Prokopec. Data Structures and Algorithms for Data-Parallel Computing in a Managed Runtime. PhD thesis, IC, Lausanne, 2014.Google Scholar
- A. Prokopec and M. Odersky. Near optimal work-stealing tree scheduler for highly irregular data-parallel workloads. In Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science, pages 55–86. Springer International Publishing, 2014.Google Scholar
- A. Prokopec, D. Petrashko, and M. Odersky. Efficient Lock-Free Work-stealing Iterators for Data-Parallel Collections. 2015.Google Scholar
Digital Library
- A. Prokopec, T. Rompf, P. Bagwell, and M. Odersky. On a generic parallel collection framework, 2011.Google Scholar
Cross Ref
- T. Rompf and M. Odersky. Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs. Communications Of The Acm, 55, 2012. Google Scholar
Digital Library
- T. Rompf, A. K. Sujeeth, K. J. Brown, H. Lee, H. Chafi, and K. Olukotun. Surgical precision JIT compilers. In M. F. P. O’Boyle and K. Pingali, editors, ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, Edinburgh, United Kingdom - June 09 - 11, 2014, page 8. ACM, 2014. Google Scholar
Digital Library
- N. Stucki. Turning Relaxed Radix Balanced Vector from Theory into Practice for scala collections. Master’s thesis, EPFL, 2015.Google Scholar
- W. Taha and T. Sheard. MetaML and Multi-Stage Programming with Explicit Annotations. In Theoretical Computer Science. ACM Press, 1999. Google Scholar
Digital Library
- V. Ureche, E. Burmako, and M. Odersky. Late data layout: Unifying data representation transformations. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA ’14, pages 397–416, New York, NY, USA, 2014. ACM. Google Scholar
Digital Library
- V. Ureche, C. Talau, and M. Odersky. Miniboxing: Improving the Speed to Code Size Tradeoff in Parametric Polymorphism Translations. In OOPSLA’13, OOPSLA ’13, pages 73–92, New York, NY, USA, 2013. ACM. Google Scholar
Digital Library
- B. Venner, G. Berger, and C. C. Seng. Scalatest. http://www. scalatest.org/.Google Scholar
Index Terms
RRB vector: a practical general purpose immutable sequence
Recommendations
RRB vector: a practical general purpose immutable sequence
ICFP 2015: Proceedings of the 20th ACM SIGPLAN International Conference on Functional ProgrammingState-of-the-art immutable collections have wildly differing performance characteristics across their operations, often forcing programmers to choose different collection implementations for each task. Thus, changes to the program can invalidate the ...
Persistence for the masses: RRB-vectors in a systems language
Relaxed Radix Balanced Trees (RRB-Trees) is one of the latest members in a family of persistent tree based data-structures that combine wide branching factors with simple and relatively flat structures. Like the battle-tested immutable sequences of ...
Characterizing the Unique and Diverse Behaviors in Existing and Emerging General-Purpose and Domain-Specific Benchmark Suites
ISPASS '08: Proceedings of the ISPASS 2008 - IEEE International Symposium on Performance Analysis of Systems and softwareCharacterizing and understanding emerging workload behavior is of vital importance to ensure next generation microprocessors perform well on their anticipated future workloads. This paper compares a number of benchmark suites from emerging application ...






Comments