skip to main content
10.1145/1248377.1248393acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
Article

Cache-oblivious streaming B-trees

Published: 09 June 2007 Publication History

Abstract

A streaming B-tree is a dictionary that efficiently implements insertions and range queries. We present two cache-oblivious streaming B-trees, the shuttle tree, and the cache-oblivious lookahead array (COLA).
For block-transfer size B and on N elements, the shuttle tree implements searches in optimal O(log B+1N) transfers, range queries of L successive elements in optimal O(log B+1N +L/B) transfers, and insertions in O((log B+1N)/BΘ(1/(log log B)2)+(log2N)/B) transfers, which is an asymptotic speedup over traditional B-trees if B ≥ (log N)1+c log log log2 N for any constant c >1.
A COLA implements searches in O(log N) transfers, range queries in O(log N + L/B) transfers, and insertions in amortized O((log N)/B) transfers, matching the bounds for a (cache-aware) buffered repository tree. A partially deamortized COLA matches these bounds but reduces the worst-case insertion cost to O(log N) if memory size M = Ω(log N). We also present a cache-aware version of the COLA, the lookahead array, which achieves the same bounds as Brodal and Fagerberg's (cache-aware) Bε-tree.
We compare our COLA implementation to a traditional B-tree. Our COLA implementation runs 790 times faster for random inser-tions, 3.1 times slower for insertions of sorted data, and 3.5 times slower for searches.

References

[1]
A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. Commun. ACM, 31(9):1116--1127, Sept. 1988.
[2]
L. Arge, M. A. Bender, E. D. Demaine, B. Holland-Minkley, and J. I. Munro. Cache-oblivious priority queue and graph algorithm applications. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing (STOC), pages 268--276, Montréal, Canada, Québec, Canada, May 2002.
[3]
L. Arge and J. S. Vitter. Optimal external memory interval management. SIAM Journal on Computing, 32(6):1488--1508, 2003.
[4]
R. Bayer and E. M. McCreight. Organization and maintenance of large ordered indexes. Acta Inf., 1(3):173--189, Feb. 1972.
[5]
M. A. Bender, R. Cole, E. D. Demaine, and M. Farach-Colton. Scanning and traversing: Maintaining data for traversals in a memory hierarchy. In Proc. 10th Annual European Symp. on Algorithms (ESA), pages 139--151, Rome, Italy, Sept. 2002.
[6]
M. A. Bender, E. D. Demaine, and M. Farach-Colton. Cache-oblivious B-trees. SIAM J. Comput., 35(2):341--358, 2005. An earlier version of this paper appeared in Proc. 41st Annual Symp. on Foundations of Computer Science (FOCS), pages 399--409, Redondo Beach, California, 2000.
[7]
M. A. Bender, Z. Duan, J. Iacono, and J. Wu. A locality-preserving cache-oblivious dynamic dictionary. J. Algorithms, 3(2):115--136, 2004.
[8]
M. A. Bender, M. Farach-Colton, and B. Kuszmaul. Cache-oblivious string B-trees. In Proc. 25th Symposium on Principles of Database Systems (PODS), pages 233--242, Chicago, Illinois, June 2006.
[9]
J. L. Bentley and J. B. Saxe. Decomposable searching problems i: Static-to-dynamic transformation. J. Algorithms, 1(4):301--358, 1980.
[10]
G. S. Brodal and R. Fagerberg. Lower bounds for external memory dictionaries. In Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 546--554, Baltimore, Maryland, May 2003.
[11]
G. S. Brodal, R. Fagerberg, and R. Jacob. Cache oblivious search trees via binary trees of small height. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 39--48, San Francisco, California, Jan. 2002.
[12]
A. L. Buchsbaum, M. Goldwasser, S. Venkatasubramanian, and J. R. Westbrook. On external memory graph traversal. In Proceedings of the 11th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 859--860, San Francisco, California, Jan. 2000.
[13]
B. Chazelle and L. J. Guibas. Fractional cascading: I. a data structuring technique. Algorithmica, 1(2):133--162, 1986.
[14]
D. Comer. The ubiquitous B-tree. ACM Comput. Surv., 11(2):121--137, June 1979.
[15]
M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. In Proc. 40th Annual Symp. on Foundations of Computer Science (FOCS), pages 285--297, New York, New York, Oct. 1999.
[16]
I. Katriel. Implicit data structures based on local reorganizations. Master's thesis, Technion, Israel Inst. of Tech., Haifa, May 2002.
[17]
D. E. Knuth. Sorting and Searching, volume 3 of The Art of Computer Programming. Addison-Wesley, Reading, Massachusetts, 1973.
[18]
J. I. Munro, T. Papadakis, and R. Sedgewick. Deterministic skip lists. In Proceedings of the 3rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 367--375, Orlando, Florida, January 1992.
[19]
M. H. Overmars. The Design of Dynamic Data Structures. Springer, 1983.
[20]
H. Prokop. Cache-oblivious algorithms. Master's thesis, Department of Electrical Engineering and Computer Science, Massachusetts Inst. of Tech., June 1999.
[21]
Sleepycat Software. The Berkeley Database. http://www.sleepycat.com, 2005.

Cited By

View all
  • (2024)Competitive Data-Structure DynamizationACM Transactions on Algorithms10.1145/367261420:4(1-28)Online publication date: 28-Jun-2024
  • (2024)History-Independent Dynamic Partitioning: Operation-Order Privacy in Ordered Data StructuresProceedings of the ACM on Management of Data10.1145/36516092:2(1-27)Online publication date: 14-May-2024
  • (2023)SRockDB: A Range-Query Optimized Database Based on RocksDB2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC53992.2023.10394667(2220-2225)Online publication date: 1-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SPAA '07: Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
June 2007
376 pages
ISBN:9781595936677
DOI:10.1145/1248377
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. buffered repository tree
  2. cache-oblivious B-tree
  3. cascading array
  4. deamortized
  5. lookahead array
  6. shuttle tree

Qualifiers

  • Article

Conference

SPAA07

Acceptance Rates

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)33
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Competitive Data-Structure DynamizationACM Transactions on Algorithms10.1145/367261420:4(1-28)Online publication date: 28-Jun-2024
  • (2024)History-Independent Dynamic Partitioning: Operation-Order Privacy in Ordered Data StructuresProceedings of the ACM on Management of Data10.1145/36516092:2(1-27)Online publication date: 14-May-2024
  • (2023)SRockDB: A Range-Query Optimized Database Based on RocksDB2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC53992.2023.10394667(2220-2225)Online publication date: 1-Oct-2023
  • (2022)LogStore: A Workload-Aware, Adaptable Key-Value Store on Hybrid Storage SystemsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.302719134:8(3867-3882)Online publication date: 1-Aug-2022
  • (2022)Using advanced data structures to enable responsive security monitoringCluster Computing10.1007/s10586-021-03463-525:4(2893-2914)Online publication date: 24-Jan-2022
  • (2021)Competitive data-structure dynamizationProceedings of the Thirty-Second Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3458064.3458199(2269-2287)Online publication date: 10-Jan-2021
  • (2021)Timely Reporting of Heavy Hitters Using External MemoryACM Transactions on Database Systems10.1145/347239246:4(1-35)Online publication date: 15-Nov-2021
  • (2021)External-memory Dictionaries in the Affine and PDAM ModelsACM Transactions on Parallel Computing10.1145/34706358:3(1-20)Online publication date: 20-Sep-2021
  • (2021)Copy-on-Abundant-Write for Nimble File System ClonesACM Transactions on Storage10.1145/342349517:1(1-27)Online publication date: 29-Jan-2021
  • (2020)Neural TreesProceedings of the 12th USENIX Conference on Hot Topics in Storage and File Systems10.5555/3488733.3488750(17-17)Online publication date: 13-Jul-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media