ABSTRACT
The B-tree is a fundamental external index structure that is widely used for answering one-dimensional range reporting queries. Given a set of N keys, a range query can be answered in O(logB NoverM + KoverB) I/Os, where B is the disk block size, K the output size, and M the size of the main memory buffer. When keys are inserted or deleted, the B-tree is updated in O(logB N) I/Os, if we require the resulting changes to be committed to disk right away. Otherwise, the memory buffer can be used to buffer the recent updates, and changes can be written to disk in batches, which significantly lowers the amortized update cost. A systematic way of batching up updates is to use the logarithmic method, combined with fractional cascading, resulting in a dynamic B-tree that supports insertions in O(1overB log NoverM) I/Os and queries in O(log NoverM + KoverB) I/Os. Such bounds have also been matched by several known dynamic B-tree variants in the database literature. Note that, however, the query cost of these dynamic B-trees is substantially worse than the O(logB NoverM + KoverB) bound of the static B-tree by a factor of ?(log B).
In this paper, we prove that for any dynamic one dimensional range query index structure with query cost O(q + KoverB) and amortized insertion cost O(u/B), the tradeoff q · log(u/q) = ©(log B) must hold if q = O(log B). For most reasonable values of the parameters, we have NoverM = BO(1), in which case our query-insertion tradeoff implies that the bounds mentioned above are already optimal. We also prove a lower bound of u · log q = ©(log B), which is relevant for larger values of q. Our lower bounds hold in a dynamic version of the indexability model, which is of independent interests. Dynamic indexability is a clean yet powerful model for studying dynamic indexing problems, and can potentially lead to more interesting complexity results.
- S. Alstrup, G. Brodal, and T. Rauhe. Optimal static range reporting in one dimension. In Proc. ACM Symposium on Theory of Computation, pages 476--482, 2001. Google Scholar
Digital Library
- L. Arge. The buffer tree: A technique for designing batched external data structures. Algorithmica, 37(1):1--24, 2003. See also WADS'95.Google Scholar
Digital Library
- L. Arge, V. Samoladas, and J. S. Vitter. On two-dimensional indexability and optimal range search indexing. In Proc. ACM Symposium on Principles of Database Systems, pages 346--357, 1999. Google Scholar
Digital Library
- L. Arge, V. Samoladas, and K. Yi. Optimal external memory planar point enclosure. Algorithmica, to appear. See also ESA'04. Google Scholar
Digital Library
- R. Bayer and E. McCreight. Organization and maintenance of large ordered indexes. Acta Informatica, 1:173--189, 1972.Google Scholar
Digital Library
- P. Beame and F. E. Fich. Optimal bounds for the predecessor problem and related problems. Journal of Computer and System Sciences, 65(1):38--72, 2002. Google Scholar
Digital Library
- J. L. Bentley and J. B. Saxe. Decomposable searching problems I: Static-to-dynamic transformation. Journal of Algorithms, 1:301--358, 1980.Google Scholar
- G. S. Brodal and R. Fagerberg. Lower bounds for external memory dictionaries. In Proc. ACM-SIAM Symposium on Discrete Algorithms, pages 546--554, 2003. Google Scholar
Digital Library
- A. L. Buchsbaum, M. Goldwasser, S. Venkatasubramanian, and J. R. Westbrook. On external memory graph traversal. In Proc. ACM-SIAM Symposium on Discrete Algorithms, pages 859--860, 2000. Google Scholar
Digital Library
- B. Chazelle and L. J. Guibas. Fractional cascading: I. A data structuring technique. Algorithmica, 1:133--162, 1986.Google Scholar
Digital Library
- J. M. Hellerstein, E. Koutsoupias, D. Miranker, C. H. Papadimitriou, and V. Samoladas. On a model of indexability and its bounds for range queries. Journal of the ACM, 49(1):35--55, 2002. Google Scholar
Digital Library
- J. M. Hellerstein, E. Koutsoupias, and C. H. Papadimitriou. On the analysis of indexing schemes. In Proc. ACM Symposium on Principles of Database Systems, pages 249--256, 1997. Google Scholar
Digital Library
- H. V. Jagadish, P. P. S. Narayan, S. Seshadri, S. Sudarshan, and R. Kanneganti. Incremental organization for data recording and warehousing. In Proc. International Conference on Very Large Databases, pages 16--25, 1997. Google Scholar
Digital Library
- C. Jermaine, A. Datta, and E. Omiecinski. A novel index supporting high volume data waresshouse insertion. In Proc. International Conference on Very Large Databases, pages 235--246, 1999. Google Scholar
Digital Library
- D. E. Knuth. Sorting and Searching, volume 3 of The Art of Computer Programming. Addison-Wesley, Reading, MA, 1973.Google Scholar
- E. Koutsoupias and D. S. Taylor. Tight bounds for 2-dimensional indexing schemes. In Proc. ACM Symposium on Principles of Database Systems, pages 52--58, 1998. Google Scholar
Digital Library
- C. W. Mortensen, R. Pagh, and M. Patra_cu. On dynamic range reporting in one dimension. In Proc. ACM Symposium on Theory of Computation, pages 104--111, 2005. Google Scholar
Digital Library
- P. O'Neil, E. Cheng, D. Gawlick, and E. O'Neil. The log-structured merge-tree (LSM-tree). Acta Informatica, 33(4):351--385, 1996. Google Scholar
Digital Library
- V. Samoladas and D. Miranker. A lower bound theorem for indexing schemes and its application to multidimensional range queries. In Proc. ACM Symposium on Principles of Database Systems, pages 44--51, 1998. Google Scholar
Digital Library
- Z. Wei, K. Yi, and Q. Zhang. Dynamic external hashing: The limit of buffering. Manuscript.Google Scholar
- A. Yao. Should tables be sorted? Journal of the ACM, 28(3):615--628, 1981. Google Scholar
Digital Library
Index Terms
Dynamic indexability and lower bounds for dynamic one-dimensional range query indexes
Recommendations
Dynamic Indexability and the Optimality of B-Trees
One-dimensional range queries, as one of the most basic type of queries in databases, have been studied extensively in the literature. For large databases, the goal is to build an external index that is optimized for disk block accesses (or I/Os). The ...
Lower bound for succinct range minimum query
STOC 2020: Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of ComputingGiven an integer array A[1..n], the Range Minimum Query problem (RMQ) asks to preprocess A into a data structure, supporting RMQ queries: given a,b∈ [1,n], return the index i∈[a,b] that minimizes A[i], i.e., argmin i∈[a,b] A[i]. This problem has a ...
Super-linear time-space tradeoff lower bounds for randomized computation
FOCS '00: Proceedings of the 41st Annual Symposium on Foundations of Computer ScienceWe prove the first time-space lower bound tradeoffs for randomized computation of decision problems. The bounds hold even in the case that the computation is allowed to have arbitrary probability of error on a small fraction of inputs. Our techniques ...






Comments