ABSTRACT
In this tutorial we will describe some of the recent advances in the development of worst-case efficient range search indexing structures, that is, structures for storing a set of data points such that the points in a axis-parallel (hyper-) query rectangle can be found efficiently (with as few disk accesses - or I/Os - as possible). We first quickly discuss the well-known and optimal structure for the one-dimensional version of the problem, the B-tree [10, 12], along with its variants weight-balanced B-trees [9], multi-version (or persistent) B-trees [6, 11, 13, 22] and buffer-trees [4]. Then we discuss the external priority search tree [8], which solves a restricted version of the two-dimensional version of the problem where the query rectangle is unbounded on one side. This structure is then used in a range tree index structure [8, 21] that answers general two-dimensional queries in the same number of I/Os as the B-tree in the one-dimensional case, but using super-linear space. We also describe the linear space kdB-tree [19, 20] and O-tree [17] index structures that also solve the problem efficiently (but using more I/Os than the range tree). A detailed presentation of all the the above structures can be found in lecture notes by the author [5]. Finally, we also discuss lower bounds techniques, most notably the theory of indexability [16], that can be used to prove that both the range tree and kdB-tree/O-tree are optimal among query efficient and linear space structures, respectively [2, 8, 17], as well as recent index structures for higher-dimensional range search indexing [1]. We end by mentioning various R-tree variant [7, 18, 15] that can be used to solve the extended version of range search indexing where the queries as well as the data are (hyper-) rectangles. More comprehensive surveys of efficient index structures can be found in [3, 14, 23].
- P. Afshani, L. Arge, and K.D. Larsen. Orthogonal range reporting in three and higher dimensions. Submitted, 2009.Google Scholar
- P.K. Agarwal, M. de Berg, J. Gudmundsson, M. Hammer, and H.J. Haverkort. Box-trees and R-trees with near-optimal query time. In Proc. ACM Symposium on Computational Geometry, pages 124--133, 2001. Google Scholar
Digital Library
- L. Arge. External memory data structures. In J. Abello, P.M. Pardalos, and M.G.C. Resende, editors, Handbook of Massive Data Sets, pages 313--358. Kluwer Academic Publishers, 2002. Google Scholar
Digital Library
- L. Arge. The buffer tree: A technique for designing batched external data structures. Algorithmica, 37(1):1--24, 2003.Google Scholar
Digital Library
- L. Arge. External-memory geometric data structures. In G.S. Brodal and R. Fagerberg, editors, EEF Summer School on Massive Datasets. Springer Verlag, 2004.Google Scholar
- L. Arge, A. Danner, and S.-H. Teh. I/O-efficient point location using persistent B-trees. In Proc. Workshop on Algorithm Engineering and Experimentation, 2003.Google Scholar
- L. Arge, M. de Berg, H.J. Haverkort, and K. Yi. The priority R-tree: A practically efficient and worst-case optimal R-tree. In Proc. SIGMOD International Conference on Management of Data, pages 347--358, 2004. Google Scholar
Digital Library
- L. Arge, V. Samoladas, and J.S. Vitter. On two-dimensional indexability and optimal range search indexing. In Proc. ACM Symposium on Principles of Database Systems, pages 346--357, 1999. Google Scholar
Digital Library
- L. Arge and J.S. Vitter. Optimal external memory interval management. SIAM Journal on Computing, 32(6):1488--1508, 2003. Google Scholar
Digital Library
- R. Bayer and E. McCreight. Organization and maintenance of large ordered indexes. Acta Informatica, 1:173--189, 1972.Google Scholar
Digital Library
- B. Becker, S. Gschwind, T. Ohler, B. Seeger, and P. Widmayer. An asymptotically optimal multiversion B-tree. VLDB Journal, 5(4):264--275, 1996. Google Scholar
Digital Library
- D. Comer. The ubiquitous B-tree. ACM Computing Surveys, 11(2):121--137, 1979. Google Scholar
Digital Library
- J.R. Driscoll, N. Sarnak, D.D. Sleator, and R. Tarjan. Making data structures persistent. Journal of Computer and System Sciences, 38:86--124, 1989. Google Scholar
Digital Library
- V. Gaede and O. Günther. Multidimensional access methods. ACM Computing Surveys, 30(2):170--231, 1998. Google Scholar
Digital Library
- A. Guttman. R-trees: A dynamic index structure for spatial searching. In Proc. SIGMOD International Conference on Management of Data, pages 47--57, 1984. Google Scholar
Digital Library
- J. Hellerstein, E. Koutsoupias, D. Miranker, C. Papadimitriou, and V. Samoladas. On a model of indexability and its bounds for range queries. Journal of ACM, 49(1), 2002. Google Scholar
Digital Library
- K.V.R. Kanth and A.K. Singh. Optimal dynamic range searching in non-replicating index structures. In Proc. International Conference on Database Theory, LNCS 1540, pages 257--276, 1999. Google Scholar
Digital Library
- Y. Manolopoulos, A. Nanopoulos, A.N. Papadopoulos, and Y. Theodoridis. R-trees have grown everywhere. Technical report, Available at http://www.rtreeportal.org, 2003.Google Scholar
- O. Procopiuc, P.K. Agarwal, L. Arge, and J.S. Vitter. Bkd-tree: A dynamic scalable kd-tree. In Proc. International Symposium on Spatial and Temporal Databases, LNCS 2750, 2003.Google Scholar
Cross Ref
- J. Robinson. The K-D-B tree: A search structure for large multidimensional dynamic indexes. In Proc. SIGMOD International Conference on Management of Data, pages 10--18, 1981. Google Scholar
Digital Library
- S. Subramanian and S. Ramaswamy. The P-range tree: A new data structure for range searching in secondary memory. In Proc. ACM-SIAM Symposium on Discrete Algorithms, pages 378--387, 1995. Google Scholar
Digital Library
- P.J. Varman and R.M. Verma. An efficient multiversion access structure. IEEE Transactions on Knowledge and Data Engineering, 9(3):391--409, 1997. Google Scholar
Digital Library
- J.S. Vitter. External memory algorithms and data structures: Dealing with MASSIVE data. ACM Computing Surveys, 33(2):209--271, 2001. Google Scholar
Digital Library
Index Terms
Worst-case efficient range search indexing: invited tutorial
Recommendations
Efficient phrase querying with an auxiliary index
SIGIR '02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrievalSearch engines need to evaluate queries extremely fast, a challenging task given the vast quantities of data being indexed. A significant proportion of the queries posed to search engines involve phrases. In this paper we consider how phrase queries can ...
Recent Advances in Worst-Case Efficient Range Search Indexing
SSTD '09: Proceedings of the 11th International Symposium on Advances in Spatial and Temporal DatabasesRange search indexing is the problem of storing a set of data points on disk such that the points in a axis-parallel (hyper-) query rectangle can be found efficiently (with as few disk accesses - or I/Os - as possible). The problem is arguably one of ...
Efficient Indexing Methods for Temporal Relations
The primary issues that affect the design of indexing methods are examined, and several structures and algorithms for specific cases are proposed. The append-only tree (AP-tree) structure indexes data for append-only databases to help event-join ...






Comments