ABSTRACT
We consider the skyline problem (a.k.a. the maxima problem), which has been extensively studied in the database community. The input is a set P of d-dimensional points. A point dominates another if the former has a lower coordinate than the latter on every dimension. The goal is to find the skyline, which is the set of points p ∈ P such that p is not dominated by any other data point. In the external-memory model, the 2-d version of the problem is known to be solvable in O((N/B)logM/B(N/B)) I/Os, where N is the cardinality of P, B the size of a disk block, and M the capacity of main memory. For fixed d ≥ 3, we present an algorithm with I/O-complexity O((N/B)logd-2/M/B(N/B)). Previously, the best solution was adapted from an in-memory algorithm, and requires O((N/B) logd-2/2(N/M)) I/Os.
- P. Afshani, J. Barbay, and T. M. Chan. Instance-optimal geometric algorithms. In Proceedings of Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 129--138, 2009. Google Scholar
Digital Library
- A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. Communications of the ACM (CACM), 31(9):1116--1127, 1988. Google Scholar
Digital Library
- L. Arge, M. Knudsen, and K. Larsen. A general lower bound on the I/O-complexity of comparison-based algorithms. In Algorithms and Data Structures Workshop (WADS), pages 83--94, 1993. Google Scholar
Digital Library
- J. L. Bentley. Multidimensional divide-and-conquer. Communications of the ACM (CACM), 23(4):214--229, 1980. Google Scholar
Digital Library
- J. L. Bentley, K. L. Clarkson, and D. B. Levine. Fast linear expected-time algorithms for computing maxima and convex hulls. Algorithmica, 9(2):168--183, 1993.Google Scholar
Cross Ref
- J. L. Bentley, H. T. Kung, M. Schkolnick, and C. D. Thompson. On the average number of maxima in a set of vectors and applications. Journal of the ACM (JACM), 25(4):536--543, 1978. Google Scholar
Digital Library
- S. Borzsonyi, D. Kossmann, and K. Stocker. The skyline operator. In Proceedings of International Conference on Data Engineering (ICDE), pages 421--430, 2001. Google Scholar
Digital Library
- C. Y. Chan, H. V. Jagadish, K.-L. Tan, A. K. H. Tung, and Z. Zhang. Finding k-dominant skylines in high dimensional space. In Proceedings of ACM Management of Data (SIGMOD), pages 503--514, 2006. Google Scholar
Digital Library
- J. Chomicki, P. Godfrey, J. Gryz, and D. Liang. Skyline with presorting. In Proceedings of International Conference on Data Engineering (ICDE), pages 717--816, 2003.Google Scholar
Cross Ref
- H. K. Dai and X.-W. Zhang. Improved linear expected-time algorithms for computing maxima. In Latin American Theoretical Informatics, pages 181--192, 2004.Google Scholar
Cross Ref
- H. N. Gabow, J. L. Bentley, and R. E. Tarjan. Scaling and related techniques for geometry problems. In Proceedings of ACM Symposium on Theory of Computing (STOC), pages 135--143, 1984. Google Scholar
Digital Library
- P. Godfrey, R. Shipley, and J. Gryz. Algorithms and analyses for maximal vector computation. The VLDB Journal, 16(1):5--28, 2007. Google Scholar
Digital Library
- M. T. Goodrich, J.-J. Tsay, D. E. Vengroff, and J. S. Vitter. External-memory computational geometry. In Proceedings of Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 714--723, 1993. Google Scholar
Digital Library
- R. Janardan. On the dynamic maintenance of maximal points in the plane. Information Processing Letters (IPL), 40(2):59--64, 1991.Google Scholar
- S. Kapoor. Dynamic maintenance of maxima of 2-d point sets. SIAM Journal of Computing, 29(6):1858--1877, 2000. Google Scholar
Digital Library
- D. G. Kirkpatrick and R. Seidel. Output-size sensitive algorithms for finding maximal vectors. In Symposium on Computational Geometry (SoCG), pages 89--96, 1985. Google Scholar
Digital Library
- M. Kontaki, A. N. Papadopoulos, and Y. Manolopoulos. Continuous k-dominant skyline computation on multidimensional data streams. In Proceedings of ACM Symposium on Applied Computing (SAC), pages 956--960, 2008. Google Scholar
Digital Library
- D. Kossmann, F. Ramsak, and S. Rost. Shooting stars in the sky: An online algorithm for skyline queries. In Proceedings of Very Large Data Bases (VLDB), pages 275--286, 2002. Google Scholar
Digital Library
- H. T. Kung, F. Luccio, and F. P. Preparata. On finding the maxima of a set of vectors. Journal of the ACM (JACM), 22(4):469--476, 1975. Google Scholar
Digital Library
- X. Lin, Y. Yuan, W. Wang, and H. Lu. Stabbing the sky: Efficient skyline computation over sliding windows. In Proceedings of International Conference on Data Engineering (ICDE), pages 502--513, 2005. Google Scholar
Digital Library
- J. Matousek. Computing dominances in En. Information Processing Letters (IPL), 38(5):277--278, 1991. Google Scholar
Digital Library
- M. D. Morse, J. M. Patel, and H. V. Jagadish. Efficient skyline computation over low-cardinality domains. In Proceedings of Very Large Data Bases (VLDB), pages 267--278, 2007. Google Scholar
Digital Library
- D. Papadias, Y. Tao, G. Fu, and B. Seeger. Progressive skyline computation in database systems. ACM Transactions on Database Systems (TODS), 30(1):41--82, 2005. Google Scholar
Digital Library
- A. D. Sarma, A. Lall, D. Nanongkai, and J. Xu. Randomized multi-pass streaming skyline algorithms. Proceedings of the VLDB Endowment (PVLDB), 2(1):85--96, 2009. Google Scholar
Digital Library
- M. A. Siddique and Y. Morimoto. K-dominant skyline computation by using sort-filtering method. In Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pages 839--848, 2009. Google Scholar
Digital Library
- P. van Emde Boas. Preserving order in a forest in less than logarithmic time and linear space. Information Processing Letters (IPL), 6(3):80--82, 1977.Google Scholar
- J. S. Vitter. Algorithms and data structures for external memory. Foundation and Trends in Theoretical Computer Science, 2(4):305--474, 2006. Google Scholar
Digital Library
Index Terms
On finding skylines in external memory
Recommendations
Worst-Case I/O-Efficient Skyline Algorithms
We consider the skyline problem (aka the maxima problem), which has been extensively studied in the database community. The input is a set P of d-dimensional points. A point dominates another if the coordinate of the former is at most that of the latter ...
Finding k-dominant skylines in high dimensional space
SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of dataGiven a d-dimensional data set, a point p dominates another point q if it is better than or equal to q in all dimensions and better than q in at least one dimension. A point is a skyline point if there does not exists any point that can dominate it. ...
Skyline distance: a measure of multidimensional competence
Skyline has been widely recognized as being useful for multi-criteria decision-making applications. While most of the existing work computes skylines in various contexts, in this paper, we consider a novel problem: how far away a point is from the ...






Comments