skip to main content
10.1145/1989284.1989298acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

On finding skylines in external memory

Authors Info & Claims
Published:13 June 2011Publication History

ABSTRACT

We consider the skyline problem (a.k.a. the maxima problem), which has been extensively studied in the database community. The input is a set P of d-dimensional points. A point dominates another if the former has a lower coordinate than the latter on every dimension. The goal is to find the skyline, which is the set of points pP such that p is not dominated by any other data point. In the external-memory model, the 2-d version of the problem is known to be solvable in O((N/B)logM/B(N/B)) I/Os, where N is the cardinality of P, B the size of a disk block, and M the capacity of main memory. For fixed d ≥ 3, we present an algorithm with I/O-complexity O((N/B)logd-2/M/B(N/B)). Previously, the best solution was adapted from an in-memory algorithm, and requires O((N/B) logd-2/2(N/M)) I/Os.

References

  1. P. Afshani, J. Barbay, and T. M. Chan. Instance-optimal geometric algorithms. In Proceedings of Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 129--138, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. Communications of the ACM (CACM), 31(9):1116--1127, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Arge, M. Knudsen, and K. Larsen. A general lower bound on the I/O-complexity of comparison-based algorithms. In Algorithms and Data Structures Workshop (WADS), pages 83--94, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. L. Bentley. Multidimensional divide-and-conquer. Communications of the ACM (CACM), 23(4):214--229, 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. L. Bentley, K. L. Clarkson, and D. B. Levine. Fast linear expected-time algorithms for computing maxima and convex hulls. Algorithmica, 9(2):168--183, 1993.Google ScholarGoogle ScholarCross RefCross Ref
  6. J. L. Bentley, H. T. Kung, M. Schkolnick, and C. D. Thompson. On the average number of maxima in a set of vectors and applications. Journal of the ACM (JACM), 25(4):536--543, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Borzsonyi, D. Kossmann, and K. Stocker. The skyline operator. In Proceedings of International Conference on Data Engineering (ICDE), pages 421--430, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Y. Chan, H. V. Jagadish, K.-L. Tan, A. K. H. Tung, and Z. Zhang. Finding k-dominant skylines in high dimensional space. In Proceedings of ACM Management of Data (SIGMOD), pages 503--514, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Chomicki, P. Godfrey, J. Gryz, and D. Liang. Skyline with presorting. In Proceedings of International Conference on Data Engineering (ICDE), pages 717--816, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  10. H. K. Dai and X.-W. Zhang. Improved linear expected-time algorithms for computing maxima. In Latin American Theoretical Informatics, pages 181--192, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  11. H. N. Gabow, J. L. Bentley, and R. E. Tarjan. Scaling and related techniques for geometry problems. In Proceedings of ACM Symposium on Theory of Computing (STOC), pages 135--143, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Godfrey, R. Shipley, and J. Gryz. Algorithms and analyses for maximal vector computation. The VLDB Journal, 16(1):5--28, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. T. Goodrich, J.-J. Tsay, D. E. Vengroff, and J. S. Vitter. External-memory computational geometry. In Proceedings of Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 714--723, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Janardan. On the dynamic maintenance of maximal points in the plane. Information Processing Letters (IPL), 40(2):59--64, 1991.Google ScholarGoogle Scholar
  15. S. Kapoor. Dynamic maintenance of maxima of 2-d point sets. SIAM Journal of Computing, 29(6):1858--1877, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. G. Kirkpatrick and R. Seidel. Output-size sensitive algorithms for finding maximal vectors. In Symposium on Computational Geometry (SoCG), pages 89--96, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Kontaki, A. N. Papadopoulos, and Y. Manolopoulos. Continuous k-dominant skyline computation on multidimensional data streams. In Proceedings of ACM Symposium on Applied Computing (SAC), pages 956--960, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Kossmann, F. Ramsak, and S. Rost. Shooting stars in the sky: An online algorithm for skyline queries. In Proceedings of Very Large Data Bases (VLDB), pages 275--286, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. H. T. Kung, F. Luccio, and F. P. Preparata. On finding the maxima of a set of vectors. Journal of the ACM (JACM), 22(4):469--476, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. X. Lin, Y. Yuan, W. Wang, and H. Lu. Stabbing the sky: Efficient skyline computation over sliding windows. In Proceedings of International Conference on Data Engineering (ICDE), pages 502--513, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Matousek. Computing dominances in En. Information Processing Letters (IPL), 38(5):277--278, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. D. Morse, J. M. Patel, and H. V. Jagadish. Efficient skyline computation over low-cardinality domains. In Proceedings of Very Large Data Bases (VLDB), pages 267--278, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Papadias, Y. Tao, G. Fu, and B. Seeger. Progressive skyline computation in database systems. ACM Transactions on Database Systems (TODS), 30(1):41--82, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. D. Sarma, A. Lall, D. Nanongkai, and J. Xu. Randomized multi-pass streaming skyline algorithms. Proceedings of the VLDB Endowment (PVLDB), 2(1):85--96, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. A. Siddique and Y. Morimoto. K-dominant skyline computation by using sort-filtering method. In Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pages 839--848, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. P. van Emde Boas. Preserving order in a forest in less than logarithmic time and linear space. Information Processing Letters (IPL), 6(3):80--82, 1977.Google ScholarGoogle Scholar
  27. J. S. Vitter. Algorithms and data structures for external memory. Foundation and Trends in Theoretical Computer Science, 2(4):305--474, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. On finding skylines in external memory

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PODS '11: Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
      June 2011
      332 pages
      ISBN:9781450306607
      DOI:10.1145/1989284

      Copyright © 2011 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 June 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate476of1,835submissions,26%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!