skip to main content
10.1145/1376916.1376924acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Evaluating rank joins with optimal cost

Published:09 June 2008Publication History

ABSTRACT

In the rank join problem, we are given a set of relations and a scoring function, and the goal is to return the join results with the top K scores. It is often the case in practice that the inputs may be accessed in ranked order and the scoring function is monotonic. These conditions allow for efficient algorithms that solve the rank join problem without reading all of the input. In this paper, we present a thorough analysis of such rank join algorithms. A strong point of our analysis is that it is based on a more general problem statement than previous work, making it more relevant to the execution model that is employed by database systems. One of our results indicates that the well known HRJN algorithm has shortcomings, because it does not stop reading its input as soon as possible. We find that it is NP-hard to overcome this weakness in the general case, but cases of limited query complexity are tractable. We prove the latter with an algorithm that infers provably tight bounds on the potential benefit of reading more input in order to stop as soon as possible. As a result, the algorithm achieves a cost that is within a constant factor of optimal.

References

  1. Parag Agrawal and Jennifer Widom. Confidence-aware joins in large uncertain databases. Technical report, Stanford University, 2007. Available at http://dbpubs.stanford.edu/pub/2007-14.Google ScholarGoogle Scholar
  2. Ronald Fagin. Combining fuzzy information from multiple systems. Journal of Computer and System Sciences, 58(1):83--99, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ronald Fagin, Amnon Lotem, and Moni Naor. Optimal aggregation algorithms for middleware. Journal of Computer and System Sciences, 66(4):614--656, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ihab F. Ilyas, Walid G. Aref, and Ahmed K. Elmagarmid. Supporting top-k join queries in relational databases. The VLDB Journal, 13(3):207--221, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ihab F. Ilyas, Walid G. Aref, Ahmed K. Elmagarmid, Hicham G. Elmongui, Rahul Shah, and Jeffrey Scott Vitter. Adaptive rank-aware query optimization in relational databases. ACM Transaction on Database Systems, 31(4):1257--1304, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chengkai Li, Kevin Chen-Chuan Chang, Ihab F. Ilyas, and Sumin Song. RankSQL: query algebra and optimization for relational top-k queries. In ACM SIGMOD International Conference on Management of Data, pages 131--142, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Nikos Mamoulis, Man Lung Yiu, Kit Hung Cheng, and David W. Cheung. Efficient top-k aggregation of ranked inputs. ACM Transaction on Database Systems, 32(3):19, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Apostol Natsev, Yuan-Chi Chang, John R. Smith, Chung-Sheng Li, and Jeffrey Scott Vitter. Supporting incremental join queries on ranked inputs. In International Conference on Very Large Databases, pages 281--290, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Karl Schnaitter and Neoklis Polyzotis. Evaluating rank joins with optimal cost. Technical report, UC Santa Cruz, 2007. Available at http://www.soe.ucsc.edu/research/reports/UCSC-CRL-07-10.pdf.Google ScholarGoogle Scholar
  10. Karl Schnaitter, Joshua Spiegel, and Neoklis Polyzotis. Depth estimation for ranking query optimization. In International Conference on Very Large Databases, pages 902--913, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Evaluating rank joins with optimal cost

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PODS '08: Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
        June 2008
        330 pages
        ISBN:9781605581521
        DOI:10.1145/1376916

        Copyright © 2008 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 9 June 2008

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate476of1,835submissions,26%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!