ABSTRACT
We consider the problem of optimizing and executing multiple continuous queries, where each query is a conjunction of filters and each filter may occur in multiple queries. When filters are expensive, significant performance gains are achieved by sharing filter evaluations across queries. A shared execution strategy in our scenario can either be fixed, in which filters are evaluated in the same predetermined order for all input, or adaptive, in which the next filter to be evaluated is chosen at runtime based on the results of the filters evaluated so far. We show that as filter costs increase, the best adaptive strategy is superior to any fixed strategy, despite the overhead of adaptivity. We show that itis NP-hard to find the optimal adaptive strategy, even if we are willing to approximate within any factor smaller than m where m is the number of queries. We then present a greedy adaptive execution strategy and show that it approximates the best adaptive strategy to within a factor O(log2m log n) where n is the number of distinct filters. We also give a precomputation technique that can reduce the execution overhead of adaptive strategies.
- A. Arasu and J. Widom. Resource sharing in continuous sliding-window aggregates. In Proc. of the 2004 Intl. Conf. on Very Large Data Bases, pages 336--347, 2004. Google Scholar
Digital Library
- R. Avnur and J. Hellerstein. Eddies: Continuously adaptive query processing. In Proc. of the 2000 ACM SIGMOD Intl. Conf. on Management of Data, pages 261--272, 2000. Google Scholar
Digital Library
- B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In Proc. of the 2002 ACM Symp. on Principles of Database Systems, pages 1--16, 2002. Google Scholar
Digital Library
- S. Babu et al. Adaptive ordering of pipelined stream filters. In Proc. of the 2004 ACM SIGMOD Intl. Conf. on Management of Data, pages 407--418, 2004. Google Scholar
Digital Library
- P. Bizarro, S. Babu, D. DeWitt, and J. Widom. Content-based routing: Different plans for different data. In Proc. of the 2005 Intl. Conf. on Very Large Data Bases, pages 757--768, 2005. Google Scholar
Digital Library
- S. Chaudhuri and K. Shim. Optimization of queries with user-defined predicates. ACM Trans. on Database Systems, 24(2):177--228, 1999. Google Scholar
Digital Library
- J. Chen, D. J. DeWitt, F. Tian, and Y. Wang. NiagaraCQ: A scalable continuous query system for internet databases. In Proc. of the 2000 ACM SIGMOD Intl. Conf. on Management of Data, pages 379--390, 2000. Google Scholar
Digital Library
- N. Dalvi, S. Sanghai, P. Roy, and S. Sudarshan. Pipelining in multi-query optimization. In Proc. of the 2001 ACM Symp. on Principles of Database Systems, 2001. Google Scholar
Digital Library
- B. Dean, M. Goemans, and J. Vondrák. Approximating the stochastic knapsack problem: The benefit of adaptivity. In Proc. of the 2004 Annual IEEE Symp. on Foundations of Computer Science, 2004. Google Scholar
Digital Library
- A. Deshpande, C. Guestrin, S. Madden, J. M. Hellerstein, and W. Hong. Model-driven data acquisition in sensor networks. In Proc. of the 2004 Intl. Conf. on Very Large Data Bases, 2004. Google Scholar
Digital Library
- O. Etzioni et al. Efficient information gathering on the internet. In Proc. of the 1996 Annual IEEE Symp. on Foundations of Computer Science, pages 234--243, 1996. Google Scholar
Digital Library
- M. Garey and D. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., 1979. Google Scholar
Digital Library
- E. Hanson. The Interval Skip List: A data structure for finding all intervals that overlap a point. In WADS, pages 153--164, 1991.Google Scholar
Cross Ref
- E. Hanson. Rule condition testing and action execution in Ariel. In Proc. of the 1992 ACM SIGMOD Intl. Conf. on Management of Data, pages 49--58, 1992. Google Scholar
Digital Library
- J. Hellerstein and M. Stonebraker. Predicate migration: Optimizing queries with expensive predicates. In Proc. of the 1993 ACM SIGMOD Intl. Conf. on Management of Data, pages 267--276, 1993. Google Scholar
Digital Library
- S. Madden, M. Shah, J. Hellerstein, and V. Raman. Continuously adaptive continuous queries over streams. In Proc. of the 2002 ACM SIGMOD Intl. Conf. on Management of Data, pages 49--60, 2002. Google Scholar
Digital Library
- R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, 1995. Google Scholar
Digital Library
- K. Munagala, U. Srivastava, and J. Widom. Optimization of continuous queries with shared expensive filters. Technical report, Stanford University, 2005. Available at http://dbpubs.stanford.edu/pub/2005-36.Google Scholar
- Open source computer vision library. http://sourceforge.net/projects/ opencvlibrary.Google Scholar
- M. Skutella and M. Uetz. Scheduling precedence-constrained jobs with stochastic processing times on parallel machines. In Proc. of the 2001 Annual ACM-SIAM Symp. on Discrete Algorithms, pages 589--590, 2001. Google Scholar
Digital Library
- R. Strom et al. Gryphon: An information flow based approach to message brokering. In Intl. Symp. on Software Reliability Engineering, 1998.Google Scholar
- V. Vazirani. Approximation Algorithms. Springer, 2001. Google Scholar
Digital Library
- T. Yan and H. Garcia-Molina. The SIFT information dissemination system. ACM Trans. on Database Systems, 24(4):529--565, 1999. Google Scholar
Digital Library
Index Terms
Optimization of continuous queries with shared expensive filters
Recommendations
Containment and Optimization of Object-Preserving Conjunctive Queries
In the optimization of queries in an object-oriented database (OODB) system, a natural first step is to use the typing constraints imposed by the schema to transform a query into an equivalent one that logically accesses a minimal set of objects. We ...
Equivalence and minimization of conjunctive queries under combined semantics
ICDT '12: Proceedings of the 15th International Conference on Database TheoryThe problems of query containment, equivalence, and minimization are fundamental problems in the context of query processing and optimization. In their classic work [2] published in 1977, Chandra and Merlin solved the three problems for the language of ...
An Optimal Algorithm for Processing Distributed Star Queries
The problem of optimal query processing in distributed database systems was shown to be NP-hard. However, for a special type of queries called star queries, we have developed a polynomial optimal algorithm. Semijoin tactics are applied for query ...






Comments