ABSTRACT
Skyline computation is widely used in multi-criteria decision making. As research in uncertain databases draws increasing attention, skyline queries with uncertain data have also been studied, e.g. probabilistic skylines. The previous work requires "thresholding" for its efficiency -- the efficiency relies on the assumption that points with skyline probabilities below a certain threshold can be ignored. But there are situations where "thresholding" is not desirable -- low probability events cannot be ignored when their consequences are significant. In such cases it is necessary to compute skyline probabilities of all data items. We provide the first algorithm for this problem whose worst-case time complexity is sub-quadratic. The techniques we use are interesting in their own right, as they rely on a space partitioning technique combined with using the existing dominance counting algorithm. The effectiveness of our algorithm is experimentally verified.
- V. Akrivi, A. Doulkeridis, Y. Kotidis, and M. Vazirgiannis. Skypeer: Efficient subspace skyline computation over distributed data. In ICDE, 2007.Google Scholar
- L. Antova, T. Jansen, C. Koch, and D. Olteanu. Fast and simple relational processing of uncertain data. In ICDE, 2008. Google Scholar
Digital Library
- O. Benjelloun, A.D. Sarma, A. Halevy, and J. Widom. Uldbs: databases with uncertainty and lineage. In VLDB, 2006. Google Scholar
Digital Library
- G. Beskales, M.A. Solima, and I.F. Ilyasu. Efficient search for the top-k probable nearest neighbors in uncertain databases. In VLDB, 2008. Google Scholar
Digital Library
- S. Borzsonyi, D. Kossmann, and K. Stocker. The skyline operator. In ICDE, 2001. Google Scholar
Digital Library
- J. Boulos, N. Dalvi, B. Mandhani, S. Mathur, C. Re, and D. Suciu. Mystiq: a system for finding more answers by using probabilities. In SIGMOD, 2005. Google Scholar
Digital Library
- B.-C. Chen, K. LeFevre, and R. Ramakrishnan. Privacy skyline: privacy with multidimensional adversarial knowledge. In VLDB, 2007. Google Scholar
Digital Library
- R. Cheng, D.V. Kalashnikov, and S. Prabhakar. Evaluating probabilistic queries over imprecise data. In SIGMOD, 2003. Google Scholar
Digital Library
- R. Cheng, D.V. Kalashnikov, and S. Prabhakar. Querying imprecise data in moving object environments. TKDE, 2004. Google Scholar
Digital Library
- R. Cheng, Y. Xia, S. Prabhakar, R. Shah, and J.S. Vitter. Efficient indexing methods for probabilistic threshold queries over uncertain data. In VLDB, 2004. Google Scholar
Digital Library
- R. Cheng, Y. Xia, S. Prabhakar, R. Shah, and J.S. Vitter. Probabilistic verifiers: Evaluating constrained nearest--neighbor queries over uncertain data. In ICDE, 2008. Google Scholar
Digital Library
- E. Dellis and B. Seeger. Efficient computation of reverse skyline queries. In VLDB, 2007. Google Scholar
Digital Library
- M. Hua, J. Pei, W. Zhang, and X. Lin. Ranking queries on uncertain data: A probabilistic threshold approach. In SIGMOD, 2008. Google Scholar
Digital Library
- H.T. Kung, F. Luccio, and F.P. Preparata. On finding the maxima of a set of vectors. J. of ACM, 1975. Google Scholar
Digital Library
- X. Lian and L. Chen. Monochromatic and bichromatic reverse skyline search over uncertain databases. In SIGMOD, 2008. Google Scholar
Digital Library
- X. Lian and L. Chen. Probabilistic ranked queries in uncertain databases. In EDBT, 2008. Google Scholar
Digital Library
- X. Lin, Y. Yuan, Q. Zhang, and Y. Zhang. Selecting stars: The k most representative skyline operator. ICDE, 2007.Google Scholar
Cross Ref
- V. Ljosa and A.K. Singh. Top-k spatial joins of probabilistic objects. In ICDE, 2008. Google Scholar
Digital Library
- E.M. McCreight. Priority search trees. SIAM J. Comput., 1985.Google Scholar
- K. Mehlhorn. Data Structures and Algorithms 3: Multi-dimensional Searching and Computational Geometry. Springer-Verlag New York, Inc., 1984. Google Scholar
Digital Library
- M. Morse, J.M. Patel, and H.V. Jagadish. Efficient skyline computation over low-cardinality domains. In VLDB, 2007. Google Scholar
Digital Library
- J. Pei, A. W.-C. Fu, X. Lin, and H. Wang. Computing compressed multidimensional skyline cubes efficiently. ICDE, 2007.Google Scholar
Cross Ref
- J. Pei, B. Jiang, X. Lin, and Y. Yuan. Probabilistic skylines on uncertain data. In VLDB, 2007. Google Scholar
Digital Library
- F. Preparata and M. Shamos. Computational Geometry: An Introduction. Springer-Verlag, 1985. Google Scholar
Digital Library
- S. Singh, R. Shah, S. Prabhakar, and C. Mayfield. Database support for pdf attributes. In ICDE, 2008.Google Scholar
- M.A. Soliman, I.F. Ilyas, and K. C.-C. Chang. Urank: formulation and efficient evaluation of top-k queries in uncertain databases. In SIGMOD, 2007. Google Scholar
Digital Library
- Y. Tao, R. Cheng, X. Xiao, W. Ngai, B. Kao, and S. Prabhakar. Indexing multi-dimensional uncertain data with arbitrary probability density functions. In VLDB, 2005. Google Scholar
Digital Library
- P. Wu, D. Agrawal, O. Egecioglu, and A. El Abbadi. Deltasky: Optimal maintenance of skyline deletions without exclusive dominance region generation. ICDE, 2007.Google Scholar
Cross Ref
- W. Zhang, X. Lin, Y. Zhang, W. Wang, and J. Yu. Probabilistic skyline operator over sliding windows. In ICDE, 2009. Google Scholar
Digital Library
- L. Zhu, S. Zhou, and J. Guan. Efficient skyline retrieval on peer-to-peer networks. Future Generation Communication and Networking, 2007. Google Scholar
Digital Library
Index Terms
Computing all skyline probabilities for uncertain data
Recommendations
Asymptotically efficient algorithms for skyline probabilities of uncertain data
Skyline computation is widely used in multicriteria decision making. As research in uncertain databases draws increasing attention, skyline queries with uncertain data have also been studied. Some earlier work focused on probabilistic skylines with a ...
Continuous probabilistic skyline queries over uncertain data streams
DEXA'10: Proceedings of the 21st international conference on Database and expert systems applications: Part IRecently, some approaches of finding probabilistic skylines on uncertain data have been proposed. In these approaches, a data object is composed of instances, each associated with a probability. The probabilistic skyline is then defined as a set of non-...
Efficient Probabilistic Skyline Query Processing in MapReduce
BIGDATACONGRESS '13: Proceedings of the 2013 IEEE International Congress on Big DataAs a popular parallel programming model, how to process probabilistic skyline query over uncertain data in MapReduce framework is becoming an urgent problem to be resolved. In MapReduce framework, implementing probabilistic skyline query is nontrivial ...






Comments