Efficient Algorithms for Uncertain Restricted Skyline Query Processing

With the rapid growth of uncertain data, query processing on uncertain data has become an important research area. Although considerable efforts have been devoted to answering certain types of queries on uncertain data, how to perform restricted skyline (rskyline) queries on uncertain data remains a...

Full description

Saved in:
Bibliographic Details
Published inThe VLDB journal Vol. 34; no. 4; p. 46
Main Authors Gao, Xiangyu, Xiao, Xingxing, Pan, Xiao, Miao, Dongjing, Li, Jianzhong
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.07.2025
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN1066-8888
0949-877X
0949-877X
DOI10.1007/s00778-025-00925-9

Cover

More Information
Summary:With the rapid growth of uncertain data, query processing on uncertain data has become an important research area. Although considerable efforts have been devoted to answering certain types of queries on uncertain data, how to perform restricted skyline (rskyline) queries on uncertain data remains an open problem. To fill the gap, this paper studies the all rskyline probabilities (ARSP) problem, which aims to compute the probability of each uncertain tuple appearing in the rskyline, and the most-likely rskyline (MLRS) problem, which aims to identify a set of uncertain tuples with the highest probability of being the rskyline. We prove that no algorithm can solve the ARSP problem in strongly subquadratic time, unless the orthogonal vectors conjecture fails and the MLRS problem is NP-hard. When F is a set of linear scoring functions subject to a set of linear constraints on weights, we propose two efficient algorithms for solving the ARSP problem. For a special linear constraint, we further develop an algorithm with sublinear query time. For the MLRS problem, we first design a series of data reduction rules to reduce the input data size. Then, we propose two exact algorithms with different search strategies, as well as a local search based approximation algorithm to further improve the time efficiency. Experimental results show that these two problems provide complementary and comprehensive perspectives on rskylines of uncertain datasets, and demonstrate the effectiveness and efficiency of the proposed algorithms.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1066-8888
0949-877X
0949-877X
DOI:10.1007/s00778-025-00925-9