Top-k Best Probability Queries on Probabilistic Data
نویسندگان
چکیده
There has been much interest in answering top-k queries on probabilistic data in various applications such as market analysis, personalised services, and decision making. In relation to probabilistic data, the most common problem in answering top-k queries is selecting the semantics of results according to their scores and top-k probabilities. In this paper, we propose a novel top-k best probability query to obtain results which are not only the best top-k scores but also the best topk probabilities. We also introduce an efficient algorithm for top-k best probability queries without requiring the user’s defined threshold. Then, the top-k best probability answer is analysed, which satisfies the semantic ranking properties of queries [3, 18] on uncertain data. The experimental studies are tested with both the real data to verify the effectiveness of the top-k best probability queries and the efficiency of our algorithm.
منابع مشابه
Top-k best probability queries and semantics ranking properties on probabilistic databases
There has been much interest in answering top-k queries on probabilistic data in various applications such as market analysis, personalised services, and decision making. In probabilistic relational databases, the most common problem in answering top-k queries (ranking queries) is selecting the top-k result based on scores and top-k probabilities. In this paper, we firstly propose novel answers...
متن کاملTop-k Query Evaluation with Probabilistic Guarantees
Martin Theobald, Gerhard Weikum, Ralf Schenkel Max-Planck Institute of Computer Science D-66123 Saarbruecken, Germany {mtb, weikum, schenkel}@mpi-sb.mpg.de Abstract Top-k queries based on ranking elements of multidimensional datasets are a fundamental building block for many kinds of information discovery. The best known general-purpose algorithm for evaluating top-k queries is Fagin’s thresho...
متن کاملSensitivity Analysis and Explanations for Robust Query Evaluation in Probabilistic Databases
Probabilistic database systems have successfully established themselves as a tool for managing uncertain data. However, much of the research in this area has focused on efficient query evaluation and has largely ignored two key issues that commonly arise in uncertain data management: First, how to provide explanations for query results, e.g., “Why is this tuple in my result ?” or “Why does this...
متن کاملPh.D. Dissertation Proposal Probabilities and Sets in Preference Querying
User preferences in databases are attracting increasing interests with the boom of information systems and the trend of personalization. In the literature, there are two different framework on this topic, namely quantitative approaches and qualitative approaches. The former assumes the availability of a scoring function, while the latter does not. Instead, in qualitative approaches, preferences...
متن کاملRobust Ranking of Uncertain Data
Numerous real-life applications are continually generating huge amounts of uncertain data (e.g., sensor or RFID readings). As a result, top-k queries that return only the k most promising probabilistic tuples become an important means to monitor and analyze such data. These “top” tuples should have both high scores in term of some ranking function, and high occurrence probability. The previous ...
متن کامل