نتایج جستجو برای: top k algorithm

تعداد نتایج: 1195242  

Journal: :CoRR 2017
Willi Mann Nikolaus Augsten Christian S. Jensen

We provide efficient support for applications that aim to continuously find pairs of similar sets in rapid streams of sets. A prototypical example setting is that of tweets. A tweet is a set of words, and Twitter emits about half a billion tweets per day. Our solution makes it possible to efficiently maintain the top-k most similar tweets from a pair of rapid Twitter streams, e.g., to discover ...

Journal: :CoRR 2013
Andreas Kosmatopoulos Kostas Tsichlas

Let S be a dataset of n 2-dimensional points. The top-k dominating query aims to report the k points that dominate the most points in S . A point p dominates a point q iff all coordinates of p are smaller than or equal to those of q and at least one of them is strictly smaller. The top-k dominating query combines the dominance concept of maxima queries with the ranking function of top-k queries...

Journal: :CoRR 2013
Gonzalo Navarro Yakov Nekrich

Let D be a collection of D documents, which are strings over an alphabet of size σ, of total length n. We describe a data structure that uses linear space and and reports k most relevant documents that contain a query pattern P , which is a string of length p, in time O(p/ log σ n+k), which is optimal in the RAM model in the general case where lgD = Θ(logn), and involves a novel RAM-optimal suf...

Journal: :PVLDB 2010
Minji Wu Laure Berti-Équille Amélie Marian Cecilia M. Procopiuc Divesh Srivastava

We consider the problem of efficiently finding the top-k answers for join queries over web-accessible databases. Classical algorithms for finding top-k answers use branch-and-bound techniques to avoid computing scores of all candidates in identifying the top-k answers. To be able to apply such techniques, it is critical to efficiently compute (lower and upper) bounds and expected scores of cand...

Journal: :Inf. Syst. 2013
Ramakrishna Varadarajan Fernando Farfán Vagelis Hristidis

Systems that produce ranked lists of results are abundant. For instance, Web search engines return ranked lists of Web pages. There has been work on distance measure for list permutations, like Kendall tau and Spearman’s Footrule, as well as extensions to handle top-k lists, which are more common in practice. In addition to ranking whole objects (e.g., Web pages), there is an increasing number ...

2010
Henrik Grosskreutz Benedikt Lemmen Stefan Rüping

Supervised descriptive rule discovery techniques like subgroup discovery are quite popular in applications like fraud detection or clinical studies. Compared with other descriptive techniques, like classical support/confidence association rules, subgroup discovery has the advantage that it comes up with only the top-k patterns, and that it makes use of a quality function that avoids patterns un...

2009
Yiping Ke James Cheng Jeffrey Xu Yu

Correlation mining has been widely studied due to its ability for discovering the underlying occurrence dependency between objects. However, correlation mining in graph databases is expensive due to the complexity of graph data. In this paper, we study the problem of mining top-k correlative subgraphs in the database, which share similar occurrence distributions with a given query graph. The se...

2011
Henrik Grosskreutz Daniel Paurat

We consider a modified version of the top-k subgroup discovery task, where subgroups dominated by other subgroups are discarded. The advantage of this modified task, known as relevant subgroup discovery, is that it avoids redundancy in the outcome. Although it has been applied in many applications, so far no efficient exact algorithm for this task has been proposed. Most existing solutions do n...

Hedieh Sajedi Rasool Azimi

Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید