نتایج جستجو برای: greedy clustering method

تعداد نتایج: 1716229  

2006
Dragi Kocev Jan Struyf Saso Dzeroski

Much research on inductive databases (IDBs) focuses on local models, such as item sets and association rules. In this work, we investigate how IDBs can support global models, such as decision trees. Our focus is on predictive clustering trees (PCTs). PCTs generalize decision trees and can be used for prediction and clustering, two of the most common data mining tasks. Regular PCT induction buil...

2015
Erik M. Lindgren Shanshan Wu Alexandros G. Dimakis

Submodular facility location functions are widely used for summarizing large datasets and have found applications ranging from sensor placement, image retrieval, and clustering. A significant problem is that evaluating such functions typically requires the calculation of pairwise benefits for all items, which is computationally unmanageable for large problems. In this paper we propose a sparsif...

2010
Shyama Das Sumam Mary Idicula Yizong Cheng George M. Church Anupam Chakraborty Hitashyam Maka Gary A Kochenberger Smitha Dharan

Biclustering algorithms perform simultaneous row and column clustering of a given data matrix. In gene expression dataset a bicluster is a subset of genes that exhibit similar expression patterns through a subset of conditions. Biclustering is a useful data mining technique for identifying local patterns from gene expression data. In this paper biclusters are identified in two steps. In the fir...

2006
Dragi Kocev Jan Struyf

We investigate how inductive databases (IDBs) can support global models, such as decision trees. We focus on predictive clustering trees (PCTs). PCTs generalize decision trees and can be used for prediction and clustering, two of the most common data mining tasks. Regular PCT induction builds PCTs top-down, using a greedy algorithm, similar to that of C4.5. We propose a new induction algorithm ...

2014
Duong Vu Szániszló Szöke Christian Wiwie Jan Baumbach Gianluigi Cardinali Richard Röttger Vincent Robert

With the availability of newer and cheaper sequencing methods, genomic data are being generated at an increasingly fast pace. In spite of the high degree of complexity of currently available search routines, the massive number of sequences available virtually prohibits quick and correct identification of large groups of sequences sharing common traits. Hence, there is a need for clustering tool...

Journal: :Proceedings of The Royal Society A: Mathematical, Physical and Engineering Sciences 2023

We study the problem of clustering networks whose nodes have imputed or physical positions in a single dimension, such as prestige hierarchies similarity dimension hyperbolic embeddings. Existing algorithms, critical gap method and other greedy strategies, only offer approximate solutions. Here, we introduce dynamic programming approach that returns provably optimal solutions polynomial time --...

2000
Terran Lane Carla E. Brodley

We describe the task of user-oriented anomaly detection for computer security. In this domain the goal is to develop a model of a computer user’s normal behavioral patterns and to detect anomalous conditions as deviations from expected behaviors. We present an instance-based learning (IBL) system for profiling users and examine some domain constraints with respect to our approach. In particular...

2014
Syama Sundar Rangapuram Pramod Kaushik Mudrakarta Matthias Hein

Spectral Clustering as a relaxation of the normalized/ratio cut has become one of the standard graph-based clustering methods. Existing methods for the computation of multiple clusters, corresponding to a balanced k-cut of the graph, are either based on greedy techniques or heuristics which have weak connection to the original motivation of minimizing the normalized cut. In this paper we propos...

2012
Glenn Blanchette Richard A. O'Keefe Lubica Benusková

This paper compares the implementations and performance of two computational methods, hierarchical clustering and a genetic algorithm, for inference of phylogenetic trees in the context of the artificial organism Caminalcules. Although these techniques have a superficial similarity, in that they both use agglomeration as their construction method, their origin and approaches are antithetical. F...

Journal: :Pattern Recognition 2012
Seyed Salim Tabatabaei Mark Coates Michael G. Rabbat

This paper describes a graph clustering algorithm that aims to minimize the normalized cut criterion and has a model order selection procedure. The performance of the proposed algorithm is comparable to spectral approaches in terms of minimizing normalized cut. However unlike spectral approaches, the proposed algorithm scales to graphs with millions of nodes and edges. The algorithm consists of...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید