Fast Support Vector Data Description Using K-Means Clustering
نویسندگان
چکیده
Support Vector Data Description (SVDD) has a limitation for dealing with a large data set in which computational load drastically increases as training data size becomes large. To handle this problem, we propose a new fast SVDDmethod using K-means clustering method. Our method uses divide-and-conquer strategy; trains each decomposed subproblems to get support vectors and retrains with the support vectors to find a global data description of a whole target class. The proposed method has a similar result to the original SVDD and reduces computational cost. Through experiments, we show efficiency of our method.
منابع مشابه
Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملFuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition
In this paper, utilization of clustering algorithms for data fusion in decision level is proposed. The results of automatic isolated word recognition, which are derived from speech spectrograph and Linear Predictive Coding (LPC) analysis, are combined with each other by using fuzzy clustering algorithms, especially fuzzy k-means and fuzzy vector quantization. Experimental results show that the...
متن کاملPosition regularized Support Vector Domain Description
Support Vector Domain Description (SVDD) is an effective method for describing a set of objects. As a basic tool, several application-oriented extensions have been developed, such as support vector clustering (SVC), SVDD-based k-Means (SVDDk-Means) and support vector based algorithm for clustering data streams (SVStream). Despite its significant success, one inherent drawback is that the descri...
متن کاملAn Efficient Predictive Model for Probability of Genetic Diseases Transmission Using a Combined Model
In this article, a new combined approach of a decision tree and clustering is presented to predict the transmission of genetic diseases. In this article, the performance of these algorithms is compared for more accurate prediction of disease transmission under the same condition and based on a series of measures like the positive predictive value, negative predictive value, accuracy, sensitivit...
متن کاملA Hybrid Data Clustering Algorithm Using Modified Krill Herd Algorithm and K-MEANS
Data clustering is the process of partitioning a set of data objects into meaning clusters or groups. Due to the vast usage of clustering algorithms in many fields, a lot of research is still going on to find the best and efficient clustering algorithm. K-means is simple and easy to implement, but it suffers from initialization of cluster center and hence trapped in local optimum. In this paper...
متن کامل