Support vector machines based on K-means clustering for real-time business intelligence systems
نویسندگان
چکیده
Support vector machines (SVM) have been applied to build classifiers, which can help users make well-informed business decisions. Despite their high generalisation accuracy, the response time of SVM classifiers is still a concern when applied into real-time business intelligence systems, such as stock market surveillance and network intrusion detection. This paper speeds up the response of SVM classifiers by reducing the number of support vectors. This is done by the K-means SVM (KMSVM) algorithm proposed in this paper. The KMSVM algorithm combines the K-means clustering technique with SVM and requires one more input parameter to be determined: the number of clusters. The criterion and strategy to determine the input parameters in the KMSVM algorithm are given in this paper. Experiments compare the KMSVM algorithm with SVM on real-world databases, and the results show that the KMSVM algorithm can speed up the response time of classifiers by both reducing support vectors and maintaining a similar testing accuracy to SVM.
منابع مشابه
Support Vector Clustering for Web Usage Mining
This paper applies the use of support vector clustering (SVC) in the domain of web usage mining. In this method, the data points are transformed to a high dimensional space called the feature space, where support vectors are used to define a smallest sphere enclosing the data. A soft-margin constant is used to handle outliers. The paper then performs experiments to compare SVC and the K-Means a...
متن کاملKBSVM: KMeans-based SVM for Business Intelligence
The goal of business intelligence (BI) is to make decisions based on accurate and succinct information from massive amounts of data. Support vector machine (SVM) has been applied to build the classification model in the field of BI and data mining. To achieve the original goal of BI and speed up the response of real-time systems, the complexity of SVM models should be reduced when it is applied...
متن کاملRule Extraction from Support Vector Machines
Support vector machines (SVMs) are learning systems based on the statistical learning theory, which are exhibiting good generalization ability on real data sets. Nevertheless, a possible limitation of SVM is that they generate black box models. In this work, a procedure for rule extraction from support vector machines is proposed: the SVM+Prototypes method. This method allows to give explanatio...
متن کاملMining Biological Repetitive Sequences Using Support Vector Machines and Fuzzy SVM
Structural repetitive subsequences are most important portion of biological sequences, which play crucial roles on corresponding sequence’s fold and functionality. Biggest class of the repetitive subsequences is “Transposable Elements” which has its own sub-classes upon contexts’ structures. Many researches have been performed to criticality determine the structure and function of repetitiv...
متن کاملEntropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJBIDM
دوره 1 شماره
صفحات -
تاریخ انتشار 2005