1 A Classification System for Rule Extraction from Support Vector Machines

نویسنده

  • Joachim Diederich
چکیده

Over the last years, a number of studies on rule extraction from support vector machines (SVMs) have been introduced [1-5]. The research strategy in these projects is similar: to explore and develop algorithms for rule extraction based on the perception (or “view”) of the underlying SVM which is either explicitly or implicitly assumed within the rule extraction technique. In the context of rule extraction from artificial neural networks (ANNs) [6, 7] the notion of “translucency” describes the degree to which the internal representation of the ANN is accessible to the rule extraction technique. More broadly, a taxonomy for rule extraction from neural networks has been introduced [6, 7] which includes five evaluation criteria: translucency, rule quality, expressive power, portability and algorithmic complexity. These evaluation criteria are now commonly used for rule extraction from SVMs. At this point, it is important to develop new techniques for rule extraction from support vector machines, including those that are soly based on SVMs and do not require any other machine learning technique. In particular support vector machines that allow the generation of structured outputs [8, 9] can be used to generate rule sets not unlike those extracted from neural networks. This represents a clear advancement since user explanation is realized by an SVM and not by a technique with a different representational bias. In addition, methods for the extraction of high quality rule sets from SVMs trained on high-dimensional data are required. The following briefly describes the first two of the five evaluation criteria for rule extraction from neural networks [6, 7] which are then discussed in the context of rule extraction from SVMs. A new classification schema for rule extraction from SVMs is presented and an approach is outlined which (1) uses SVMs only, including those generating structured output, and (2) works well for high-dimensional data. Translucency and rule quality in the context of rule extraction from ANNs Translucency describes the degree to which the internal representation of the ANN is accessible to the rule extraction technique. At one end of the translucency spectrum are those rule extraction techniques which view the underlying ANN at the maximum level 2 of granularity i.e. as a set of discrete hidden and output units. Craven and Shavlik [10] categorized such techniques as “decompositional”. The basic strategy of such decompositional techniques is to extract rules at the level of each individual hidden and output unit within the trained ANN. In general, decompositional rule extraction techniques incorporate some form of analysis of the weight vector and associated bias (threshold) of each unit in the trained ANN. Then, by treating each unit in the ANN as an isolated entity, decompositional techniques initially generate rules in which the antecedents and consequents are expressed in terms which are local to the unit from which they are derived. A process of aggregation is then required to transform these local rules into a composite rule base for the ANN as a whole [7]. In contrast to the decompositional approaches, the strategy of the pedagogical approaches is to view the trained ANN at the minimum possible level of granularity i.e. as a single entity or alternatively as a “black box”. The focus is on finding rules that map the (ANN) inputs (i.e. the attribute/value pairs from the problem domain) directly to outputs [7]. In addition to these two main categories, Andrews et. al. [6] also proposed a third category which they labeled as “eclectic” to accommodate those rule extraction techniques which incorporate elements of both the decompositional and pedagogical approaches. Rule extraction from neural networks adopted criteria for the quality of the extracted rules. The set of criteria for evaluating rule quality includes [6]: a) accuracy b) fidelity c) consistency, and d) comprehensibility of the extracted rules. A rule set is considered to be accurate if it can correctly classify a set of previously unseen examples from the problem domain [7]. Similarly a rule set is considered to display a high level of fidelity if it can mimic the behavior of neural network from which it was extracted by capturing all of the information represented in the ANN. An extracted rule set is deemed to be consistent if, under differing training sessions, the artificial neural network generates rule sets which produce the same classifications of unseen examples. Finally the comprehensibility of a rule set is determined by measuring the size of the rule set (in terms of the number of rules) and the number of antecedents per rule [7]. Translucency and rule quality applied to rule extraction from SVMs Most current studies on rule extraction from SVMs focus on decompositional extraction; however, learning-based approaches are also available [4]. The idea is simple: learn what the SVM has learned. For this purpose a dataset is divided in two or more parts. The first set is used to train the SVM to completion. The second set does not include targets, the inputs are presented to the SVM and the output is obtained from the SVM. Inputs and outputs combined represent a new data set which is used for a second machine learning episode by use of a machine learning system that produces rules as output.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A QUADRATIC MARGIN-BASED MODEL FOR WEIGHTING FUZZY CLASSIFICATION RULES INSPIRED BY SUPPORT VECTOR MACHINES

Recently, tuning the weights of the rules in Fuzzy Rule-Base Classification Systems is researched in order to improve the accuracy of classification. In this paper, a margin-based optimization model, inspired by Support Vector Machine classifiers, is proposed to compute these fuzzy rule weights. This approach not only  considers both accuracy and generalization criteria in a single objective fu...

متن کامل

Remote Sensing and Land Use Extraction for Kernel Functions Analysis by Support Vector Machines with ASTER Multispectral Imagery

Land use is being considered as an element in determining land change studies, environmental planning and natural resource applications. The Earth’s surface Study by remote sensing has many benefits such as, continuous acquisition of data, broad regional coverage, cost effective data, map accurate data, and large archives of historical data. To study land use / cover, remote sensing as an effic...

متن کامل

Learning-based Rule-Extraction from Support Vector Machines

In recent years, support vector machines (SVMs) have shown good performance in a number of application areas, including text classification. However, the success of SVMs comes at a cost – an inability to explain the process by which a learning result was reached and why a decision is being made. Rule-extraction from SVMs is important for the acceptance of this machine learning technology, espec...

متن کامل

Face Recognition using Eigenfaces , PCA and Supprot Vector Machines

This paper is based on a combination of the principal component analysis (PCA), eigenface and support vector machines. Using N-fold method and with respect to the value of N, any person’s face images are divided into two sections. As a result, vectors of training features and test features are obtain ed. Classification precision and accuracy was examined with three different types of kernel and...

متن کامل

Fault diagnosis in a distillation column using a support vector machine based classifier

Fault diagnosis has always been an essential aspect of control system design. This is necessary due to the growing demand for increased performance and safety of industrial systems is discussed. Support vector machine classifier is a new technique based on statistical learning theory and is designed to reduce structural bias. Support vector machine classification in many applications in v...

متن کامل

Fuzzy Apriori Rule Extraction Using Multi-Objective Particle Swarm Optimization: The Case of Credit Scoring

There are many methods introduced to solve the credit scoring problem such as support vector machines, neural networks and rule based classifiers. Rule bases are more favourite in credit decision making because of their ability to explicitly distinguish between good and bad applicants.In this paper multi-objective particle swarm is applied to optimize fuzzy apriori rule base in credit scoring. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006