Prototype selection for dissimilarity-based classifiers
نویسندگان
چکیده
A conventional way to discriminate between objects represented by dissimilarities is the nearest neighbor method. A more efficient and sometimes a more accurate solution is offered by other dissimilarity-based classifiers. They construct a decision rule based on the entire training set, but they need just a small set of prototypes, the so-called representation set, as a reference for classifying new objects. Such alternative approaches may be especially advantageous for non-Euclidean or even non-metric dissimilarities. The choice of a proper representation set for dissimilarity-based classifiers is not yet fully investigated. It appears that a random selection may work well. In this paper, a number of experiments has been conducted on various metric and non-metric dissimilarity representations and prototype selection methods. Several procedures, like traditional feature selection methods (here effectively searching for prototypes), mode seeking and linear programming are compared to the random selection. In general, we find out that systematic approaches lead to better results than the random selection, especially for a small number of prototypes. Although there is no single winner as it depends on data characteristics, the k-centres works well, in general. For two-class problems, an important observation is that our dissimilarity-based discrimination functions relying on significantly reduced prototype sets (3–10% of the training objects) offer a similar or much better classification accuracy than the best k-NN rule on the entire training set. This may be reached for multi-class data as well, however such problems are more difficult. 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
منابع مشابه
On using prototype reduction schemes to optimize dissimilarity-based classification
The aim of this paper is to present a strategy by which a new philosophy for pattern classification, namely that pertaining to dissimilaritybased classifiers (DBCs), can be efficiently implemented. This methodology, proposed by Duin and his co-authors (see Refs. [Experiments with a featureless approach to pattern recognition, Pattern Recognition Lett. 18 (1997) 1159–1166; Relational discriminan...
متن کاملOPTIMIZED DICTIONARY DESIGN AND CLASSIFICATION USING THE MATCHING PURSUITS DISSIMILARITY MEASURE By RAAZIA MAZHAR A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy OPTIMIZED DICTIONARY DESIGN AND CLASSIFICATION USING THE MATCHING PURSUITS DISSIMILARITY MEASURE By Raazia Mazhar May 2009 Chair: Paul D. Gader Co-chair: Joseph N. Wilson Major: Computer Engineering Discrimination-based classifiers diffe...
متن کاملA Conformal Classifier for Dissimilarity Data
Current classification algorithms focus on vectorial data, given in euclidean or kernel spaces. Many real world data, like biological sequences are not vectorial and often non-euclidean, given by (dis-)similarities only, requesting for efficient and interpretable models. Current classifiers for such data require complex transformations and provide only crisp classification without any measure o...
متن کاملA new metric for dissimilarity data classification based on Support Vector Machines optimization
Dissimilarities are extremely useful in many real-world pattern classification problems, where the data resides in a complicated, complex space, and it can be very difficult, if not impossible, to find useful feature vector representations. In these cases a dissimilarity representation may be easier to come by. The goal of this work is to provide a new technique based on Support Vector Machines...
متن کاملExperimental study on prototype optimisation algorithms for prototype-based classification in vector spaces
Prototype-based classification relies on the distances between the examples to be classified and carefully chosen prototypes. A small set of prototypes is of interest to keep the computational complexity low, while maintaining high classification accuracy. An experimental study of some old and new prototype optimisation techniques is presented, in which the prototypes are either selected or gen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition
دوره 39 شماره
صفحات -
تاریخ انتشار 2006