Classification on Pairwise Proximity Data
نویسندگان
چکیده
We investigate the problem of learning a classification task on data represented in terms of their pairwise proximities. This representation does not refer to an explicit feature representation of the data items and is thus more general than the standard approach of using Euclidean feature vectors, from which pairwise proximities can always be calculated. Our first approach is based on a combined linear embedding and classification procedure resulting in an extension of the Optimal Hyperplane algorithm to pseudo-Euclidean data. As an alternative we present another approach based on a linear threshold model in the proximity values themselves, which is optimized using Structural Risk Minimization. We show that prior knowledge about the problem can be incorporated by the choice of distance measures and examine different metrics w.r.t. their generalization. Finally, the algorithms are successfully applied to protein structure data and to data from the cat’s cerebral cortex. They show better performance than K-nearestneighbor classification.
منابع مشابه
Application of grey GIS filtration to identify the potential area for cement plants in South Khorasan Province, Eastern Iran
Cement-based materials are fundamental resources used to in construction. The increase in requests for and consumption of cement products, especially in Iran, indicates that more cement plants should be equipped. This study developed a geographical information system using pairwise comparison based on grey numbers to identify potential sites in which to set up cement plants. A group of five exp...
متن کاملA new classification method based on pairwise SVM for facial age estimation
This paper presents a practical algorithm for facial age estimation from frontal face image. Facial age estimation generally comprises two key steps including age image representation and age estimation. The anthropometric model used in this study includes computation of eighteen craniofacial ratios and a new accurate skin wrinkles analysis in the first step and a pairwise binary support vector...
متن کاملClassification on Proximity Data with LP–Machines
We provide a new linear program to deal with classification of data in the case of functions written in terms of pairwise proximities. This allows to avoid the problems inherent in using feature spaces with indefinite metric in Support Vector Machines, since the notion of a margin is purely needed in input space where the classification actually occurs. Moreover in our approach we can enforce s...
متن کاملThe Effect of Transitive Closure on the Calibration of Logistic Regression for Entity Resolution
This paper describes a series of experiments in using logistic regression machine learning as a method for entity resolution. From these experiments the authors concluded that when a supervised ML algorithm is trained to classify a pair of entity references as linked or not linked pair, the evaluation of the model’s performance should take into account the transitive closure of its pairwise lin...
متن کاملSemi-Supervised Vector Quantization for proximity data
Semi-supervised learning (SSL) is focused on learning from labeled and unlabeled data by incorporating structural and statistical information of the available unlabeled data. The amount of data is dramatically increasing, but few of them are fully labeled, due to cost and time constraints. This is even more challenging for non-vectorial, proximity data, given by pairwise proximity values. Only ...
متن کامل