Finding and evaluating sets of nearest neighbours
نویسندگان
چکیده
In this paper, we consider two applications of distributional similarity measures, probability estimation and prediction of semantic similarity. We investigate whether high performance in one application area is correlated with high performance in the other. This work also provides an evaluation of two state-of-the-art distributional similarity measures and introduces a variant of one. Further, we overcome statistical biases in the standard pseudo-disambiguation task and look at the effect of word and co-occurrence frequency on the performance of the measures.
منابع مشابه
Pseudo-Likelihood Inference Underestimates Model Uncertainty: Evidence from Bayesian Nearest Neighbours
When using the K-nearest neighbours (KNN) method, one often ignores the uncertainty in the choice of K. To account for such uncertainty, Bayesian KNN (BKNN) has been proposed and studied (Holmes and Adams 2002 Cucala et al. 2009). We present some evidence to show that the pseudo-likelihood approach for BKNN, even after being corrected by Cucala et al. (2009), still significantly underest...
متن کاملIncorporating Farthest Neighbours in Instance Space Classification
The nearest neighbour (NN) classifier is often known as a ‘lazy’ approach but it is still widely used particularly in the systems that require pattern matching. Many algorithms have been developed based on NN in an attempt to improve classification accuracy and to reduce the time taken, especially in large data sets. This paper proposes a new classification technique based on kNearest Neighbour...
متن کاملSNN: A Supervised Clustering Algorithm
In this paper, we present a new algorithm based on the nearest neighbours method, for discovering groups and identifying interesting distributions in the underlying data in the labelled databases. We introduces the theory of nearest neighbours sets in order to base the algorithm S-NN (Similar Nearest Neighbours). Traditional clustering algorithms are very sensitive to the user-defined parameter...
متن کاملA K-nearest neighbours method based on imprecise probabilities
K-nearest neighbours algorithms are among the most popular existing classification methods, due to their simplicity and their good performances. Over the years, several extensions of the initial method have been proposed. In this paper, we propose a K-nearest neighbours approach that uses the theory of imprecise probabilities, and more specifically lower previsions. We show that the proposed ap...
متن کاملConcave hull: A k-nearest neighbours approach for the computation of the region occupied by a set of points
This paper describes an algorithm to compute the envelope of a set of points in a plane, which generates convex or non-convex hulls that represent the area occupied by the given points. The proposed algorithm is based on a k-nearest neighbours approach, where the value of k, the only algorithm parameter, is used to control the “smoothness” of the final solution. The obtained results show that t...
متن کامل