A New Self-Organizing Map for Dissimilarity Data
نویسندگان
چکیده
The Self-Organizing Map (Kohonen, 1997) is an effective and a very popular tool for data clustering and visualization. With this method, the input samples are projected into a low dimension space while preserving their topology. The samples are described by a set of features. The input space is generally a high dimensional space Rd. 2D or 3D maps are very often used for visualization in a low dimension space (2 or 3). For many applications, usually in psychology, biology, genetic, image and signal processing, such vector description is not available; only pair-wise dissimilarity data is provided. For instance, applications in Text Mining or ADN exploration are very important in this field and the observations are usually described through their proximities expressed by the “Levenshtein”, or “String Edit” distances (Levenshtein, 1966). The first approach consists of the transformation of a dissimilarity matrix into a true Euclidean distance matrix. A straightforward strategy is to use “Multidimensional Scaling” techniques (Borg & Groenen, 1997) to provide a feature space. So, the initial vector SOM algorithm can be naturally used. If this transformation involves great distortions, the initial vector model for SOM is no longer valid, and the analysis of dissimilarity data requires specific techniques (Jain & Dubes, 1988; Van Cutsem, 1994) and Dissimilarity Self Organizing Map (DSOM) is a new one. Consequently, adaptation of the Self-Organizing Map (SOM) to dissimilarity data is of a growing interest. During this last decade, different propositions emerged to extend the vector SOM model to pair-wise dissimilarity data. The main motivation is to cope with large proximity databases for data mining. In this article, we present a new adaptation of the SOM algorithm which is compared with two existing ones. BACKGROUND
منابع مشابه
Landforms identification using neural network-self organizing map and SRTM data
During an 11 days mission in February 2000 the Shuttle Radar Topography Mission (SRTM) collected data over 80% of the Earth's land surface, for all areas between 60 degrees N and 56 degrees S latitude. Since SRTM data became available, many studies utilized them for application in topography and morphometric landscape analysis. Exploiting SRTM data for recognition and extraction of topographic ...
متن کاملClassification of Streaming Fuzzy DEA Using Self-Organizing Map
The classification of fuzzy data is considered as the most challenging areas of data analysis and the complexity of the procedures has been obstacle to the development of new methods for fuzzy data analysis. However, there are significant advances in modeling systems in which fuzzy data are available in the field of mathematical programming. In order to exploit the results of the researches on ...
متن کاملA Modfied Self-organizing Map Neural Network to Recognize Multi-font Printed Persian Numerals (RESEARCH NOTE)
This paper proposes a new method to distinguish the printed digits, regardless of font and size, using neural networks.Unlike our proposed method, existing neural network based techniques are only able to recognize the trained fonts. These methods need a large database containing digits in various fonts. New fonts are often introduced to the public, which may not be truly recognized by the Opti...
متن کاملFast algorithm and implementation of dissimilarity self-organizing maps
In many real-world applications, data cannot be accurately represented by vectors. In those situations, one possible solution is to rely on dissimilarity measures that enable a sensible comparison between observations. Kohonen's self-organizing map (SOM) has been adapted to data described only through their dissimilarity matrix. This algorithm provides both nonlinear projection and clustering o...
متن کاملSpeeding Up the Dissimilarity Self-Organizing Maps by Branch and Bound
This paper proposes to apply the branch and bound principle from combinatorial optimization to the Dissimilarity Self-Organizing Map (DSOM), a variant of the SOM that can handle dissimilarity data. A new reference model optimization method is derived from this principle. Its results are strictly identical to those of the original DSOM algorithm by Kohonen and Somervuo, while its running time is...
متن کامل