Creating an Order in Distributed Digital Libraries by Integrating Independent Self-Organizing Maps
نویسندگان
چکیده
Digital document libraries are an almost perfect application arena for un-supervised neural networks. This because many of the operations computers have to perform on text documents are classiication tasks based on \noisy" input patterns. The \noise" arises because of the known inaccuracy of mapping natural language to an indexing vocabulary representing the contents of the documents. A growing number of papers is dedicated to the usage of self-organizing maps to organize the contents of such digital libraries. These papers assume the central availability of the data; an assumption that is questionable given the massive amount of available information. In this paper we describe an approach for organizing distributed digital libraries based on a system of independent self-organizing maps each of which representing just a portion of the complete digital library. Furthermore, we argue in favor of integrating these independent maps in a hierarchical fashion, again by means of self-organizing maps. The integration is based on the trained low level maps.
منابع مشابه
Organization of Distributed Digital Libraries: A Neural Network { Based Approach
Self-organizing maps are a popular neural network model for mapping high-dimensional input data onto a lower-dimensional output space. However, as the size of the training data increases, both the necessary computational power as well as the training time required exceed tolerable limits. Still more important, not all training data may be available in one central location but may rather be coll...
متن کاملSOMLib: A Distributed Digital Library System based on Self-Organizing Maps
We describe an architecture for a distributed digital library system based on an unsupervised neural network model, namely the Self-Organizing Map. The system allows the clustering of text documents forming the basis for intelligent information retrieval. User prooles can be combined with full text queries or sample texts to locate documents within the library system. Contrary to conventional a...
متن کاملMinervaDL: An Architecture for Information Retrieval and Filtering in Distributed Digital Libraries
We present MinervaDL, a digital library architecture that supports approximate information retrieval and filtering functionality under a single unifying framework. The architecture of MinervaDL is based on the peer-to-peer search engine Minerva, and is able to handle huge amounts of data provided by digital libraries in a distributed and self-organizing way. The two-tier architecture and the us...
متن کاملA Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation
The rapid proliferation of textual and multimedia online databases, digital libraries, Internet servers, and intranet services has turned researchers' and practitioners' dream of creating an information-rich society into a nightmare of info-gluts. Many researchers believe that turning an info-glut into a useful digital library requires automated techniques for organizing and categorizing large-...
متن کاملText Data Mining
Classiication is one of the central issues in any system dealing with text data. The need for eeective approaches is dramatically increased nowadays due to the advent of massive digital libraries containing free-form documents. What we are looking for are powerful methods for the exploration of such libraries whereby the discovery of similarities between groups of text documents is the overall ...
متن کامل