A Framework To Automatically Categorize The Unstructured Text Documents
نویسندگان
چکیده
منابع مشابه
Mining criminal networks from unstructured text documents
Digital data collected for forensics analysis often contain valuable information about the suspects’ social networks. However, most collected records are in the form of unstructured textual data, such as e-mails, chat messages, and text documents. An investigator often has to manually extract the useful information from the text and then enter the important pieces into a structured database for...
متن کاملAutomated ontology construction for unstructured text documents
Ontology is playing an increasingly important role in knowledge management and the Semantic Web. This study presents a novel episode-based ontology construction mechanism to extract domain ontology from unstructured text documents. Additionally, fuzzy numbers for conceptual similarity computing are presented for concept clustering and taxonomic relation definitions. Moreover, concept attributes...
متن کاملAutoMarkup: A Tool for Automatically Marking up Text Documents
In this paper we present a novel system that can automatically mark up text documents into XML. The system uses the Self-Organizing Map (SOM) algorithm to organize marked documents on a map so that similar documents are placed on nearby locations. Then by using the inductive learning algorithm C5, it automatically generates and applies the markup rules from the nearest SOM neighbours of an unma...
متن کاملA Search/Crawl Framework for Automatically Acquiring Scientific Documents
Despite the advancements in search engine features, ranking methods, technologies, and the availability of programmable APIs, current-day openaccess digital libraries still rely on crawl-based approaches for acquiring their underlying document collections. In this paper, we propose a novel search-driven framework for acquiring documents for scientific portals. Within our framework, publicly-ava...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Indian Journal of Science and Technology
سال: 2017
ISSN: 0974-6846,0974-5645
DOI: 10.17485/ijst/2017/v10i8/10947