منابع مشابه
Concept-based Text Clustering
Thematic organization of text is a natural practice of humans and a crucial task for today’s vast repositories. Clustering automates this by assessing the similarity between texts and organizing them accordingly, grouping like ones together and separating those with different topics. Clusters provide a comprehensive logical structure that facilitates exploration, search and interpretation of cu...
متن کاملEffective Concept-Based Mining Model For Text Clustering
The common techniques in text mining are based on the statistical analysis of a term, either word or phrase. Statistical analysis of a term frequency captures the importance of the term within a document only. Two terms can have the same frequency in their documents, but one term contributes more to the meaning of its sentences than the other term. Usually in text mining techniques the basic me...
متن کاملConcept Chain Based Text Clustering
Different from familiar clustering objects, text documents have sparse data spaces. A common way of representing a document is as a bag of its component words, but the semantic relations between words are ignored. In this paper, we propose a novel document representation approach to strengthen the discriminative feature of document objects. We replace terms of documents with concepts in WordNet...
متن کاملOntology based Text Mining of Concept Definitions in Biomedical Literature
Many developers of biomedical knowledge bases typically validate and update formalized knowledge based on reviews of full-text scientific articles, but finding text relevant to domain concepts can be tedious and prone to errors. Prior methods have automated this process by matching term-based patterns within a single sentence. In our work developing a knowledge base of autism phenotypes, specif...
متن کاملClustering Concept Hierarchies from Text
Abstract We present a novel approach to learning taxonomies or concept hierarchies from text. The approach is based on Formal Concept Analysis, a method mainly used for the analysis of data, i.e. for investigating and processing explicitly given information. Our approach is based on the distributional hypothesis, i.e. that nouns or terms are similar to the extent to which they share contexts. F...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computer and Communication Technology
سال: 2016
ISSN: 2231-0371,0975-7449
DOI: 10.47893/ijcct.2016.1331