نتایج جستجو برای: text documents
تعداد نتایج: 222232 فیلتر نتایج به سال:
Hypertext is a generalization of the conventional linear text into a non-linear text formed by adding cross-reference and structural links between different pieces of text. A hypertext can be regarded as an extension of a textual database by adding a link structure among the different text objects it stores. We present a tool for finding information in a distributed hypertext such as the World-...
which determines similarities between text documents. These text documents are indexed with keywords and further background knowledge-terms from an ontology.The representation of the documents and the evaluation of the algorithm are used to let an ontology learn. This is shown to be one way of improving the results of the algorithm by improving the background knowledge.
ÐWe propose a method for text retrieval from document images without the use of OCR. Documents are segmented into character objects. Image features, namely, the Vertical Traverse Density (VTD) and Horizontal Traverse Density (HTD), are extracted. An n-gram based document vector is constructed for each document based on these features. Text similarity between documents is then measured by calcul...
Clustering of text documents is an important technique for documents retrieval. It aims to organize documents into meaningful groups or clusters. Preprocessing text plays a main role in enhancing clustering process of Arabic documents. This research examines and compares text preprocessing techniques in Arabic document clustering. It also studies effectiveness of text preprocessing techniques: ...
Text clustering methods can be used to group large sets of text documents. Most of the text clustering methods do not address the problems of text clustering such as very high dimensionality of the data and understandability of the clustering descriptions. In this paper, a frequent term based approach of clustering has been introduced; it provides a natural way of reducing a large dimensionalit...
We introduce the unsupervised text annotation model UTA, which iteratively populates a document-specific database containing the related symbolic content description. The model identifies the most related documents using the text of documents and the symbolic content description. UTA extends the database of one document with data from related documents without ignoring the precision.
Semantic Text Mining and its Application in Biomedical Domain Illhoi Yoo Xiaohua Hu, Ph.D A huge amount of biomedical knowledge and novel discoveries have been produced and collected in text databases or digital libraries, such as MEDLINE, because the most natural form to store information is text. In order to cope with this pressing text information overload, text mining is employed. However, ...
Clustering text documents into different category groups is an important step in indexing, retrieval, management and mining of abundant text data on the Web or in corporate information systems. Text clustering task can be intuitively described as finding, given a set vectors of some data points in a multi-dimensional space, a partition of text data into clusters such that the points within each...
Document retrieval is the problem of finding stored documents that contain useful information. There exist a set of documents on a range of topics, written by different authors, at different times, and at varying levels of depth, detail, clarity, and precision, and a set of individuals who, at different times and for different reasons, search for recorded information that may be contained in so...
Japanese documents often contain both horizontally and vertically printed text lines in the same page. It has been required for document analysis systems to detect correct orientation of text lines and to select text line candidates of correct orientation. We designed an efficient framework for the procedure and developed some algorithms which reduce text line candidates of incorrect orientatio...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید