نتایج جستجو برای: document weight
تعداد نتایج: 499854 فیلتر نتایج به سال:
A key problem in document classification and clustering is learning the similarity between documents. Traditional approaches include estimating similarity between feature vectors of documents where the vectors are computed using TF-IDF in the bag-of-words model. However, these approaches do not work well when either similar documents do not use the same vocabulary or the feature vectors are not...
Traditional document clustering techniques are mostly based on the number of occurrences and the existence of keywords. The term frequency based clustering techniques takes the documents as bag-of words while ignoring the relationship between the words. Similarly Phrase based clustering technique only captures the order in which the words appear in a sentence instead of determining the semantic...
Traditional index weighting approaches for information retrieval from texts depend on the term frequency based analysis of the text contents. A shortcoming of these indexing schemes, which consider only the occurrences of the terms in a document, is that they have some limitations in extracting semantically exact indexes that represent the semantic content of a document. To address this issue, ...
Inverted files are widely used to index documents in large-scale information retrieval systems. An inverted file consists of posting lists, which can be stored in either a document-identifier ascending order or a document-weight descending order. For an identifierascending-order posting list, retrieving ranked documents necessitates traversal of all postings, whereas for the weight-descending-o...
We introduce a light-weight interlingua for a crosslanguage document retrieval system in the medical domain. It is composed of equivalence classes of semantically primitive, language-specific subwords which are clustered by interlingual and intralingual synonymy. Each subword cluster represents a basic conceptual entity of the language-independent interlingua. Documents, as well as queries, are...
خلاصه فارسی: به علت وجود مشکلات استفاده از داروهایی با نیمه عمر کوتاه که نیازمند تجویز چند دوز در طی یک روز می باشند، مانند فراموش کردن دریافت دوز دارو و ثابت نبودن سطح خونی و... ضرورت ساخت اشکال modified release واضح و مبرهن است. از میان راه های مختلف دارورسانی به بدن، راه خوراکی به علت مزایای ویژه خود همچنان از جایگاه مهمی در داروسازی برخوردار می باشد. اشکال دارویی متنوعی نیز این امکان را فر...
Integrate Document Ranking Information into Confidence Measure Calculation for Spoken Term Detection
This paper proposes an algorithm to improve the calculation of confidence measure for spoken term detection (STD). Given an input query term, the algorithm first calculates a measurement named document ranking weight for each document in the speech database to reflect its relevance with the query term by summing all the confidence measures of the hypothesized term occurrences in this document. ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید