نتایج جستجو برای: document weight

تعداد نتایج: 499854  

2011
Pradeep Muthukrishnan Dragomir R. Radev Qiaozhu Mei

A key problem in document classification and clustering is learning the similarity between documents. Traditional approaches include estimating similarity between feature vectors of documents where the vectors are computed using TF-IDF in the bag-of-words model. However, these approaches do not work well when either similar documents do not use the same vocabulary or the feature vectors are not...

2014
Sapna Gupta

Traditional document clustering techniques are mostly based on the number of occurrences and the existence of keywords. The term frequency based clustering techniques takes the documents as bag-of words while ignoring the relationship between the words. Similarly Phrase based clustering technique only captures the order in which the words appear in a sentence instead of determining the semantic...

Journal: :Inf. Process. Manage. 2005
Bo-Yeong Kang Sang-Jo Lee

Traditional index weighting approaches for information retrieval from texts depend on the term frequency based analysis of the text contents. A shortcoming of these indexing schemes, which consider only the occurrences of the terms in a document, is that they have some limitations in extracting semantically exact indexes that represent the semantic content of a document. To address this issue, ...

Journal: :Milletleraras 1977

2003
Wann-Yun Shieh Tien-Fu Chen Chung-Ping Chung

Inverted files are widely used to index documents in large-scale information retrieval systems. An inverted file consists of posting lists, which can be stored in either a document-identifier ascending order or a document-weight descending order. For an identifierascending-order posting list, retrieving ranked documents necessitates traversal of all postings, whereas for the weight-descending-o...

2005
Udo HAHN Kornél MARKÓ Stefan SCHULZ

We introduce a light-weight interlingua for a crosslanguage document retrieval system in the medical domain. It is composed of equivalence classes of semantically primitive, language-specific subwords which are clustered by interlingual and intralingual synonymy. Each subword cluster represents a basic conceptual entity of the language-independent interlingua. Documents, as well as queries, are...

پایان نامه :دانشگاه آزاد اسلامی - دانشگاه آزاد اسلامی واحد علوم دارویی - دانشکده داروسازی 1393

خلاصه فارسی: به علت وجود مشکلات استفاده از داروهایی با نیمه عمر کوتاه که نیازمند تجویز چند دوز در طی یک روز می باشند، مانند فراموش کردن دریافت دوز دارو و ثابت نبودن سطح خونی و... ضرورت ساخت اشکال modified release واضح و مبرهن است. از میان راه های مختلف دارورسانی به بدن، راه خوراکی به علت مزایای ویژه خود همچنان از جایگاه مهمی در داروسازی برخوردار می باشد. اشکال دارویی متنوعی نیز این امکان را فر...

Journal: :CoRR 2015
Quan Liu Wu Guo Zhen-Hua Ling

This paper proposes an algorithm to improve the calculation of confidence measure for spoken term detection (STD). Given an input query term, the algorithm first calculates a measurement named document ranking weight for each document in the speech database to reflect its relevance with the query term by summing all the confidence measures of the hypothesized term occurrences in this document. ...

Journal: :International Journal of Advanced Computer Science and Applications 2016

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید