نتایج جستجو برای: topic model

تعداد نتایج: 2231604  

2016
Shuangyin Li Rong Pan Yu Zhang Qiang Yang

It is natural to expect that the documents in a corpus will be correlated, and these correlations are reflected by not only the words but also the observed tags in each document. Most previous works model this type of corpus, which are called the semi-structured corpus, without considering the correlations among the tags. In this work, we develop a Correlated Tag Learning (CTL) model for semi-s...

2011
Dae Il Kim Erik B. Sudderth

Topic models are learned via a statistical model of variation within document collections, but designed to extract meaningful semantic structure. Desirable traits include the ability to incorporate annotations or metadata associated with documents; the discovery of correlated patterns of topic usage; and the avoidance of parametric assumptions, such as manual specification of the number of topi...

2014
Ang Zhao Xin Lin Jing Yang

In this paper, a novel graph-based model (GBM) is proposed for topic detecting. Different from existing statistical methods, our proposed model considers more semantic factors which combines named entity and dependency relation between words derived from a dependency parse tree. In our model, a graph is constructed for representing words and their association. By utilizing spectral clustering a...

2017
Guangxu Xun Yaliang Li Wayne Xin Zhao Jing Gao Aidong Zhang

Conventional correlated topic models are able to capture correlation structure among latent topics by replacing the Dirichlet prior with the logistic normal distribution. Word embeddings have been proven to be able to capture semantic regularities in language. Therefore, the semantic relatedness and correlations between words can be directly calculated in the word embedding space, for example, ...

2015
Jaimie Murdock Colin Allen

Topic models remain a black box both for modelers and for end users in many respects. From the modelers’ perspective, many decisions must be made which lack clear rationales and whose interactions are unclear – for example, how many topics the algorithms should find (K), which words to ignore (aka the “stop list”), and whether it is adequate to run the modeling process once or multiple times, p...

2009
Pradipto Das Rohini K. Srihari

Generating short multi-document summaries has received a lot of focus recently and is useful in many respects including summarizing answers to a question in an online scenario like Yahoo! Answers. The focus of this paper is to attempt to define a new probabilistic topic model that includes the semantic roles of the words in the document generation process. Words always carry syntactic and seman...

Journal: :CoRR 2017
Ramesh Nallapati Igor Melnyk Abhishek Kumar Bowen Zhou

We present a new topic model that generates documents by sampling a topic for one whole sentence at a time, and generating the words in the sentence using an RNN decoder that is conditioned on the topic of the sentence. We argue that this novel formalism will help us not only visualize and model the topical discourse structure in a document better, but also potentially lead to more interpretabl...

2013
QINJIAO MAO BOQIN FENG SHANLIANG PAN

In recommender systems, modeling user interest is a basic step to understand user's personal features. Traditional methods mostly just use the items that the target users navigated as their interests, which makes the inherent information unclear to the system and thus the recommendations are not intelligent enough. In this paper, we investigate the utility of topic model called LDA for the task...

2013
Tomonari Masada Atsuhiro Takasu

In this paper, we provide a revised inference for correlated topic model (CTM) [3]. CTM is proposed by Blei et al. for modeling correlations among latent topics more expressively than latent Dirichlet allocation (LDA) [2] and has been attracting attention of researchers. However, we have found that the variational inference of the original paper is unstable due to almost-singularity of the cova...

2015
Tong Wang Vish Viswanath Ping Chen

Topic Model such as Latent Dirichlet Allocation(LDA) makes assumption that topic assignment of different words are conditionally independent. In this paper, we propose a new model Extended Global Topic Random Field (EGTRF) to model non-linear dependencies between words. Specifically, we parse sentences into dependency trees and represent them as a graph, and assume the topic assignment of a wor...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید