نتایج جستجو برای: topic model

تعداد نتایج: 2231604  

2015
Lisa Posch Arnim Bleier Philipp Schaer Markus Strohmaier

In this paper, we present the Polylingual Labeled Topic Model, a model which combines the characteristics of the existing Polylingual Topic Model and Labeled LDA. The model accounts for multiple languages with separate topic distributions for each language while restricting the permitted topics of a document to a set of predefined labels. We explore the properties of the model in a two-language...

2012
Hugo Larochelle Stanislas Lauly

We describe a new model for learning meaningful representations of text documents from an unlabeled collection of documents. This model is inspired by the recently proposed Replicated Softmax, an undirected graphical model of word counts that was shown to learn a better generative model and more meaningful document representations. Specifically, we take inspiration from the conditional mean-fie...

2011
Claudiu Cristian Musat Julien Velcin Marian-Andrei Rizoiu Stefan Trausan-Matu

We propose a system which employs conceptual knowledge to improve topic models by removing unrelated words from the simplified topic description. We use WordNet to detect which topical words are not conceptually similar to the others and then test our assumptions against human judgment. Results obtained on two different corpora in different test conditions show that the words detected as unrela...

Journal: :CoRR 2017
Wenlin Wang Zhe Gan Wenqi Wang Dinghan Shen Jiaji Huang Wei Ping Sanjeev Satheesh Lawrence Carin

We propose a Topic Compositional Neural Language Model (TCNLM), a novel method designed to simultaneously capture both the global semantic meaning and the local wordordering structure in a document. The TCNLM learns the global semantic coherence of a document via a neural topic model, and the probability of each learned latent topic is further used to build a Mixture-ofExperts (MoE) language mo...

2016
Arseniy Ashuha Natalia V. Loukachevitch

A probabilistic topic model is a modern statistical tool for document collection analysis that allows extracting a number of topics in the collection and describes each document as a discrete probability distribution over topics. Classical approaches to statistical topic modeling can be quite effective in various tasks, but the generated topics may be too similar to each other or poorly interpr...

2008
Meng-Sung Wu Jen-Tzung Chien

Document modeling is important for document retrieval and categorization. The probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA) are popular paradigms of document models where word/document correlations are inferred by latent topics. In PLSA and LDA, the unseen words and documents are not explicitly represented at the same time. Model generalization is constrain...

2012
Seppo Virtanen Yangqing Jia Arto Klami Trevor Darrell

Multi-modal data collections, such as corpora of paired images and text snippets, require analysis methods beyond single-view component and topic models. For continuous observations the current dominant approach is based on extensions of canonical correlation analysis, factorizing the variation into components shared by the different modalities and those private to each of them. For count data,...

2014
Maxim Rabinovich David M. Blei

Taddy (2013) proposed multinomial inverse regression (MNIR) as a new model of annotated text based on the influence of metadata and response variables on the distribution of words in a document. While effective, MNIR has no way to exploit structure in the corpus to improve its predictions or facilitate exploratory data analysis. On the other hand, traditional probabilistic topic models (like la...

2016
YongHeng Chen Yaojin Lin Hao Yue

A large number of electronic documents are labeled using human-interpretable annotations. High-efficiency text mining on such data set requires generative model that can flexibly comprehend the significance of observed labels while simultaneously uncovering topics within unlabeled documents. This paper presents a novel and generalized on-line labeled topic model (OLT) tracking the time developm...

Journal: :JSW 2010
Xiaoyan Zhang Ting Wang

In topic tracking, a topic is usually described by several stories. How to represent a topic is always an issue and a difficult problem in the research on topic tracking. To emphasis the topic in stories, we provide an improved topicbased tf*idf weighting method to measure the topical importance of the features in the representation model. To overcome the topic drift problem and filter the nois...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید