Dynamic Rank Factor Model for Text Streams
نویسندگان
چکیده
We propose a semi-parametric and dynamic rank factor model for topic modeling, capable of (i) discovering topic prevalence over time, and (ii) learning contemporary multi-scale dependence structures, providing topic and word correlations as a byproduct. The high-dimensional and time-evolving ordinal/rank observations (such as word counts), after an arbitrary monotone transformation, are well accommodated through an underlying dynamic sparse factor model. The framework naturally admits heavy-tailed innovations, capable of inferring abrupt temporal jumps in the importance of topics. Posterior inference is performed through straightforward Gibbs sampling, based on the forward-filtering backwardsampling algorithm. Moreover, an efficient data subsampling scheme is leveraged to speed up inference on massive datasets. The modeling framework is illustrated on two real datasets: the US State of the Union Address and the JSTOR collection from Science.
منابع مشابه
A Bio-inspired Clustering Approach for Dynamic Document Distributed Analysis
Document clustering is a fundamental operation used in unsupervised document organization, automatic topic extraction and information retrieval. But most clustering technologies are limited in their application on the static document collection. Intelligence analysts are currently overwhelmed with tremendous amount of text information streams generated everyday. There is a lack of comprehensive...
متن کاملTransverse and longitudinal dynamic modeling of bimorph piezoelectric actuators with investigating the effect of vibrational modes
Bimorph piezoelectric cantilevered (BPC) actuators have recently received a great deal of attention in a variety of micro-electromechanical systems (MEMS) applications. Dynamic modeling of such actuators needs to be improved in order to enhance the control performance. Previous works have usually taken transv...
متن کاملMining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows
Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...
متن کاملFiltered Dynamic Indexing of Text Streams
The identification and indexing of textual features that occur frequently in text streams is vital for many real-world information retrieval applications. Previous research has shown that frequent indexes of text features can be constructed efficiently for static collections. We extend this research to allow the insertion of new documents to the index. The insertion of new documents introduces ...
متن کامل