Text segmentation: A topic modeling perspective
نویسندگان
چکیده
منابع مشابه
Topic Modeling and Classification of Cyberspace Papers Using Text Mining
The global cyberspace networks provide individuals with platforms to can interact, exchange ideas, share information, provide social support, conduct business, create artistic media, play games, engage in political discussions, and many more. The term cyberspace has become a conventional means to describe anything associated with the Internet and the diverse Internet culture. In fact, cyberspac...
متن کاملText Segmentation with Topic Modeling and Entity Coherence
This paper describes a system which uses entity and topic coherence for improved Text Segmentation (TS) accuracy. First, Linear Dirichlet Allocation (LDA) algorithm was used to obtain topics for sentences in the document. We then performed entity mapping across a window in order to discover the transition of entities within sentences. We used the information obtained to support our LDA-based bo...
متن کاملSemantic Text Segmentation and Sub-topic Extraction
Semantic Text segmentation and sub-topic extraction divides the input text into coherent paragraphs and extracts topics out of them. This enables applications to extract relevant meaningful data that could be useful in many text analysis tasks like information retrieval and summarization. In this project we have combined the techniques of text tiling and latent semantic analysis and have come u...
متن کاملDiscriminative Topic Segmentation of Text and Speech
We explore automated discovery of topicallycoherent segments in speech or text sequences. We give two new discriminative topic segmentation algorithms which employ a new measure of text similarity based on word co-occurrence. Both algorithms function by finding extrema in the similarity signal over the text, with the latter algorithm using a compact support-vector based description of a window ...
متن کاملText Segmentation wit h Topic Models
This article presents a general method to use information retrieved from the Latent Dirichlet Allocation (LDA) topic model for Text Segmentation: Using topic assignments instead of words in two well-known Text Segmentation algorithms, namely TextTiling and C99, leads to significant improvements. Further, we introduce our own algorithm called TopicTiling, which is a simplified version of TextTil...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Processing & Management
سال: 2011
ISSN: 0306-4573
DOI: 10.1016/j.ipm.2010.11.008