text segmentation

نتایج جستجو برای: text segmentation

تعداد نتایج: 227918 فیلتر نتایج به سال:

Hierarchical Topic Structuring: From Dense Segmentation to Topically Focused Fragments via Burst Analysis

2015

Anca-Roxana Simon Pascale Sébillot Guillaume Gravier

Topic segmentation traditionally relies on lexical cohesion measured through word re-occurrences to output a dense segmentation, either linear or hierarchical. In this paper, a novel organization of the topical structure of textual content is proposed. Rather than searching for topic shifts to yield dense segmentation, we propose an algorithm to extract topically focused fragments organized in ...

متن کامل

語料庫統計值與全球資訊網統計值之比較：以中文斷詞應用為例 (Comparison of Corpus Statistics and Web Statistics: An Application to Chinese Word Segmentation) [In Chinese]

2004

Hsiao-Ching Lin Hsin-Hsi Chen

متن کامل

"Less is More" in Bayesian Word Segmentation: When cognitively plausible learners outperform the ideal

2012

Lawrence Phillips Lisa Pearl

Purely statistical models have accounted for infants' early ability to segment words out of fluent speech, with Bayesian models performing best (Goldwater et al. 2009). Yet these models often incorporate unlikely assumptions, such as infants having unlimited processing and memory resources and knowing the full inventory of phonemes in their native language. Following Pearl, et al. (2011), we ex...

متن کامل

Developing the modelling of Swedish prosody in spontaneous dialogue

1996

Gösta Bruce Marcus Filipsson Johan Frid Björn Granström Kjell Gustafson Merle Horne David House Birgitta Lastow Paul Touati

The main goal of our current research is the development of the Swedish prosody model. In our analysis of discourse and dialogue intonation we are exploiting model-based resynthesis. By comparing synthesized default and fine-tuned pitch contours for dialogues under study we are able to isolate relevant intonation patterns. This analysis of intonation is related to an independent modelling of to...

متن کامل

Detection of Laughter-in-Interaction in Multichannel Close-Talk Microphone Recordings of Meetings

2008

Kornel Laskowski Tanja Schultz

Laughter is a key element of human-human interaction, occurring surprisingly frequently in multi-party conversation. In meetings, laughter accounts for almost 10% of vocalization effort by time, and is known to be relevant for topic segmentation and the automatic characterization of affect. We present a system for the detection of laughter, and its attribution to specific participants, which re...

متن کامل

An iterative topic segmentation algorithm with intra-content term weighting (Segmentation thématique : processus itératif de pondération intra-contenu) [in French]

2013

Abdessalam Bouchekif Géraldine Damnati Delphine Charlet

739 c ï¿¿ ATALA

متن کامل

CoMiC: Exploring Text Segmentation and Similarity in the English Entrance Exams Task

2015

Ramon Ziai Björn Rudzewitz

This paper describes our contribution to the English Entrance Exams task of CLEF 2015, which requires participating systems to automatically solve multiple choice reading comprehension tasks. We use a combination of text segmentation and different similarity measures with the aim of exploiting two observed aspects of tests: 1) the often linear relationship between reading text and test question...

متن کامل

Second-Order Cohesion

Journal: :Computational Intelligence 2000

Stefan Kaufmann

Similarity in contextual behavior between words is considered a source of “lexical cohesion,” which is otherwise hard to measure or quantify. Such contextual similarity is used by an implementation for text segmentation, the VecTile system, which uses precompiled vector representations of words to produce similarity curves over texts. The performance of this system is shown to improve over that...

متن کامل

Building a Word Segmenter for Sanskrit Overnight

Journal: :CoRR 2018

Vikas Reddy Amrith Krishna Vishnu Dutt Sharma Prateek Gupta Vineeth M. R Pawan Goyal

There is abundance of digitised texts available in Sanskrit. However, the word segmentation task in such texts are challenging due to the issue of Sandhi. In Sandhi, words in a sentence often fuse together to form a single chunk of text, where the word delimiter vanishes and sounds at the word boundaries undergo transformations, which is also reflected in the written text. Here, we propose an a...

متن کامل

Discovering Chinese Words from Unsegmented Text

1999

Xianping Ge Wanda Pratt Padhraic Smyth

In English written text, words are separated by spaces, but in written Chinese text, there are no such separators between words. (See Figure 1.) Thus, effective information retrieval of Chinese text first requires good word segmentation. In this paper, we investigate an efficient algorithm to discover the words and their occurrence probabilities from a corpus of unsegmented text without using a...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید