نتایج جستجو برای: text segmentation

تعداد نتایج: 227918  

2007
Wirote Aroonmanakun

This paper discusses problems of word and sentence segmentation in Thai. Disagreements on word segmentation are caused mostly from compound words. To set a standard resource and tool of word segmentation, we suggest that only simple words and true compound words should be segmented in the process of word segmentation. Other compounds can be grouped later by the same means as multiword identific...

2014
XIAOPEI LIU ZHAOYANG LU JING LI WEI JIANG Xiaopei Liu Zhaoyang Lu Jing Li Wei Jiang

-This paper presents a new scheme for character detection and segmentation from natural scene images. In the detection stage, stroke edge is employed to detect possible text regions, and some geometrical features are used to filter out obvious non-text regions. Moreover, in order to combine unary properties with pairwise features into one framework, a graph model of candidate text regions is se...

2018
Omri Koshorek Adir Cohen Noam Mor Michael Rotman Jonathan Berant

Text segmentation, the task of dividing a document into contiguous segments based on its semantic structure, is a longstanding challenge in language understanding. Previous work on text segmentation focused on unsupervised methods such as clustering or graph search, due to the paucity in labeled data. In this work, we formulate text segmentation as a supervised learning problem, and present a l...

2015
Bogdan Ludusan Gabriel Synnaeve Emmanuel Dupoux

It is well known that prosodic information is used by infants in early language acquisition. In particular, prosodic boundaries have been shown to help infants with sentence and wordlevel segmentation. In this study, we extend an unsupervised method for word segmentation to include information about prosodic boundaries. The boundary information used was either derived from oracle data (handanno...

2001
Gurpreet Singh Lehal Chandan Singh

This paper describes a technique for text segmentation of machine printed Gurmukhi script documents. Research in the field of segmentation of Gurmukhi script faces major problems mainly related to the unique characteristics of the script like connectivity of characters on the headline, two or more characters in a word having intersecting minimum bounding rectangles, multicomponent characters, t...

2013
XIAOYONG WANG

How to get the target text quickly becomes a technical limitation with the using of massive data. While obtaining the Chinese target information, the segmentation of the sentence is supposed to be the key according to research. To mine the segmentation of English text is relatively simple for the space is used as a interval, meanwhile the Chinese segmentation is much more difficult. So in this ...

Journal: :IEEE/ACM Transactions on Audio, Speech, and Language Processing 2019

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید