نتایج جستجو برای: text segmentation

تعداد نتایج: 227918  

2012
Ashu Kumar Simpel Rani Jindal Bruno Taconet Manivannan Arivazhagan Harish Srinivasan Sargur Srihari Xiaojun Du Wumo Pan Tien. D. Bui Stéphane Nicolas Thierry Paquet Laurent Heutte Rajiv K. Sharma Amardeep Singh Naresh Garg M. K. Jindal

Text line segmentation is an essential pre-processing stage for handwriting recognition in many Optical Character Recognition (OCR) systems. It is an important step because inaccurately segmented text lines will cause errors in the recognition stage. Text line segmentation of the handwritten documents is still one of the most complicated problems in developing a reliable OCR. The nature of hand...

2008
Doina Tatar Andreea Diana Mihis Dana Lupsa

Summarization is the process of condensing a source text into a shorter version preserving its information content ([2]). This paper presents some original methods for text summarization by extraction of a single source document based on a particular intuition which is not explored till now: the logical structure of a text. The summarization relies on an original linear segmentation algorithm w...

Journal: :CoRR 2013
Grzegorz Chrupala

Learning word representations has recently seen much success in computational linguistics. However, assuming sequences of word tokens as input to linguistic analysis is often unjustified. For many languages word segmentation is a non-trivial task and naturally occurring text is sometimes a mixture of natural language strings and other character data. We propose to learn text representations dir...

2010
M Swamy Das

Segmentation is an important task of any OCR system. It separates the image text documents into lines, words and characters. The accuracy of OCR system mainly depends on the segmentation algorithm being used. Segmentation Telugu text is difficult when compared with Latin based languages because of its structural complexity and increased character set. It contains vowels, consonants and compound...

2011
Darko BRODIC

Text segmentation represents the key element in the optical character recognition process. Hence, testing procedure for text segmentation algorithms has significance importance. All previous works deal mainly with text database as a template. They are used for testing as well as for the evaluation of the text segmentation algorithm. However, because of inconsistencies in this process, some meth...

2000
Andrea Weber

This study investigates the influence of both phonotactic and acoustic cues on the segmentation of spoken English. Listeners detected embedded English words in nonsense sequences (word spotting). Words aligned with phonotactic boundaries were easier to detect than words without such alignment. Acoustic cues to boundaries could also have signaled word boundaries, especially when word onsets lack...

2017
Daisuke Kawahara Yuta Hayashibe Hajime Morita Sadao Kurohashi

This paper presents a joint model for morphological and dependency analysis based on automatically acquired lexical knowledge. This model takes advantage of rich lexical knowledge to simultaneously resolve word segmentation, POS, and dependency ambiguities. In our experiments on Japanese, we show the effectiveness of our joint model over conventional pipeline models.

2004
Laurianne Sitbon Patrice Bellot

This paper presents an empirical comparison between different methods and tools for segmenting texts. After presenting segmentation tools and more specifically linear segmentation algorithms, we present a comparison of these methods on both French and English text corpora. This evalutation points out that the performance of each method heavilly relies on the topic of the documents, and the numb...

2012
Christopher C. Heffner Laura C. Dilley J. Devin McAuley Mark A. Pitt

Christopher C. Heffner, Laura C. Dilley, J. Devin McAuley, and Mark A. Pitt Department of Linguistics and Germanic, Slavic, Asian and African Languages, Michigan State University, East Lansing, MI, USA Department of Psychology, Michigan State University, East Lansing, MI, USA Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, MI, USA Department of Psych...

2006
Richard Forsyth Shaaron Ainsworth David Clarke Pat Brundell Claire O’Malley

Social scientists face an overload of digitized information. In particular, they must often spend inordinate amounts of time coding and analyzing transcribed speech. This paper describes a study, in the field of learning science, of the feasibility of semi-automatically coding and scoring verbal data. Transcripts from 48 individual learners comprising 2 separate data sets of 44,000 and 23,000 w...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید