نتایج جستجو برای: text segmentation

تعداد نتایج: 227918  

2011
Naresh Kumar Garg Lakhwinder Kaur M. K. Jindal

Optical Character Recognition (OCR) is a process to recognize the handwritten or printed scanned text with the help of a computer. Segmentation is very important stage of any text recognition system. The problems in segmentation can lead to decrease in segmentation rate and hence recognition rate. A good segmentation technique can improve the recognition rate. This paper deals with the hazards ...

Journal: :IEICE Transactions 2008
Bilan Zhu Masaki Nakagawa

This paper describes a method of producing segmentation point candidates for on-line handwritten Japanese text by a support vector machine (SVM) to improve text recognition. This method extracts multi-dimensional features from on-line strokes of handwritten text and applies the SVM to the extracted features to produces segmentation point candidates. We incorporate the method into the segmentati...

2006
Bilan Zhu Junko Tokuno Masaki Nakagawa

This paper describes a method of producing segmentation point candidates for on-line handwritten Japanese text by a support vector machine (SVM) to improve text recognition. This method extracts multi-dimensional features from on-line strokes of handwritten text and applies the SVM to the extracted features to produces segmentation point candidates. We incorporate the method into the segmentati...

2011
D. Brodic

Text line segmentation represents the key element in the optical character recognition process. Hence, testing of text line segmentation algorithms has substantial relevance. All previously proposed testing methods deal mainly with text database as a template. They are used for testing as well as for the evaluation of the text segmentation algorithm. In this manuscript, methodology for the eval...

Journal: :JASIST 2005
Christopher C. Yang Kar Wing Li

The authors propose a heuristic method for Chinese automatic text segmentation based on a statistical approach. This method is developed based on statistical information about the association among adjacent characters in Chinese text. Mutual information of bi-grams and significant estimation of tri-grams are utilized. A heuristic method with six rules is then proposed to determine the segmentat...

2010
R. Allen Wilkinson

Image segmentation, the partitioning of an image into meaningful parts, is a major concern of any computer vision system. The meaningful parts of a text image are lines of text, words and characters. In this paper, the segmentation of pages of text into lines of text and lines of text into characters on a parallel machine will be examined. Using a parallel machine for text image segmentation al...

2015
Dakui Zhang Yu Mao Yang Liu Hanshi Wang Chuyuan Wei Shiping Tang

Human labeled corpus is indispensable for the training of supervised word segmenters. However, it is time-consuming and laborintensive to label corpus manually. During the process of typing Chinese text by Pingyin, people usually need to type "space" or numeric keys to choose the words due to homophones, which can be viewed as a cue for segmentation. We argue that such a process can be used to ...

Journal: :Pattern Recognition Letters 1996
Stefano Messelodi Carla Maria Modena

We present an algorithm for text segmentation and recognition mainly suited for complex problems where many merged characters are present. The basic idea is to define a distance, between lines of text and strings, which allows us to postpone the final decision about text segmentation and character classification until the contextual analysis is performed. The distance takes into account both th...

2010
Huidan Liu Weina Zhao Minghua Nuo Li Jiang Jian Wu Yeping He

Tibetan word segmentation is essential for Tibetan information processing. People mainly use the basic machine matching method which is based on dictionary to segment Tibetan words at present, because there is no segmented Tibetan corpus which can be used for training in Tibetan word segmentation. But the method based on dictionary is not fit to Tibetan number identification. This paper studies...

2014
Shinsuke Mori Graham Neubig

In this paper, we investigate the relative effect of two strategies of language resource additions to the word segmentation problem and partof-speech tagging problem in Japanese. The first strategy is adding entries to the dictionary and the second is adding annotated sentences to the training corpus. The experimental results showed that the annotated sentence addition to the training corpus is...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید