Context driven text segmentation and recognition
نویسندگان
چکیده
We present an algorithm for text segmentation and recognition mainly suited for complex problems where many merged characters are present. The basic idea is to define a distance, between lines of text and strings, which allows us to postpone the final decision about text segmentation and character classification until the contextual analysis is performed. The distance takes into account both the hypotheses about segmentation generated by a text segmentation module and the hypotheses about character classification produced by a probabilistic classifier. The algorithm has been tested by reading text on books’ covers; the experimental results highlight the quality of the solution proposed.
منابع مشابه
Effective Training of a Neural Network Character Classifier for Word Recognition
We have combined an artificial neural network (ANN) character classifier with context-driven search over character segmentation, word segmentation, and word recognition hypotheses to provide robust recognition of hand-printed English text in new models of Apple Computer's Newton MessagePad. We present some innovations in the training and use of ANNs as character classifiers for word recognition...
متن کاملChinese Word Segmentation and Named Entity Recognition Based on a Context-Dependent Mutual Information Independence Model
This paper briefly describes our system in the third SIGHAN bakeoff on Chinese word segmentation and named entity recognition. This is done via a word chunking strategy using a context-dependent Mutual Information Independence Model. Evaluation shows that our system performs well on all the word segmentation closed tracks and achieves very good scalability across different corpora. It also show...
متن کاملSegmentation of On-Line Freely Written Japanese Text Using SVM for Improving Text Recognition
This paper describes a method of producing segmentation point candidates for on-line handwritten Japanese text by a support vector machine (SVM) to improve text recognition. This method extracts multi-dimensional features from on-line strokes of handwritten text and applies the SVM to the extracted features to produces segmentation point candidates. We incorporate the method into the segmentati...
متن کاملSegmentation of On-Line Handwritten Japanese Text Using SVM for Improving Text Recognition
This paper describes a method of producing segmentation point candidates for on-line handwritten Japanese text by a support vector machine (SVM) to improve text recognition. This method extracts multi-dimensional features from on-line strokes of handwritten text and applies the SVM to the extracted features to produces segmentation point candidates. We incorporate the method into the segmentati...
متن کاملCombining Neural Networks and Context-Driven Search for On-line, Printed Handwriting Recognition in the Newton
MESSAGEPAD and EMATE. Combining an artificial neural network (ANN) as a character classifier with a context-driven search over segmentation and word-recognition hypotheses provides an effective recognition system. Long-standing issues relative to training, generalization, segmentation, models of context, probabilistic formalisms, and so on, need to be resolved, however, to achieve excellent per...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition Letters
دوره 17 شماره
صفحات -
تاریخ انتشار 1996