Efficient Word Segmentation Driven by Unconstrained Handwritten Phrase Recognition

نویسندگان

  • Jaehwa Park
  • Venu Govindaraju
  • Sargur N. Srihari
چکیده

An e cient system which nds the best match between an input image and a lexicon is presented. To capture writing style of spacing between words and characters prime stroke analysis based on statistical methods is introduced. A method for estimating bound on number of characters without actual recognition is also presented. For system efciency, before actual recognition, classi ed groups of word segments and eligible subset of lexicons are generated as hypotheses. The hypotheses are veri ed and ordered by a lexicon driven word recognizer. We have tested our approach in the street name recognition/interpretation for US mail stream. Experimental results and encouraging.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A lexicon-driven approach for optimal segment combination in off-line recognition of unconstrained handwritten Korean words

We propose a new method for o!-line recognition of unconstrained handwritten words consisting of Korean and numeric characters. To overcome the di$culty in separating touching characters, we adopt an over-segmentation strategy. Given a slice of the input word image, we "nd the optimal segment combination using a lexicon-driven word scoring technique and a nearest-neighbor classi"er. The optimal...

متن کامل

A Lexicon Driven Approach for Off-line Recognition of Unconstrained Handwritten Korean Words

We propose a new method for the recognition of unconstrained handwritten words consisting of Korean and numeric characters. To overcome the difficulty in separating touching characters, we adopt an oversegmentation technique and we find the optimal segment combination using a lexicon-driven word scoring technique and a nearest neighbor classifier. The optimal combination gives the final segment...

متن کامل

Word segmentation of off-line handwritten documents

Word segmentation is the most critical pre-processing step for any handwritten document recognition/retrieval system. This paper describes an approach to separate a line of unconstrained (written in a natural manner) handwritten text into words. When the writing style is unconstrained, recognition of individual components may be unreliable so they must be grouped together into word hypotheses, ...

متن کامل

Offline handwritten Amharic word recognition

This paper describes two approaches for Amharic word recognition in unconstrained handwritten text using HMMs. The first approach builds word models from concatenated features of constituent characters and in the second method HMMs of constituent characters are concatenated to form word model. In both cases, the features used for training and recognition are a set of primitive strokes and their...

متن کامل

Review: A Literature Survey on Text Segmentation in Handwritten Punjabi Documents

Gurumukhi script is used for Punjabi language, which is a two dimensional composition of symbols with connected and disconnected diacritics. Handwritten Gurumukhi script has some complexities like connected, overlapped text lines, words and characters. It is one of the foremost issues for errors during the recognition process. Text segmentation is a challenging job in unconstrained writer indep...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999