Ph.D. Dissertation Proposal Handwriting Recognition for Document Images Captured by Portable Cameras

نویسندگان

  • Sijun Kang
  • Guizhen Yang
چکیده

The increasing availability of high performance, low priced, portable digital imaging devices has created a tremendous opportunity in document image acquisition by supplementing traditional scanning using flatbed scanners and mounted cameras. However, the portability of the camera presents new challenges in document image analysis and recognition in general and handwriting recognition in particular. Problems are posed by low resolution, blur from focus, uneven lighting and warping distortion. Perspective distortion is one major factor that can cause the recognition accuracy of algorithms designed with the assumption of traditional scanning drop significantly. Features considered robust under the traditional scanning scenario often lose distinguishing power while some features that were not considered before become attractive. In this dissertation, the problems of perspective distortion caused by camera-based document imaging will be systematically studied and new algorithms developed for feature evaluation and classifier construction. We will study theoretically and by implementation the underpinnings of the design of a high performance, lexicon-driven offline handwriting recognizers which can adjust automatically for perspective distortion. Specifically, (i) a new feature evaluation measurement will be introduced to quantify the distinguishing power of features, (ii) a training methodology for automatically learning the distortion parameters is proposed, and (iii) a dynamic feature selection strategy based on perplexity and correlation is proposed to select only a subset of features (from a large set) that exhibit high discriminative power given the automatically computed parameters of perspective distortion. A prototype “perspective distortion independent” recognizer will be built and tested on a dataset of historical document images. First, historical documents are usually fragile and must be subjected to minimal handling during the digitization process. This constraint makes portable digital cameras the digitization device of choice. Secondly, historical documents are predominantly handwritten thus offering a rich source of data to test the algorithms. It is expected that the proposed research will make unique contributions in document image processing, image enhancement, and dynamic feature selection parts of handwriting recognition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recognition of Sequence of Print and Ink Strokes: Investigation the Effect of Handwriting Pressure, Hue of Ink, Printer and Paper Type

By introducing of digital techniques, forensic document examiners has been encouraged to work with better accuracy in non-destructive ways. The aim of this study was to present a non-destructive, accessible, economic (affordable), user friendly, portable, useful and easy technique for specifying the order of crossing lines of ink stroke and printed text. The intersections of LaserJet and In...

متن کامل

A New Method for Shading Removal and Binarization of Documents Acquired with Portable Digital Cameras

Photo documents, documents digitized with portable digital cameras, often are affected by non-uniform shading. This paper proposes a new method to remove the shade of document images captured with digital cameras followed by a new binarization algorithm. This method is able to automatically work with images of different resolutions and lighting patterns without any parameter adjustment. The pro...

متن کامل

Skew Detection from Natrual Scene Images: A Review

Natural scene images are generally captured with portable devices such as mobile phone cameras. Scene images contains text information as part of captured scene. Scene image text poses difficultly in processing as compared to document text due to complexity of scene and open environment conditions. Scene images usually suffer from skew deformation due to inherent nature of portable capturing de...

متن کامل

PhotoDoc: A Toolbox for Processing Document Images Acquired Using Portable Digital Cameras

This paper introduces PhotoDoc a software toolbox designed to process document images acquired with portable digital cameras. PhotoDoc was developed as an ImageJ plug-in. It performs border removal, perspective and skew correction, and image binarization. PhotoDoc interfaces with Tesseract, an open source Optical Character Recognizer originally developed by HP and distributed by Google.

متن کامل

Progress in Camera-Based Document Image Analysis

The increasing availability of high performance, low priced, portable digital imaging devices has created a tremendous opportunity for supplementing traditional scanning for document image acquisition. Digital cameras attached to cellular phones, PDAs, or as standalone still or video devices are highly mobile and easy to use; they can capture images of any kind of document including very thick ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005