Recognizing Degraded Handwritten Characters

نویسندگان

  • Markus Diem
  • Robert Sablatnig
چکیده

In this report, a character recognition system is proposed that handles degraded manuscript documents which were discovered at the St. Catherine’s Monastery. In contrast to state-of-the-art Ocr systems, no early decision, namely the image binarization, needs to be performed. Thus, an object recognition methodology is adapted for the recognition of ancient manuscripts. Therefore, interest points are extracted which allow for the computation of local descriptors. These are directly classified using a Svm with one against all tests. In order to localize characters, interest points that represent characters are found by means of a scale distribution histogram. Then, the remaining interest points are clustered using a k-means which is initialized with the previously selected interest points. Finally a voting scheme is applied where the local descriptors’ class probabilities are accumulated to a probability histogram for each character cluster. This histogram does not solely allow for a hard decision, but can be presented to human experts who can decide the character class for hardly readable characters according to the probabilities obtained. The system was evaluated on three different datasets, namely a synthetic with Latin script, degraded characters and real world data. The system achieves a F0.5 score of 0.77 on the last dataset mentioned.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Feature Extraction Based on Run-Length Compensation for Degraded Handwritten Character Recognition

Conventional features are robust for recognizing either deformed or degraded characters. This paper proposes a feature extraction method that is robust for both of them. Run-length compensation is introduced for extracting approximate directional run-lengths of strokes from degraded handwritten characters. This technique is applied to the conventional feature vector based on directional runleng...

متن کامل

Category-Dependent Feature Extraction for Recognition of Degraded Handwritten Characters

Conventional methods for recognizing multiple fonts and handwriting are generally robust against deformation but are weak against degradation. This paper proposes a category-dependent feature extraction method that resists both deformation and degradation. Our proposed method compares an input pattern with the template of each category and estimates the degree of degradation of the input patter...

متن کامل

Machine Recognition of Hand Written Characters using Neural Networks

Even today in Twenty First Century Handwritten communication has its own stand and most of the times, in daily life it is globally using as means of communication and recording the information like to be shared with others. Challenges in handwritten characters recognition wholly lie in the variation and distortion of handwritten characters, since different people may use different style of hand...

متن کامل

Optimizing Feature Selection for Recognizing Handwritten Arabic Characters

Recognition of characters greatly depends upon the features used. Several features of the handwritten Arabic characters are selected and discussed. An off-line recognition system based on the selected features was built. The system was trained and tested with realistic samples of handwritten Arabic characters. Evaluation of the importance and accuracy of the selected features is made. The recog...

متن کامل

Performance Comparison of Different Image Sizes for Recognizing Unconstrained Handwritten Tamil Characters using SVM

This study describes a system for recognizing offline handwritten Tamil characters using Support Vector Machine (SVM). Data samples are collected from different writers on A4 sized documents. They are scanned using a flat bed scanner at a resolution of 300 dpi and stored as grey scale images. Various preprocessing operations are performed on the digitized image to enhance the quality of the ima...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010