Introducing a very large dataset of handwritten Farsi digits and a study on their varieties
نویسندگان
چکیده
A very large dataset of handwritten Farsi digits is introduced. Binary images of 102,352 digits were extracted from about 12,000 registration forms of two types, filled by B.Sc. and senior high school students. These forms were scanned at 200 dpi with a high speed scanner. A method for finding variety of handwritten digits in a typical dataset is proposed. Based on this method, training and test subsets are provided to facilitate sharing of results among researchers as well as performance comparison. 2007 Elsevier B.V. All rights reserved.
منابع مشابه
The biologically inspired Hierarchical Temporal Memory
It is herein proposed a handwritten digit recognition system which biologically inspired of the large-scale structure of the mammalian neocortex. Hierarchical Temporal Memory (HTM) is a memory-prediction network model that takes advantage of the Bayesian belief propagation and revision techniques. In this article a study has been conducted to train a HTM network to recognize handwritten digits ...
متن کاملPersian Handwritten Digit Recognition Using Particle Swarm Probabilistic Neural Network
Handwritten digit recognition can be categorized as a classification problem. Probabilistic Neural Network (PNN) is one of the most effective and useful classifiers, which works based on Bayesian rule. In this paper, in order to recognize Persian (Farsi) handwritten digit recognition, a combination of intelligent clustering method and PNN has been utilized. Hoda database, which includes 80000 P...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کاملLearning Document Image Features With SqueezeNet Convolutional Neural Network
The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...
متن کاملA Study on Farsi Handwriting Styles for Online Recognition
Knowing varieties of writing a letter in a word or a subword in different handwriting styles is very beneficial in recognition specifically for online recognition. In this paper, TMU-OFS dataset consisting of 1000 frequent Farsi subwords is employed to study Farsi handwriting styles. The subwords are grouped based on their delayed strokes and their main bodies, separately. The handwriting style...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition Letters
دوره 28 شماره
صفحات -
تاریخ انتشار 2007