A New Segmentation Algorithm for Online Handwritten Word Recognition in Persian Script
نویسندگان
چکیده
The cursive nature of Persian alphabet, and the complex and convoluted rules regarding this script cause major challenges to segmentation as well as recognition of Persian words. We propose a new segmentation algorithm for the main stroke of online Persian handwritten words. Using this segmentation, we present a perturbation method which is used to generate artificial samples from handwritten words. Our recognition system is composed of three modules. The first module deals with the preprocessing of the data. We propose a wavelet-based smoothing technique which enhances the recognition performance compared to the conventional widely used technique. The second module is word segmentation into convex portions of the global shape which we call Convex Curve Sectors (CCSs). The third module is to analyze those CCSs and use the information for recognition performed by Dynamic Time Warping (DTW) technique. Using CCSs provides the DTW-based classifier with a compact word representation which makes comparison much
منابع مشابه
تشخیص دستنوشتۀ برخط فارسی با استفاده از مدل زبانی و کاهش قوانین نگارش کاربر
The Joint-up, cursive form of Persian words and immense variety of its scripts, also different figures of Persian letters depending on their sitting positions in the words, have turned the Persian handwritings recognition to an intense challenge. The major obstacle of the most often recognition ways, is their inattention to sentence contexture which causes utilizing of a word with correct appea...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کاملDataset and Ground Truth for Handwritten Text in Four Different Scripts
In document image analysis (DIA) especially in handwritten document recognition, standard databases play signi ̄cant roles for evaluating performances of algorithms and comparing results obtained by di®erent groups of researchers. The ̄eld of DIA regard to Indo-Persian documents is still at its infancy compared to Latin script-based documents; as such standard datasets are not still available in ...
متن کاملMixture of Experts for Persian handwritten word recognition
This paper presents the results of Persian handwritten word recognition based on Mixture of Experts technique. In the basic form of ME the problem space is automatically divided into several subspaces for the experts, and the outputs of experts are combined by a gating network. In our proposed model, we used Mixture of Experts Multi Layered Perceptrons with Momentum term, in the classification ...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کامل