Attention-based CNN-ConvLSTM for Handwritten Arabic Word Extraction
نویسندگان
چکیده
Word extraction is one of the most critical steps in handwritten recognition systems. It challenging for many reasons, such as variability writing styles, touching and overlapping characters, skewness problems, diacritics, ascenders, descenders' presence. In this work, we propose a deep-learning-based approach Arabic word extraction. We used an Attention-based CNN-ConvLSTM (Convolutional Long Short-term Memory) followed by CTC (Connectionist Temporal Classification) function. Firstly, text-line input image's essential features are extracted using Convolutional Neural Networks (CNN). The text line's transcription then passed to ConvLSTM learn mapping between them. Finally, alignment images their automatically. tested proposed model on complex dataset known KFUPM Handwritten Text (KHATT \cite{khatt}). consists patterns text-lines. experimental results show apparent efficiency combination, where ended up with success rate 91.7\%.
منابع مشابه
Word Extraction and Recognition in Arabic Handwritten Text
Segmenting arabic manuscripts into text-lines and words is an important step to make recognition systems more efficient and accurate. The major problem making this task crucial is the word extraction process: first, words are often a succession of sub-words where the space value between these sub-words do not respect any rules. Second, the presence of connections even between non adjacent sub-w...
متن کاملArabic word descriptor for handwritten word indexing and lexicon reduction
Word recognition systems use a lexicon to guide the recognition process in order to improve the recognition rate. However, as the lexicon grows, the computation time increases. In this paper, we present the Arabic word descriptor (AWD) for Arabic word shape indexing and lexicon reduction in handwritten documents. It is formed in two stages. First, the structural descriptor (SD) is computed for ...
متن کاملArabic Handwritten Word Recognition based on Bernoulli Mixture HMMs
This thesis presents new approaches in off-line Arabic Handwriting Recognition based on conventional Bernoulli Hidden Markov models. Until now, the off-line handwriting recognition, in particular, the Arabic handwriting recognition is still far away form being perfect. Hidden Markov Models (HMMs) are now widely used for off-line handwriting recognition in many languages and, in particular, in A...
متن کاملWord-Based Handwritten Arabic Scripts Recognition Using Dynamic Bayesian Network
In this paper, multi-class classification system is of handwritten Arabic words using Dynamic Bayesian Network (DBN) is proposed, in which technical details are presented in terms of three stages, i.e. preprocessing, feature extraction and classification. Firstly, words are segmented from inputted scripts and also normalized in size. Then, features are extracted from each normalized word, where...
متن کاملArabic handwritten word recognition based on dynamic bayesian network
Distinguishing an Arabic handwritten text is a hard task because the Arabic word is morphologically complex and the writing style from one model is highly variable, like the recognition of words representing the names of Tunisian cities. Actually, this is the first work based on the Dynamic Hierarchical Bayesian Network (DHBN). Its objective is to get the best model by learning the structure an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Electronic Letters on Computer Vision and Image Analysis
سال: 2022
ISSN: ['1577-5097']
DOI: https://doi.org/10.5565/rev/elcvia.1433