South Indian Tamil Language Handwritten Document Text Line Segmentation Technique with Aid of Sliding Window and Skewing Operations

نویسنده

  • SUNANDA DIXIT
چکیده

In document image analysis, Text line segmentation is one of the key components. The segmentation logic presents essential information about skew correction, zone segmentation, and character recognition. The method of document image segmentation into text lines for printed text has seen numerous contributions from fellow research scholars, yet there is scope for tremendous improvement. The key challenges for handwritten document are due to writer movement, the inter-line distance changeability and incoherent distance between the components that may differ. These may be directly by segments, or curved. The area of handwritten segmentation has seen few models; very few of the research paper are proposed for Text line skew segmentation model and hence the stimulus of handwritten south Indian languages. Consequently, a better text line segmentation technique for south Indian Tamil language is proposed in this paper. The processing of Tamil language is very crucial factor because the Tamil letters are in crucial shapes and it is harder to segment the touching lines and letters from the Tamil image documents. The challenges present in Tamil language process and the existing text line segmentation methods has been improved by our proposed method, which utilizing two major techniques namely, sliding window and adaptive histogram equalization. Our proposed text line segmentation technique initially performs the preprocessing process and these preprocessed document images are given to the adaptive histogram equalization. During the histogram equalization process, the document images text characters are enhanced to view the characters more accurately. The enhanced image text lines are segmented by utilizing the sliding window operation. For accurate line segmentation, the skewing operation is performed on the line segmented result images. The implementation result shows the effectiveness of proposed technique, in segmenting the handwritten text lines from the input document. The performance of the proposed technique is evaluated by comparing the result of proposed technique with the conventional text line segmentation technique. The result shows that our proposed technique acquires high-quality text line segmentation DR, RA and F-Measure values for the number of testing documents in comparison with the conventional technique.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Improved Handwritten Tamil Character Recognition System using Octal Graph

Problem Statement: Handwriting recognition has attracted voluminous research in recent times. The segmentation and recognition of the characters from handwritten scripts incorporates considerable overhead. Almost all the existing handwritten character recognition techniques use neural network approach, which requires lot of preprocessing and hence accomplishing these problems using neural netwo...

متن کامل

Performance of Statistics Based Line Segmentation System for Unconstrained Handwritten Text

Handwritten character recognition is a technique by which a computer system could recognize characters and other symbols written in natural handwriting. Segmentation decomposes the document image into subcomponents like lines, words and characters. To achieve greater accuracy, segmentation and recognition could not be treated independently. Most of the existing line segmentation methods have li...

متن کامل

Document Analysis And Classification Based On Passing Window

In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...

متن کامل

A new scheme for unconstrained handwritten text-line segmentation

Variations in inter-line gaps and skewed or curled text-lines are some of the challenging issues in segmentation of handwritten text-lines. Moreover, overlapping and touching text-lines that frequently appear in unconstrained handwritten text documents significantly increase segmentation complexities. In this paper, we propose a novel approach for unconstrained handwritten text-line segmentatio...

متن کامل

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013