Stroke Width-Based Contrast Feature for Document Image Binarization

نویسندگان

  • Le Thi Khue Van
  • Gueesang Lee
چکیده

Automatic segmentation of foreground text from the background in degraded document images is very much essential for the smooth reading of the document content and recognition tasks by machine. In this paper, we present a novel approach to the binarization of degraded document images. The proposed method uses a new local contrast feature extracted based on the stroke width of text. First, a pre-processing method is carried out for noise removal. Text boundary detection is then performed on the image constructed from the contrast feature. Then local estimation follows to extract text from the background. Finally, a refinement procedure is applied to the binarized image as a post-processing step to improve the quality of the final results. Experiments and comparisons of extracting text from degraded handwriting and machine-printed document image against some well-known binarization algorithms demonstrate the effectiveness of the proposed method. Keywords—Degraded Document Image, Binarization, Stroke Width, Contrast Feature, Text Boundary

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Adaptively Iterative Method of Document Image Binarization

Document image binarization is difficult when the image is affected by background noise or nonuniformly illuminated. In this paper, an adaptively method is proposed to address the above problems and thus get expected binarization results. This method begins by defining a filter window length with initialized stroke width, and then transforms the document image into a feature space with Gaussian...

متن کامل

Shape based local thresholding for binarization of document images

This paper presents a novel local threshold algorithm for the binarization of document images. Stroke width of handwritten and printed characters in documents is utilized as the shape feature. As a result, in addition to the intensity analysis, the proposed algorithm introduces the stroke width as shape information into local thresholding. Experimental results for both synthetic and practical d...

متن کامل

Directional Stroke Width Transform to Separate Text and Graphics in City Maps

One of the complex documents in the real world is city maps. In these kinds of maps, text labels overlap by graphics with having a variety of fonts and styles in different orientations. Usually, text and graphic colour is not predefined due to various map publishers. In most city maps, text and graphic lines form a single connected component. Moreover, the common regions of text and graphic lin...

متن کامل

Binarization of Document Image

Documents Image Binarization is performed in the preprocessing stage for document analysis and it aims to segment the foreground text from the document background. A fast and accurate document image binarization technique is important for the ensuing document image processing tasks such as optical character recognition (OCR). Though document image binarization has been studied for many years, t...

متن کامل

Document Analysis And Classification Based On Passing Window

In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JIPS

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2014