Color reduction for complex document images
نویسندگان
چکیده
A new technique for color reduction of complex document images is presented in this article. It reduces significantly the number of colors of the document image (less than 15 colors in most of the cases) so as to have solid characters and uniform local backgrounds. Therefore, this technique can be used as a preprocessing step by text information extraction applications. Specifically, using the edge map of the document image, a representative set of samples is chosen that constructs a 3D color histogram. Based on these samples in the 3D color space, a relatively large number of colors (usually no more than 100 colors) are obtained by using a simple clustering procedure. The final colors are obtained by applying a meanshift based procedure. Also, an edge preserving smoothing filter is used as a preprocessing stage that enhances significantly the quality of the initial image. Experimental results prove the method’s capability of producing correctly segmented complex color documents where the character elements can be easily extracted as connected components. VC 2009 Wiley Periodicals, Inc. Int J Imaging Syst Technol, 19, 14–26, 2009; Published online in Wiley InterScience (www.interscience. wiley.com). DOI 10.1002/ima.20174
منابع مشابه
Complex Background and Foreground Extraction in Color Document Images using Interval Type-2 Fuzzy
This paper deals with the problem of extracting the text information from complex ground from color document images. Developing general framework for separating the foreground text and background information from complex document image is still a challenging problem because of its high unpredictability and complexity. In this paper a new interval type-2 fuzzy based thresholding method is propos...
متن کاملA maximal-information color to gray conversion method for document images: Toward an optimal grayscale representation for document image binarization
A novel method to convert color/multi-spectral images to gray-level images is introduced to increase the performance of document binarization methods. The method uses the distribution of the pixel data of the input document image in a color space to find a transformation, called the dual transform, which balances the amount of information on all color channels. Furthermore, in order to reduce t...
متن کاملپژوهشی کیفی در تحلیل الگوی بهرهگیری خبرگان حوزهی سلامت از تصاویر پزشکی
Introduction: In health sector, image functions as a form of document that can convey a considerable amount of information. Employing this type of information can increase the effectiveness of the performance of medical experts. This study aimed to survey how health experts use medical images in their practice. Methods: This applied qualitative study was carried out in 1392 (2013). The study p...
متن کاملText Extraction in Complex Color Document Images for Enhanced Readability
Often we encounter documents with text printed on complex color background. Readability of textual contents in such documents is very poor due to complexity of the background and mix up of color(s) of foreground text with colors of background. Automatic segmentation of foreground text in such document images is very much essential for smooth reading of the document contents either by human or b...
متن کاملFace Detection with methods based on color by using Artificial Neural Network
The face Detection methodsis used in order to provide security. The mentioned methods problems are that it cannot be categorized because of the great differences and varieties in the face of individuals. In this paper, face Detection methods has been presented for overcoming upon these problems based on skin color datum. The researcher gathered a face database of 30 individuals consisting of ov...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. J. Imaging Systems and Technology
دوره 19 شماره
صفحات -
تاریخ انتشار 2009