document image analysis

نتایج جستجو برای: document image analysis

تعداد نتایج: 3209544 فیلتر نتایج به سال:

Document Image Matching Techniques

2004

Jonathan J. Hull John Cullen Mark Peairs

Several techniques for document image matching developed at the Ricoh California Research Center are presented. These methods are given a document image as input and locate visually similar or identical copies of the same image in a large database.

متن کامل

Document Image Applications

2007

Dan S. Bloomberg

The analysis of document images is a difficult and ill-defined task. Unlike the graphics operation of rendering a document into a pixmap, using a structured page-level description such as pdf, the analysis problem starts with the pixmap and attempts to generate a structured description. This description is hierarchical, and typically consists of two interleaved trees, one giving the physical la...

متن کامل

Generating paper texture of historical documents using statistical moments

2000

Carlos A. B. Mello Rafael Dueire Lins

This paper presents a scheme for generating paper texture of historical documents. A new entropy based segmentation algorithm is used to decompose the image of documents into the image of the paper background and the printing of the document. Statistical analysis allows filling in the gaps from the printing, yielding a blank sheet of paper with similar texture to the original document.

متن کامل

Indus Image Segmentation Using Watershed and Histogram Projections

2017

Character segmentation is the major step of document image analysis and optical character recognition (OCR). The character segmentation is necessary to detect all the character regions in the image document. The proposed method preprocesses the image document with edge detection techniques to enhance the character edges. Further, the watershed algorithm is implemented to identify the regions of...

متن کامل

Multi-component Document Image Coding Using Regions-of-Interest

2004

Xiao Wei Yin Andy C. Downton Martin Fleury Jingyu He

Region-of-Interest (ROI) techniques are often utilized in natural stillimage coding standards such as JPEG2000 [1]. In contrast, document image coding typically adopts multi-layer methods [2], using a carefully selected algorithm for each layer to optimize overall performance. In this paper, an ROI-based method is proposed for multi-component document image coding, where rectangular textual ROI...

متن کامل

Document image content inventories

2007

Henry S. Baird Michael A. Moll Chang An Matthew R. Casey

We report an investigation into strategies, algorithms, and software tools for document image content extraction and inventory, that is, the location and measurement of regions containing handwriting, machine-printed text, photographs, blank space, etc. We have developed automatically trainable methods, adaptable to many kinds of documents represented as bilevel, greylevel, or color images, tha...

متن کامل

Degraded document image enhancement

2007

Gady Agam G. Bal Gideon Frieder Ophir Frieder

Poor quality documents are obtained in various situations such as historical document collections, legal archives, security investigations, and documents found in clandestine locations. Such documents are often scanned for automated analysis, further processing, and archiving. Due to the nature of such documents, degraded document images are often hard to read, have low contrast, and are corrup...

متن کامل

Document Image Binarization

2013

Konstantinos Ntirogiannis

Principal stage of the document image analysis procedure is the binarization, according to which the pixels are classified into text and background. It is a crucial stage that can affect further stages including the final character recognition stage. This thesis is focused on document image binarization, including both binarization techniques and evaluation methodologies. Specifically, accordin...

متن کامل

Adaptive document image binarization

Journal: :Pattern Recognition 2000

Jaakko J. Sauvola Matti Pietikäinen

A new method is presented for adaptive document image binarization, where the page is considered as a collection of subcomponents such as text, background and picture. The problems caused by noise, illumination and many source type-related degradations are addressed. Two new algorithms are applied to determine a local threshold for each pixel. The performance evaluation of the algorithm utilize...

متن کامل

Anveshak - A Groundtruth Generation Tool for Foreground Regions of Document Images

2016

Soumyadeep Dey Jayanta Mukherjee Shamik Sural Amit Vijay Nandedkar

We propose a graphical user interface based groundtruth generation tool in this paper. Here, annotation of an input document image is done based on the foreground pixels. Foreground pixels are grouped together with user interaction to form labeling units. These units are then labeled by the user with the user defined labels. The output produced by the tool is an image with an XML file containin...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید