نتایج جستجو برای: document image analysis

تعداد نتایج: 3209544  

2004
Jonathan J. Hull John Cullen Mark Peairs

Several techniques for document image matching developed at the Ricoh California Research Center are presented. These methods are given a document image as input and locate visually similar or identical copies of the same image in a large database.

2007
Dan S. Bloomberg

The analysis of document images is a difficult and ill-defined task. Unlike the graphics operation of rendering a document into a pixmap, using a structured page-level description such as pdf, the analysis problem starts with the pixmap and attempts to generate a structured description. This description is hierarchical, and typically consists of two interleaved trees, one giving the physical la...

2000
Carlos A. B. Mello Rafael Dueire Lins

This paper presents a scheme for generating paper texture of historical documents. A new entropy based segmentation algorithm is used to decompose the image of documents into the image of the paper background and the printing of the document. Statistical analysis allows filling in the gaps from the printing, yielding a blank sheet of paper with similar texture to the original document.

2017

Character segmentation is the major step of document image analysis and optical character recognition (OCR). The character segmentation is necessary to detect all the character regions in the image document. The proposed method preprocesses the image document with edge detection techniques to enhance the character edges. Further, the watershed algorithm is implemented to identify the regions of...

2004
Xiao Wei Yin Andy C. Downton Martin Fleury Jingyu He

Region-of-Interest (ROI) techniques are often utilized in natural stillimage coding standards such as JPEG2000 [1]. In contrast, document image coding typically adopts multi-layer methods [2], using a carefully selected algorithm for each layer to optimize overall performance. In this paper, an ROI-based method is proposed for multi-component document image coding, where rectangular textual ROI...

2007
Henry S. Baird Michael A. Moll Chang An Matthew R. Casey

We report an investigation into strategies, algorithms, and software tools for document image content extraction and inventory, that is, the location and measurement of regions containing handwriting, machine-printed text, photographs, blank space, etc. We have developed automatically trainable methods, adaptable to many kinds of documents represented as bilevel, greylevel, or color images, tha...

2007
Gady Agam G. Bal Gideon Frieder Ophir Frieder

Poor quality documents are obtained in various situations such as historical document collections, legal archives, security investigations, and documents found in clandestine locations. Such documents are often scanned for automated analysis, further processing, and archiving. Due to the nature of such documents, degraded document images are often hard to read, have low contrast, and are corrup...

2013
Konstantinos Ntirogiannis

Principal stage of the document image analysis procedure is the binarization, according to which the pixels are classified into text and background. It is a crucial stage that can affect further stages including the final character recognition stage. This thesis is focused on document image binarization, including both binarization techniques and evaluation methodologies. Specifically, accordin...

Journal: :Pattern Recognition 2000
Jaakko J. Sauvola Matti Pietikäinen

A new method is presented for adaptive document image binarization, where the page is considered as a collection of subcomponents such as text, background and picture. The problems caused by noise, illumination and many source type-related degradations are addressed. Two new algorithms are applied to determine a local threshold for each pixel. The performance evaluation of the algorithm utilize...

2016
Soumyadeep Dey Jayanta Mukherjee Shamik Sural Amit Vijay Nandedkar

We propose a graphical user interface based groundtruth generation tool in this paper. Here, annotation of an input document image is done based on the foreground pixels. Foreground pixels are grouped together with user interaction to form labeling units. These units are then labeled by the user with the user defined labels. The output produced by the tool is an image with an XML file containin...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید