منابع مشابه
Retrieval of historical documents by word spotting
The implementation of word spotting is not an easy procedure and it gets even worse in the case of historical documents since it requires character recognition and indexing of the document images. A general technique for word spotting is presented, independent of OCR, using automatic representation of the text queries of the user by word images and comparing them with the word images extracted ...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کاملOn the Influence of Word Representations for Handwritten Word Spotting in Historical Documents
Word spotting is the process of retrieving all instances of a queried keyword from a digital library of document images. In this paper we evaluate the performance of different word descriptors to assess the advantages and disadvantages of statistical and structural models in a framework of query-by-example word spotting in historical documents. We compare four word representation models, namely...
متن کاملFeatures for Word Spotting in Historical Manuscripts
For the transition from traditional to digital libraries, the large number of handwritten manuscripts that exist pose a great challenge. Easy access to such collections requires an index, which is currently created manually at great cost. Because automatic handwriting recognizers fail on historical manuscripts, the word spotting technique has been developed: the words in a collection are matche...
متن کاملScript Independent Word Spotting in Multilingual Documents
This paper describes a method for script independent word spotting in multilingual handwritten and machine printed documents. The system accepts a query in the form of text from the user and returns a ranked list of word images from document image corpus based on similarity with the query word. The system is divided into two main components. The first component known as Indexer, performs indexi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Document Analysis and Recognition (IJDAR)
سال: 2006
ISSN: 1433-2833,1433-2825
DOI: 10.1007/s10032-006-0027-8