TextMatcher: Cross-Attentional Neural Network to Compare Image and Text
نویسندگان
چکیده
We study a multimodal-learning problem where, given an image containing single-line (printed or handwritten) text and candidate transcription, the goal is to assess whether represented in corresponds text. This problem, which we dub matching, primarily motivated by real industrial application scenario of automated cheque processing, whose automatically information bank (e.g., issue date) match data that have been entered customer while depositing teller machine (ATM). The finds more general several other scenarios too, e.g., personal-identity-document processing user-registration procedures. devise machine-learning model specifically designed for text-matching problem. proposed model, termed TextMatcher, compares two inputs applying novel cross-attention mechanism over embedding representations text, it trained end-to-end fashion on desired distribution errors be detected. demonstrate effectiveness TextMatcher automated-cheque-processing use case, where shown generalize well future unseen dates, unlike existing models related problems. further performance different distributions public IAM dataset. Results attest that, compared naïve problems, achieves higher variety configurations.
منابع مشابه
Text Extraction and Recognition from Image using Neural Network
Extraction and recognition of text from image is an important step in building efficient indexing and retrieval systems for multimedia databases. Our primary objective is to make an unconstrained image indexing and retrieval system using neural network. We adopt HSV based approaches for color reduction. This approach show impressive results. We extract a set of features from each ROI for that s...
متن کاملStereo Matching by Training a Convolutional Neural Network to Compare Image Patches
We present a method for extracting depth information from a rectified image pair. Our approach focuses on the first stage of many stereo algorithms: the matching cost computation. We approach the problem by learning a similarity measure on small image patches using a convolutional neural network. Training is carried out in a supervised manner by constructing a binary classification data set wit...
متن کاملText-Attentional Convolutional Neural Networks for Scene Text Detection
Recent deep learning models have demonstrated strong capabilities for classifying text and non-text components in natural images. They extract a high-level feature computed globally from a whole image component (patch), where the cluttered background information may dominate true text features in the deep representation. This leads to less discriminative power and poorer robustness. In this wor...
متن کاملText identification for document image analysis using a neural network
A new bottom-up method is described that clusters the content of a mixed type document into text or non-text areas. The proposed approach is based on a new set of features combined with a self-organized neural network classifier. The set of features corresponds to the contents and the relationship of 3 3 3 masks, is selected by using a statistical reduction procedure, and provides texture infor...
متن کاملMonte Carlo Simulation to Compare Markovian and Neural Network Models for Reliability Assessment in Multiple AGV Manufacturing System
We compare two approaches for a Markovian model in flexible manufacturing systems (FMSs) using Monte Carlo simulation. The model which is a development of Fazlollahtabar and Saidi-Mehrabad (2013), considers two features of automated flexible manufacturing systems equipped with automated guided vehicle (AGV) namely, the reliability of machines and the reliability of AGVs in a multiple AGV jobsho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2022
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-031-18840-4_25