New baseline correction algorithm for text-line recognition with bidirectional recurrent neural networks

نویسندگان

  • Olivier Morillot
  • Laurence Likforman-Sulem
  • Emmanuèle Grosicki
چکیده

Many preprocessing techniques have been proposed for isolated word recognition. However, recently, recognition systems have dealt with text blocks and their compound text lines. In this paper, we propose a new preprocessing approach to efficiently correct baseline skew and fluctuations. Our approach is based on a sliding window within which the vertical position of the baseline is estimated. Segmentation of text lines into subparts is, thus, avoided. Experiments conducted on a large publicly available database (Rimes), with a BLSTM (bidirectional long short-term memory) recurrent neural network recognition system, show that our baseline correction approach highly improves performance. © 2013 SPIE and IS&T [DOI: 10.1117/1.JEI.22.2.023028]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Approach to On-Line Handwriting Recognition Based on Bidirectional Long Short-Term Memory Networks

In this paper we introduce a new connectionist approach to on-line handwriting recognition and address in particular the problem of recognizing handwritten whiteboard notes. The approach uses a bidirectional recurrent neural network with long short-term memory blocks. We use a recently introduced objective function, known as Connectionist Temporal Classification (CTC), that directly trains the ...

متن کامل

Pattern Recognition in Control Chart Using Neural Network based on a New Statistical Feature

Today for the expedition of the identification and timely correction of process deviations, it is necessary to use advanced techniques to minimize the costs of production of defective products. In this way control charts as one of the important tools for the statistical process control in combination with modern tools such as artificial neural networks have been used. The artificial neural netw...

متن کامل

Bengali character recognition using Bidirectional Associative Memories (BAM) neural network

This paper presents the recognition features of Bengali text using BAM (Bidirectional Associative Memories) neural network with a proposal of feature extraction procedure of a Bengali character. To do this, the conventional methods are used for text scanning to segmentation of a text line to a single character. In this paper an efficient procedure is proposed for boundary extraction, scaling of...

متن کامل

Using deep bidirectional recurrent neural networks for prosodic-target prediction in a unit-selection text-to-speech system

Deeply-stacked Bidirectional Recurrent Neural Networks (BiRNNs) are able to capture complex, shortand long-term, context dependencies between predictors and targets due to the non-linear dependency they introduce on the entire observation when predicting a target, thanks to the use of recurrent hidden layers that accumulate information from all preceding and future observations. This aspect of ...

متن کامل

Recurrent neural networks with specialized word embeddings for health-domain named-entity recognition

BACKGROUND Previous state-of-the-art systems on Drug Name Recognition (DNR) and Clinical Concept Extraction (CCE) have focused on a combination of text "feature engineering" and conventional machine learning algorithms such as conditional random fields and support vector machines. However, developing good features is inherently heavily time-consuming. Conversely, more modern machine learning ap...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Electronic Imaging

دوره 22  شماره 

صفحات  -

تاریخ انتشار 2013