sequence recognition

Optical Music Recognition with Convolutional Sequence-to-Sequence Models

2017

Eelco van der Wel Karen Ullrich

Optical Music Recognition (OMR) is an important technology within Music Information Retrieval. Deep learning models show promising results on OMR tasks, but symbol-level annotated data sets of sufficient size to train such models are not available and difficult to develop. We present a deep learning architecture called a Convolutional Sequence-to-Sequence model to both move towards an end-to-en...

متن کامل

Sequence to sequence learning for unconstrained scene text recognition

Journal: :CoRR 2016

Ahmed Mamdouh A. Hassanien

In this work we present a state-of-the-art approach for unconstrained natural scene text recognition. We propose a cascade approach that incorporates a convolutional neural network (CNN) architecture followed by a long short term memory model (LSTM). The CNN learns visual features for the characters and uses them with a softmax layer to detect sequence of characters. While the CNN gives very go...

متن کامل

An online sequence-to-sequence model for noisy speech recognition

Journal: :CoRR 2017

Chung-Cheng Chiu Dieterich Lawson Yuping Luo George Tucker Kevin Swersky Ilya Sutskever Navdeep Jaitly

Generative models have long been the dominant approach for speech recognition. The success of these models however relies on the use of sophisticated recipes and complicated machinery that is not easily accessible to non-practitioners. Recent innovations in Deep Learning have given rise to an alternative – discriminative models called Sequence-to-Sequence models, that can almost match the accur...

متن کامل

Sequence-specific endonuclease BamHI: relaxation of sequence recognition.

Journal: :Proceedings of the National Academy of Sciences of the United States of America 1982

J George J G Chirikjian

The effect of glycerol on the specificity of DNA cleavage by the restriction endonuclease BamHI has been examined. In addition to the canonical G decreases from G-A-T-C-C site, BamHI cuts DNA at several sites that we have named noncanonical BamHI.1 sites. The number of BamHI.1 sites in simian virus 40 and pBR322 was determined to be 13 for each DNA. Cutting sites determined by DNA sequence anal...

متن کامل

A Comparison of Sequence-to-Sequence Models for Speech Recognition

2017

Rohit Prabhavalkar Kanishka Rao Tara N. Sainath Bo Li Leif Johnson Navdeep Jaitly

In this work, we conduct a detailed evaluation of various allneural, end-to-end trained, sequence-to-sequence models applied to the task of speech recognition. Notably, each of these systems directly predicts graphemes in the written domain, without using an external pronunciation lexicon, or a separate language model. We examine several sequence-to-sequence models including connectionist tempo...

متن کامل

In silico screening of G-Quadruplex Structures in Wilms tumor 1 Gene Promoter

ژورنال: مجله دانشگاه علوم پزشکی خراسان شمالی 2019

Ghazaey Zidanloo, Saeedeh , Jafarzadeh Hesari, Mahsa ,

Introduction: X-ray diffraction studies have revealed that guanines in a DNA stands may be arranged in quartet and form a structure called G-quadruplexs. Bioinformatics studies suggested the formation of G-quadruplex structure in human crucial genes, including Wilms tumor 1 (WT1). The aim of this study was to in silico analysis of the guanine-rich sequence in the promoter region of the WT1 gene...

متن کامل

Probabilistic sequence models for image sequence processing and recognition

2012

Philippe Dreuw

This PhD thesis investigates the image sequence labeling problems optical character recognition (OCR), object tracking, and automatic sign language recognition (ASLR). To address these problems we investigate which concepts and ideas can be adopted from speech recognition to these problems. For each of these tasks we propose an approach that is centered around the approaches known from speech r...

متن کامل

Sequence to Sequence Learning for Optical Character Recognition

Journal: :CoRR 2015

Devendra K. Sahu Mohak Sukhwani

We propose an end-to-end recurrent encoder-decoder based sequence learning approach for printed text Optical Character Recognition (OCR). In contrast to present day existing state-of-art OCR solution [Graves et al. (2006)] which uses CTC output layer, our approach makes minimalistic assumptions on the structure and length of the sequence. We use a two step encoder-decoder approach – (a) A recur...

متن کامل

شناسایی خودکار زبان گفتار با استفاده از روش های آماری

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه صنعتی امیرکبیر(پلی تکنیک تهران) - دانشکده مهندسی برق 1387

علی ضیایی, محمد احدی,

سیستم های شناسایی زبان بر دو گونه اند: سیستم هایی که از اطلاعات سطح بالای زبان مانند واج و کلمه برای شناسایی زبان استفاده می کنند و سیستم هایی که از اطلاعات سطح پایین زبان مثل زیرواج و یا ویژگی های طیفی گفتار استفاده می کنند. مشکل سیستم های با دقت بالا مانند سیستم های شناسایی زبان مبتنی بر واج که نیاز به استخراج واج دارند اینست که نیاز به دادگان آوانویسی شده برای آنها وجود دارد و با توجه به این...

15 صفحه اول

Streaming End-to-End Multi-Talker Speech Recognition

Journal: :IEEE Signal Processing Letters 2021

End-to-end multi-talker speech recognition is an emerging research trend in the community due to its vast potential applications such as conversation and meeting transcriptions. To best of our knowledge, all existing works are constrained offline scenario. In this work, we propose Streaming Unmixing Recognition Transducer (SURT) for end-to-end recognition. Our model employs Recurrent Neural Net...

متن کامل