A CYK+ Variant for SCFG Decoding Without a Dot Chart

نویسنده

  • Rico Sennrich
چکیده

While CYK+ and Earley-style variants are popular algorithms for decoding unbinarized SCFGs, in particular for syntaxbased Statistical Machine Translation, the algorithms rely on a so-called dot chart which suffers from a high memory consumption. We propose a recursive variant of the CYK+ algorithm that eliminates the dot chart, without incurring an increase in time complexity for SCFG decoding. In an evaluation on a string-totree SMT scenario, we empirically demonstrate substantial improvements in memory consumption and translation speed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Contrasting objective functions for CYK chart decoding

Context-free inference is a standard part of many NLP pipelines. Most approaches use a variant of the CYK dynamic programming algorithm to populate a chart structure with predicted nonterminals over each span. We can extract a parse tree from this chart in several ways. In this work, we compare two commonly-used decoding approaches (Viterbi and max-rule) with a minimum-bayes-risk (MBR) method w...

متن کامل

Beam-Width Prediction for Efficient Context-Free Parsing

Efficient decoding for syntactic parsing has become a necessary research area as statistical grammars grow in accuracy and size and as more NLP applications leverage syntactic analyses. We review prior methods for pruning and then present a new framework that unifies their strengths into a single approach. Using a log linear model, we learn the optimal beam-search pruning parameters for each CY...

متن کامل

An Efficient Shift-Reduce Decoding Algorithm for Phrased-Based Machine Translation

In statistical machine translation, decoding without any reordering constraint is an NP-hard problem. Inversion Transduction Grammars (ITGs) exploit linguistic structure and can well balance the needed flexibility against complexity constraints. Currently, translation models with ITG constraints usually employs the cube-time CYK algorithm. In this paper, we present a shift-reduce decoding algor...

متن کامل

Technical Report: An n-free-passes CYK algorithm for error-correction and the prediction of non-canonical base-pairs in RNA secondary structure

Background: The prediction of non-canonical base-pairs in RNA secondary structure prediction has become increasingly important with the advent of next-generation sequencing technologies, where sequencing errors can introduce artificial non-canonical base-pairs in RNA secondary structure. These base-pairs are not appropriately accounted for by the currently existing models. Results: Here we focu...

متن کامل

SCFG latent annotation for machine translation

We discuss learning latent annotations for synchronous context-free grammars (SCFG) for the purpose of improving machine translation. We show that learning annotations for nonterminals results in not only more accurate translation, but also faster SCFG decoding.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014