Word Alignment via Quadratic Assignment
نویسندگان
چکیده
Recently, discriminative word alignment methods have achieved state-of-the-art accuracies by extending the range of information sources that can be easily incorporated into aligners. The chief advantage of a discriminative framework is the ability to score alignments based on arbitrary features of the matching word tokens, including orthographic form, predictions of other models, lexical context and so on. However, the proposed bipartite matching model of Taskar et al. (2005), despite being tractable and effective, has two important limitations. First, it is limited by the restriction that words have fertility of at most one. More importantly, first order correlations between consecutive words cannot be directly captured by the model. In this work, we address these limitations by enriching the model form. We give estimation and inference algorithms for these enhancements. Our best model achieves a relative AER reduction of 25% over the basic matching formulation, outperforming intersected IBM Model 4 without using any overly compute-intensive features. By including predictions of other models as features, we achieve AER of 3.8 on the standard Hansards dataset.
منابع مشابه
An Optimal Quadratic Approach to Monolingual Paraphrase Alignment
We model the problem of monolingual textual alignment as a Quadratic Assignment Problem (QAP) which simultaneously maximizes the global lexicosemantic and syntactic similarities of two sentence-level texts. Because QAP is an NP-complete problem, we propose a branch-and-bound approach to efficiently find an optimal solution. When compared with other methods and studies, our results are competitive.
متن کاملInstance-Based Parameter Tuning via Search Trajectory Similarity Clustering
This paper is concerned with automated tuning of parameters in local-search based meta-heuristics. Several generic approaches have been introduced in the literature that returns a ”one-size-fits-all” parameter configuration for all instances. This is unsatisfactory since different instances may require the algorithm to use very different parameter configurations in order to find good solutions....
متن کاملSpectral Alignment of Networks
Network alignment refers to the problem of finding a bijective mapping across vertices of two or more graphs to maximize the number of overlapping edges and/or to minimize the number of mismatched interactions across networks. This paper introduces a network alignment algorithm inspired by eigenvector analysis which creates a simple relaxation for the underlying quadratic assignment problem. Ou...
متن کاملPermA and Balloon: Tools for string alignment and text processing
Two online research tools are presented in this paper: PermA, a general-purpose string aligner which can for example be used for grapheme-to-phoneme and phonemeto-phoneme alignment, and Balloon, a text processing toolkit for German and English providing components for part-of-speech tagging, morphological analyses, and grapheme-to-phoneme conversion including syllabification and word-stress ass...
متن کاملA Honey Bee Algorithm To Solve Quadratic Assignment Problem
Assigning facilities to locations is one of the important problems, which significantly is influence in transportation cost reduction. In this study, we solve quadratic assignment problem (QAP), using a meta-heuristic algorithm with deterministic tasks and equality in facilities and location number. It should be noted that any facility must be assign to only one location. In this paper, first o...
متن کامل