الگوریتم levenshtein

Pruned Convolutional Codes and Viterbi Decoding Using the Levenshtein Distance Metric Applied to Asynchronous Noisy Channels

2007

L. Cheng H. C. Ferreira

For a convolutional encoding and Viterbi decoding system, two insertion/deletion/substitution (IDS) error correcting techniques are presented in this paper. In the first means, by using the pruned convolutional codes, a rate compatible encoding system can adapt the transmission according to the state of the channel having IDS errors. In the second means, a convolutional encoded sequence is deco...

متن کامل

Adaptive String Distance Measures for Bilingual Dialect Lexicon Induction

2007

Yves Scherrer

This paper compares different measures of graphemic similarity applied to the task of bilingual lexicon induction between a Swiss German dialect and Standard German. The measures have been adapted to this particular language pair by training stochastic transducers with the ExpectationMaximisation algorithm or by using handmade transduction rules. These adaptive metrics show up to 11% F-measure ...

متن کامل

Lattice Codes for the Binary Deletion Channel

Journal: :CoRR 2014

Lin Sok Patrick Solé Aslan Tchamkerten

The construction of deletion codes for the Levenshtein metric is reduced to the construction of codes over the integers for the Manhattan metric by run length coding. The latter codes are constructed by expurgation of translates of lattices. These lattices, in turn, are obtained from Construction A applied to binary codes and Z4−codes. A lower bound on the size of our codes for the Manhattan di...

متن کامل

Linking Task: Identifying Authors and Book Titles in Verbose Queries

2016

Anaïs Ollagnier Sébastien Fournier Patrice Bellot

In this paper, we present our contribution in INEX 2016 Social Book Search Track. This year, we participate in a new track called Mining track. This track focuses on detecting and linking book titles in online book discussion forums. We propose a supervised approach based on Support Vector Machine (SVM) classification process combined with Conditional Random Fields (CRF) to detect book titles. ...

متن کامل

Extracting Common Motifs under the Levenshtein Measure: Theory and Experimentation

2002

Ezekiel F. Adebiyi Michael Kaufmann

Using our techniques for extracting approximate non-tandem repeats[1] on well constructed maximal models, we derive an algorithm to find common motifs of length P that occur in N sequences with at most D differences under the Edit distance metric. We compare the effectiveness of our algorithm with the more involved algorithm of Sagot[17] for Edit distance on some real sequences. Her method has ...

متن کامل

Using Distributional Semantic Models and Levenshtein Distance Normalization

2014

Lisa Tengstrand Beáta Megyesi Martin Duneld

In the medical domain, especially in clinical texts, non-standard abbreviations are prevalent, which impairs readability for patients. To ease the understanding of the physicians’ notes, abbreviations need to be identified and expanded to their original forms. This thesis presents a distributional semantic approach to find candidates of the original form of the abbreviation, which is combined w...

متن کامل

Modeling and Solving Bi-Objective Flexible Job Shop Scheduling with Parallel Machines and Dual Human-Machine Resources

Journal: : 2022

در این پژوهش مسئله زمان‌بندی کار کارگاهی منعطف با ماشین‌های موازی درنظرگرفتن معیار تولید پاک‌تر، منابع دوگانه انسان-ماشین، زمان دسترسی کارها و پردازش وابسته به سرعت ماشین‌ها بررسی می‌شود. اهداف شامل حداقل‌کردن مجموع جریمه‌های دیرکرد زودکرد افزایش است. داده می‌شود تا تکمیل کاهش یابد. درحالی‌که آلودگی صوتی محیط تولیدی منجر توجه رویکرد پاک‌تر که نگرشی پیشگیرانه است، اینجا سعی شده است حداقلکردن...

متن کامل

Designing a Mathematical Model of a Collaborative Production System Based on Make to Order under Uncertainty

Journal: : 2022

در این مقاله یک مدل ریاضی برای مسئله سیستم تولیدی همکارانه ساخت بر اساس سفارش با رعایت انصاف تخصیص بار‌های تولید طراحی شده است. اهداف اصلی مدل، کمینه‌سازی هزینه‌‌های کل و حداکثر استفاده از منابع به‌منظور عادلانه شرایط عدمقطعیت کنترل پارامتر‌های غیرقطعی روش برنامه‌ریزی فازی ‌شده نتایج نشان می‌دهد افزایش نرخ عدم‌قطعیت، مییابد. ازآنجاکه ظرفیت کارخانه‌ها ثابت است، مقدار تقاضا، هر کارخانه نیز میی...

متن کامل

Comparative Quality Estimation: Automatic Sentence-Level Ranking of Multiple Machine Translation Outputs

2012

Eleftherios Avramidis

A machine learning mechanism is learned from human annotations in order to perform preference ranking. The mechanism operates on a sentence level and ranks the alternative machine translations of each source sentence. Rankings are decomposed into pairwise comparisons so that binary classifiers can be trained using black-box features of automatic linguistic analysis. In order to re-compose the p...

متن کامل

Non-interactive OCR Post-correction for Giga-Scale Digitization Projects

2008

Martin Reynaert

This paper proposes a non-interactive system for reducing the level of OCR-induced typographical variation in large text collections, contemporary and historical. Text-Induced Corpus Clean-up or ticcl (pronounce ’tickle’) focuses on high-frequency words derived from the corpus to be cleaned and gathers all typographical variants for any particular focus word that lie within the predefined Leven...

متن کامل