Efficient generation of super condensed neighborhoods

نویسندگان

  • Luís M. S. Russo
  • Arlindo L. Oliveira
چکیده

Indexing methods for the approximate string matching problem spend a considerable effort generating condensed neighborhoods. Condensed neighborhoods, however, are not a minimal representation of a pattern neighborhood. Super condensed neighborhoods, proposed in this work, are smaller, provably minimal and can be used to locate approximate matches that can later be extended by on-line search. We present an algorithm for generating Super Condensed Neighborhoods. The algorithm can be implemented either by using dynamic programming or nondeterministic automata. The complexity is O(ms) for the first case and O(kms) for the second, where m is the pattern size, s is the size of the super condensed neighborhood and k the number of errors. Previous algorithms depended on the size of the condensed neighborhood instead. These algorithms can be implemented using Bit-Parallelism and Increased Bit-Parallelism techniques. Our experimental results show that the resulting algorithms are fast and achieve significant speedups, when compared with the existing proposals that use condensed neighborhoods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Algorithm for Generating Super Condensed Neighborhoods

Indexing methods for the approximate string matching problem spend a considerable effort generating condensed neighborhoods. Here, we point out that condensed neighborhoods are not a minimal representation of a pattern neighborhood. We show that we can restrict our attention to super condensed neighborhoods which are minimal. We then present an algorithm for generating Super Condensed Neighborh...

متن کامل

Faster Generation of Super Condensed Neighbourhoods Using Finite Automata

We present a new algorithm for generating super condensed neighbourhoods. Super condensed neighbourhoods have recently been presented as the minimal set of words that represent a pattern neighbourhood. These sets play an important role in the generation phase of hybrid algorithms for indexed approximate string matching. An existing algorithm for this purpose is based on a dynamic programming ap...

متن کامل

Image Transformer

Image generation has been successfully cast as an autoregressive sequence generation or transformation problem. Recent work has shown that self-attention is an effective way of modeling textual sequences. In this work, we generalize a recently proposed model architecture based on self-attention, the Transformer, to a sequence modeling formulation of image generation with a tractable likelihood....

متن کامل

Extending the Super Efficiency Method to Rank the Non-Extreme Efficient Units

This article will address the extension of super efficiency method to rank the non-extreme efficient decision making units. Many methodologies have introduced methods that can rank efficient units, amongst which, the super efficiency method due to its ability to provide meaningful geometrical as well as economic analyses has a significant place. But the common problem with all the super efficie...

متن کامل

A generalized super-efficiency model for ranking extreme efficient DMUs in stochastic DEA

In this current study a generalized super-efficiency model is first proposed for ranking extreme efficient decision making units (DMUs) in stochastic data envelopment analysis (DEA) and then, a deterministic (crisp) equivalent form of the stochastic generalized super-efficiency model is presented. It is shown that this deterministic model can be converted to a quadratic programming model. So fa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Discrete Algorithms

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2007