Synchronization from Insertions and Deletions
نویسندگان
چکیده
We study the problem of synchronizing two files X and Y at two distant nodes A and B that are connected through a two-way communication channel. We assume that file Y at node B is obtained from file X at node A by inserting and deleting a small fraction of symbols in X . More specifically, we consider the case where X is a non-binary non-uniform string, and deletions and insertions happen uniformly with rates βd and βi, respectively. We propose a synchronization protocol between node A and node B that needs to transmit O(CX(βd+βi)n log 1 βd+βi ) bits (where n is the length of X and CX is a constant that depends on the statistical properties of X) and reconstructs X at node B with error probability exponentially low in n. This protocol readily generalizes the recent result by Tabatabaei Yazdi and Dolecek that dealt with synchronization from binary uniform source and under only deletion errors.
منابع مشابه
Bounds on the Optimal Rate for Synchronization from Deletions and Insertions
Consider two remotely located binary sourcesX and Y , where Y is mis-synchronized from X due to deletions and insertions. The distribution of X is known, and Y is obtained from X through a process of i.i.d deletions and insertions. What is the minimum rate of information X needs to send in order to synchronize Y to X? This is a distributed source coding problem, so the optimal rate is the condi...
متن کاملCodes for Data Synchronization and Timing
This paper proposes and analyzes data synchronization techniques that not only resynchronize after encoded bits are corrupted by insertions, deletions or substitution errors, but also produce estimates of the time indices of the decoded data.
متن کاملGuess & Check Codes for Deletions, Insertions, and Synchronization
We consider the problem of constructing codes that can correct δ deletions occurring in an arbitrary binary string of length n bits. Varshamov-Tenengolts (VT) codes are zeroerror single deletion (δ = 1) correcting codes, and have an asymptotically optimal redundancy. Finding similar codes for δ ≥ 2 deletions is an open problem. We propose a new family of codes, that we call Guess & Check (GC) c...
متن کاملString editing under a combination of constraints
Let X and Y be any two strings of finite lengths N and M , respectively, over a finite alphabet. An edit distance between X and Y is defined as the minimum sum of elementary edit distances associated with edit operations of substitutions, deletions, and insertions needed to transform X to Y . In this paper, the problem of efficient computation of such a distance is considered under the assumpti...
متن کاملSynchronization Strings: List Decoding for Insertions and Deletions
We study codes that are list-decodable under insertions and deletions (“insdel codes”). Specifically, we consider the setting where, given a codeword x of length n over some finite alphabet Σ of size q, δ · n codeword symbols may be adversarially deleted and γ · n symbols may be adversarially inserted to yield a corrupted word w. A code is said to be list-decodable if there is an (efficient) al...
متن کامل