Exact Distribution of a Spaced Seed Statistic for Applications in DNA Repeat Detection

نویسندگان

  • Gary Benson
  • Denise Y.F. Mak
چکیده

Let a seed, S, be a string from the alphabet {1, ∗} which starts and ends with a 1. For example S = 11 ∗ 1. S occurs in a binary string B at position k if S can be positioned so that the last letter in S aligns with the kth letter in B, and each 1 in S aligns with a 1 in B. A 1 in B is covered by S if there exists some occurrence of S in B such that the 1 in B aligns with a 1 in the occurrence of S. We show how to compute the exact probability distribution for the number of 1s covered by a seed S in an i.i.d Bernoulli string of length n with probability of 1 equal to p. We refer to the new probability distribution as CnSp, for covered, with S being the seed. When S consists entirely of 1s, for example S = 111, this reduces to the familiar Rnkp which is the probability distribution for the number of 1s which occur in runs of length k or longer (k is the number of 1s in S). Importantly, our method is probability independent in that the calculation yields a formula in terms of the probability parameter p, and does not require fixing the value of p in advance. The CnSp distribution has applications in the detection of approximate DNA repeats using spaced seeds.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exact Distribution of a Spaced Seed Statistic for DNA Homology Detection

Let a seed, S, be a string from the alphabet {1, ∗}, of arbitrary length k, which starts and ends with a 1. For example, S = 11 ∗ 1. S occurs in a binary string T at position h if the length k substring of T ending at position h contains a 1 in every position where there is a 1 in S. We say that the 1s at the corresponding positions in T are covered. We are interested in calculating the probabi...

متن کامل

Accurate Inference for the Mean of the Poisson-Exponential Distribution

Although the random sum distribution has been well-studied in probability theory, inference for the mean of such distribution is very limited in the literature. In this paper, two approaches are proposed to obtain inference for the mean of the Poisson-Exponential distribution. Both proposed approaches require the log-likelihood function of the Poisson-Exponential distribution, but the exact for...

متن کامل

Track detection on the cells exposed to high LET heavy-ions by CR-39 plastic and terminal deoxynucleotidyl transferase (TdT)

Background: The fatal effect of ionizing radiation on cells depends on Linear Energy Transfer (LET) level. The distribution of ionizing radiation is sparse and homogeneous for low LET radiations such as X or γ, but it is dense and concentrated for high LET radiation such as heavy-ions radiation. Material and Methods: Chinese hamster ovary cells (CHO-K1) were exposed to 4 Gy Fe-ion 2000 keV/...

متن کامل

An Analysis of the Repeated Financial Earthquakes

Since the seismic behavior of the earth’s energy (which follows from the power law distribution) can be similarly seen in the energy realized by the stock markets, in this paper we consider a statistical study for comparing the financial crises and the earthquakes. For this end, the TP statistic, proposed by Pisarenko and et al. (2004), is employed for estimating the critical point or the lower...

متن کامل

The Lomax-Exponential Distribution, Some Properties and Applications

Abstract: The exponential distribution is a popular model in applications to real data. We propose a new extension of this distribution, called the Lomax-exponential distribution, which presents greater flexibility to the model. Also there is a simple relation between the Lomax-exponential distribution and the Lomax distribution. Results for moment, limit behavior, hazard function, Shannon entr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008