A Distribution Function Arising in Computational Biology

نویسندگان

  • Craig A. Tracy
  • Harold Widom
چکیده

Karlin and Altschul in their statistical analysis for multiple highscoring segments in molecular sequences introduced a distribution function which gives the probability there are at least r distinct and consistently ordered segment pairs all with score at least x. For long sequences this distribution can be expressed in terms of the distribution of the length of the longest increasing subsequence in a random permutation. Within the past few years, this last quantity has been extensively studied in the mathematics literature. The purpose of this note is to summarize these new mathematical developments in a form suitable for use in computational biology. Dedicated to Barry McCoy on the occasion of his sixtieth birthday. 1 The Distribution Function Karlin and Altschul [8] in their statistical analysis for multiple high-scoring segments in molecular sequences, introduced the following distribution function: Let F (r; y) denote the probability that there are at least r distinct and consistently ordered segment pairs all with score at least x. They further introduced a parameter y = KNe−λx where K and λ are parameters related to the scoring system, see [8] for details. We use the parameter y without further reference to x. For long sequences (N → ∞) this distribution function is well approximated by [8] F (r; y) = e−y ∞

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

استفاده از رگولاریزاسیون خطی برای پیش‌بینی توابع توزیع دارای چند پیک در جاذبهای ناهمگن

In the present article an energy distribution function of heterogeneous solid was estimated. Energy distribution function is an important characterization for heterogeneous adsorbent. An overall adsorption quantity for a heterogeneous solid is usually expressed by a first kind of Fredholm equation, which contains unknown distribution function and local adsorption isotherm as a kernel. The calcu...

متن کامل

A Bayesian approach for image denoising in MRI

Magnetic Resonance Imaging (MRI) is a notable medical imaging technique that is based on Nuclear Magnetic Resonance (NMR). MRI is a safe imaging method with high contrast between soft tissues, which made it the most popular imaging technique in clinical applications. MR Imagechr('39')s visual quality plays a vital role in medical diagnostics that can be severely corrupted by existing noise duri...

متن کامل

Higher Order Moments and Recurrence Relations of Order Statistics from the Exponentiated Gamma Distribution

Order statistics arising from exponentiated gamma (EG) distribution are considered. Closed from expressions for the single and double moments of order statistics are derived. Measures of skewness and kurtosis of the probability density function of the rth order statistic for different choices of r, n and /theta are presented. Recurrence relations between single and double moments of r...

متن کامل

Bounds for CDFs of Order Statistics Arising from INID Random Variables

In recent decades, studying order statistics arising from independent and not necessary identically distributed (INID) random variables has been a main concern for researchers. A cumulative distribution function (CDF) of these random variables (Fi:n) is a complex manipulating, long time consuming and a software-intensive tool that takes more and more times. Therefore, obtaining approximations a...

متن کامل

On a Distribution Function Arising in Computational Biology

where Rk,r is the number of permutations of the integers {1, . . . , k} that contain an increasing subsequence of length at least r. Let Xy denote a positive integer valued random variable such that Prob (Xy ≥ r) = F (r; y). If R k,r denotes the complement of Rk,r, i.e. the number of permutations σ ∈ Sk all of whose increasing subsequences have length strictly less than r, then clearly R k,r = ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000