A profile-based deterministic sequential Monte Carlo algorithm for motif discovery

نویسندگان

  • Kuo-ching Liang
  • Xiaodong Wang
  • Dimitris Anastassiou
چکیده

MOTIVATION Conserved motifs often represent biological significance, providing insight on biological aspects such as gene transcription regulation, biomolecular secondary structure, presence of non-coding RNAs and evolution history. With the increasing number of sequenced genomic data, faster and more accurate tools are needed to automate the process of motif discovery. RESULTS We propose a deterministic sequential Monte Carlo (DSMC) motif discovery technique based on the position weight matrix (PWM) model to locate conserved motifs in a given set of nucleotide sequences, and extend our model to search for instances of the motif with insertions/deletions. We show that the proposed method can be used to align the motif where there are insertions and deletions found in different instances of the motif, which cannot be satisfactorily done using other multiple alignment and motif discovery algorithms. AVAILABILITY MATLAB code is available at http://www.ee.columbia.edu/~kcliang

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance comparison of four commercial GE discovery PET/CT scanners: A monte carlo study using GATE

  Combined PET/CT scanners now play a major role in medicine for in vivo imaging in oncology, cardiology, neurology, and psychiatry. As the performance of a scanner depends not only on the scintillating material but also on the scanner design, with regards to the advent of newer scanners, there is a need to optimize acquisition protocols as well as to compare scanner ...

متن کامل

Sequential Monte Carlo multiple testing

MOTIVATION In molecular biology, as in many other scientific fields, the scale of analyses is ever increasing. Often, complex Monte Carlo simulation is required, sometimes within a large-scale multiple testing setting. The resulting computational costs may be prohibitively high. RESULTS We here present MCFDR, a simple, novel algorithm for false discovery rate (FDR) modulated sequential Monte ...

متن کامل

Bayesian multiple-instance motif discovery with BAMBI: inference of recombinase and transcription factor binding sites

Finding conserved motifs in genomic sequences represents one of essential bioinformatic problems. However, achieving high discovery performance without imposing substantial auxiliary constraints on possible motif features remains a key algorithmic challenge. This work describes BAMBI-a sequential Monte Carlo motif-identification algorithm, which is based on a position weight matrix model that d...

متن کامل

Characteristics of lead glass for radiation protection purposes: A Monte Carlo study

Background: Lead glass has a wide variety of applications in radiation protection. This study aims to investigate some characteristics of lead glass such as the γ-ray energy-dependent mass and linear attenuation coefficients, the half-value layer thickness, and the absorbed dose distribution for specific energy. Materials and Methods: The attenuation parameters of different lead glass types aga...

متن کامل

Piecewise Deterministic Markov Processes for Continuous-Time Monte Carlo

Recently there have been conceptually new developments in Monte Carlo methods through the introduction of new MCMC and sequential Monte Carlo (SMC) algorithms which are based on continuous-time, rather than discrete-time, Markov processes. This has led to some fundamentally new Monte Carlo algorithms which can be used to sample from, say, a posterior distribution. Interestingly, continuous-time...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 24 1  شماره 

صفحات  -

تاریخ انتشار 2008