A profile-based deterministic sequential Monte Carlo algorithm for motif discovery
نویسندگان
چکیده
MOTIVATION Conserved motifs often represent biological significance, providing insight on biological aspects such as gene transcription regulation, biomolecular secondary structure, presence of non-coding RNAs and evolution history. With the increasing number of sequenced genomic data, faster and more accurate tools are needed to automate the process of motif discovery. RESULTS We propose a deterministic sequential Monte Carlo (DSMC) motif discovery technique based on the position weight matrix (PWM) model to locate conserved motifs in a given set of nucleotide sequences, and extend our model to search for instances of the motif with insertions/deletions. We show that the proposed method can be used to align the motif where there are insertions and deletions found in different instances of the motif, which cannot be satisfactorily done using other multiple alignment and motif discovery algorithms. AVAILABILITY MATLAB code is available at http://www.ee.columbia.edu/~kcliang
منابع مشابه
Performance comparison of four commercial GE discovery PET/CT scanners: A monte carlo study using GATE
Combined PET/CT scanners now play a major role in medicine for in vivo imaging in oncology, cardiology, neurology, and psychiatry. As the performance of a scanner depends not only on the scintillating material but also on the scanner design, with regards to the advent of newer scanners, there is a need to optimize acquisition protocols as well as to compare scanner ...
متن کاملSequential Monte Carlo multiple testing
MOTIVATION In molecular biology, as in many other scientific fields, the scale of analyses is ever increasing. Often, complex Monte Carlo simulation is required, sometimes within a large-scale multiple testing setting. The resulting computational costs may be prohibitively high. RESULTS We here present MCFDR, a simple, novel algorithm for false discovery rate (FDR) modulated sequential Monte ...
متن کاملBayesian multiple-instance motif discovery with BAMBI: inference of recombinase and transcription factor binding sites
Finding conserved motifs in genomic sequences represents one of essential bioinformatic problems. However, achieving high discovery performance without imposing substantial auxiliary constraints on possible motif features remains a key algorithmic challenge. This work describes BAMBI-a sequential Monte Carlo motif-identification algorithm, which is based on a position weight matrix model that d...
متن کاملCharacteristics of lead glass for radiation protection purposes: A Monte Carlo study
Background: Lead glass has a wide variety of applications in radiation protection. This study aims to investigate some characteristics of lead glass such as the γ-ray energy-dependent mass and linear attenuation coefficients, the half-value layer thickness, and the absorbed dose distribution for specific energy. Materials and Methods: The attenuation parameters of different lead glass types aga...
متن کاملPiecewise Deterministic Markov Processes for Continuous-Time Monte Carlo
Recently there have been conceptually new developments in Monte Carlo methods through the introduction of new MCMC and sequential Monte Carlo (SMC) algorithms which are based on continuous-time, rather than discrete-time, Markov processes. This has led to some fundamentally new Monte Carlo algorithms which can be used to sample from, say, a posterior distribution. Interestingly, continuous-time...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 24 1 شماره
صفحات -
تاریخ انتشار 2008