Integrating quality-based clustering of microarray data with Gibbs sampling for the discovery of regulatory motifs
نویسندگان
چکیده
In microarray experiments, genes exhibiting a similar expression profile are potentially coregulated. Clustering identifies such groups of coexpressed genes, whose upstream regions can then searched for putative regulatory elements. We present two algorithms and an interactive web-based user interface that integrate cluster analysis and motif finding for the analysis of microarray data. Starting from the expression, we present our adaptive quality-based clustering algorithm to define groups of tightly coexpressed genes. The upstream region is then retrieved based on the accession number and gene name. Once the upstream regions are identified, the sequences are analyzed using Gibbs sampling for motif finding to find the over-represented motifs. Our implementation (called Motif Sampler) allows the use of higher-order models for the sequence background. This methodology can be used through our INCLUSive web interface at the following URL: http://www.esat.kuleuven.ac.be/~dna/BioI/Software.html
منابع مشابه
BioProspector: Discovering Conserved DNA Motifs in Upstream Regulatory Regions of Co-Expressed Genes
The development of genome sequencing and DNA microarray analysis of gene expression gives rise to the demand for data-mining tools. BioProspector, a C program using a Gibbs sampling strategy, examines the upstream region of genes in the same gene expression pattern group and looks for regulatory sequence motifs. BioProspector uses zero to third-order Markov background models whose parameters ar...
متن کاملINCLUSive: INtegrated Clustering, Upstream sequence retrieval and motif Sampling
INCLUSive allows automatic multistep analysis of microarray data (clustering and motif finding). The clustering algorithm (adaptive quality-based clustering) groups together genes with highly similar expression profiles. The upstream sequences of the genes belonging to a cluster are automatically retrieved from GenBank and can be fed directly into Motif Sampler, a Gibbs sampling algorithm that ...
متن کاملBiclustering microarray data by Gibbs sampling
MOTIVATION Gibbs sampling has become a method of choice for the discovery of noisy patterns, known as motifs, in DNA and protein sequences. Because handling noise in microarray data presents similar challenges, we have adapted this strategy to the biclustering of discretized microarray data. RESULTS In contrast with standard clustering that reveals genes that behave similarly over all the con...
متن کاملSimultaneous alignment and clustering of peptide data using a Gibbs sampling approach
MOTIVATION Proteins recognizing short peptide fragments play a central role in cellular signaling. As a result of high-throughput technologies, peptide-binding protein specificities can be studied using large peptide libraries at dramatically lower cost and time. Interpretation of such large peptide datasets, however, is a complex task, especially when the data contain multiple receptor binding...
متن کاملGene expression module discovery using gibbs sampling.
Recent advances in high throughput profiling of gene expression have catalyzed an explosive growth in functional genomics aimed at the elucidation of genes that are differentially expressed in various tissue or cell types across a range of experimental conditions. These studies can lead to the identification of diagnostic genes, classification of genes into functional categories, association of...
متن کامل