ProtEST: protein multiple sequence alignments from expressed sequence tags

نویسندگان

James A. Cuff

Ewan Birney

Michele E. Clamp

Geoffrey J. Barton

چکیده

MOTIVATION An automatic sequence searching method (ProtEST) is described which constructs multiple protein sequence alignments from protein sequences and translated expressed sequence tags (ESTs). ProtEST is more effective than a simple TBLASTN search of the query against the EST database, as the sequences are automatically clustered, assembled, made non-redundant, checked for sequence errors, translated into protein and then aligned and displayed. RESULTS A ProtEST search found a non-redundant, translated, error- and length-corrected EST sequence for > 58% of sequences when single sequences from 1407 Pfam-A seed alignments were used as the probe. The average family size of the resulting alignments of translated EST sequences contained > 10 sequences. In a cross-validated test of protein secondary structure prediction, alignments from the new procedure led to an improvement of 3.4% average Q3 prediction accuracy over single sequences. AVAILABILITY The ProtEST method is available as an Internet World Wide Web service http://barton.ebi.ac.uk/servers/protest.html+ ++ The Wise2 package for protein and genomic comparisons and the ProtESTWise script can be found at http://www.sanger.ac.uk/Software/Wise2 CONTACT [email protected]

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Molecular cloning of adenylate kinase from the human filarial parasite Onchocerca volvulus

Adenylate kinases (ADK) are ubiquitous enzymes that contribute to the homeostasis of adeninenucleotides in living cells. In this study, the cloning of a cDNA encoding an adenylate kinase from the filariaOnchocerca volvulus has been described. Using PCR technique, a 281 bp cDNA fragment encoding part ofan adenylate kinase was isolated from an O. volvulus cDNA library. Use of this fragment as a p...

متن کامل

A hierarchical model for incomplete alignments in phylogenetic inference

MOTIVATION Full-length DNA and protein sequences that span the entire length of a gene are ideally used for multiple sequence alignments (MSAs) and the subsequent inference of their relationships. Frequently, however, MSAs contain a substantial amount of missing data. For example, expressed sequence tags (ESTs), which are partial sequences of expressed genes, are the predominant source of seque...

متن کامل

SEAN: SNP prediction and display program utilizing EST sequence clusters

SEAN is an application that predicts single nucleotide polymorphisms (SNPs) using multiple sequence alignments produced from expressed sequence tag (EST) clusters. The algorithm uses rules of sequence identity and SNP abundance to determine the quality of the prediction. A Java viewer is provided to display the EST alignments and predicted SNPs.

متن کامل

Computational gene prediction using multiple sources of evidence.

This article describes a computational method to construct gene models by using evidence generated from a diverse set of sources, including those typical of a genome annotation pipeline. The program, called Combiner, takes as input a genomic sequence and the locations of gene predictions from ab initio gene finders, protein sequence alignments, expressed sequence tag and cDNA alignments, splice...

متن کامل

همسانه‌سازی و بیان ایمونوتوکسین اونتاک به صورت هیبریدی با دنباله اینتئینی

Introduction: Inteins (INT) are internal parts of a number of proteins in yeast and some other unicellular eukaryotes, which can be separated from the immature protein during protein splicing process. After identifying the mechanism of intein action, applications of these sequences are be considered in the single- step purification of recombinant proteins and different intein tags were develope...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Bioinformatics

دوره 16 2 شماره

صفحات -

تاریخ انتشار 2000

ProtEST: protein multiple sequence alignments from expressed sequence tags

نویسندگان

چکیده

منابع مشابه

Molecular cloning of adenylate kinase from the human filarial parasite Onchocerca volvulus

A hierarchical model for incomplete alignments in phylogenetic inference

SEAN: SNP prediction and display program utilizing EST sequence clusters

Computational gene prediction using multiple sources of evidence.

همسانه‌سازی و بیان ایمونوتوکسین اونتاک به صورت هیبریدی با دنباله اینتئینی

عنوان ژورنال:

اشتراک گذاری