AGMIAL: implementing an annotation strategy for prokaryote genomes as a distributed system

نویسندگان

  • K. Bryson
  • V. Loux
  • R. Bossy
  • P. Nicolas
  • S. Chaillou
  • M. van de Guchte
  • S. Penaud
  • E. Maguin
  • M. Hoebeke
  • P. Bessières
  • J-F Gibrat
چکیده

We have implemented a genome annotation system for prokaryotes called AGMIAL. Our approach embodies a number of key principles. First, expert manual annotators are seen as a critical component of the overall system; user interfaces were cyclically refined to satisfy their needs. Second, the overall process should be orchestrated in terms of a global annotation strategy; this facilitates coordination between a team of annotators and automatic data analysis. Third, the annotation strategy should allow progressive and incremental annotation from a time when only a few draft contigs are available, to when a final finished assembly is produced. The overall architecture employed is modular and extensible, being based on the W3 standard Web services framework. Specialized modules interact with two independent core modules that are used to annotate, respectively, genomic and protein sequences. AGMIAL is currently being used by several INRA laboratories to analyze genomes of bacteria relevant to the food-processing industry, and is distributed under an open source license.

منابع مشابه

GenDB--an open source genome annotation system for prokaryote genomes.

The flood of sequence data resulting from the large number of current genome projects has increased the need for a flexible, open source genome annotation system, which so far has not existed. To account for the individual needs of different projects, such a system should be modular and easily extensible. We present a genome annotation system for prokaryote genomes, which is well tested and rea...

متن کامل

GeneTrees: a phylogenomics resource for prokaryotes

The GeneTrees phylogenomics system pursues comparative genomic analyses from the perspective of gene phylogenies for individual genes. The GeneTrees project has the goal of providing detailed evolutionary models for all protein-coding gene components of the fully sequenced genomes. Currently, a database of alignments and trees for all protein sequences for 325 fully sequenced and annotated prok...

متن کامل

A "polyORFomic" analysis of prokaryote genomes using disabled-homology filtering reveals conserved but undiscovered short ORFs.

Prokaryote gene annotation is complicated by large numbers of short open reading frames (ORFs) that arise naturally from genetic code design. Historically, many hypothetical ORFs have been annotated as genes in microbes, usually with an arbitrary length threshold (e.g. greater than 100 codons). Given the use of such thresholds, what is the extent of genuine undiscovered short genes in the curre...

متن کامل

CRISPRdigger: detecting CRISPRs with better direct repeat annotations

Clustered regularly interspaced short palindromic repeats (CRISPRs) are important genetic elements in many bacterial and archaeal genomes, and play a key role in prokaryote immune systems' fight against invasive foreign elements. The CRISPR system has also been engineered to facilitate target gene editing in eukaryotic genomes. Using the common features of mis-annotated CRISPRs in prokaryotic g...

متن کامل

Comprehensive Phylogenetic Analysis of Bacterial Reverse Transcriptases

Much less is known about reverse transcriptases (RTs) in prokaryotes than in eukaryotes, with most prokaryotic enzymes still uncharacterized. Two surveys involving BLAST searches for RT genes in prokaryotic genomes revealed the presence of large numbers of diverse, uncharacterized RTs and RT-like sequences. Here, using consistent annotation across all sequenced bacterial species from GenBank an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل
عنوان ژورنال:
  • Nucleic Acids Research

دوره 34  شماره 

صفحات  -

تاریخ انتشار 2006