SUS-BAR: a database of pig proteins with statistically validated structural and functional annotation
نویسندگان
چکیده
Given the relevance of the pig proteome in different studies, including human complex maladies, a statistical validation of the annotation is required for a better understanding of the role of specific genes and proteins in the complex networks underlying biological processes in the animal. Presently, approximately 80% of the pig proteome is still poorly annotated, and the existence of protein sequences is routinely inferred automatically by sequence alignment towards preexisting sequences. In this article, we introduce SUS-BAR, a database that derives information mainly from UniProt Knowledgebase and that includes 26 206 pig protein sequences. In SUS-BAR, 16 675 of the pig protein sequences are endowed with statistically validated functional and structural annotation. Our statistical validation is determined by adopting a cluster-centric annotation procedure that allows transfer of different types of annotation, including structure and function. Each sequence in the database can be associated with a set of statistically validated Gene Ontologies (GOs) of the three main sub-ontologies (Molecular Function, Biological Process and Cellular Component), with Pfam functional domains, and when possible, with a cluster Hidden Markov Model that allows modelling the 3D structure of the protein. A database search allows some statistics demonstrating the enrichment in both GO and Pfam annotations of the pig proteins as compared with UniProt Knowledgebase annotation. Searching in SUS-BAR allows retrieval of the pig protein annotation for further analysis. The search is also possible on the basis of specific GO terms and this allows retrieval of all the pig sequences participating into a given biological process, after annotation with our system. Alternatively, the search is possible on the basis of structural information, allowing retrieval of all the pig sequences with the same structural characteristics.
منابع مشابه
Database tool SUS-BAR: a database of pig proteins with statistically validated structural and functional annotation
Bologna Biocomputing Group, University of Bologna, via S. Giacomo 9/2, I-40126, Bologna, Italy, Department of Biological, Geological and Environmental Sciences (BIGEA), University of Bologna, via Selmi 3, I-40126, Bologna, Italy, Department of Computer Science and Engineering, University of Bologna, Mura A. Zamboni 7, I-40126, Bologna, Italy, Health Science and Technologies-ICIR, University of ...
متن کاملAnnotation of hypothetical proteins orthologous in Pongo abelii and Sus scrofa
UNLABELLED A hypothetical protein is predicted to be expressed from an open reading frame without known experimental evidence of translation. They constitute a substantial fraction of proteomes. Domain extraction from these hypothetical sequences helps to search for protein coding genes for protein structural and functional annotation. We describe the analysis of prediction data in a sequence d...
متن کاملFunctional Annotation of Two Hypothetical Proteins Reveals Valuable Proteins Involved in Response to Salinity: An in silico Approach
Through the exponential development in the specification of sequences and structures of proteins by genome sequencing and structural genomics approaches, there is a growing demand for valid bioinformatics methods to define these proteins function. In this study, our objective is to identify the function of unknown proteins from UCB-1 pistachio rootstock and specify their class...
متن کاملBAR-PLUS: the Bologna Annotation Resource Plus for functional and structural annotation of protein sequences
We introduce BAR-PLUS (BAR(+)), a web server for functional and structural annotation of protein sequences. BAR(+) is based on a large-scale genome cross comparison and a non-hierarchical clustering procedure characterized by a metric that ensures a reliable transfer of features within clusters. In this version, the method takes advantage of a large-scale pairwise sequence comparison of 13,495,...
متن کاملProtein Sequence Annotation by means of Community Detection
The improvement of sequencing technologies is increasing the volume of biosequences in databases. Experimental validation of genomes and proteomes is however far too slow compared to the pace at which data are being produced and electronic annotation is the current solution to this problem. The annotation of a new sequence is inferred from experimentally validated reference proteins using diffe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
دوره 2013 شماره
صفحات -
تاریخ انتشار 2013