Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions
نویسندگان
چکیده
Detection of protein homology via sequence similarity has important applications in biology, from protein structure and function prediction to reconstruction of phylogenies. Although current methods for aligning protein sequences are powerful, challenges remain, including problems with homologous overextension of alignments and with regions under convergent evolution. Here, we test the ability of the profile hidden Markov model method HMMER3 to correctly assign homologous sequences to >13,000 manually curated families from the Pfam database. We identify problem families using protein regions that match two or more Pfam families not currently annotated as related in Pfam. We find that HMMER3 E-value estimates seem to be less accurate for families that feature periodic patterns of compositional bias, such as the ones typically observed in coiled-coils. These results support the continued use of manually curated inclusion thresholds in the Pfam database, especially on the subset of families that have been identified as problematic in experiments such as these. They also highlight the need for developing new methods that can correct for this particular type of compositional bias.
منابع مشابه
Evolutionary Patterns in Coiled-Coils
Models of protein evolution are used to describe evolutionary processes, for phylogenetic analyses and homology detection. Widely used general models of protein evolution are biased toward globular domains and lack resolution to describe evolutionary processes for other protein types. As three-dimensional structure is a major constraint to protein evolution, specific models have been proposed f...
متن کاملConvergent Evolution of Disease Resistance Gene Specificity in Two Flowering Plant FamiliesW
Plant disease resistance (R) genes that mediate recognition of the same pathogen determinant sometimes can be found in distantly related plant families. This observation implies that some R gene alleles may have been conserved throughout the diversification of land plants. To address this question, we have compared R genes from Glycine max (soybean), Rpg1-b, and Arabidopsis thaliana, RPM1, that...
متن کاملSecondary structure of component 8c-1 of alpha-keratin. An analysis of the amino acid sequence.
The amino acid sequence of component 8c-1 from alpha-keratin was analysed by using secondary-structure prediction techniques, homology search methods, fast Fourier-transform techniques to detect regularities in the linear disposition of amino acids, interaction counts to assess possible modes of chain aggregation and assessment of hydrophilicity distribution. The analyses show the following. Th...
متن کاملLearnCoil-VMF: computational evidence for coiled-coil-like motifs in many viral membrane-fusion proteins.
Crystallographic studies have shown that the coiled-coil motif occurs in several viral membrane-fusion proteins, including HIV-1 gp41 and influenza virus hemagglutinin. Here, the LearnCoil-VMF program was designed as a specialized program for identifying coiled-coil-like regions in viral membrane-fusion proteins. Based upon the use of LearnCoil-VMF, as well as other computational tools, we repo...
متن کاملI-28: Role of Mevalonate-Ras Homology (Rho)/Rho-Associated Coiled-Coil-Forming Protein Ki nase-Mediated Signaling Pathway in The Pathogenesis of Endometriosis-Associated Fibrosis
Background: Endometriosis, a disease affecting 3-10% of women of reproductive age, is characterized by the ectopic growth of endometrial glands and stroma surrounded by dense fibrous tissue. Whereas, normal eutopic endometrium shows scarless tissue repair during menstrual cycles, which suggests that the endometriotic tissues have distinct mechanisms of fibrogenesis. During the development of en...
متن کامل