Overcoming sequence misalignments with weighted structural superposition.
نویسندگان
چکیده
An appropriate structural superposition identifies similarities and differences between homologous proteins that are not evident from sequence alignments alone. We have coupled our Gaussian-weighted RMSD (wRMSD) tool with a sequence aligner and seed extension (SE) algorithm to create a robust technique for overlaying structures and aligning sequences of homologous proteins (HwRMSD). HwRMSD overcomes errors in the initial sequence alignment that would normally propagate into a standard RMSD overlay. SE can generate a corrected sequence alignment from the improved structural superposition obtained by wRMSD. HwRMSD's robust performance and its superiority over standard RMSD are demonstrated over a range of homologous proteins. Its better overlay results in corrected sequence alignments with good agreement to HOMSTRAD. Finally, HwRMSD is compared to established structural alignment methods: FATCAT, secondary-structure matching, combinatorial extension, and Dalilite. Most methods are comparable at placing residue pairs within 2 Å, but HwRMSD places many more residue pairs within 1 Å, providing a clear advantage. Such high accuracy is essential in drug design, where small distances can have a large impact on computational predictions. This level of accuracy is also needed to correct sequence alignments in an automated fashion, especially for omics-scale analysis. HwRMSD can align homologs with low-sequence identity and large conformational differences, cases where both sequence-based and structural-based methods may fail. The HwRMSD pipeline overcomes the dependency of structural overlays on initial sequence pairing and removes the need to determine the best sequence-alignment method, substitution matrix, and gap parameters for each unique pair of homologs.
منابع مشابه
Dynamic Analysis of Linear Structural Systems with Nonlinear Vibrating Isolators Using Mode Superposition Method
متن کامل
Dynamic Analysis of Linear Structural Systems with Nonlinear Vibrating Isolators Using Mode Superposition Method
متن کامل
Extension of Hardy Inequality on Weighted Sequence Spaces
Let and be a sequence with non-negative entries. If , denote by the infimum of those satisfying the following inequality: whenever . The purpose of this paper is to give an upper bound for the norm of operator T on weighted sequence spaces d(w,p) and lp(w) and also e(w,?). We considered this problem for certain matrix operators such as Norlund, Weighted mean, Ceasaro and Copson ma...
متن کاملProtein Sequence Alignment Analysis by Local Covariation: Coevolution Statistics Detect Benchmark Alignment Errors
The use of sequence alignments to understand protein families is ubiquitous in molecular biology. High quality alignments are difficult to build and protein alignment remains one of the largest open problems in computational biology. Misalignments can lead to inferential errors about protein structure, folding, function, phylogeny, and residue importance. Identifying alignment errors is difficu...
متن کاملEffective dimension for weighted function spaces
This paper introduces some notions of effective dimension for weighted function spaces. A space has low effective dimension if the smallest ball in it that contains a function of variance 1, has no functions with large values of certain ANOVA mean squares. For a Sobolev space of periodic functions defined by product weights we get explicit formulas describing effective dimension in terms of tho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Proteins
دوره 80 11 شماره
صفحات -
تاریخ انتشار 2012