Density Peak clustering of protein sequences associated to a Pfam clan reveals clear similarities and interesting differences with respect to manual family annotation

نویسندگان

چکیده

Abstract Background The identification of protein families is outstanding practical importance for in silico annotation and at the basis several bioinformatic resources. Pfam possibly most well known family database, built many years work by domain experts with extensive use manual curation. This approach generally very accurate, but it quite time consuming may suffer from a bias generated hand-curation itself, which often guided available experimental evidence. Results We introduce procedure that aims to identify automatically putative families. based on Density Peak Clustering uses as input only local pairwise alignments between sequences. In experiment we present here, ran algorithm about 4000 full-length proteins least one classified belonging Pseudouridine synthase Archaeosine transglycosylase (PUA) clan. obtained 71 automatically-generated sequence clusters 100 members. While our were largely consistent classification, showing good overlap either single or multi-domain architectures, also observed some inconsistencies. latter inspected using structural evidence, suggested automatic classification captured evolutionary signals reflecting non-trivial features architectures. Based this analysis identified novel pre-PUA alternative boundaries few PUA PUA-associated As first indication was unlikely be clan-specific, performed same P53 clan, obtaining comparable results. Conclusions clustering described takes advantage information contained large set successfully identifies architectures an unsupervised manner. Comparison highlights significant points interesting differences, suggesting new could have potential applications related classification. Testing hypothesis, however, will require further experiments diverse datasets.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

a comparison of teachers and supervisors, with respect to teacher efficacy and reflection

supervisors play an undeniable role in training teachers, before starting their professional experience by preparing them, at the initial years of their teaching by checking their work within the proper framework, and later on during their teaching by assessing their progress. but surprisingly, exploring their attributes, professional demands, and qualifications has remained a neglected theme i...

15 صفحه اول

‏‎a comparative study of language learning strategies employmed by bilinguals and monolinguals with reference to attitudes and motivation‎‏

هدف از این تحقیق بررسی برخی عوامل ادراکی واحساسی یعنی استفاده از شیوه های یادگیری زبان ، انگیزه ها ونگرش نسبت به زبان انگلیسی در رابطه با زمینه زبانی زبان آموزان می باشد. هدف بررسی این نکته بود که آیا اختلافی چشمگیر میان زبان آموزان دو زبانه و تک زبانه در میزان استفاده از شیوه های یادگیری زبان ، انگیزه ها نگرش و سطح مهارت زبانی وجود دارد. همچنین سعی شد تا بهترین و موثرترین عوامل پیش بینی کننده ...

15 صفحه اول

Assignment of protein sequences to existing domain and family classification systems: Pfam and the PDB

MOTIVATION Automating the assignment of existing domain and protein family classifications to new sets of sequences is an important task. Current methods often miss assignments because remote relationships fail to achieve statistical significance. Some assignments are not as long as the actual domain definitions because local alignment methods often cut alignments short. Long insertions in quer...

متن کامل

Biochemical characterization of PE_PGRS61 family protein of Mycobacterium tuberculosis H37Rv reveals the binding ability to fibronectin

Objective(s): The periodic binding of protein expressed by Mycobacterium tuberculosis H37Rv with the host cell receptor molecules i.e. fibronectin (Fn) is gaining significance because of its adhesive properties.  The genome sequencing of M. tuberculosis H37Rv revealed that the proline-glutamic (PE) proteins contain polymorphic GC-rich repetitive sequences (PGRS) which have clinical importance i...

متن کامل

a frame semantic approach to the study of translating cultural scripts in salingers franny and zooey

the frame semantic theory is a nascent approach in the area of translation studies which goes beyond the linguistic barriers and helps us to incorporate cognitive and cultural factors to the study of translation. based on rojos analytical model (2002b), which centered in the frames or knowledge structures activated in the text, the present research explores the various translation problems that...

15 صفحه اول

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: BMC Bioinformatics

سال: 2021

ISSN: ['1471-2105']

DOI: https://doi.org/10.1186/s12859-021-04013-x