Rapid 3D protein structure database searching using information retrieval techniques

نویسندگان

  • Zeyar Aung
  • Kian-Lee Tan
چکیده

MOTIVATION As the sizes of three-dimensional (3D) protein structure databases are growing rapidly nowadays, exhaustive database searching, in which a 3D query structure is compared to each and every structure in the database, becomes inefficient. We propose a rapid 3D protein structure retrieval system named 'ProtDex2', in which we adopt the techniques used in information retrieval systems in order to perform rapid database searching without having access to every 3D structure in the database. The retrieval process is based on the inverted-file index constructed on the feature vectors of the relationships between the secondary structure elements (SSEs) of all the 3D protein structures in the database. ProtDex2 is a significant improvement, both in terms of speed and accuracy, upon its predecessor system, ProtDex. RESULTS The experimental results show that ProtDex2 is very much faster than two well-known protein structure comparison methods, DALI and CE, yet not sacrificing on the accuracy of the comparison. When comparing with a similar SSE-based method, namely TopScan, ProtDex2 is much faster with comparable degree of accuracy. AVAILABILITY The software is available at: http://xena1.ddns.comp.nus.edu.sg/~genesis/PD2.htm

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cooperative Query Answering for Approximate Answers with Nearness Measure in Hierarchical Structure Information Systems

COOPERATIVE QUERY ANSWERING FOR APPROXIMATE ANSWERS WITH NEARNESS MEASURE IN HIERARCHICAL STRUCTURE INFORMATION SYSTEMS Thanit Puthpongsiriporn, Ph.D. University of Pittsburgh Cooperative query answering for approximate answers has been utilized in various problem domains. Many challenges in manufacturing information retrieval, such as: classifying parts into families in group technology implem...

متن کامل

Developing a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information

With the growing dominance of complex and multi-level urban structures, current cadastral systems, which are often developed based on 2D representations, are not capable of providing unambiguous spatial information about urban properties. Therefore, the concept of 3D cadastre is proposed to support 3D digital representation of land and properties and facilitate the communication of legal owners...

متن کامل

Conceptual Database Retrieval through Multilingual Thesauri

In traditional database management systems, information retrieval is often carried out using keywords contained within fields of each record. Because a term (concept) can be expressed in several ways, a significant number of records are ignored by the free text techniques which use only a posteriori relations between terms. This paper proposes the utilisation of a priori conceptual relations be...

متن کامل

وضعیت بازیابی اطلاعات در دو پایگاه نمایه و نما و سنجش اثربخشی استفاده از واژگان کنترل ‌شده در نمایه‌سازی این دو پایگاه

Purpose: This study was carried out to determine the level of precision, recall, and searching time for “Nama” and “Namayeh” databases, as well as to find out which of the indexing tools (thesaurus and Dewey decimal classification) helps us more in improvement of information retrieval. Methodology: This study is an analytical survey in which the necessary data was collected by direct observati...

متن کامل

Computer Aided Molecular Modeling Of Membrane Metalloprotease

Molecular modeling is a set of computational techniques for construction of 3D structure of a protein especially membrane bound proteins whose structures can not be elucidated using experimental techniques. These techniques has been applied in the study of membrane metalloproteases for comparing wild and mutated enzymes, docking inhibitors in the catalytic site and examination of binding pocket...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 20 7  شماره 

صفحات  -

تاریخ انتشار 2004