PREDICT-2ND: a tool for generalized protein local structure prediction

نویسندگان

  • Sol Katzman
  • Christian Barrett
  • Grant Thiltgen
  • Rachel Karchin
  • Kevin Karplus
چکیده

MOTIVATION Predictions of protein local structure, derived from sequence alignment information alone, provide visualization tools for biologists to evaluate the importance of amino acid residue positions of interest in the absence of X-ray crystal/NMR structures or homology models. They are also useful as inputs to sequence analysis and modeling tools, such as hidden Markov models (HMMs), which can be used to search for homology in databases of known protein structure. In addition, local structure predictions can be used as a component of cost functions in genetic algorithms that predict protein tertiary structure. We have developed a program (predict-2nd) that trains multilayer neural networks and have applied it to numerous local structure alphabets, tuning network parameters such as the number of layers, the number of units in each layer and the window sizes of each layer. We have had the most success with four-layer networks, with gradually increasing window sizes at each layer. RESULTS Because the four-layer neural nets occasionally get trapped in poor local optima, our training protocol now uses many different random starts, with short training runs, followed by more training on the best performing networks from the short runs. One recent addition to the program is the option to add a guide sequence to the profile inputs, increasing the number of inputs per position by 20. We find that use of a guide sequence provides a small but consistent improvement in the predictions for several different local-structure alphabets. AVAILABILITY Local structure prediction with the methods described here is available for use online at http://www.soe.ucsc.edu/compbio/SAM_T08/T08-query.html. The source code and example networks for PREDICT-2ND are available at http://www.soe.ucsc.edu/~karplus/predict-2nd/ A required C++ library is available at http://www.soe.ucsc.edu/~karplus/ultimate/

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Surface Pressure Contour Prediction Using a GRNN Algorithm

A new approach based on a Generalized Regression Neural Network (GRNN) has been proposed to predict the planform surface pressure field on a wing-tail combination in low subsonic flow. Extensive wind tunnel results were used for training the network and verification of the values predicted by this approach. GRNN has been trained by the aforementioned experimental data and subsequently was used ...

متن کامل

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

Link Prediction using Network Embedding based on Global Similarity

Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...

متن کامل

PROSAT: A Generalized Framework for Protein Sequence Annotation

Motivation: Over the last decade several prediction methods have been developed for determining structural and functional properties of individual protein residues using sequence and sequencederived information. These protein residue annotation problems are often formulated as either classification or regression problems and solved using a common set of techniques. Methods: We developed a gener...

متن کامل

Estimating Algorithms for Prediction and Spread of a Factor as a Pandemic: A Case Study of Global COVID-19 Prevalence

Background: This paper presents open-source computer simulation programs developed for simulating, tracking, and estimating the COVID-19 outbreak. Methods: The programs consisted of two separate parts: one set of programs built in Simulink with a block diagram display, and another one coded in MATLAB as scripts. The mathematical model used in this package was the SIR, SEIR, and SEIRD models re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 24 21  شماره 

صفحات  -

تاریخ انتشار 2008