Database searches with multiple oligopeptides containing ambiguous residues.

نویسندگان

  • M H Vodkin
  • R J Novak
  • G L McLaughlin
چکیده

Several techniques in molecular biology frequently yield partial and ambiguous data on genes and gene products. For instance, N-terminal sequence analysis of oligopeptide cleavage products generates this type of sequence data. Typically, data generated from blotted or HPLC-resolved peptides consist of disconnected and unordered oligopeptides derived from N-terminal analysis of fragments resulting from complete or partial trypsin, chymotrypsin or CNBr digestion; such sequences are also “linked” if they were derived from the same isolated polypeptide, e.g., a band identified after sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDSPAGE) and blotting. To empirically identify the protein represented by such data, labor-intensive sequencing, with or without cloning, is frequently required. We became interested in defining a strategy to more reliably identify the source protein from existing sequence databases without an investment of additional laboratory experiments. Several algorithms are readily available to rapidly search the databases for proteins or nucleic acids that are identical to or related to a specified query sequence. One popular program is the basic local alignment search tool (BLAST) (Reference 1; see Availability). However, BLAST can search databases (e.g., SWISS-PROT) with only moderate sensitivity. At the National Institutes of Health (NIH) address (see Availability), BLAST can search the updated, nonredundant protein or nucleic acid databases. A disadvantage of BLAST is that very limited ambiguity is allowed at each position. For amino acids, “X” designates an unknown, “B” designates aspartate or asparagine, “Z” designates glutamate or glutamine and “-” designates a gap of indeterminate length. Table 1 shows an actual example of such data. When the individual oligopeptides listed in Table 1 were used to search the SWISS-PROT database, multiple related and unrelated sequences with similar or identical scores were retrieved. Even by comparing the individual lists for common, multiple hits, it was not possible to determine a unique candidate protein that was related to all or most of the oligopeptides. Another approach tested was to search with BLAST in pairwise or N-wise combinations of the oligopeptides, either as a continuous string of residues or as a broken string with hyphens designated as a discontinuity. (BLAST at the NIH supports the latter syntax; however, BLAST at some other addresses does not.) The first method created strings of characters that were not originally juxtaposed and thus did not allow the correct identification. The second method, when used in various pairwise combinations, still did not detect homologous proteins in the database. We therefore utilized an alternative search algorithm and show its utility for identifying a protein in the database when query sequences include several linked oligopeptide fragments with some ambiguous amino acid residues. FindPatterns, or Find (a subset of the GCG package; see Availability), was used for the peptide data set in Table 1. Find has more versatility for managing ambiguous residues and multiple, discontinuous oligopeptides. Each individual residue of the query oligopeptide can be specified as either unknown (X) or up to a 20-fold ambiguity (every amino acid candidate at a position is encoded, as in Table 1). The gap size between the unordered fragments can also be specified with a minimum

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The EROP-Moscow oligopeptide database

Natural oligopeptides may regulate nearly all vital processes. To date, the chemical structures of nearly 6000 oligopeptides have been identified from >1000 organisms representing all the biological kingdoms. We have compiled the known physical, chemical and biological properties of these oligopeptides--whether synthesized on ribosomes or by non-ribosomal enzymes--and have constructed an intern...

متن کامل

Order and maximum incorporation of N-acetyl-D-galactosamine into threonine residues of MUC2 core peptide with microsome fraction of human-colon-carcinoma LS174T cells.

Mucin 2 (MUC2) is the major intestinal mucin. O-glycans are attached to MUC2 in a potentially diverse arrangement, which is crucial for their interaction with endogeneous and exogeneous lectins. In the present report, five oligopeptides [PTTTPITTTT(K), ITTTTTVTPT(K), TVTPTPTPTG(K), PTPTGTQTPT(K) and TQTPTTTPIT(K)] corresponding to portions of the MUC2 tandem repeat domain were synthesized, and ...

متن کامل

PhosphoBase, a database of phosphorylation sites: release 2.0

PhosphoBase contains information about phosphorylated residues in proteins and data about peptide phosphorylation by a variety of protein kinases. The data are collected from literature and compiled into a common format. The current release of PhosphoBase (October 1998, version 2.0) comprises 414 phosphoprotein entries covering 1052 phosphorylatable serine, threonine and tyrosine residues. The ...

متن کامل

Spatial requirement for coupling of iodotyrosine residues to form thyroid hormones.

A linear random copolymer of tyrosine and lysine and two synthetic oligopeptides containing two tyrosine residues in addition to lysine residues give thyroid hormone (thyroxine and triodothyronine) residues in good yield upon enzymatic iodination with thyroid peroxidase. These synthetic peptides may serve as simple models for thyroglobulin, the protein in which biosynthesis of the thyroid hormo...

متن کامل

Concentration-Driven Assembly and Sol–Gel Transition of π-Conjugated Oligopeptides

Advances in supramolecular assembly have enabled the design and synthesis of functional materials with well-defined structures across multiple length scales. Biopolymer-synthetic hybrid materials can assemble into supramolecular structures with a broad range of structural and functional diversity through precisely controlled noncovalent interactions between subunits. Despite recent progress, th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • BioTechniques

دوره 21 6  شماره 

صفحات  -

تاریخ انتشار 1996