Investigating extended regulatory regions of genomic DNA sequences

نویسندگان

  • Vladimir N. Babenko
  • P. S. Kosarev
  • Oleg V. Vishnevsky
  • Victor G. Levitsky
  • V. V. Basin
  • Anatoly S. Frolov
چکیده

MOTIVATION Despite the growing volume of data on primary nucleotide sequences, the regulatory regions remain a major puzzle with regard to their function. Numerous recognising programs considering a diversity of properties of regulatory regions have been developed. The system proposed here allows the specific contextual, conformational and physico-chemical properties to be revealed based on analysis of extended DNA regions. RESULTS The Internet-accessible computer system RegScan, designed to analyse the extended regulatory regions of eukaryotic genes, has been developed. The computer system comprises the following software: (i) programs for classification dividing a set of promoters into TATA-containing and TATA-less promoters and promoters with and without CpG islands; (ii) programs for constructing (a) nucleotide frequency profiles, (b) sequence complexity profiles and (c) profiles of conformational and physico-chemical properties; (iii) the program for constructing the sets of degenerate oligonucleotide motifs of a specified length; and (iv) the program searching for and visualising repeats in nucleotide sequences. The system has allowed us to demonstrate the following characteristic patterns of vertebrate promoter regions: the TATA box region is flanked by regions with an increased G+C content and increased bending stiffness, the TATA box content is asymmetric and promoter regions are saturated with both direct and inverted repeats. AVAILABILITY The computer system RegScan is available via the Internet at http://www.mgs.bionet.nsc. ru/Systems/RegScan, http://www.cbil.upenn.edu/mgs/systems/r egscan/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A statistical model for locating regulatory regions in genomic DNA.

In addition to genes, chromosomal DNA contains sequences that serve as signals for turning on and off gene expression. These signals are thought to be distributed as clusters in the regulatory regions of genes. We develop a Bayesian model that views locating regulatory regions in genomic DNA as a change-point problem, with the beginning of regulatory and non-regulatory regions corresponding to ...

متن کامل

A Concept to Use Spatial Knowledge of Genomic Structures to Support the Alignment of Bacterial Genomic DNA Sequences

There exists a strong need for a qualitative spatial knowledge representation that is capable of modeling and of reasoning about knowledge of genomic structures. The reasoning component can help to improve the biological plausibility of existing alignment approaches for bacterial genomic DNA sequences, which at present relies exclusively on similarity comparisons on the string level. Genomic DN...

متن کامل

Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data

Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computati...

متن کامل

A Simple Genome Walking Strategy to Isolate Unknown Genomic Regions Using Long Primer and RAPD Primer

Background: Genome walking is a DNA-cloning methodology that is used to isolate unknown genomic regions adjacent to known sequences. However, the existing genome-walking methods have their own limitations. Objectives: Our aim was to provide a simple and efficient genome-walking technology. Material and Methods: In this paper, we dev...

متن کامل

Duplex destabilization in superhelical DNA is predicted to occur at specific transcriptional regulatory regions.

Analytic methods that accurately calculate the extent of duplex destabilization induced in each base-pair of a DNA molecule by superhelical stresses are used to analyze several genomic DNA sequences. Sites predicted to be susceptible to stress-induced duplex destabilization (SIDD) are found to be closely associated with specific transcriptional regulatory regions. Operators within the promoters...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 15 7-8  شماره 

صفحات  -

تاریخ انتشار 1999