Ranking of Prokaryotic Genomes Based on Maximization of Sortedness of Gene Lengths.

نویسندگان

  • A Bolshoy
  • B Salih
  • I Cohen
  • T Tatarinova
چکیده

How variations of gene lengths (some genes become longer than their predecessors, while other genes become shorter and the sizes of these factions are randomly different from organism to organism) depend on organismal evolution and adaptation is still an open question. We propose to rank the genomes according to lengths of their genes, and then find association between the genome rank and variousproperties, such as growth temperature, nucleotide composition, and pathogenicity. This approach reveals evolutionary driving factors. The main purpose of this study is to test effectiveness and robustness of several ranking methods. The selected method of evaluation is measuring of overall sortedness of the data. We have demonstrated that all considered methods give consistent results and Bubble Sort and Simulated Annealing achieve the highest sortedness. Also, Bubble Sort is considerably faster than the Simulated Annealing method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Methods of Combinatorial Optimization to Reveal Factors Affecting Gene Length

In this paper we present a novel method for genome ranking according to gene lengths. The main outcomes described in this paper are the following: the formulation of the genome ranking problem, presentation of relevant approaches to solve it, and the demonstration of preliminary results from prokaryotic genomes ordering. Using a subset of prokaryotic genomes, we attempted to uncover factors aff...

متن کامل

Lengths of Orthologous Prokaryotic Proteins Are Affected by Evolutionary Factors

Proteins of the same functional family (for example, kinases) may have significantly different lengths. It is an open question whether such variation in length is random or it appears as a response to some unknown evolutionary driving factors. The main purpose of this paper is to demonstrate existence of factors affecting prokaryotic gene lengths. We believe that the ranking of genomes accordin...

متن کامل

Gene-Family Extension Measures and Correlations

The existence of multiple copies of genes is a well-known phenomenon. A gene family is a set of sufficiently similar genes, formed by gene duplication. In earlier works conducted on a limited number of completely sequenced and annotated genomes it was found that size of gene family and size of genome are positively correlated. Additionally, it was found that several atypical microbes deviated f...

متن کامل

Fast identification of gene clusters in prokaryotic genomes

The detection of gene clusters that are conserved in several genomes, in terms of gene proximity and gene content, have proved to be an invaluable tool in the comparative analysis of prokaryotic genomes. It has applications, for example, in predicting functional association between groups of genes or putative genome rearrangements. We propose an efficient algorithm for computing gene clusters, ...

متن کامل

A New Reporter Gene Technology: Opportunities and Perspectives

The paper summarizes the current status of the reporter gene technology and their basics. Reporter gene technology is widely used to monitor cellular events associated with gene expression and signal transduction. Based upon the splicing of transcriptional control elements to a variety of reporter genes, it “reports” the effects of a cascade of signaling events on gene expression inside cells. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of data mining in genomics & proteomics

دوره 5 1  شماره 

صفحات  -

تاریخ انتشار 2014