A cautionary note for retrocopy identification: DNA-based duplication of intron-containing genes significantly contributes to the origination of single exon genes

نویسندگان

  • Yong E. Zhang
  • Maria D. Vibranovski
  • Benjamin H. Krinsky
  • Manyuan Long
چکیده

MOTIVATION Retrocopies are important genes in the genomes of almost all higher eukaryotes. However, the annotation of such genes is a non-trivial task. Intronless genes have often been considered to be retroposed copies of intron-containing paralogs. Such categorization relies on the implicit premise that alignable regions of the duplicates should be long enough to cover exon-exon junctions of the intron-containing genes, and thus intron loss events can be inferred. Here, we examined the alternative possibility that intronless genes could be generated by partial DNA-based duplication of intron-containing genes in the fruitfly genome. RESULTS By building pairwise protein-, transcript- and genome-level DNA alignments between intronless genes and their corresponding intron-containing paralogs, we found that alignments do not cover exon-exon junctions in 40% of cases and thus no intron loss could be inferred. For these cases, the candidate parental proteins tend to be partially duplicated, and intergenic sequences or neighboring genes are included in the intronless paralog. Moreover, we observed that it is significantly less likely for these paralogs to show inter-chromosomal duplication and testis-dominant transcription, compared to the remaining 60% of cases with evidence of clear intron loss (retrogenes). These lines of analysis reveal that DNA-based duplication contributes significantly to the 40% of cases of single exon gene duplication. Finally, we performed an analogous survey in the human genome and the result is similar, wherein 34% of the cases do not cover exon-exon junctions. Thus, genome annotation for retrogene identification should discard candidates without clear evidence of intron loss. CONTACT [email protected]; [email protected]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DNA Polymorphisms at Candidate Gene Loci and Their Relation with Milk Production Traits in Murrah Buffalo (Bubalus bubalis)

DNA polymorphism within diacylglycerol transferase 2 (DGAT2) / monoacyl glycerol transferases 2 (MOGAT2), leptin and butyrophilin genes were analysed using PCR-SSCP in Murrah buffalo. The single strand conformation polymorphism (SSCP) analysis of amplified gene fragment in exon 5 of MOGAT2, exon 3 of leptin and intron 1 of butyrophilin gene revealed different patterns. A, B and C showed the fol...

متن کامل

Fingerprinting of some Egyptian rice genotypes using Intron-exon Splice Junctions (ISJ) markers

DNA fingerprinting has become an important tool for diversity assessment and varietal identification in plant breeding programs. Semi- random PCR primers targeting intron-exon splice junctions (ISJ) were used to evaluate the potential of these markers in identification and classification of rice genotypes. A total of 12 ISJ primers were used for screening fourteen Egyptian rice genotypes, inclu...

متن کامل

A glimpse of a putative pre-intron phase of eukaryotic evolution.

Comparison of the exon-intron structures of ancient eukaryotic paralogs reveals the absence of conserved intron positions in these genes. This is in contrast to the conservation of intron positions in orthologous genes from even the most evolutionarily distant eukaryotes and in more recent paralogs. The lack of conserved intron positions in ancient paralogs probably reflects the origination of ...

متن کامل

Molecular identification of agrobacterium tumefaciens containing pCAMBIA 1305.2 plasmid using multiplex PCR and Gold nanoparticles multiplex probe

Conventional microbiology methods used to detect bacteria include multiple cultures and identification processes, so the results of lab work are painstaking and time-consuming. In recent years, more and more tend to use the diagnostic tests which are based on DNA; hence, DNA diagnostic biosensors have been created to perform DNA identification better. In this study, GUS and hpt genes were used ...

متن کامل

Polymorphism in Prolactin and PEPCK-C Genes and Its Association with Economic Traits in Native Fowl of Yazd Province

The objective of the present study was to investigate the polymorphism of prolactin promoter and cytosolicphosphoenol pyruvate carboxykinase (PEPCK-C) intron 3 to exon 3 regions, and its association with economictraits in native fowl of Yazd province. These traits consisted of body weight at 8 (BW8) and 12 (BW12) weeks of age, age at sexual maturity (ASM), weight at sexual maturity ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 27 13  شماره 

صفحات  -

تاریخ انتشار 2011