Sampling Properties of Estimators of Nucleotide Diversity at Discovered Snp Sites
نویسندگان
چکیده
SNP sites are generally discovered by sequencing regions of the human genome in a limited number of individuals. This may leave SNP sites present in the region, but containing rare mutant nucleotides, undetected. Consequently, estimates of nucleotide diversity obtained from assays of detected SNP sites are biased. In this research we present a statistical model of the SNP discovery process, which is used to evaluate the extent of this bias. This model involves the symmetric Beta distribution of variant frequencies at SNP sites, with an additional probability that there is no SNP at any given site. Under this model of allele frequency distributions at SNP sites, we show that nucleotide diversity is always underestimated. However, the extent of bias does not seem to exceed 10–15% for the analyzed data. We find that our model of allele frequency distributions at SNP sites is consistent with SNP statistics derived based on new SNP data at ATM, BLM, RQL and WRN gene regions. The application of the theory to these new SNP data as well as to the literature data at the LPL gene region indicates that in spite of ascertainment biases, the observed differences of nucleotide diversity across these gene regions are real. This provides interesting evidence concerning the heterogeneity of the rates of nucleotide substitution across the genome.
منابع مشابه
Application of single-nucleotide polymorphism (SNP) as a molecular marker in the study of genetic diversity of aquatic populations
Genetic diversity is one of the important and essential characteristics of any population for its survival. The study of genetic variation in different populations of aquatic organisms is of particular importance in order to protect, stabilize and manage their stocks. Based on studies conducted in recent years, molecular markers have proven that they can be used as indicators of the genetic div...
متن کاملطراحی پرایمرهای اختصاصی برای مطالعه تنوع تک نوکلئوتیدی (SNP) در ژن ها و تعیین عملکرد آنها
There is a lot of information about genes sequence but their functions are still unknown. So, to fill the gap between structure and function of these sequences many reverse genetic researches have been done. Current experiment studying, how to design gene-specific primers, that can determine single nucleotide diversity and its impact on gene function.This research was condacted at International...
متن کاملCorrecting estimators of theta and Tajima's D for ascertainment biases caused by the single-nucleotide polymorphism discovery process.
Most single-nucleotide polymorphism (SNP) data suffer from an ascertainment bias caused by the process of SNP discovery followed by SNP genotyping. The final genotyped data are biased toward an excess of common alleles compared to directly sequenced data, making standard genetic methods of analysis inapplicable to this type of data. We here derive corrected estimators of the fundamental populat...
متن کاملEvolutionary features of 8K (KDa) silencing suppressor protein of Potato mop-top virus
The cysteine-rich 8K protein of Potato mop-top virus (PMTV) suppresses host RNA silencing. In this study, evolutionary analysisof 8K sequences of PMTV isolates was studied on the basis of nucleotide and amino acid sequences. Twenty-one positively selected sites were identified in 8K codingregions. Recombination events were found in the 8K of PMTV isolates with a rate of 1.8. Totally 30 haplotyp...
متن کاملNucleotide diversity analysis highlights functionally important genomic regions
We analyzed functionality and relative distribution of genetic variants across the complete Oryza sativa genome, using the 40 million single nucleotide polymorphisms (SNPs) dataset from the 3,000 Rice Genomes Project (http://snp-seek.irri.org), the largest and highest density SNP collection for any higher plant. We have shown that the DNA-binding transcription factors (TFs) are the most conserv...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003