Estimation of genotype imputation accuracy using reference populations with varying degrees of relationship and marker density panel
Authors
Abstract:
Genotype imputation from low-density to high-density (SNP) chips is an important step before applying genomic selection, because denser chips can provide more reliable genomic predictions. In the current research, the accuracy of genotype imputation from low and moderate-density panels (5K and 50K) to high-density panels in the purebred and crossbred populations was assessed. The simulated populations included two purebred populations (lines A and B) and two crossbred populations (cross and backcross). Three scenarios were assessed for selecting the subset of the references that used to impute un-genotyped loci of animals in the validation set, where: 1) high relationship with validation set, 2) randomly, and 3) high inbreeding selecting. Imputing the individuals of validation set 5K and 50K to marker density 777K using the various combinations of reference set was performed by FImpute software. The imputation accuracies were calculated using two methods including Pearson correlation coefficient (PCC) and concordance rate (CR). The results showed that imputation accuracy in the purebred populations lines A and B was higher than the cross and backcross populations. When the reference set has been selected based on high relationships, the genotype accuracy in lines A and B was the highest, and there was less difference between imputation from 5K and 50K density to 777K compared to the other subset selection methods. In the crossbred population with imputation from 50K to 777K, the imputation accuracy was the highest in the state of the randomly selected of the reference population (0.98 and 0.97 for PCC and CR, respectively). In the backcross population, the imputation accuracy was the lowest when the reference set selected according to the high inbreeding, which it could be resulting from the lower homozygosis in these populations.
similar resources
Estimation of genomic prediction accuracy from reference populations with varying degrees of relationship
Genomic prediction is emerging in a wide range of fields including animal and plant breeding, risk prediction in human precision medicine and forensic. It is desirable to establish a theoretical framework for genomic prediction accuracy when the reference data consists of information sources with varying degrees of relationship to the target individuals. A reference set can contain both close a...
full textMarker Genotype Imputation in a Low- Marker-Density Panel with a High-Marker- Density Reference Panel: Accuracy Evaluation in Barley Breeding Lines
We evaluated a strategy in which the scores of markers untyped in a low-density experimental panel were imputed on the basis of data from a high-density reference panel, in its application to whole-genome genotyping of barley (Hordeum vulgare L.) breeding lines. Using a barley core set consisting of 98 lines genotyped with 3205 markers (high-density reference panel), we imputed marker scores un...
full textGenotype imputation accuracy with different reference panels in admixed populations
Genome-wide association studies have successfully identified common variants that are associated with complex diseases. However, the majority of genetic variants contributing to disease susceptibility are yet to be discovered. It is now widely believed that multiple rare variants are likely to be associated with complex diseases. Using custom-made chips or next-generation sequencing to uncover ...
full textEffect of Reference Population Size and Imputation Methods on the Accuracy of Imputation in Pure and Mixed Populations
Imputation as a method of creating low-density chips to high-density chips has been introduced to increase the accuracy of genomic selection in animals. In the current study, to investing imputation accuracy, three populations of mixed (scenario 1), pure (scenario 2) and mixed + pure (scenario 3) were simulated using QMSim. Two methods of imputation including Beagle and Flmpute were used fo...
full textGenotype Imputation Reference Panel Selection Using Maximal Phylogenetic Diversity
The recent dramatic cost reduction of next-generation sequencing technology enables investigators to assess most variants in the human genome to identify risk variants for complex diseases. However, sequencing large samples remains very expensive. For a study sample with existing genotype data, such as array data from genome-wide association studies, a cost-effective approach is to sequence a s...
full textGenotype-imputation accuracy across worldwide human populations.
A current approach to mapping complex-disease-susceptibility loci in genome-wide association (GWA) studies involves leveraging the information in a reference database of dense genotype data. By modeling the patterns of linkage disequilibrium in a reference panel, genotypes not directly measured in the study samples can be imputed and tested for disease association. This imputation strategy has ...
full textMy Resources
Journal title
volume 7 issue 1
pages 45- 53
publication date 2019-10-14
By following a journal you will be notified via email when a new issue of this journal is published.
Hosted on Doprax cloud platform doprax.com
copyright © 2015-2023