mendelFix: a Perl script for checking Mendelian errors in high density SNP data of trio designs
نویسندگان
چکیده
Here we present mendelFix, a Perl script for checking Mendelian errors in genome-wide SNP data of trio designs. The program takes 12-recoded PLINK PED and MAP files as input to calculate a series of summary statistics for Mendelian errors, sets missing offspring genotypes that present Mendelian inconsistencies, and implements a simplistic procedure to infer missing genotypes using parent information. The program can be easily incorporated in any pipeline for family-based SNP data analysis, and is distributed as free software under the GNU General Public License.
منابع مشابه
SNPLINK: multipoint linkage analysis of densely distributed SNP data incorporating automated linkage disequilibrium removal
SUMMARY SNPLINK is a Perl script that performs full genome linkage analysis of high-density single nucleotide polymorphism (SNP) marker sets. The presence of linkage disequilibrium (LD) between closely spaced SNP markers can falsely inflate linkage statistics. SNPLINK removes LD from the marker sets in an automated fashion before carrying out linkage analysis. SNPLINK can compute both parametri...
متن کاملError detection in SNP data by considering the likelihood of recombinational history implied by three-site combinations
MOTIVATION Errors in nucleotide sequence and SNP genotyping data are problematic when inferring haplotypes. Previously published methods for error detection in haplotype data make use of pedigree information; however, for many samples, individuals are not related by pedigree. This article describes a method for detecting errors in haplotypes by considering the recombinational history implied by...
متن کاملDetecting SNPs and estimating allele frequencies in clonal bacterial populations by sequencing pooled DNA
SUMMARY Here, we present a method for estimating the frequencies of SNP alleles present within pooled samples of DNA using high-throughput short-read sequencing. The method was tested on real data from six strains of the highly monomorphic pathogen Salmonella Paratyphi A, sequenced individually and in a pool. A variety of read mapping and quality-weighting procedures were tested to determine th...
متن کاملThe struggle to find reliable results in exome sequencing data: filtering out Mendelian errors
Next Generation Sequencing studies generate a large quantity of genetic data in a relatively cost and time efficient manner and provide an unprecedented opportunity to identify candidate causative variants that lead to disease phenotypes. A challenge to these studies is the generation of sequencing artifacts by current technologies. To identify and characterize the properties that distinguish f...
متن کاملAn analytic solution to single nucleotide polymorphism error-detection rates in nuclear families: implications for study design.
Recently, there has been increased interest in using Single Nucleotide Polymorphisms (SNPs) as a method for detecting genes for complex traits. SNPs are diallelic markers that have the potential to be inexpensively produced using chip technology. It has been suggested that SNPs will be beneficial in study designs that utilize trio data (father, mother, child). In our previous work, we calculate...
متن کامل