Who's Who? Detecting and Resolving Sample Anomalies in Human DNA Sequencing Studies with Peddy.

نویسندگان

  • Brent S Pedersen
  • Aaron R Quinlan
چکیده

The potential for genetic discovery in human DNA sequencing studies is greatly diminished if DNA samples from a cohort are mislabeled, swapped, or contaminated or if they include unintended individuals. Unfortunately, the potential for such errors is significant since DNA samples are often manipulated by several protocols, labs, or scientists in the process of sequencing. We have developed a software package, peddy, to identify and facilitate the remediation of such errors via interactive visualizations and reports comparing the stated sex, relatedness, and ancestry to what is inferred from the individual genotypes derived from whole-genome (WGS) or whole-exome (WES) sequencing. Peddy predicts a sample's ancestry using a machine learning model trained on individuals of diverse ancestries from the 1000 Genomes Project reference panel. Peddy facilitates both automated and interactive, visual detection of sample swaps, poor sequencing quality, and other indicators of sample problems that, if left undetected, would inhibit discovery.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Strategies and Clinical Applications of Next Generation Sequencing

Abstract DNA sequencing is one of the great valuable techniques in molecular biology, which can be used to detect the sequence of nucleotides in a DNA fragment. The high-throughput se­quencing known as Next Generation Sequencing (NGS) revolutionized genomic research and molecular biology; therefore, the whole human genome can be sequenced with a low cost in several days. NGS technology is simi...

متن کامل

Strategies and Clinical Applications of Next Generation Sequencing

Abstract DNA sequencing is one of the great valuable techniques in molecular biology, which can be used to detect the sequence of nucleotides in a DNA fragment. The high-throughput se­quencing known as Next Generation Sequencing (NGS) revolutionized genomic research and molecular biology; therefore, the whole human genome can be sequenced with a low cost in several days. NGS technology is simi...

متن کامل

I-38: Chromosome Instability in The Cleavage Stage Embryo

Recently, we demonstrated chromosome instability (CIN) in human cleavage stage embryogenesis following in vitro fertilization (IVF). CIN not necessarily undermines normal human development (i.e. when remaining normal diploid blastomeres develop the embryo proper), however it can spark a spectrum of conditions, including loss of conception, genetic disease and genetic variation development. To s...

متن کامل

Detecting and Correcting Contamination in Genetic Data by

DNA sample contamination is a serious problem in DNA sequencing studies, and may result in systematic genotype misclassification and false positive associations. While methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available. In this paper, we describe methods to identify withinspecies DNA sample contamination ba...

متن کامل

مروری برتکنیک های توالی یابی D‏NA (نسل اول، نسل دوم و نسل سوم)

Introduction: The DNA sequencing is the most important technique in molecular biology by which the order of the nucleotides can be identified in a piece of DNA. There are several different methods for sequencing the DNA. Now, the DNA sequencing has great importance in the medical diagnostics and other medical fields. Some methods have been invented to speed up and increase the efficiency of the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • American journal of human genetics

دوره 100 3  شماره 

صفحات  -

تاریخ انتشار 2017