Vy-PER: eliminating false positive detection of virus integration events in next generation sequencing data

نویسندگان

  • Michael Forster
  • Silke Szymczak
  • David Ellinghaus
  • Georg Hemmrich
  • Malte Rühlemann
  • Lars Kraemer
  • Sören Mucha
  • Lars Wienbrandt
  • Martin Stanulla
  • Andre Franke
چکیده

Several pathogenic viruses such as hepatitis B and human immunodeficiency viruses may integrate into the host genome. These virus/host integrations are detectable using paired-end next generation sequencing. However, the low number of expected true virus integrations may be difficult to distinguish from the noise of many false positive candidates. Here, we propose a novel filtering approach that increases specificity without compromising sensitivity for virus/host chimera detection. Our detection pipeline termed Vy-PER (Virus integration detection bY Paired End Reads) outperforms existing similar tools in speed and accuracy. We analysed whole genome data from childhood acute lymphoblastic leukemia (ALL), which is characterised by genomic rearrangements and usually associated with radiation exposure. This analysis was motivated by the recently reported virus integrations at genomic rearrangement sites and association with chromosomal instability in liver cancer. However, as expected, our analysis of 20 tumour and matched germline genomes from ALL patients finds no significant evidence for integrations by known viruses. Nevertheless, our method eliminates 12,800 false positives per genome (80× coverage) and only our method detects singleton human-phiX174-chimeras caused by optical errors of the Illumina HiSeq platform. This high accuracy is useful for detecting low virus integration levels as well as non-integrated viruses.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A robust approach for blind detection of balanced chromosomal rearrangements with whole-genome low-coverage sequencing.

Balanced chromosomal rearrangement (or balanced chromosome abnormality, BCA) is a common chromosomal structural variation. Next-generation sequencing has been reported to detect BCA-associated breakpoints with the aid of karyotyping. However, the complications associated with this approach and the requirement for cytogenetics information has limited its application. Here, we provide a whole-gen...

متن کامل

I-37: Establishing High Resolution Genomic Profiles of Single Cells Using Microarray and Next-Generation Sequencing Technologies

The nature and pace of genome mutation is largely unknown. Standard methods to investigate DNA-mutation rely on arraying or sequencing DNA from a population of cells, hence the genetic composition of individual cells is lost and de novo mutation in cell(s) is concealed within the bulk signal. We developed methods based on (SNP-) arraying and next-generation sequencing of single-cell whole-genom...

متن کامل

A Unified Framework for Delineation of Ambulatory Holter ECG Events via Analysis of a Multiple-Order Derivative Wavelet-Based Measure

In this study, a new long-duration holter electrocardiogram (ECG) major events detection-delineation algorithm is described which operates based on the false-alarm error bounded segmentation of a decision statistic with simple mathematical origin. To meet this end, first three-lead holter data is pre-processed by implementation of an appropriate bandpass finite-duration impulse response (FIR) f...

متن کامل

SLOPE: a quick and accurate method for locating non-SNP structural variation from targeted next-generation sequence data

MOTIVATION Targeted 'deep' sequencing of specific genes or regions is of great interest in clinical cancer diagnostics where some sequence variants, particularly translocations and indels, have known prognostic or diagnostic significance. In this setting, it is unnecessary to sequence an entire genome, and target capture methods can be applied to limit sequencing to important regions, thereby r...

متن کامل

Molecular Characterization of Transgene Integration by Next-Generation Sequencing in Transgenic Cattle

As the number of transgenic livestock increases, reliable detection and molecular characterization of transgene integration sites and copy number are crucial not only for interpreting the relationship between the integration site and the specific phenotype but also for commercial and economic demands. However, the ability of conventional PCR techniques to detect incomplete and multiple integrat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2015