hg19K: addressing a significant lacuna in hg19-based variant calling
نویسندگان
چکیده
منابع مشابه
hg19K: addressing a significant lacuna in hg19‐based variant calling
BACKGROUND The hg19 assembly of the human genome is the most heavily annotated and most commonly used reference to make variant calls for individual genomes. Based on the phase 3 report of the 1000 genomes project (1000G), it is now well known that many positions in the hg19 genome represent minor alleles. Since commonly used variant call methods are developed under the assumption that hg19 ref...
متن کاملS 6 Structural Variant Calling
S6.1 Genome-wide Structural Variant Detection We used whole-genome shotgun paired-end sequence data generated with both Illumina and Applied Biosystems SOLiD platforms from the genomes of six canid samples (including a additional Basenji only sequenced to low coverage on the Illumina platform, but excluding the Chinese wolf), to estimate the fraction of the genome with segmental duplications. O...
متن کاملFermiKit: assembly-based variant calling for Illumina resequencing data
UNLABELLED FermiKit is a variant calling pipeline for Illumina whole-genome germline data. It de novo assembles short reads and then maps the assembly against a reference genome to call SNPs, short insertions/deletions and structural variations. FermiKit takes about one day to assemble 30-fold human whole-genome data on a modern 16-core server with 85 GB RAM at the peak, and calls variants in h...
متن کاملChangepoint Analysis for Efficient Variant Calling
We present CAGe, a statistical algorithm which exploits high sequence identity between sampled genomes and a reference assembly to streamline the variant calling process. Using a combination of changepoint detection, classification, and online variant detection, CAGe is able to call simple variants quickly and accurately on the 90-95% of a sampled genome which differs little from the reference,...
متن کاملDistributed Pipeline for Genomic Variant Calling
Due to recent advances in nucleotide sequencing technology, the cost of genomic sequencing is decreasing at a pace that vastly exceeds Moore’s law. The computational methods needed to process short read data are struggling to keep pace; indeed, current sequencing pipelines take days to execute for even a single human genome. In this work, we describe the Big Genomics Inference Engine (BIGGIE), ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Molecular Genetics & Genomic Medicine
سال: 2016
ISSN: 2324-9269
DOI: 10.1002/mgg3.251