Phylogenic inference using alignment-free methods for applications in microbial community surveys using 16s rRNA gene

نویسندگان

  • Yifei Zhang
  • Alexander V Alekseyenko
چکیده

The diversity of microbiota is best explored by understanding the phylogenetic structure of the microbial communities. Traditionally, sequence alignment has been used for phylogenetic inference. However, alignment-based approaches come with significant challenges and limitations when massive amounts of data are analyzed. In the recent decade, alignment-free approaches have enabled genome-scale phylogenetic inference. Here we evaluate three alignment-free methods: ACS, CVTree, and Kr for phylogenetic inference with 16s rRNA gene data. We use a taxonomic gold standard to compare the accuracy of alignment-free phylogenetic inference with that of common microbiome-wide phylogenetic inference pipelines based on PyNAST and MUSCLE alignments with FastTree and RAxML. We re-simulate fecal communities from Human Microbiome Project data to evaluate the performance of the methods on datasets with properties of real data. Our comparisons show that alignment-free methods are not inferior to alignment-based methods in giving accurate and robust phylogenic trees. Moreover, consensus ensembles of alignment-free phylogenies are superior to those built from alignment-based methods in their ability to highlight community differences in low power settings. In addition, the overall running times of alignment-based and alignment-free phylogenetic inference are comparable. Taken together our empirical results suggest that alignment-free methods provide a viable approach for microbiome-wide phylogenetic inference.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genetic variations of avian Pasteurella multocida as demonstrated by 16S-23S rRNA gene sequences comparison

Pasteurella multocida is known as an important heterogenic bacterial agent causes some severe diseases such as fowl cholera in poultry and haemorrhagic septicaemia in cattle and buffalo. A polymerase chain reaction (PCR) assay was developed using primers derived from conserved part of 16S-23S rRNA gene. The PCR amplified a fragment size of 0.7 kb using DNA from nine avian P. multocida  isolates...

متن کامل

Molecular Detection of Novel Genetic Variants Associated to Anaplasma ovis among Dromedary Camels in Iran

To the best of our knowledge, little information is available regarding the presence of Anaplasma species in camels in Iran. This study sought to investigate the presence of Anaplasma species by microscopy and polymerase chain reaction (PCR) assays in 100 healthy dromedaries (Camelus dromedarius) arriving for slaughter. The microscopic examination of Giemsa-stained blood films revealed that Ana...

متن کامل

CopyRighter: a rapid tool for improving the accuracy of microbial community profiles through lineage-specific gene copy number correction

BACKGROUND Culture-independent molecular surveys targeting conserved marker genes, most notably 16S rRNA, to assess microbial diversity remain semi-quantitative due to variations in the number of gene copies between species. RESULTS Based on 2,900 sequenced reference genomes, we show that 16S rRNA gene copy number (GCN) is strongly linked to microbial phylogenetic taxonomy, potentially under-...

متن کامل

Microbial Communities Can Be Described by Metabolic Structure: A General Framework and Application to a Seasonally Variable, Depth-Stratified Microbial Community from the Coastal West Antarctic Peninsula

Taxonomic marker gene studies, such as the 16S rRNA gene, have been used to successfully explore microbial diversity in a variety of marine, terrestrial, and host environments. For some of these environments long term sampling programs are beginning to build a historical record of microbial community structure. Although these 16S rRNA gene datasets do not intrinsically provide information on mi...

متن کامل

Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers

The recent introduction of massively parallel pyrosequencers allows rapid, inexpensive analysis of microbial community composition using 16S ribosomal RNA (rRNA) sequences. However, a major challenge is to design a workflow so that taxonomic information can be accurately and rapidly assigned to each read, so that the composition of each community can be linked back to likely ecological roles pl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2017