Decision Tree Algorithm–Generated Single-Nucleotide Polymorphism Barcodes of rbcL Genes for 38 Brassicaceae Species Tagging
نویسندگان
چکیده
DNA barcode sequences are accumulating in large data sets. A barcode is generally a sequence larger than 1000 base pairs and generates a computational burden. Although the DNA barcode was originally envisioned as straightforward species tags, the identification usage of barcode sequences is rarely emphasized currently. Single-nucleotide polymorphism (SNP) association studies provide us an idea that the SNPs may be the ideal target of feature selection to discriminate between different species. We hypothesize that SNP-based barcodes may be more effective than the full length of DNA barcode sequences for species discrimination. To address this issue, we tested a ribulose diphosphate carboxylase (rbcL) SNP barcoding (RSB) strategy using a decision tree algorithm. After alignment and trimming, 31 SNPs were discovered in the rbcL sequences from 38 Brassicaceae plant species. In the decision tree construction, these SNPs were computed to set up the decision rule to assign the sequences into 2 groups level by level. After algorithm processing, 37 nodes and 31 loci were required for discriminating 38 species. Finally, the sequence tags consisting of 31 rbcL SNP barcodes were identified for discriminating 38 Brassicaceae species based on the decision tree-selected SNP pattern using RSB method. Taken together, this study provides the rational that the SNP aspect of DNA barcode for rbcL gene is a useful and effective sequence for tagging 38 Brassicaceae species.
منابع مشابه
Single nucleotide polymorphism barcoding of cytochrome c oxidase I sequences for discriminating 17 species of Columbidae by decision tree algorithm
DNA barcodes are widely used in taxonomy, systematics, species identification, food safety, and forensic science. Most of the conventional DNA barcode sequences contain the whole information of a given barcoding gene. Most of the sequence information does not vary and is uninformative for a given group of taxa within a monophylum. We suggest here a method that reduces the amount of noninformati...
متن کاملSNP Typing for Germplasm Identification of Amomum villosum Lour. Based on DNA Barcoding Markers
Amomum villosum Lour., produced from Yangchun, Guangdong Province, China, is a Daodi medicinal material of Amomi Fructus in traditional Chinese medicine. This herb germplasm should be accurately identified and collected to ensure its quality and safety in medication. In the present study, single nucleotide polymorphism typing method was evaluated on the basis of DNA barcoding markers to identif...
متن کاملApplying DNA barcodes for identification of economically important species in Brassicaceae.
Brassicaceae is a large plant family of special interest; it includes many economically important crops, herbs, and ornamentals, as well as model organisms. The taxonomy of the Brassicaceae has long been controversial because of the poorly delimited generic boundaries and artificially circumscribed tribes. Despite great effort to delimitate species and reconstruct the phylogeny of Brassicaceae,...
متن کاملHigh-Throughput Discovery of Chloroplast and Mitochondrial DNA Polymorphisms in Brassicaceae Species by ORG-EcoTILLING
BACKGROUND Information on polymorphic DNA in organelle genomes is essential for evolutionary and ecological studies. However, it is challenging to perform high-throughput investigations of chloroplast and mitochondrial DNA polymorphisms. In recent years, EcoTILLING stands out as one of the most universal, low-cost, and high-throughput reverse genetic methods, and the identification of natural g...
متن کاملEfficient Identification of the Forest Tree Species in Aceraceae Using DNA Barcodes
Aceraceae is a large forest tree family that comprises many economically and ecologically important species. However, because interspecific and/or intraspecific morphological variations result from frequent interspecific hybridization and introgression, it is challenging for non-taxonomists to accurately recognize and identify the tree species in Aceraceae based on a traditional approach. DNA b...
متن کامل