Improved Bootstrap Condence Limits in Large-Scale Phylogenies, with an Example from Neo-Astragalus (Leguminosae)
نویسندگان
چکیده
—Phylogenetic analyses of large data sets pose special challenges, including the apparent tendency for the bootstrap support for a clade to decline with increased taxon sampling of that clade. We document this decline in data setswith increasing numbers of taxa in Astragalus, the most species-rich angiospermgenus. Support for one subclade, Neo-Astragalus, declinedmonotonicallywith increased sampling of taxa insideNeo-Astragalus, irrespective ofwhether parsimony or neighbor-joiningmethods were used or of which particular heuristic search algorithm was used (although more stringent algorithms tended to yield higher support). Three possible explanations for this decline were examined, including (1) mistaken assignment of themost recent common ancestor of the taxon sample (and its bootstrap support) with the most recent common ancestor of the clade from which it was sampled; (2) computational limitations of heuristic search strategies; and (3) statistical bias in bootstrap proportions, especially that from random homoplasy distributed among taxa. The best explanation appears to be (3), although computational shortcomings (2) may explain some of the problem. The bootstrap proportion, as currently used in phylogenetic analysis, does not accurately capture the classical notion of condence assessments on the null hypothesis of nonmonophyly, especially in large data sets. More accurate assessments of condence as type I error levels (relying on iterated bootstrap methods) remove most of the monotonic decline in condence with increasing numbers of taxa. [Bootstrap; phylogeny reconstruction; species richness; taxon sampling.] Astragalus L. is a vast assemblage of >2,500 species and 250+ sections (Lock and Simpson, 1991; Mabberly, 1997) in the angiosperm family Leguminosae (Fabaceae). Distributedmainly in cool arid regions of the Northern Hemisphere and South America, Astragalus is especially diverse in southwest Asia ( » 1,000–1,500 species), the SinoHimalayan region (500 species), western North America ( » 400 to 450 species), and along the Andes in South America (100–150 species). It is also diverse in Mediterranean climates along the west coasts of North and South America and in Europe. Many Astragalus species are narrow endemics, often preferentially distributed in marginal habitats or associatedwith specialized substrates. However, many temperate herbaceous angiosperm genera have similar ecological and biogeographic characteristics without displaying the species-richness of Astragalus. Astragalus, therefore, provides an opportunity for studying evolutionary processes on a nearly unique scale. At the same time, it represents a challenge to the prevailing taxon sampling strategy in phylogenetics, which, almost ofnecessity, reliesonsamplingamodest number of taxa (Kim, 1998). 3Author for correspondence. E-mail: mjsanderson@ ucdavis.edu An important clue to phylogenetic relationships in Astragalus is a close correlation between chromosome number and geographic distribution. Of the » 2,000 Old World species, all but 22 have euploid numbers based on n = 8 (among those assayed). Of the 500 New World species, all but 13 have numbers in an aneuploid series, with n = 11–15. Previous molecular phylogenetic analyses based on fairly small taxon samples supported the monophyly of the almost exclusively New World aneuploid species. This group, referred to as “Neo-Astragalus”, is nested well within Old World euploid taxa. Wojciechowski et al. (1993) sequenced nuclear rDNA internal transcribed spacer (ITS) regions for 14 aneuploid and 12 euploid Astragalus and found bootstrap support for Neo-Astragalus at the 88% level. Corroboration was found in independent chloroplast restriction fragment length polymorphism data sets (Sanderson and Doyle, 1993; Liston and Wheeler, 1994). Recently we completed a much more intensive molecular phylogenetic study in Astragalus, increasing the taxon sampling by vefold (Wojciechowski et al., 1999). Once again, Neo-Astragalus is a clade, but the support—as measured by bootstrap proportions (BP; Felsenstein, 1985)—declined to 64–73%, depending on the search strategy.
منابع مشابه
Improved bootstrap confidence limits in large-scale phylogenies, with an example from Neo-Astragalus (Leguminosae).
Phylogenetic analyses of large data sets pose special challenges, including the apparent tendency for the bootstrap support for a clade to decline with increased taxon sampling of that clade. We document this decline in data sets with increasing numbers of taxa in Astragalus, the most species-rich angiosperm genus. Support for one subclade, Neo-Astragalus, declined monotonically with increased ...
متن کاملCombinability of phylogenies and bootstrap confidence envelopes.
Recently, Lutzoni (1997) used a test described by Rodrigo et al. (1993; hereinafter referred to as the RKB3 test) to compare phylogenies constructed by using ribosomal RNA gene sequences (rDNA) and intergenic transcribed spacer sequences (ITS) from a group of related basidiomycete species. Lut-zoni discussed the relationships amongst the species, but he also commented extensively on the RKB3 te...
متن کاملPhylogenetic systematics of the tribe Millettieae (Leguminosae) based on chloroplast trnK/matK sequences and its implications for evolutionary patterns in Papilionoideae.
Phylogenetic relationships in the tribe Millettieae and allies in the subfamily Papilionoideae (Leguminosae) were reconstructed from chloroplast trnK/matK sequences. Sixty-two accessions representing 57 traditionally recognized genera of Papilionoideae were sampled, including 27 samples from Millettieae. Phylogenies were constructed using maximum parsimony and are well resolved and supported by...
متن کاملIMPROVED BAT ALGORITHM FOR OPTIMUM DESIGN OF LARGE-SCALE TRUSS STRUCTURES
Deterring the optimum design of large-scale structures is a difficult task. Great number of design variables, largeness of the search space and controlling great number of design constraints are major preventive factors in performing optimum design of large-scale truss structures in a reasonable time. Meta-heuristic algorithms are known as one of the useful tools to d...
متن کاملConfidence Limits on Phylogenies: an Approach Using the Bootstrap.
The recently-developed statistical method known as the "bootstrap" can be used to place confidence intervals on phylogenies. It involves resampling points from one's own data, with replacement, to create a series of bootstrap samples of the same size as the original data. Each of these is analyzed, and the variation among the resulting estimates taken to indicate the size of the error involved ...
متن کامل