Comprehensive comparison of in silico MS/MS fragmentation tools of the CASMI contest: database boosting is needed to achieve 93% accuracy
نویسندگان
چکیده
In mass spectrometry-based untargeted metabolomics, rarely more than 30% of the compounds are identified. Without the true identity of these molecules it is impossible to draw conclusions about the biological mechanisms, pathway relationships and provenance of compounds. The only way at present to address this discrepancy is to use in silico fragmentation software to identify unknown compounds by comparing and ranking theoretical MS/MS fragmentations from target structures to experimental tandem mass spectra (MS/MS). We compared the performance of four publicly available in silico fragmentation algorithms (MetFragCL, CFM-ID, MAGMa+ and MS-FINDER) that participated in the 2016 CASMI challenge. We found that optimizing the use of metadata, weighting factors and the manner of combining different tools eventually defined the ultimate outcomes of each method. We comprehensively analysed how outcomes of different tools could be combined and reached a final success rate of 93% for the training data, and 87% for the challenge data, using a combination of MAGMa+, CFM-ID and compound importance information along with MS/MS matching. Matching MS/MS spectra against the MS/MS libraries without using any in silico tool yielded 60% correct hits, showing that the use of in silico methods is still important.
منابع مشابه
The Critical Assessment of Small Molecule Identification (CASMI): Challenges Solutions
The Critical Assessment of Small Molecule Identification, or CASMI, contest was founded in 2012 to provide scientists with a common open dataset to evaluate their identification methods. In this article, the challenges and solutions for the inaugural CASMI 2012 are presented. The contest was split into four categories corresponding with tasks to determine molecular formula and molecular structu...
متن کاملCritical Assessment of Small Molecule Identification 2016: automated methods
BACKGROUND The fourth round of the Critical Assessment of Small Molecule Identification (CASMI) Contest ( www.casmi-contest.org ) was held in 2016, with two new categories for automated methods. This article covers the 208 challenges in Categories 2 and 3, without and with metadata, from organization, participation, results and post-contest evaluation of CASMI 2016 through to perspectives for f...
متن کاملCASMI—The Small Molecule Identification Process from a Birmingham Perspective
The Critical Assessment of Small Molecule Identification (CASMI) contest was developed to provide a systematic comparative evaluation of strategies applied for the annotation and identification of small molecules. The authors participated in eleven challenges in both category 1 (to deduce a molecular formula) and category 2 (to deduce a molecular structure) related to high resolution LC-MS data...
متن کاملMetabolite Identification through Machine Learning — Tackling CASMI Challenge Using FingerID
Metabolite identification is a major bottleneck in metabolomics due to the number and diversity of the molecules. To alleviate this bottleneck, computational methods and tools that reliably filter the set of candidates are needed for further analysis by human experts. Recent efforts in assembling large public mass spectral databases such as MassBank have opened the door for developing a new gen...
متن کاملiMet: A Network-Based Computational Tool To Assist in the Annotation of Metabolites from Tandem Mass Spectra.
Structural annotation of metabolites relies mainly on tandem mass spectrometry (MS/MS) analysis. However, approximately 90% of the known metabolites reported in metabolomic databases do not have annotated spectral data from standards. This situation has fostered the development of computational tools that predict fragmentation patterns in silico and compare these to experimental MS/MS spectra. ...
متن کامل