Metabolite identification through multiple kernel learning on fragmentation trees
نویسندگان
چکیده
MOTIVATION Metabolite identification from tandem mass spectrometric data is a key task in metabolomics. Various computational methods have been proposed for the identification of metabolites from tandem mass spectra. Fragmentation tree methods explore the space of possible ways in which the metabolite can fragment, and base the metabolite identification on scoring of these fragmentation trees. Machine learning methods have been used to map mass spectra to molecular fingerprints; predicted fingerprints, in turn, can be used to score candidate molecular structures. RESULTS Here, we combine fragmentation tree computations with kernel-based machine learning to predict molecular fingerprints and identify molecular structures. We introduce a family of kernels capturing the similarity of fragmentation trees, and combine these kernels using recently proposed multiple kernel learning approaches. Experiments on two large reference datasets show that the new methods significantly improve molecular fingerprint prediction accuracy. These improvements result in better metabolite identification, doubling the number of metabolites ranked at the top position of the candidates list.
منابع مشابه
Fragmentation trees for the structural characterisation of metabolites
Metabolite identification plays a crucial role in the interpretation of metabolomics research results. Due to its sensitivity and widespread implementation, a favourite analytical method used in metabolomics is electrospray mass spectrometry. In this paper, we demonstrate our results in attempting to incorporate the potentials of multistage mass spectrometry into the metabolite identification r...
متن کاملThe pipelined metabolite identification based on MS fragmentation
Structural characterization and identification of components of complex biological mixtures constitutes one of the central aspects of metabolomics. Metabolite identification is a challenging but essential task in studies of biological samples. Mass spectrometry, because of its high sensitivity and specificity, is widely and successfully used in analysis of biological samples. Identification of ...
متن کاملSystematic metabolite identification using Hplc-MSn fragmentation trees and lc-MS-SpE-NMR
s 14 2010
متن کاملUsing fragmentation trees and mass spectral trees for identifying unknown compounds in metabolomics.
Identification of unknown metabolites is the bottleneck in advancing metabolomics, leaving interpretation of metabolomics results ambiguous. The chemical diversity of metabolism is vast, making structure identification arduous and time consuming. Currently, comprehensive analysis of mass spectra in metabolomics is limited to library matching, but tandem mass spectral libraries are small compare...
متن کاملCFM-ID: a web server for annotation, spectrum prediction and metabolite identification from tandem mass spectra
CFM-ID is a web server supporting three tasks associated with the interpretation of tandem mass spectra (MS/MS) for the purpose of automated metabolite identification: annotation of the peaks in a spectrum for a known chemical structure; prediction of spectra for a given chemical structure and putative metabolite identification--a predicted ranking of possible candidate structures for a target ...
متن کامل