Next Article in Journal / Special Issue
Molecular Formula Identification with SIRIUS
Previous Article in Journal
Multiple Roles of Photosynthetic and Sunscreen Pigments in Cyanobacteria Focusing on the Oxidative Stress
Previous Article in Special Issue
Small Molecule Identification with MOLGEN and Mass Spectrometry
Metabolites 2013, 3(2), 484-505; doi:10.3390/metabo3020484

Metabolite Identification through Machine Learning— Tackling CASMI Challenge Using FingerID

1,* , 2
1 Helsinki Institute for Information Technology HIIT; Department of Information and Computer Science, Aalto University, Konemiehentie 2, FI-02150 Espoo, Finland; 2 Institute of Molecular Systems Biology, ETH Zürich, Wolfgang-Pauli Street 16, 8093 Zürich, Switzerland 3 IBISC, Université d'Evry-Val d'Essonne, Bâtiment IBGBI, 23 Bd de France, 91037 cedex Evry,France
* Author to whom correspondence should be addressed.
Received: 1 April 2013 / Revised: 24 May 2013 / Accepted: 30 May 2013 / Published: 6 June 2013
View Full-Text   |   Download PDF [626 KB, uploaded 6 June 2013]   |   Browse Figures


Metabolite identification is a major bottleneck in metabolomics due to the number and diversity of the molecules. To alleviate this bottleneck, computational methods and tools that reliably filter the set of candidates are needed for further analysis by human experts. Recent efforts in assembling large public mass spectral databases such as MassBank have opened the door for developing a new genre of metabolite identification methods that rely on machine learning as the primary vehicle for identification. In this paper we describe the machine learning approach used in FingerID, its application to the CASMI challenges and some results that were not part of our challenge submission. In short, FingerID learns to predict molecular fingerprints from a large collection of MS/MS spectra, and uses the predicted fingerprints to retrieve and rank candidate molecules from a given large molecular database. Furthermore, we introduce a web server for FingerID, which was applied for the first time to the CASMI challenges. The challenge results show that the new machine learning framework produces competitive results on those challenge molecules that were found within the relatively restricted KEGG compound database. Additional experiments on the PubChem database confirm the feasibility of the approach even on a much larger database, although room for improvement still remains.
Keywords: metabolite identification; molecular fingerprints; machine learning; FingerID metabolite identification; molecular fingerprints; machine learning; FingerID
This is an open access article distributed under the Creative Commons Attribution License (CC BY) which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Share & Cite This Article

Further Mendeley | CiteULike
Export to BibTeX |
EndNote |
MDPI and ACS Style

Shen, H.; Zamboni, N.; Heinonen, M.; Rousu, J. Metabolite Identification through Machine Learning— Tackling CASMI Challenge Using FingerID. Metabolites 2013, 3, 484-505.

View more citation formats

Related Articles

Article Metrics

For more information on the journal, click here


[Return to top]
Metabolites EISSN 2218-1989 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert