Metabolites

Editorial

Jump to: Research, Review

2 pages, 161 KiB

Open AccessEditorial

The Intersection of Metabolomics and Data Science

by Seongho Kim

Metabolites 2023, 13(8), 915; https://doi.org/10.3390/metabo13080915 - 4 Aug 2023

Viewed by 692

Abstract

Metabolomics generates a vast amount of data and heavily relies on data science for biological interpretation [...] Full article

(This article belongs to the Special Issue Data Science in Metabolomics)

Research

Jump to: Editorial, Review

14 pages, 2889 KiB

Open AccessArticle

Comparative Analysis of Binary Similarity Measures for Compound Identification in Mass Spectrometry-Based Metabolomics

by Seongho Kim, Ikuko Kato and Xiang Zhang

Metabolites 2022, 12(8), 694; https://doi.org/10.3390/metabo12080694 - 26 Jul 2022

Cited by 3 | Viewed by 1767

Abstract

Compound identification is a critical step in untargeted metabolomics. Its most important procedure is to calculate the similarity between experimental mass spectra and either predicted mass spectra or mass spectra in a mass spectral library. Unlike the continuous similarity measures, there is no [...] Read more.

Compound identification is a critical step in untargeted metabolomics. Its most important procedure is to calculate the similarity between experimental mass spectra and either predicted mass spectra or mass spectra in a mass spectral library. Unlike the continuous similarity measures, there is no study to assess the performance of binary similarity measures in compound identification, even though the well-known Jaccard similarity measure has been widely used without proper evaluation. The objective of this study is thus to evaluate the performance of binary similarity measures for compound identification in untargeted metabolomics. Fifteen binary similarity measures, including the well-known Jaccard, Dice, Sokal–Sneath, Cosine, and Simpson measures, were selected to assess their performance in compound identification. using both electron ionization (EI) and electrospray ionization (ESI) mass spectra. Our theoretical evaluations show that the accuracy of the compound identification was exactly the same between the Jaccard, Dice, 3W-Jaccard, Sokal–Sneath, and Kulczynski measures, between the Cosine and Hellinger measures, and between the McConnaughey and Driver–Kroeber measures, which were practically confirmed using mass spectra libraries. From the mass spectrum-based evaluation, we observed that the best performing similarity measures were the McConnaughey and Driver–Kroeber measures for EI mass spectra and the Cosine and Hellinger measures for ESI mass spectra. The most robust similarity measure was the Fager–McGowan measure, the second-best performing similarity measure in both EI and ESI mass spectra. Full article

(This article belongs to the Special Issue Data Science in Metabolomics)

► Show Figures

Figure 1

15 pages, 1564 KiB

Open AccessArticle

Quantitative Comparison of Statistical Methods for Analyzing Human Metabolomics Data

by Mir Henglin, Brian L. Claggett, Joseph Antonelli, Mona Alotaibi, Gino Alberto Magalang, Jeramie D. Watrous, Kim A. Lagerborg, Gavin Ovsak, Gabriel Musso, Olga V. Demler, Ramachandran S. Vasan, Martin G. Larson, Mohit Jain and Susan Cheng

Metabolites 2022, 12(6), 519; https://doi.org/10.3390/metabo12060519 - 4 Jun 2022

Cited by 8 | Viewed by 2105

Abstract

Emerging technologies now allow for mass spectrometry-based profiling of thousands of small molecule metabolites (‘metabolomics’) in an increasing number of biosamples. While offering great promise for insight into the pathogenesis of human disease, standard approaches have not yet been established for statistically analyzing [...] Read more.

Emerging technologies now allow for mass spectrometry-based profiling of thousands of small molecule metabolites (‘metabolomics’) in an increasing number of biosamples. While offering great promise for insight into the pathogenesis of human disease, standard approaches have not yet been established for statistically analyzing increasingly complex, high-dimensional human metabolomics data in relation to clinical phenotypes, including disease outcomes. To determine optimal approaches for analysis, we formally compare traditional and newer statistical learning methods across a range of metabolomics dataset types. In simulated and experimental metabolomics data derived from large population-based human cohorts, we observe that with an increasing number of study subjects, univariate compared to multivariate methods result in an apparently higher false discovery rate as represented by substantial correlation between metabolites directly associated with the outcome and metabolites not associated with the outcome. Although the higher frequency of such associations would not be considered false in the strict statistical sense, it may be considered biologically less informative. In scenarios wherein the number of assayed metabolites increases, as in measures of nontargeted versus targeted metabolomics, multivariate methods performed especially favorably across a range of statistical operating characteristics. In nontargeted metabolomics datasets that included thousands of metabolite measures, sparse multivariate models demonstrated greater selectivity and lower potential for spurious relationships. When the number of metabolites was similar to or exceeded the number of study subjects, as is common with nontargeted metabolomics analysis of relatively small cohorts, sparse multivariate models exhibited the most-robust statistical power with more consistent results. These findings have important implications for metabolomics analysis in human disease. Full article

(This article belongs to the Special Issue Data Science in Metabolomics)

► Show Figures

Figure 1

23 pages, 2734 KiB

Open AccessArticle

Binary Simplification as an Effective Tool in Metabolomics Data Analysis

by Francisco Traquete, João Luz, Carlos Cordeiro, Marta Sousa Silva and António E. N. Ferreira

Metabolites 2021, 11(11), 788; https://doi.org/10.3390/metabo11110788 - 18 Nov 2021

Cited by 7 | Viewed by 2212

Abstract

Metabolomics aims to perform a comprehensive identification and quantification of the small molecules present in a biological system. Due to metabolite diversity in concentration, structure, and chemical characteristics, the use of high-resolution methodologies, such as mass spectrometry (MS) or nuclear magnetic resonance (NMR), [...] Read more.

Metabolomics aims to perform a comprehensive identification and quantification of the small molecules present in a biological system. Due to metabolite diversity in concentration, structure, and chemical characteristics, the use of high-resolution methodologies, such as mass spectrometry (MS) or nuclear magnetic resonance (NMR), is required. In metabolomics data analysis, suitable data pre-processing, and pre-treatment procedures are fundamental, with subsequent steps aiming at highlighting the significant biological variation between samples over background noise. Traditional data analysis focuses primarily on the comparison of the features’ intensity values. However, intensity data are highly variable between experimental batches, instruments, and pre-processing methods or parameters. The aim of this work was to develop a new pre-treatment method for MS-based metabolomics data, in the context of sample profiling and discrimination, considering only the occurrence of spectral features, encoding feature presence as 1 and absence as 0. This “Binary Simplification” encoding (BinSim) was used to transform several benchmark datasets before the application of clustering and classification methods. The performance of these methods after the BinSim pre-treatment was consistently as good as and often better than after different combinations of traditional, intensity-based, pre-treatments. Binary Simplification is, therefore, a viable pre-treatment procedure that effectively simplifies metabolomics data-analysis pipelines. Full article

(This article belongs to the Special Issue Data Science in Metabolomics)

► Show Figures

Graphical abstract

12 pages, 3395 KiB

Open AccessArticle

MStractor: R Workflow Package for Enhancing Metabolomics Data Pre-Processing and Visualization

by Luca Nicolotti, Jeremy Hack, Markus Herderich and Natoiya Lloyd

Metabolites 2021, 11(8), 492; https://doi.org/10.3390/metabo11080492 - 29 Jul 2021

Cited by 2 | Viewed by 2926

Abstract

Untargeted metabolomics experiments for characterizing complex biological samples, conducted with chromatography/mass spectrometry technology, generate large datasets containing very complex and highly variable information. Many data-processing options are available, however, both commercial and open-source solutions for data processing have limitations, such as vendor platform [...] Read more.

Untargeted metabolomics experiments for characterizing complex biological samples, conducted with chromatography/mass spectrometry technology, generate large datasets containing very complex and highly variable information. Many data-processing options are available, however, both commercial and open-source solutions for data processing have limitations, such as vendor platform exclusivity and/or requiring familiarity with diverse programming languages. Data processing of untargeted metabolite data is a particular problem for laboratories that specialize in non-routine mass spectrometry analysis of diverse sample types across humans, animals, plants, fungi, and microorganisms. Here, we present MStractor, an R workflow package developed to streamline and enhance pre-processing of metabolomics mass spectrometry data and visualization. MStractor combines functions for molecular feature extraction with user-friendly dedicated GUIs for chromatographic and mass spectromerty (MS) parameter input, graphical quality-control outputs, and descriptive statistics. MStractor performance was evaluated through a detailed comparison with XCMS Online. The MStractor package is freely available on GitHub at the MetabolomicsSA repository. Full article

(This article belongs to the Special Issue Data Science in Metabolomics)

► Show Figures

Figure 1

16 pages, 3407 KiB

Open AccessArticle

The mwtab Python Library for RESTful Access and Enhanced Quality Control, Deposition, and Curation of the Metabolomics Workbench Data Repository

by Christian D. Powell and Hunter N.B. Moseley

Metabolites 2021, 11(3), 163; https://doi.org/10.3390/metabo11030163 - 12 Mar 2021

Cited by 8 | Viewed by 2781

Abstract

The Metabolomics Workbench (MW) is a public scientific data repository consisting of experimental data and metadata from metabolomics studies collected with mass spectroscopy (MS) and nuclear magnetic resonance (NMR) analyses. MW has been constantly evolving; updating its ‘mwTab’ text file format, adding a [...] Read more.

The Metabolomics Workbench (MW) is a public scientific data repository consisting of experimental data and metadata from metabolomics studies collected with mass spectroscopy (MS) and nuclear magnetic resonance (NMR) analyses. MW has been constantly evolving; updating its ‘mwTab’ text file format, adding a JavaScript Object Notation (JSON) file format, implementing a REpresentational State Transfer (REST) interface, and nearly quadrupling the number of datasets hosted on the repository within the last three years. In order to keep up with the quickly evolving state of the MW repository, the ‘mwtab’ Python library and package have been continuously updated to mirror the changes in the ‘mwTab’ and JSONized formats and contain many new enhancements including methods for interacting with the MW REST interface, enhanced format validation features, and advanced features for parsing and searching for specific metabolite data and metadata. We used the enhanced format validation features to evaluate all available datasets in MW to facilitate improved curation and FAIRness of the repository. The ‘mwtab’ Python package is now officially released as version 1.0.1 and is freely available on GitHub and the Python Package Index (PyPI) under a Clear Berkeley Software Distribution (BSD) license with documentation available on ReadTheDocs. Full article

(This article belongs to the Special Issue Data Science in Metabolomics)

► Show Figures

Figure 1

18 pages, 4808 KiB

Open AccessFeature PaperArticle

Development of a Microfluidic Platform for Trace Lipid Analysis

by Andrew Davic and Michael Cascio

Metabolites 2021, 11(3), 130; https://doi.org/10.3390/metabo11030130 - 24 Feb 2021

Cited by 3 | Viewed by 1518

Abstract

The inherent trace quantity of primary fatty acid amides found in biological systems presents challenges for analytical analysis and quantitation, requiring a highly sensitive detection system. The use of microfluidics provides a green sample preparation and analysis technique through small-volume fluidic flow through [...] Read more.

The inherent trace quantity of primary fatty acid amides found in biological systems presents challenges for analytical analysis and quantitation, requiring a highly sensitive detection system. The use of microfluidics provides a green sample preparation and analysis technique through small-volume fluidic flow through micron-sized channels embedded in a polydimethylsiloxane (PDMS) device. Microfluidics provides the potential of having a micro total analysis system where chromatographic separation, fluorescent tagging reactions, and detection are accomplished with no added sample handling. This study describes the development and the optimization of a microfluidic-laser induced fluorescence (LIF) analysis and detection system that can be used for the detection of ultra-trace levels of fluorescently tagged primary fatty acid amines. A PDMS microfluidic device was designed and fabricated to incorporate droplet-based flow. Droplet microfluidics have enabled on-chip fluorescent tagging reactions to be performed quickly and efficiently, with no additional sample handling. An optimized LIF optical detection system provided fluorescently tagged primary fatty acid amine detection at sub-fmol levels (436 amol). The use of this LIF detection provides unparalleled sensitivity, with detection limits several orders of magnitude lower than currently employed LC-MS techniques, and might be easily adapted for use as a complementary quantification platform for parallel MS-based omics studies. Full article

(This article belongs to the Special Issue Data Science in Metabolomics)

► Show Figures

Figure 1

19 pages, 1281 KiB

Open AccessArticle

Comprehensive Comparative Analysis of Local False Discovery Rate Control Methods

by Shin June Kim, Youngjae Oh and Jaesik Jeong

Metabolites 2021, 11(1), 53; https://doi.org/10.3390/metabo11010053 - 14 Jan 2021

Cited by 1 | Viewed by 2011

Abstract

Due to the advance in technology, the type of data is getting more complicated and large-scale. To analyze such complex data, more advanced technique is required. In case of omics data from two different groups, it is interesting to find significant biomarkers between [...] Read more.

Due to the advance in technology, the type of data is getting more complicated and large-scale. To analyze such complex data, more advanced technique is required. In case of omics data from two different groups, it is interesting to find significant biomarkers between two groups while controlling error rate such as false discovery rate (FDR). Over the last few decades, a lot of methods that control local false discovery rate have been developed, ranging from one-dimensional to k-dimensional FDR procedure. For comparison study, we select three of them, which have unique and significant properties: Efron’s approach, Ploner’s approach, and Kim’s approach in chronological order. The first approach is one-dimensional approach while the other two are two-dimensional ones. Furthermore, we consider two more variants of Ploner’s approach. We compare the performance of those methods on both simulated and real data. Full article

(This article belongs to the Special Issue Data Science in Metabolomics)

► Show Figures

Figure 1

Review

Jump to: Editorial, Research

32 pages, 902 KiB

Open AccessReview

Mathematical Models for FDG Kinetics in Cancer: A Review

by Sara Sommariva, Giacomo Caviglia, Gianmario Sambuceti and Michele Piana

Metabolites 2021, 11(8), 519; https://doi.org/10.3390/metabo11080519 - 6 Aug 2021

Cited by 2 | Viewed by 2151

Abstract

Compartmental analysis is the mathematical framework for the modelling of tracer kinetics in dynamical Positron Emission Tomography. This paper provides a review of how compartmental models are constructed and numerically optimized. Specific focus is given on the identifiability and sensitivity issues and on [...] Read more.

Compartmental analysis is the mathematical framework for the modelling of tracer kinetics in dynamical Positron Emission Tomography. This paper provides a review of how compartmental models are constructed and numerically optimized. Specific focus is given on the identifiability and sensitivity issues and on the impact of complex physiological conditions on the mathematical properties of the models. Full article

(This article belongs to the Special Issue Data Science in Metabolomics)

► Show Figures

Figure 1

17 pages, 1166 KiB

Open AccessReview

Amino Acid Metabolism in Apicomplexan Parasites

by Aarti Krishnan and Dominique Soldati-Favre

Metabolites 2021, 11(2), 61; https://doi.org/10.3390/metabo11020061 - 20 Jan 2021

Cited by 26 | Viewed by 3696

Abstract

Obligate intracellular pathogens have coevolved with their host, leading to clever strategies to access nutrients, to combat the host’s immune response, and to establish a safe niche for intracellular replication. The host, on the other hand, has also developed ways to restrict the [...] Read more.

Obligate intracellular pathogens have coevolved with their host, leading to clever strategies to access nutrients, to combat the host’s immune response, and to establish a safe niche for intracellular replication. The host, on the other hand, has also developed ways to restrict the replication of invaders by limiting access to nutrients required for pathogen survival. In this review, we describe the recent advancements in both computational methods and high-throughput –omics techniques that have been used to study and interrogate metabolic functions in the context of intracellular parasitism. Specifically, we cover the current knowledge on the presence of amino acid biosynthesis and uptake within the Apicomplexa phylum, focusing on human-infecting pathogens: Toxoplasma gondii and Plasmodium falciparum. Given the complex multi-host lifecycle of these pathogens, we hypothesize that amino acids are made, rather than acquired, depending on the host niche. We summarize the stage specificities of enzymes revealed through transcriptomics data, the relevance of amino acids for parasite pathogenesis in vivo, and the role of their transporters. Targeting one or more of these pathways may lead to a deeper understanding of the specific contributions of biosynthesis versus acquisition of amino acids and to design better intervention strategies against the apicomplexan parasites. Full article

(This article belongs to the Special Issue Data Science in Metabolomics)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Data Science in Metabolomics

Share This Special Issue

Special Issue Editor

Special Issue Information

Published Papers (10 papers)

Editorial

Research

Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI