**MolNetEnhancer: Enhanced Molecular Networks by Integrating Metabolome Mining and Annotation Tools**

**Madeleine Ernst 1,2,\*, Kyo Bin Kang 1,3, Andrés Mauricio Caraballo-Rodríguez 1, Louis-Felix Nothias 1, Joe Wandy 4, Christopher Chen 1, Mingxun Wang 1, Simon Rogers 5, Marnix H. Medema 6, Pieter C. Dorrestein 1,7,8 and Justin J.J. van der Hooft 1,6,\***


Received: 29 May 2019; Accepted: 11 July 2019; Published: 16 July 2019

**Abstract:** Metabolomics has started to embrace computational approaches for chemical interpretation of large data sets. Yet, metabolite annotation remains a key challenge. Recently, molecular networking and MS2LDA emerged as molecular mining tools that find molecular families and substructures in mass spectrometry fragmentation data. Moreover, in silico annotation tools obtain and rank candidate molecules for fragmentation spectra. Ideally, all structural information obtained and inferred from these computational tools could be combined to increase the resulting chemical insight one can obtain from a data set. However, integration is currently hampered as each tool has its own output format and efficient matching of data across these tools is lacking. Here, we introduce MolNetEnhancer, a workflow that combines the outputs from molecular networking, MS2LDA, in silico annotation tools (such as Network Annotation Propagation or DEREPLICATOR), and the automated chemical classification through ClassyFire to provide a more comprehensive chemical overview of metabolomics data whilst at the same time illuminating structural details for each fragmentation spectrum. We present examples from four plant and bacterial case studies and show how MolNetEnhancer enables the chemical annotation, visualization, and discovery of the subtle substructural diversity within molecular families. We conclude that MolNetEnhancer is a useful tool that greatly assists the metabolomics researcher in deciphering the metabolome through combination of multiple independent in silico pipelines.

**Keywords:** chemical classification; in silico workflows; metabolite annotation; metabolite identification; metabolome mining; molecular families; networking; substructures

#### **1. Introduction**

Metabolomics has matured into a research field generating increasing amounts of metabolome profiles of complex metabolite mixtures aiming to provide biochemical insights. Mass spectrometry

has become the workhorse of metabolomics and typical untargeted experiments currently result in qualitative and semiquantitative information on several thousands of molecular ions across tens to hundreds of samples. Technical advances in the last decade have allowed researchers to fragment increasing amounts of mass peaks that result in mass fragmentation spectra (MS/MS or MS2). Metabolite annotation and identification tools have benefited from these advances as now more MS2 spectra per sample can be queried in reference libraries in order to find candidate structures or submitted to in silico tools that propose a putative structure [1–9].

Despite these tremendous advances, a key challenge remaining for metabolomics researchers is to biochemically interpret large-scale untargeted metabolomics studies due to the complexity of the metabolomes represented by mass fragmentation spectra to which actual chemical structures need to be assigned, and for which reference spectra are not available. In biological samples, many metabolites share molecular substructures and form structurally related molecular families (MFs) of various chemical classes, which has inspired metabolome mining tools exploiting these biochemical relationships. Based on the assumption that structurally similar molecules (analogs) generate similar mass spectrometry fragmentation spectra, one can group analogs by comparing their fragmentation spectra resulting in the construction of molecular families. To do this on a larger scale, computational tools have been developed such as molecular networking (MN) [7]. However, to actually annotate structural information additional sources are usually needed such as library matches, candidate structures from libraries or chemical class annotations.

Indeed, since the molecular networking approach was proposed in 2012 [10], numerous complementary metabolome mining workflows as well as annotation and classification tools have been introduced including SIRIUS [3], CSI:FingerID [4], MetFusion [11], MetFamily [12], and many others of which some also use molecular networks as basis [1,2,7,8,13–24] and their combined use for natural product discovery was very recently reviewed [25]. Where tandem mass spectral molecular networking efficiently can group molecular features in molecular families [10], MS2LDA can discover substructures, not only based on common fragment peaks but also common neutral losses, which can aid in further annotation of subfamilies and shared modifications [14]. These metabolome mining tools typically take MS/MS spectra as input, such as the open formats Mascott Generic Format (MGF), the mzML, or mzXML format, and generate tables where a fragmented mass feature is linked to other fragmented mass features or substructure patterns. Reference fragmentation spectra in public repositories are still very few. Thus, on average only 2–5% percent of MS2 spectra acquired in a typical LC–MS/MS experiment can be matched to known molecules [26]. Complementary to library matching, in silico tools such as Network Annotation Propagation (NAP) [8], DEREPLICATOR [1], VarQuest [2], or SIRIUS+CSI:FingerID [4] predict fragmentation spectra in silico from known structures and allow for effective searching in chemical databases for candidate structures. These metabolome annotation tools also take MS/MS spectra as input and typically use precursor masses to find candidate structures in compound databases followed by a ranking of those structures based on the similarity of the predicted and experimental MS/MS data. The output is typically a table with candidate structures found for each mass feature and associated score. These tools typically differ in the compound databases they use to query for candidate structures, or the processing of mass spectrometry data. For example, SIRIUS+CSI:FingerID first builds annotated fragmentation trees before searching molecular structures in large compound databases. DEREPLICATOR and VarQuest are annotation tools that match structures from a large database of Peptidic Natural Products to MS/MS spectra, whereby DEREPLICATOR looks for exact matches and VarQuest also allows for one modified amino acid. It is important to realize that each tool has its own set of parameters that will affect the number of annotated features.

The outputted structural information for each mass feature can be mapped on a molecular network, for example, to show for which mass features library matches or in silico predicted structural matches are available. The recently introduced Network Annotation Propagation (NAP) also exploits the network topology to rerank candidate structure lists based on neighboring matches within molecular

families [8]. Furthermore, when using multiple annotation tools, the structural information they provide may support each other increasing confidence in the annotation.

To assess whether molecular families are of particular interest for your research question, knowing their chemical class may provide sufficient information. The recently proposed ClassyFire tool [16] takes molecular descriptors as SMILES or InchiKeys as input and outputs hierarchical chemical ontology terms. Thus, the candidate structures outputted for each mass feature by the metabolome annotation tools mentioned above can now be automatically chemically classified. When that is done at larger scale for an entire molecular family, one can combine those chemical class terms and assess whether particular terms are enriched.

Taken together, all these recent developments enable the discovery of relations between millions of spectra and the listing of candidate structures from various spectral libraries or alternatively from compound libraries using in silico approaches.

Whilst each of those tools produce useful structural information, their combined application has been hampered by the use of different file formats, platforms, and the challenge to match molecular features across the outputs of these tools. We postulate that whilst each tool provides complementary insights, their combined use allows an increased level of biochemical interpretation, i.e., the sum becomes greater than the individual parts. Furthermore, it would be practically advantageous to combine all these results in one place. We have previously described the integration of Mass2Motifs and chemical classifications with molecular networks to assess the chemical diversity within a subset of species of the plant genus *Euphorbia* [27] and the plant family Rhamnaceae [28]. However, in those studies, integration was achieved using custom in-house scripts in R, hampering adoption by the community. Moreover, the results of the peptide annotation tools DEREPLICATOR and VarQuest were not included in those custom scripts.

Here, we introduce MolNetEnhancer a software package available in Python and R that unites the output of many of the above-mentioned metabolome mining and annotation tools (GNPS molecular networking, MS2LDA substructure discovery, and in silico annotation tools) independent of what dataset it processes, thus making the algorithm accessible in an easy-to-use format to the community (Figure 1). MolNetEnhancer discovers molecular families (MFs), subfamilies, and subtle structural differences between family members. The workflow enhances the currently available molecular networking methods based on either MS-Cluster [29] (classical) or MZmine2 [30] (also called "feature-based molecular networking") and results in annotated molecular networks that can be explored in Cytoscape [31]. We applied MolNetEnhancer to publicly available mass spectrometry fragmentation data ranging from marine-sediment and nematode-related bacteria, to *Euphorbia* and Rhamnaceae plants. Illustrated by four case studies, we demonstrate how our integrative workflow discovers dozens of MFs in large-scale metabolomics studies of these plant and bacterial extracts. Moreover, discovered MFs can be divided into subfamilies using the mapped MS2LDA results. Structural annotation of Mass2Motifs is facilitated by having chemical and structural annotations at hand, for example by recognizing substructures in peptidic molecules. We conclude that our workflow provides chemical refinement of metabolomics results beyond spectral matches through large-scale MF and substructure discovery and annotation by integrating outputs of various tools in one place allowing for enhanced visualization. This also guides the metabolomics researcher in prioritizing MFs to explore and in structurally annotating molecules.

**Figure 1.** Schematic overview of the MolNetEnhancer workflow. Starting with mass spectrometry data in the mzML format obtained from complex metabolic mixtures the user creates (**1**) mass spectral molecular networks in GNPS, (**2**) performs in silico structure annotation (e.g., through NAP, DEREPLICATOR or SIRIUS+CSI:FingerID), and (**3**) performs unsupervised substructure discovery through MS2LDA. Steps 1–3 are performed prior to the MolNetEnhancer workflow within the respective platforms. MolNetEnhancer is then used in (**4**) to map information layers obtained from all three platforms independently on top of each other resulting in network-wide chemical class information and more detailed substructure information within molecular families (as exemplified for the organic acid conjugates in the enlarged part of the triterpenoid molecular family on the right).

#### **2. Materials and Methods**

MolNetEnhancer is a software package available in Python and R that unites the output of several metabolome mining and annotation tools, including mass spectral molecular networking through GNPS, unsupervised substructure discovery through MS2LDA and in silico structure annotation, for example through NAP, DEREPLICATOR, or SIRIUS+CSI:FingerID (Figure 1). Before using the MolNetEnhancer workflow, the user will run each metabolome mining tool separately:


For documentation of steps 1–3 the user is referred to the original publications and guidelines for each tool [1,2,7,8,14]. Section 8 contains links to tutorials of the analysis tools used in this study. Functions implemented in the MolNetEnhancer workflow can then be used to combine the outputs created in step 1–3 such that


MolNetEnhancer is freely available on GitHub at https://github.com/madeleineernst/ pyMolNetEnhancer and https://github.com/madeleineernst/RMolNetEnhancer. Interactive Jupyter example notebooks and a step by step tutorial guide the user to build enhanced mass spectral molecular networks, which are outputted in the graphml format for visualization in Cytoscape.

Currently, two distinct methods from raw data to MNs exist. One method takes all MS2 spectra found in the input files and uses MS-Cluster to prepare a set of representative "consensus" MS2 spectra for molecular networking, and the other method uses MZmine2 for data preprocessing, which performs molecular feature detection at the MS1 level and associates each MS1 feature with its respective MS2 spectra to send off to GNPS Molecular Networking. The here proposed MolNetEnhancer workflow can enrich both these molecular networking methods with Mass2Motif presence and chemical class annotations.

Substructural information retrieved through MS2LDA is integrated in two ways within the mass spectral molecular networks. Shared substructures or motifs between two molecular features are visualized as multiple edges connecting the nodes. Furthermore, motifs found within a molecular feature can be visualized as pie charts, where the relative abundance of each motif represents the overlap score, a score measuring how much of the motif is present in the spectrum [32]. Furthermore, for each molecular family, the x most shared motifs are shown, where x is defined by the user. An example of such a molecular family with motifs mapped is shown in Figure 6 in the results section.

To retrieve the most abundant chemical classes per molecular family, all chemical structures obtained through GNPS library matching, and in silico chemical structural annotation are submitted to automated chemical classification and taxonomy structure using ClassyFire [16]. This retrieves chemical classes for each of the putative structures submitted organized in five hierarchical levels of a chemical taxonomy (kingdom, superclass, class, subclass, and direct parent). For each level of the chemical ontology, a score is calculated, which represents the most abundant chemical class found for the structural matches within the molecular family. It is important to note that a high score does not represent a higher confidence in the true identity of the chemical structures found within the molecular family, but indicates more consistency as more structural matches obtained for this molecular family fall within the same chemical class. Figure 2 exemplifies how the score is calculated. Given a molecular family consisting of six molecular features (nodes), the percentage of nodes classified as cinnamaldehydes, coumarins and derivatives, flavonoids and macrolactames at the chemical class level respectively is calculated. Each molecular feature can have multiple structural matches with multiple (e.g., node 2) or identical (e.g., node 3) chemical classes. A majority of the structural matches obtained in the network shown in Figure 2 were classified as flavonoids (2.25 out of six nodes), thus the molecular family is classified as flavonoids with a chemical classification score at the class level of 0.375 (2.25/6). For single nodes (molecular features which show no spectral similarity with any other molecular features found in the dataset) the chemical classes are retrieved analogously, however, it should be noted that single nodes often result in a very high score, as only one structural match is retrieved, corresponding to a score of 1 (1 node out of 1).

**Figure 2.** Schematic overview of how the chemical classification score is calculated and visualized within a molecular family. (**a**) Schematic overview of hypothetical structural annotations within a molecular family consisting of 6 nodes. Out of the 6 nodes, chemical structural information could be retrieved for 4, where each node can consist of structural annotations to multiple different (e.g., node 2) or identical (e.g., node 3) chemical classes. The total number of nodes per chemical class retrieved is calculated and the most abundant chemical class is assigned to the molecular family, resulting in (**b**). Schematic overview of the molecular family shown in (**a**), classified as 'flavonoids' at the chemical class level by MolNetEnhancer, with a score of 0.375, translating to the majority of the putative structural annotations within this molecular family (2.25) belong to the flavonoid structural class.

#### **3. Results**

#### *3.1. MolNetEnhancer Workflow*

MolNetEnhancer requires inputs from independent metabolome mining tools including mass spectral molecular networking through GNPS, unsupervised substructure discovery through MS2LDA and in silico structure annotation, for example through NAP, DEREPLICATOR or SIRIUS+CSI:FingerID (Figure 1). Provided with these inputs, MolNetEnhancer consists of two independent steps. During the first step, molecular substructures detectable by co-occurring fragment ions or neutral losses, so called Mass2Motifs, are mapped onto a Molecular Network. Each node in the network represents a molecular feature, whereas Mass2Motifs represent substructural features. Most fragmented mass peaks (precursor ions) represent molecular ions, although fragmented mass peaks may also represent adducts of one and the same molecule, in source fragments or doubly-charged peaks [33]. For simplicity, we will refer to any fragmented mass peak as molecular feature throughout the manuscript. Mass2Motifs contained within each molecular feature can be visualized as pie charts on the nodes. Alternatively, Mass2Motifs shared across multiple molecular features can be visualized as multiple lines (edges) connecting the nodes. In a second step, most abundant chemical classes per molecular family based on candidate structures from in silico annotation tools as well as GNPS library matches can be mapped through chemical classification using ClassyFire [16]. A chemical classification score is calculated representing what percentage of nodes within a molecular family are attributed to a given chemical class (see Section 2 and Figure 2 therein). In Sections 3.2–3.5 we show how MolNetEnhancer can accelerate and enrich chemical information retrieval in 4 case studies, comprising two plant and two bacterial publicly accessible datasets. The MolNetEnhancer workflow results in one graphml network file that contains all the structural information obtained from the individual tools. Such a file can be easily imported into network visualization tools such as Cytoscape [31], an environment where additional metadata on the molecular features can be added. In addition, all structural information is also available as tab delimited text files.

#### *3.2. Case Study 1: Annotation of Euphorbia Specialized Metabolites Using MolNetEnhancer*

With more than 2000 species worldwide, the plant genus *Euphorbia* is among the most species-rich and diverse flowering plants on earth [34,35]. Besides exhibiting an extreme diversity in its growth forms and habitat types, the genus has also attracted interest within natural products drug discovery [36,37]. *Euphorbia* species are chemically highly diverse, particularly within macro- and polycyclic diterpenoids, biosynthetically derived from a head-to-tail cyclization of the tetraprenyl pyrophosphate precursor, which have been found to exhibit a range of biological activities with pharmaceutical interest, such as antitumor, antimicrobial or immunomodulatory activity [36]. Ingenol mebutate for example, a diterpenoid originally isolated from *Euphorbia peplus* L. is marketed for the topical treatment of actinic keratosis, a precancerous skin condition [38], however production through plant extraction or chemical synthesis is inefficient and expensive [39,40].

A key interest is therefore to find species within the genus producing higher quantities of ingenol mebutate or other close diterpenoid analogs exhibiting biological activities with pharmaceutical interest. We have previously assessed chemical diversity within a representative subset of species of the plant genus *Euphorbia* [27]. A major challenge is the rapid identification of known and unknown *Euphorbia* diterpenoid structures. Using MolNetEnhancer, we were able to significantly accelerate manual annotation of diterpenoids and retrieve chemical structural information, even for molecular families with no structural matches in the GNPS spectral libraries.

An example of how MolNetEnhancer increases chemical structural information throughout two molecular families is highlighted in Figure 3. Using GNPS spectral library matching, chemical structural information for only one molecular feature was obtained, and manual propagation of the annotation throughout molecular family (i) was limited given that the annotated ion exhibited one neighbor only. No structural information could be retrieved for family (ii), where no chemical structural information was retrieved through GNPS library matching (Figure 3a).

**Figure 3.** MolNetEnhancer increases chemical structural information obtained for *Euphorbia* specialized metabolites. (**a**) Mass spectral molecular network showing two molecular families of *Euphorbia* specialized metabolites. Using GNPS library matching only one molecular feature could be putatively annotated. Manual annotation propagation is limited for family (i) and impossiblefor family (ii). (**b**) Using MolNetEnhancer, substructural Mass2Motifs can be visualized within the network; both molecular family (i) and (ii) contain Mass2Motifs related to a *Euphorbia* diterpene spectral fingerprint (DSF) and molecular family (ii) contains Mass2Motifs related to a nicotinoyl side chain. Mass2Motifs are mapped on the nodes as pie charts with an area proportional to their overlap score, a score measuring how much of the Mass2Motif is present in the spectrum, whereas dotted lines connecting the nodes represent features with a MS2 spectral similarity of a cosine score over 0.6 (**c**) Most chemical structures retrieved for molecular family (i) and (ii) are diterpenoids of the jatrophane, tigliane or ingenane type, which both can result in a DSF with *m*/*z* 313, 295, or 285. Substructures with mass fragments characteristic of these *Euphorbia* DSFs were also found within the Mass2Motifs. Node colors represent most abundant chemical classes, colored lines connecting the nodes represent shared Mass2Motifs, and dotted lines connecting the nodes represent features with a MS2 spectral similarity of a cosine score over 0.6 (**d**) *Euphorbia* diterpenoid skeletons of the jatrophane, deoxy tigliane, or ingenane ester type are found within all *Euphorbia* subgeneric clades, whereas nicotinoyl sidechain modifications are unique to subgenus *Esula*. Node colors represent summed peak area per *Euphorbia* subgeneric clade, colored lines connecting the nodes represent shared Mass2Motifs, and dotted lines connecting the nodes represent features with a MS2 spectral similarity of a cosine score over 0.6.

Using MolNetEnhancer however, we were able to highlight substructural Mass2Motifs within both molecular families (Figure 3b). Substructural Mass2Motifs, putatively annotated as a *Euphorbia* diterpenoid backbone skeleton with mass peaks at *m*/*z* 313, 295, and 285 were found both in molecular families (i) and (ii) (Figure 3b). Manual annotation of these Mass2Motifs was possible by comparing mass fragments of the library spectrum to mass fragments contained in the Mass2Motifs. A mirror plot comparing the GNPS reference spectrum to the unknown spectrum found in our samples is shown in Supplementary Figure S1. The exact *Euphorbia* backbone skeleton type could not be identified unambiguously, as many *Euphorbia* diterpenoid skeletons are isomeric and their respective MS2 spectra are identical or very similar. A *Euphorbia* backbone skeleton with masses at *m*/*z* 313, 295, 285 can either result from a jatrophane, deoxy tigliane, or ingenane ester like skeleton [41,42]. Furthermore, we were able to see that molecular family (ii) contains substructural Mass2Motifs related to a nicotinoyl side chain. Manual annotation of these Mass2Motifs was possible by comparing chemical structures retrieved through NAP in silico structure annotation with mass fragments found in the Mass2Motifs. Motifs 432 and 180 were both found to contain mass peaks at *m*/*z* 106 and 124, possibly resulting from a nicotinoyl side chain and a hydroxylation (Figure 3b). Chemical structures retrieved through in silico annotation or library matching can aid the manual annotation of Mass2Motifs and vice versa annotated Mass2Motifs can aid the propagation of chemical structural information throughout the network. Additionally, chemical structural hypotheses can be reinforced by taking into consideration both substructural information as well as chemical class information obtained through in silico annotation and library matching. Most chemical structures retrieved for molecular family (i) and (ii) were diterpenoids of the jatrophane, tigliane or ingenane type and substructures related to these *Euphorbia* diterpenoid backbone skeletons were also found within the Mass2Motifs (Figure 3c).

In conclusion, using MolNetEnhancer we were able to significantly increase chemical structural annotations obtained from retrieving chemical structural information of one molecular feature through GNPS library matching (Figure 3a), to retrieving chemical structural information at an annotation level 3 (putatively characterized compound classes) according to the Metabolomics Standard Initiative's reporting standards [43] of two molecular families comprising 73 molecular features (Figure 3b–d). Finally, this information allowed us to conclude that within the investigated subset of molecular families *Euphorbia* diterpenoid skeletons of the jatrophane, deoxy tigliane, or ingenane ester type are found within all *Euphorbia* subgeneric clades, whereas nicotinoyl sidechain modifications are unique to subgenus *Esula* (Figure 3d).

#### *3.3. Case Study 2: Annotation of Rhamnaceae Specialized Metabolites*

Another case where we demonstrate the efficiency of MolNetEnhancer for enhancing the chemical annotation of metabolomics data is our previous study on the plant family Rhamnaceae [28]. Rhamnaceae is a cosmopolitan family including about 900 species, and Rhamnaceae species are known for their exceptional morphological and genetic diversity, which are thought to be caused by the wide geographic distribution and different habitats [44]. We applied an MS2-based untargeted metabolomics approach to get insights on the metabolomic diversity of this highly-diversified family, and MolNetEnhancer was used as a key to provide fundamental annotations for MS2 spectra.

As shown in Figure 4a, MolNetEnhancer provided the putative chemical classification of molecular families within the Rhamnaceae molecular network. After combining this chemical class annotations with taxonomic information of each molecular feature, the normalized distribution pattern of different classes of metabolites were analyzed. This revealed that the taxonomic clade Rhamnoid exhibits more diversified flavonoids, carbohydrates, and anthraquinones, while the Ziziphoid clade produces various triterpenoids and triterpenoid glycosides [28].

MolNetEnhancer allowed us to visualize and discover the subtle substructural diversity within the molecular families. In the molecular family of triterpenoid esters, for example, substructural differences of phenolic moieties such as protocatchuate, vanillate, and coumarate were easily recognized by analyzing the distribution of Mass2Motifs 28, 117, 120, and 191 (Figure 4b). Two flavonoid aglycone substructures, kaempferol and quercetin, were also distinguished by analyzing the distribution of Mass2Motifs 86, 130, and 149 in the molecular family of flavone 3-*O*-glycosides (Figure 4c). Mass2Motif 130 contained mass peaks at *m*/*z* 284, 255, and 227, while Mass2Motifs 86 and 149 covered mass peaks at *m*/*z* 300, 271, and 255. These fragment ions are well-known as characteristic fragments of kaempferol 3-*O*-glycosides and quercetin 3-*O*-glycosides [45–47], so these Mass2Motifs could be easily annotated. This case study shows how MolNetEnhancer facilitates the interpretation process and our knowledge on MS2 fragmentation, previously mainly applied manually by experts.

**Figure 4.** MolNetEnhancer increases chemical structural information obtained for Rhamnaceae specialized metabolites. (**a**) Structural annotation for molecular families was suggested based on consensus-based classification of NAP in silico structure annotation. (**b**) Subtle chemical differences of phenolic acid moieties can be visualized within the molecular family of triterpenoid esters based on Mass2Motifs. (**c**) Molecular family annotated as flavonoid glycosides reveals two subfamilies by Mass2Motif mapping: the pink Mass2Motif is related to the kaempferol core structure, whereas the orange and brown Mass2Motifs are related to the quercetin core structure—two related yet distinct flavonoid structures.

#### *3.4. Case Study 3: Large Chemical Diversity Uncovered by Annotating Specialized Metabolites in Marine Sediment Streptomyces and Salinispora Bacterial Extracts*

The MolNetEnhancer workflow was also applied to bacterial data sets to gain more detailed insights into their chemical richness. Crüsemann and coworkers created a molecular network of extracts of the marine sediment bacteria *Salinispora* and *Streptomyces* that formed the basis for this case study [48]. Figure 5 displays the molecular network colored by the most prevalent chemical class annotations. Whilst we can observe that the bacteria also produce a structurally diverse arsenal of molecules, its composition is clearly different from that of the Rhamnaceae plants in Figure 4a. The most prevalent chemical class annotations are "Carboxylic acid and derivatives" and "Prenol lipids" with the first containing peptide-related molecules and the latter containing terpenoid molecules. Both these classes of molecules are known to be produced by *Salinispora* and *Streptomyces* bacteria. The chemical classification scores (see Section 2) for the ClassyFire class and kingdom terms are presented in Supplementary materials Figure S2. These scores aid in assessing chemical novelty and also provide information on the consistency of the chemical class annotations of the structural candidates.

**Figure 5.** Marine sediment *Salinispora*/*Streptomyces* molecular network colored by 15 selected chemical class terms as indicated in the legend. In total, 50 different class terms were annotated in the network using MolNetEnhancer, indicating that the metabolic output of the *Salinispora*/*Streptomyces* strains is chemically very diverse. We can observe that the larger molecular families are mostly annotated with prenol lipids (blue) and carboxylic acids and derivatives (red). Furthermore, for a couple of MFs no chemical class annotations were obtained as no candidate structures were retrieved through any of the annotation tools.

From the 5930 network nodes, we discovered 300 Mass2Motifs using MS2LDA. From those, we could annotate 40 with structural information at various levels of structural details gained from spectral matching with the GNPS libraries or from the in silico annotation tools NAP, DEREPLICATOR, and VarQuest. For example, we could annotate an amino sugar-related Mass2Motif with fragment ions related to two known N,N-dimethyl amino sugars present in known specialized molecules from the bacteria studied [48]: dimethylamino-β-d-xylo-hexopyranoside (rosamicin) and N,N-dimethyl-pyrrolosamine (lomaiviticin) which have overlapping fragment ions and are therefore characterized by the same Mass2Motif. With a frequency of more than 70 throughout the entire molecular network (using probability and overlap score thresholds of 0.1 and 0.3, respectively, for the molecular feature—Mass2Motif connections), the amino sugar Mass2Motif can be used as a handle to identify known and potential novel natural products throughout network. Indeed, the Mass2Motif was found in all members of the Rosamicin MF (Figure 6a) and the Lomaiviticin MF (Supplementary materials Figure S3a). Moreover, the same amino sugar-related Mass2Motif was also found in all members of two yet unknown MFs (Figure 6b, Supplementary materials Figure S3b). In addition, the Mass2Motif was also found in a number of singletons not connected to any MF, often in combination with Mass2Motif 66 as well like we see for the rosamicin-related MF. Mass2Motif 66 represents the presence of an *m*/*z* 116 fragment which is likely also generated by the dimethylated amino sugar; in fact it may point to the dimethylamino-β-D-xylo-hexopyranoside moiety or something very similar as this fragment is absent in spectra from the lomaiviticin MF which contains the different dimethylated amino sugar N,N-dimethyl-pyrrolosamine. In most singletons, no other Mass2Motifs were discovered that could provide clues on the complete structures of these molecules; however, given the presence of the amino sugar moiety they are likely natural products and not core metabolites or contaminants—something that we could not confidently state without using the MolNetEnhancer workflow.

**Figure 6.** Molecular families from marine sediment bacteria with color coded Mass2Motif substructure information mapped on them, with (**a**) rosamicin-related molecular family found through GNPS library hits where all members contain an amino sugar-related motif as colored in blue in its depicted structure—substructures or motifs found within each molecular feature are mapped on the nodes as pie charts, where the relative abundance of each motif represents the overlap score, a score measuring how much of the motif is present in the spectrum. Furthermore, motifs shared between two nodes are visualized as colored continuous lines (edges) connecting the nodes whereas dashed lines (edges) represent a cosine score of over 0.6, (**b**) Yet unknown molecular family that shares an amino sugar-related motif connecting this MF to (**a**) by sharing a substructure, (**c**) tryptophan-related molecular family sharing the Tryptophan Mass2Motif, and (**d**) actinomycin-related molecular family—found through GNPS library hits and further validated with help of DEREPLICATOR results—sharing an Actinomycin related motif across most of its members. The actinomycin D (Daptomycin) structure is depicted with the Mass2Motif substructure highlighted in color: the peptide lactone ring present twice in the molecule. In all MFs, nodes are colored based on Mass2Motif overlap scores and the edges show if cosine score-connected nodes share similar Mass2Motifs. It can be seen that in all families multiple motifs are shared across some of its members.

Another MF displayed in Figure 6c did not return any GNPS library hits; however, all its members shared Mass2Motif 154. Due to its indicative fragment ions, we could annotate this Mass2Motif as tryptophan-related, indicating that all these molecules contain a tryptophan core structure. Based on their shared Mass2Motif, the masses of the molecular features, and their fragmentation patterns, with the help of MolNetEnhancer we could now tentatively annotate this MF as tryptophan-related containing molecules such as small peptides or N-acyltryptophans. Figure 6d shows the peptidic MF of actinomycin-related molecules. The annotation of this MF was guided by DEREPLICATOR and VarQuest annotations as well as the Mass2Motif that 10 of its members shared. We could annotate this Mass2Motif as the peptide lactone ring (depsipeptide moiety) present twice in actinomycins using reference data from literature [49]. The unique combination of four actinomycin-related mass fragments was only present in the 10 MF members, thereby reinforcing the DEREPLICATOR and VarQuest annotations.

Furthermore, mapping the Mass2Motifs on the molecular network means that we can more easily track neutral loss-based motifs such as the loss of an acetyloxy group that was only found in *Streptomyces* MFs. Moreover, inspection of the MFs without annotated chemical classes revealed that they contained some Mass2Motifs with relatively low frequency throughout the data set—something that could point to a unique substructure or scaffold possibly from a unique biosynthesis enzymatic function. For example, Mass2Motif 35 has a frequency of 43 and was present in all four members of the MF in Supplementary materials Figure S3c. It is a mass-fragment-based Mass2Motif and with masses of 142, 100, and 58 Da it could be related to a polyamine-like structural feature. Finally, the MF in Supplementary materials Figure S3d shares the two still unknown loss-based Mass2Motifs 250 and 261 that have frequencies of 26 and 50, respectively. These are examples of Mass2Motifs representing potential novel chemistry that can now be easily tracked in the molecular network.

#### *3.5. Case Study 4: Annotating Peptidic Motifs in Peptide-Rich Xenorhabdus*/*Photorhabdus Extracts*

*Xenorhabdus* and *Photorhabdus* are Gammaproteobacteria that live in symbiotic association with soil-dwelling nematodes of the genus *Steinernema* [50,51]. Eventually as a consequence thereof, they spend a large amount of their resources to the production of specialized metabolites, in particular nonribosomal peptides and polyketides. Tobias and coworkers recently published metabolomics data of 25 *Xenorhabdus* and five *Photorhabdus* strains to explore metabolic diversity amongst these strains [50]. Here, we applied MolNetEnhancer on this publicly available molecular networking data to further probe the chemical diversity previously found. The 6228 network nodes were analyzed with MS2LDA to discover 300 Mass2Motifs. Furthermore, we also submitted the *Xenorhabdus*/*Photorhabdus* molecular networking data to NAP, DEREPLICATOR, and VarQuest to run the MF chemical class annotation pipeline. By far the majority of the 46 annotated motifs were peptide, amino acid, or likely to be peptidic-related which fits with the ClassyFire predicted peptide-related MFs present in the *Xenorhabdus*/*Photorhabdus* extracts with "Carboxylic acids and derivatives" and "Peptidomimetics" as most frequently occurring annotations (see Figure 7, with corresponding chemical classification scores in Supplementary materials Figure S4). We could also annotate an indole-related Mass2Motif which can be part of peptides/amino acids. An exception is the ethylphenyl-related Mass2Motif that was found in 478 molecules (out of 6228 nodes, corresponding to 7.7%) of the *Xenorhabdus*/*Photorhabdus* extracts. This can be explained by the reported production of phenylethylamides, dialkylresorcinoles, and cyclohexadions derivatives by the studied strains [52].

Annotations included Mass2Motifs that form peptidic substructures related to well-known *Xenorhabdus* peptidic families such as the commonly found bioactive rhabdopeptides and the related xenortides [52,53]. We could annotate two rhabdopeptide-related motifs with frequencies of 231 and 186 (3.7% and 3.0% of nodes, respectively). Compared to the structurally less diverse xentrivalpeptides [54] which the Mass2Motif had a frequency of 28, corresponding to 0.45% of the nodes, we can conclude that rhabdopeptide-related molecules are widespread in the *Xenorhabdus*/*Photorhabdus* extracts. The PAX peptides constitute another well-known *Xenorhabdus*/*Photorhabdus* lysine-rich peptide class [55]. The corresponding MF consisted of 13 members; indeed, they shared a Mass2Motif related to lysine (lys) and lys–lys fragments. Similarly, a leucine-leucine Mass2Motif was found in molecules annotated as xenobovid. This motif occurred in 110/6228 (1.8%) nodes pointing to several peptidic families that contain this amino acid motif—in contrast to the lys–lys amino acid motif that is very wide-spread in *Xenorhabdus*/*Photorhabdus* molecules, being present in 1500 (24%) nodes. In total, using the MolNetEnhancer workflow we could annotate 32 peptidic motifs of which we could link 11 to peptides known to be produced by *Xenorhabdus*/*Photorhabdus* strains whilst the other 21 Mass2Motifs represent substructures not yet elucidated. The peptidic nature of these Mass2Motifs was assessed by recognition of typical fragment ion patterns as seen for known peptides as well as doubly charged precursor ions that are often a sign of peptides in these extracts.

**Figure 7.** Nematode symbionts *Photorhabdus*/*Xenorhabdus* network colored by 10 selected chemical class terms as indicated in the legend. In total, 49 different class terms were annotated in the network using MolNetEnhancer. We can observe that the larger molecular families as well as many smaller molecular families are mostly annotated with peptidomimetics (purple) and carboxylic acids and derivatives (red). This is consistent with earlier findings that these nematode symbionts produce a wide array of peptidic products.

With the help of the integrative display of DEREPLICATOR and VarQuest annotation results, we could also annotate two xenoamicin-related peptidic MFs (Figure 8a,b). Xenoamicins are known to be produced by *Xenorhabdus* and eight variants have been described in detail with variants A and B present in peptidic databases [56]. Xenoamicin is a cyclic peptide consisting of a peptidic ring and peptidic tail (see Figure 8d). Interestingly, in one of the annotated MFs, not one but two Mass2Motifs were shared between most of its members (see Figure 8a). With help of DEREPLICATOR-predicted annotations of the fragment ions, we could annotate the Mass2Motif shared by almost the entire MF as being related to the xenoamicin A peptidic ring, whereas the other more abundant Mass2Motif was related to the xenoamicin peptidic tail (Figure 8c, and Supplementary materials Figure S5a,b). These Mass2Motifs are quite specific as we observed that 9 and 6 mass fragments, respectively, were consistently present in more than 75% of the molecular features to which the ring and tail Mass2Motifs were linked. A third Mass2Motif could be putatively annotated as xenoamicin B peptidic ring-related as its masses are +14 Da as compared to the ring A motif and xenoamicin B differs from A with an isobutyl replacing an isopropyl group. Based on the Mass2Motif presence/absence analysis in the

larger MF of 32 members, we observe that 4 have links (overlap score > 0.3) to both ring A and tail motifs, 10 just have the ring A motif, three have only links to the peptidic tail motif, two share both ring A and putative ring B together with the tail Mass2Motif, and two share the putative ring B with the tail Mass2Motif (Figure 8a). Thus, this indicates how MolNetEnhancer increases the resolution in molecular networks by highlighting structural differences in between MF members.

**Figure 8.** Xenoamicin-related molecular families annotated by MolNetEnhancer with (**a**) MF of 32 nodes of which 23 were annotated with at least one xenoamicin modified structure (xenoamicin A or B) by either VarQuest or DEREPLICATOR with VarQuest using 0.005 Da fragment binning assigning most xenoamicin structures (FDRs mostly < 2.5). This MF also contains nodes sharing all Mass2Motifs related to xenoamicin structures with two ring and tail-related Mass2Motifs. Mass2Motif 265 contains mass fragments related to xenoamicin A, whereas masses in Mass2Motif 51 are shifted with 14 Da pointing towards xenoamicin B. The MF consists of singly charged molecular features. (**b**) Related MF of which 20 out of 22 nodes were annotated with xenoamicin modified structures (FDRs mostly < 2.5). This MF only shares the Mass2Motif annotated as xenoamicin tail-related and consists of doubly-charged precursor ions. (**c**) Xenoamicin A spectrum in the ms2lda.org environment with (top) ring-related Mass2Motif highlighted and (bottom) tail-related Mass2Motif highlighted with the corresponding blue and red colors as in (**a**) and (**b**). (**d**) VarQuest annotation of xenoamicin modified peptide where a ring proline indicated in brown is likely methylated. All light blue peaks in the mass spectrum were annotated by VarQuest. The red part in the xenoamicin structure corresponds to the selected fragment of *m*/*z* 537.348, which includes the tail part, whereas the light blue amino acid is annotated to be modified with a mass shift of 14.013 Da that likely corresponds to a methylation. Indeed, the Mass2Motif related to the xenoamicin tail is found in this fragmentation spectrum, whereas the ring Mass2Motif is absent.

We could also find additional MFs and singletons in which the xenoamicin ring or tail Mass2Motif was present, pointing to related peptidic molecules not linked through the modified cosine score. Further inspection with help of VarQuest annotations strengthened these annotations as VarQuest annotated modified amino acids in both rings (Figure 8, Supplementary materials Figure S5e,f) and the tail region (Supplementary materials Figure S5c,d) of xenoamicin many of which, to our knowledge, have not been reported yet, such as the one highlighted in Figure 8d where the ring-proline is likely methylated (the ring A motif is not linked to this molecular feature). In fact, xenoamicin A was annotated as variant from xenoamicin B (Supplementary materials Figure S5f) where the modified amino acid (demethylation) corresponds to previous literature findings [56], further increasing our trust in these *in silico* approaches. The smaller MF of 22 nodes consisted of doubly-charged precursor ions where no ring-related Mass2Motifs were assigned. Some members like xenoamicin A appeared in both MFs as singly and doubly charged precursor ions; the differences in motif distributions between the two MFs indicates that the initial charge has an impact on the fragmentation pathways and thus the acquired spectra given that we know the ring A is part of xenoamicin A.

Altogether, this example highlights how the MolNetEnhancer approach facilitates fragmentation based metabolomics analysis workflows by increasing the "structural resolution", the discovery of more xenoamicin variants than previously described, and highlighting previously unseen connections between MFs and molecules. Furthermore, the integrative approach enabled straightforward annotation of Mass2Motifs found in the xenoamicin MF by using the VarQuest fragment ion annotations as guide for Mass2Motif feature annotation. Both Mass2Motif and VarQuest results strengthened each other since when predicted amino acid changes occurred in the peptidic ring, the corresponding ring-related Mass2Motif was absent, and vice versa—made possible by combining the outputs of several *in silico* tools together.

#### **4. Discussion**

Although significant advances have been made in molecular mining workflows, chemical annotation as well as classification tools [1–4,7,8,10,14–16], chemical structural annotation remains the major and most challenging bottleneck in mass spectrometry-based metabolomics as most of our biological interpretations rely on annotated structures [8,26,57]. MolNetEnhancer is a workflow that combines chemical structural information retrieved from different *in silico* tools, thus increasing structural information retrieved and enhancing biological interpretation. Here, we have chosen a representative number of *in silico* tools covering mining, annotation, and chemical annotation to provide the user with different chemical insights. Although we used DEREPLICATOR and NAP to exemplify *in silico* annotation tools here, MolNetEnhancer is platform independent, meaning that chemical structures retrieved from any *in silico* annotation platform could be used given the molecular feature identities correspond across all molecular mining and annotation tools.

Particularly in natural products research, the rapid annotation of known (i.e., dereplication) as well as unknown specialized metabolites from complex metabolic mixtures hinders interpretation in an ecological, agricultural or pharmaceutical context. Many specialized metabolites from natural sources are used as pharmaceuticals [58], in agriculture [59], or nutrition [60]; however, their discovery is inherently slow due to the above-mentioned limitations. To highlight how MolNetEnhancer can accelerate chemical structural annotation in complex metabolic mixtures from natural sources, we exemplified its use on four plant and bacterial datasets.

In the plant genus *Euphorbia*, we were able to retrieve chemical structural information of previously described pharmaceutically highly valuable diterpenoid skeletons corresponding to an annotation level 3 according to the Metabolomics Standard Initiative's reporting standards [43]. The use of different tools combined in one data format with MolNetEnhancer allowed both for the retrieval of complementary information as well as the reinforcement of putative annotations, in cases where two independent tools pointed to the same chemical structural conclusion. Used separately, none of the tools were able to retrieve as much chemical structural information as when combined in MolNetEnhancer. Likewise, MolNetEnhancer allowed for the annotation of triterpenoids chemistries with several distinct phenolic acid modifications (e.g., vanillate, protocatechuate) in the plant family Rhamnaceae. In *Salinispora* and *Streptomyces* bacterial extracts, MolNetEnhancer aided the annotation of a previously unreported tryptophan-based MF, and a xenoamycin-related MF in the Gammaproteobacteria of the genus *Xenorhabdus* and *Photorhabdus* could be studied in more detail than in previous studies.

It is of utmost importance to note that results retrieved from MolNetEnhancer summarize results retrieved from third-party software and manual inspection and validation of all structural hypotheses remain essential. However, MolNetEnhancer significantly aids the manual inspection and validation process conducted by the expert, by making substructural as well as chemical class information readily available and visible within one data resource. As exemplified in the case studies, MolNetEnhancer can for example help in prioritizing molecular families within a molecular network, which consists of many hundreds to thousands of molecular features, be it by highlighting different chemical classes of interest or molecular families, for which only very few structural hypotheses could be retrieved, potentially highlighting novel chemistry.

Limitations introduced through data acquisition on different mass spectrometric instrument types do also apply to MolNetEnhancer. Acquiring data on different instruments can cause different MS2 fragmentation patterns, thus in some cases leading to different structural hypotheses through library matching or *in silico* structure prediction [61]. Also, the presence of low quality and/or chimeric MS2 spectra is a challenge for mass spectrometry annotation tools as the one described here, and methods that are capable of filtering-out these spectra before proceeding with *in silico* annotation tools will improve our confidence in *in silico* spectral annotation [62].

These limitations highlight the importance of good practices during data acquisition and processing to minimize the time spent analyzing mass spectrometry artefacts and improving the confidence in any downstream annotations. Here, the use of feature-based molecular networking could also help to focus the analysis on those molecular features that are very likely molecular ions [63]—and it has the added benefit that MS1 differential abundance information from LC–MS peak picking is available on the molecular features as well.

Apart from limitations caused by experimental conditions, analysis bias can be introduced for structural predictions based on chemical structures available in public databases, which are still limited especially for particular compound classes. This is particularly true for the chemical class annotations provided through ClassyFire, which rely on collecting correct or structurally closely related candidate structures from compound databases. The chemical annotation score was implemented to guide the researcher in assessing how consistent the chemical annotations are and for how many molecular features at least one candidate structure is found. The peptidic annotations by DEREPLICATOR and VarQuest come with scores, *p*-values, and false discovery rates to assess confidence in the annotations. Using MolNetEnhancer, it is now also possible to explore the consistency in peptidic annotations within MFs, along with their associated Mass2Motifs, which also assist in improving confidence in the annotations, as we have shown for the xenoamicin MFs in the nematode symbiont bacteria where the majority of the MFs were annotated with xenoamicin variants.

One limitation of the use of MS2LDA on the bacterial datasets is that most noncyclic peptidic molecular families do not share any motifs as typically analogues differ by modifications such as methylation or hydroxylation causing a shift in *m*/*z* in most of their mass fragment peaks. Incorporation of amino acid-related mass differences as features for MS2LDA could be a route to also discover Mass2Motifs for noncyclic peptides. As it is, cyclic peptides do often contain one or more Mass2Motifs and peptides containing positively charged amino acids such as lysine and leucine have this structural information represented by Mass2Motifs. Furthermore, many Mass2Motifs are currently still unannotated, which hampers fast structural analysis. To partially solve this bottleneck, MotifDB (www.ms2lda.org/motifdb) was recently introduced [64] and the here annotated Mass2Motif sets from the four case studies are made available through MotifDB for matching against Mass2Motifs found in other MS2LDA experiments. Furthermore, this will allow to use a combination of "supervised" (annotated) Mass2Motifs and "unsupervised" (free) Mass2Motifs in future MS2LDA experiments on data of related samples thereby accelerating structural annotation since part of the motifs already discovered do not need to be reannotated.

Despite the limitations discussed above, MolNetEnhancer assists in metabolite annotations by its combined analysis of chemical class annotations, structural annotations, and Mass2Motif annotations. If these annotations support each other, as for example for the actinomycin MF in the marine sediment bacteria, there is more confidence that these in silico annotations will indeed be correct. It is noteworthy that the modularity of MolNetEnhancer allows for complementary sources of structural information to be added on in future. We showed that MolNetEnhancer is a practical tool to annotate the chemical space of complex metabolic mixtures using a panel of complementary in silico annotation tools for mass spectrometry based metabolomics experiments. Although we have highlighted the use of MolNetEnhancer using two plant and bacterial datasets, MolNetEnhancer is sample type-independent and may be used for any mass spectrometry-based metabolomics experiment, where chemical structural annotation and interpretation is of interest. Future work will focus on making the complete MolNetEnhancer workflow available within the GNPS platform in order to further increase its user friendliness. Currently, the chemical classification workflow is available to run within the GNPS framework directly outputting an annotated network (see URL in code availability Section 7). Furthermore, the integration of other existing and future metabolome mining and annotation tools in the output of MolNetEnhancer is also planned to extend on the initial set of in silico tools that it currently can combine.

#### **5. Conclusions**

MolNetEnhancer is a powerful tool to accelerate chemical structural annotation within complex metabolic mixtures through the combined use of mass spectral molecular networking, substructure discovery, in silico annotation as well as chemical classifications provided by ClassyFire. The MolNetEnhancer workflow is presented both as an open source Python module and R package, allowing easy access and usability by the community as well as the possibility for customization and further development by integration into future collaborative modular tools and by integration of other existing or future metabolome mining and annotation tools. Whilst its use was showcased using natural product examples, we expect that MolNetEnhancer will also enhance biological and chemical interpretations in other scientific fields such as clinical and environmental metabolomics.

#### **6. Data Availability**

Publicly available mass spectrometry fragmentation data sets from four studies were used for this study. Details on how samples and data were collected can be found in the original studies [27,28,48,50]. Here, we list links to the different analyses that were done on each of the studies. Through these links, all used settings and parameters can be retrieved.

Data from case studies1&2 illustrating MolNetEnhancer applied to feature-based molecular networking are publicly accessible through the links listed below.

Case study 1: *Euphorbia* study—combined analysis of 43 *Euphorbia* plant extracts


Case study 2: Rhamnaceae study—combined analysis of 71 Rhamnaceae plant extracts


GNPS example study used in Jupyter notebook to show MolNetEnhancer based on feature-based molecular networking—subset of American Gut Project:


Data from case studies 3 & 4 illustrating MolNetEnhancer applied to classical molecular networking are publicly accessible through the links listed below.

Case study 3: Marine-sediment bacteria study—combined analysis of 120 *Salinospora* and 26 *Streptomyces* bacterial strain extracts


Case study 4: Nematode symbionts study—combined analysis of 25 *Xenorhabdus* and 5 *Photorhabdus* bacterial strain extracts


• GNPS DEREPLICATOR 0.05 job: http://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=


GNPS example study used in Jupyter notebook to show MolNetEnhancer based on classical molecular networking—drug metabolism in set of sputum samples:


#### **7. Code Availability**

The MolNetEnhancer package in R including Jupyter notebooks with an exemplary analysis workflow for mapping Mass2Motifs and chemical class annotations onto classical and feature-based molecular networks is publicly accessible at https://github.com/madeleineernst/RMolNetEnhancer and the MolNetEnhancer package in Python including Jupyter notebooks with an exemplary analysis workflow for mapping Mass2Motifs and chemical class annotations onto classical and feature-based molecular networks is publicly accessible at https://github.com/madeleineernst/pyMolNetEnhancer. A beta version of the MolNetEnhancer workflow is also available from within GNPS: https://gnps.ucsd. edu/ProteoSAFe/index.jsp?params=%7B%22workflow%22:%22MOLNETENHANCER%22%7D. This currently outputs the chemical class annotated molecular network by user provided task ids to the individual jobs run within GNPS.

#### **8. Tutorials**

Tutorials to get familiar with individual tools from which the output is combined with MolNetEnhancer can be found here.

GNPS molecular networking: https://ccms-ucsd.github.io/GNPSDocumentation/networking DEREPLICATOR/VarQuest: https://ccms-ucsd.github.io/GNPSDocumentation/dereplicator Network annotation propagation: https://ccms-ucsd.github.io/GNPSDocumentation/nap

ClassyFire: http://classyfire.wishartlab.com MS2LDA: https://ccms-ucsd.github.io/GNPSDocumentation/ms2lda/ http://ms2lda.org/user\_guide MolNetEnhancer workflow tutorials in both R and Python can be found here: https://github.com/madeleineernst/pyMolNetEnhancer https://github.com/madeleineernst/RMolNetEnhancer

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2218-1989/9/7/144/s1, Figure S1: Mirror plot comparing molecular feature with *m*/*z* 614.30 and RT 373.17 (black) to GNPS reference spectrum of a jatrophane diterpenoid (green), Figure S2: (a) Marine sediment *Salinispora*/*Streptomyces* molecular network colored by chemical classification scores for annotated chemical class terms and (b) same molecular network colored by chemical classification scores for annotated chemical kingdom terms. Light gray means no database matches were found. The higher the class score, the more consistent the chemical annotations are. The kingdom scores represent the database coverage of nodes across a molecular family with scores closer to zero representing families with fewer nodes that have at least one database hit. Whilst most MFs do have database matches for all or most nodes, the consistency in chemical class annotations is—apart from some exceptions—less (indicated by the more orange/pink colors in the left panel). This indicates that for many MF family members the right molecular structures might not yet be present in the structural databases used, Figure S3: Molecular families from marine sediment bacteria with color coded Mass2Motif substructure information mapped on them, with (a) lomaiviticin-related molecular family where all members contain an amino sugar related motif, (b) yet unknown molecular family that shares an amino sugar related motif, (c) yet unknown molecular family sharing an unknown fragment-based motif occurring 0.7% in the marine sediment data set, and (d) yet unknown molecular family sharing unknown loss-based motifs occurring 0.4% (Mass2Motif 250) and 0.8% (Mass2Motif 261) in the marine sediment data. In all MFs, nodes are colored based on motif overlap scores and the edges present similar colors to show if cosine score-connected nodes share similar Mass2Motifs. It can be seen that in most families multiple motifs are shared across some of its members, Figure S4: (a) Nematode symbionts *Photorhabdus*/*Xenorhabdus* network colored by chemical classification scores for annotated chemical class terms, and (b) same molecular network colored by chemical classification scores for annotated chemical kingdom terms. Light gray means no database matches were found. The higher the class score, the more consistent the chemical annotations are. The kingdom scores represent the database coverage of nodes across a molecular family with scores closer to zero representing families with fewer nodes that have at least a database hit. We observe database coverages of close to 1 for most molecular families; however, some molecular families have a lower coverage with a few nodes that return candidate structures. Furthermore, we observe that the chemical class annotation is not always consistent indicating that manual inspection and validation of those hits remains essential, Figure S5: Xenoamicin Mass2Motif mass feature frequency plots for (a) Mass2Motif related to xenoamicin peptidic ring and (b) xenoamicin peptidic tail. It can be observed that many mass fragments are present in at least 75% of the associated molecular features (9 and 6 for ring and tail Mass2Motif, respectively) with a few mass fragments present in nearly all associated molecular features. (c,d) Examples of annotated xenoamicin A modified structures in which only the ring Mass2Motif was found. Indeed, we observe that VarQuest annotates a modified amino acid (addition and loss of) in the tail region of xenoamicin A indicated in orange. (e,f) Examples of annotated xenoamicin B modified structures in which only the ring Mass2Motif was found. Indeed, we observe that VarQuest annotates a modified amino acid (double water addition, loss of methyl) in the ring region of xenoamicin B indicated in orange. The structures of xenoamicin A and B differ in one methyl group on the amino acid highlighted in orange in (f) where B has an isobutyl group and A an isopropyl group. In fact, the structure of xenoamicin A is correctly annotated by VarQuest to this fragmented doubly charged ion.

**Author Contributions:** Conceptualization, M.E., S.R. and J.J.J.v.d.H.; methodology, J.J.J.v.d.H. and M.E.; software M.E., C.C., M.W., J.W. and S.R.; validation, J.J.J.v.d.H., K.B.K., A.M.C.-R. and L.-F.N.; formal analysis, J.J.J.v.d.H., M.E. and K.B.K.; supervision, J.J.J.v.d.H.; writing—original draft preparation, J.J.J.v.d.H. and M.E.; writing—review and editing, J.J.J.v.d.H., M.E., K.B.K., A.M.C.-R., L.-F.N., J.W., C.C., M.W., S.R., M.H.M. and P.C.D.; visualization, M.E., J.J.J.v.d.H. and K.B.K.; funding acquisition, J.J.J.v.d.H., P.C.D. and M.H.M.

**Funding:** J.J.J.v.d.H. was funded by an ASDI eScience grant, ASDI.2017.030, from the Netherlands eScience Center—NLeSC. K.B.K. was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MIST) (No. 2019R1F1A1058068). A.M.C.R. and P.C.D. were supported by USA National Science Foundation grant IOS-1656481. S.R. and J.W. were supported by EPSRC EP/R018634/1. SR was supported by BBSRC BB/R022054/1.

**Acknowledgments:** The authors thank all research groups that made their metabolomics data publicly available so it could be reused in the current study. Yannick Djoumbou Feunang (University of Alberta, Canada) is thanked for his support with the use of ClassyFire and Ricardo da Silva (UCSD, USA) for scientific discussions and feedback on the methodology and workflow.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **In Silico Optimization of Mass Spectrometry Fragmentation Strategies in Metabolomics**

**Joe Wandy 1,†, Vinny Davies 2,†, Justin J. J. van der Hooft 3, Stefan Weidt 1, Rónán Daly <sup>1</sup> and Simon Rogers 2,\***


Received: 28 August 2019; Accepted: 2 October 2019; Published: 9 October 2019

**Abstract:** Liquid chromatography (LC) coupled to tandem mass spectrometry (MS/MS) is widely used in identifying small molecules in untargeted metabolomics. Various strategies exist to acquire MS/MS fragmentation spectra; however, the development of new acquisition strategies is hampered by the lack of simulators that let researchers prototype, compare, and optimize strategies before validations on real machines. We introduce Virtual Metabolomics Mass Spectrometer (ViMMS), a metabolomics LC-MS/MS simulator framework that allows for scan-level control of the MS2 acquisition process in silico. ViMMS can generate new LC-MS/MS data based on empirical data or virtually re-run a previous LC-MS/MS analysis using pre-existing data to allow the testing of different fragmentation strategies. To demonstrate its utility, we show how ViMMS can be used to optimize *N* for Top-N data-dependent acquisition (DDA) acquisition, giving results comparable to modifying *N* on the mass spectrometer. We expect that ViMMS will save method development time by allowing for offline evaluation of novel fragmentation strategies and optimization of the fragmentation strategy for a particular experiment.

**Keywords:** liquid chromatography–mass spectrometry (LC/MS); fragmentation (MS/MS); data-dependent acquisition (DDA); simulator; in silico

#### **1. Introduction**

Liquid chromatography (LC) tandem mass spectrometry (MS/MS) is commonly used to identify small molecules in untargeted metabolomics. In this setup, chemicals elute through the liquid chromatographic column at different retention times (RTs) before entering the mass spectrometer and potentially undergoing fragmentation. Fragmentation produces distinct patterns of fragment peaks at different mass-to-charge ratios (*m*/*z*s) that can be used to annotate chemical structures [1,2]. The choice of fragmentation strategy, which determines how precursor ions are selected for further fragmentation in tandem mass spectrometry, is an important factor affecting the coverage and quality of MS/MS spectra available for subsequent analysis. Many strategies exist to perform fragmentation, including data-independent acquisition (DIA) and data-dependent acquisition (DDA), and new and improved fragmentation strategies are constantly being introduced [3,4]. However, evaluating and comparing different strategies is challenging since the chemicals present in the samples in untargeted metabolomics studies are generally unknown, making it hard to judge whether a certain strategy leads to optimal MS/MS coverage. Currently, this is usually done by trying different fragmentation settings on the instrument followed by manual inspection for the samples of interest.

An appealing alternative way to evaluate fragmentation strategies is using a simulator, which can replicate the underlying LC-MS/MS processes and allow researchers to prototype and compare strategies before validation on the actual MS instrument. Although some mass spectrometry simulators exist they are typically focused on proteomics and do not include simulation of the MS2 acquisition strategy within a chromatographic run [5–10]. Additionally, existing simulators do not allow for real-time control of scan events (such as programmatically determining which *m*/*z* ranges to scan at a particular retention time), a crucial function for developing novel fragmentation strategies that can be controlled through libraries available with modern mass spectrometers, e.g., using the Instrument Application Programming Interface (API) available for Thermo Tribrid instruments [11] that has begun to generate interest within the mass spectrometry community (e.g., [12]).

In this work, we introduce Virtual Metabolomics Mass Spectrometer (ViMMS) a modular LC-MS/MS simulator for metabolomics that allows for real-time scan-level control of the MS2 acquisition process in silico. ViMMS works by creating a set of chemical objects, each with its own chromatogram, RT and intensity, fragmentation spectra and propensity to generate particular adducts. These can be created from a list of known metabolites (for example from the Human Metabolome Database, HMDB [13]) or from chromatographic peaks extracted in experimental .mzML files. A selection of controllers that implement different fragmentation strategies are available, including standard Top-*N* strategies but also MS1-only simulation as they also form a part of LC-MS/MS experiments. Using the appropriate controllers, users can benchmark and test different strategies and obtain simulated results in mzML format (the entire simulator state can also be saved for inspection later).

The idea of ViMMS is to offer the functionality of simulating MS1 and MS2 generation processes, but also to be modular enough that additional features are easily integrated in the framework. The development of a simulator such as ViMMS makes it possible to optimize MS/MS acquisition in silico without having to use valuable instrument time. Please note that our proposed tool is not a new acquisition strategy itself, but rather a framework that makes the development, testing, and benchmarking of such acquisition strategies easier. All code and examples are available at https://github.com/sdrogers/vimms and demonstrate how our simulator framework can be used in an interactive setting via Jupyter Notebooks. We demonstrate the utility of ViMMS with two examples: first we perform an experiment to vary *N* (the number of precursor peaks selected for fragmentation in standard Top-*N* DDA fragmentation strategy) in silico as well as the dynamic exclusion window (DEW) that is used to exclude ions from fragmentation for a certain time and evaluate how changing those parameter settings affects fragmentation coverage and the quality of MS1 peak picking. We then validate these results by comparing the ViMMS output against experimental data when both *N* and dynamic exclusion window parameters are varied. Secondly, we use the simulator to reproduce key results from a novel fragmentation strategy, data-set-dependent acquisition (DsDA) [4], demonstrating how ViMMS can be used to compare fragmentation strategies before implementation in an actual MS instrument.

#### **2. Materials and Methods**

#### *2.1. LC-MS/MS Materials and Methods*

#### 2.1.1. Samples

Beer and urine samples (labelled *multi-beer* and *multi-urine*) from a previously published study [14] are used in our experiments. Here we briefly summarize the sample preparation and analytical platform for the *multi-beer* and *multi-urine* in [14]. 19 different beers were collected from bottles over a period of 5 months and frozen immediately after sampling. 22 urine samples were obtained from a clinical trial of an anonymized cohort of elderly hypertensive patients who were administered several drugs, including antihypertensives. 5 μL of beer/urine was extracted in 200 μL of chloroform/methanol/water (1:3:1) at 4 ◦C, vortexed for 5 min at 4 ◦C and centrifuged for 3 min (13,000 g) at 4 ◦C. The resulting supernatant was stored at −80 ◦C until analysis, and a pooled aliquot of the 22 selected urine samples and 19 beer samples were prepared prior to LC-MS/MS runs.

On top of the existing *multi-beer* and *multi-urine* samples, we also introduce newly generated beer data in this study. One beer extract (labelled *BeerQCB*) was selected for repeated and reproducible sampling across this experiment. An English premium bitter (Black Sheep Ale, 4.4 %) was purchased from a local supermarket. Beer metabolites were extracted by addition of chloroform and methanol to the ratio of 1:1:3 (*v*/*v*/*v*), as previously described, except for the total volume being scaled up to 100 mL. The solution was thoroughly mixed using a vortex mixer, before protein and other precipitates removed by centrifuging at 14,000 rpm at 4 ◦C for 10 min. The supernatant was removed, and aliquots stored at −80 ◦C until needed.

#### 2.1.2. Liquid Chromatography

All samples underwent liquid chromatography separation under the following experimental conditions: a Thermo Scientific UltiMate 3000 RSLC liquid chromatography system was used for HILIC separation with a SeQuant ZIC-pHILIC column using a gradient elution with (A) 20 mM ammonium carbonate and (B) acetontrile. 10 μL of each sample was injected onto the column with initial conditions of 80% (B). A linear gradient from 80% to 20% (B) over 15 min, a wash of 5% (B) for 2 min, before re-equilibration at 80% (B) for 7 min (QE) or 9 min (Fusion). A constant flow rate of 300 μL/min was used. The column oven was maintained at a constant temperature of 25 ◦C (QE) or 40 ◦C (Fusion). Blank runs, quality control samples and three standard mixes were prepared according to the standard procedures at Glasgow Polyomics [14,15].

#### 2.1.3. Mass Spectrometer Acquisition

The *multi-beer* and *multi-urine* datasets were acquired using a Q-Exactive orbitrap mass spectrometer for LC-MS/MS. All full-scan spectra were acquired in positive ion mode only, with a fixed resolution of 70,000, with mass range 70–1050 *m*/*z*. Ions were isolated with 1.0 *m*/*z* width and fragmented with stepped HCD collision energy of 25.2, 60, 94.8% for both positive and negative ion modes. Fragmentation spectrum were acquired with the orbitrap mass analyzer with resolution of 17,500. Top 10 ions with an intensity threshold ≥1.3E5 were selected for fragmentation and then added to a dynamic exclusion window for 15 s. For more mass spectrometer acquisition details, we refer to [14].

The new validation dataset (*BeerQCB*) to simulate fragmentation performance in Section 3.4 was generated using an orbitrap fusion tribrid-series mass spectrometer. All full-scan spectra were acquired in positive ion mode only, with a fixed resolution of 120,000, with mass range 70–1000 *m*/*z*. To investigate the instrument performance with differing Top-*N* and dynamic exclusion windows, filters such as intensity threshold and monoisotopic peak determination were not used. This allowed for a consistent number tandem MS scans to be acquired under varying Top-*N* parameters and dynamic exclusion window (DEW), for *N* = (1, 2, 3, 4, 5, 10, 15, 20, 35, 50) and *DEW* = (15, 30, 60, 120). Ions were isolated with 0.7 *m*/*z* width and fragmented with fixed HCD collision energy of 25%. Fragmentation spectrum were acquired with the orbitrap mass analyzer with resolution of 7500.

#### 2.1.4. Data Transformation

Raw files from acquisition were converted into mzML format using MSconvert (Proteowizard). In the evaluation of the Top-*N* controller in Sections 3.2 and 3.3, two of the *multi-beer* samples (labelled *multi-beer-1* and *multi-beer-2*) and two of the *multi-urine* samples (labelled *multi-urine-1* and *multi-urine-2*) are used. In Section 3.5 to evaluate DsDA, all *multi-beer* and *multi-urine* samples are used. Section 3.4 uses only the *BeerQCB* samples.

#### *2.2. Computational Methods*

#### 2.2.1. Overall Framework

The overall schematic for ViMMS can be found in Figure 1. ViMMS works by first creating *chemical* objects which represent the possible metabolites in a sample (Section 2.2.2). These objects contain information which defines how each chemical appears when scanned by the virtual mass spectrometer (yellow box in Figure 1). To get the information to fill the chemical objects, we create a database of spectral features from experimental data from which we can sample (Section 2.2.3). Regions of interest (ROIs) representing groups of mass traces that could potentially form chromatographic peaks are also extracted from experimental files and assigned to chemicals (Section 2.2.4). Unlike other simulators, e.g., [5–9], chemical objects can also be associated with fragment spectra that could themselves be extracted from spectral databases or generated using *in silico* fragment prediction methods (Section 2.2.5). *In silico* scan simulation in ViMMS (yellow box in Figure 1) proceeds as follows. A virtual mass spectrometer takes the list of chemical objects as input and generates MS1 and MS2 scans at appropriate RTs (Section 2.2.6). Scan parameters are determined by a controller that implements a particular fragmentation strategy. (Section 2.3). The proposed framework is designed to be completely modular such that a variety of situations and different fragmentation strategies can be tested. Finally, using the psims library [16] simulated results can be written as mzML files for further analysis in other tools. The entire state of the simulator over time can also be saved for inspections using the built-in pickle function in Python.

**Figure 1.** An overall schematic of ViMMS. (**A**) Synthetic Sample Workflow: Chemical objects in ViMMS can be created by sampling for compound formulae, mz, RT, and intensity values from the spectral feature database. (**B**) Existing Sample Workflow: alternatively, chemical objects can be created by extracting regions of interest from a single mzML file and converting them to chemical objects. (**C**) Yellow box: chemical objects are processed in the virtual mass spectrometer during in silico scan simulations. A controller performs parameter updates on the mass spectrometer depending on the fragmentation strategy implemented in the controller. Simulated results can be written as mzML files.

#### 2.2.2. Chemical Objects

Chemical objects in ViMMS can be created in two ways—by sampling chemical formulae from a relevant database and then associating them with chromatographic peaks, or by re-running an existing analysis. In the first method, formulae are first sampled from a metabolite database such as HMDB [13] (Synthetic Sample Workflow in Figure 1). Each chemical is given a starting RT (the first RT at which they will appear when scanned) and a maximum intensity value by sampling from the spectral feature database in Section 2.2.3. Based on the spectral information they contain, the chemical objects are able to generate MS1 peaks for the relevant adducts and isotopes, with the intensity of the chemical object split between the various adduct and isotope combinations and the *m*/*z* values being calculated based on the chemical's assigned formula. Distributions over adduct intensities can be specified by the user. Finally, an ROI with a similar maximum intensity to the chemical is chosen (Section 2.2.4), and fragment peaks assigned (Section 2.2.5). It is also possible to generate multiple related samples of chemicals, whether biological or technical replicates. To do this we introduce independent Gaussian noise to the maximum intensity values and allow chemicals to be excluded from samples with a certain dropout probability.

When real data is available and the user wishes to re-run the same data under different fragmentation strategies, chemicals can also be extracted from an existing mzML file (Existing Sample Workflow in Figure 1). Here ROIs in the file are extracted and converted to chemical objects (we make the simplifying assumption that each ROI corresponds to a single unknown chemical). Unknown chemicals created in this manner will generate a single trace in the output (as opposed to multiple traces where multiple adducts and isotopes are generated.

#### 2.2.3. Spectral Feature Database

To generate data, we create a database of spectral features extracted from actual experimental data. This database is used to sample the features associated with a chemical including the *m*/*z*, RT, and maximum intensity values of observed MS1 and MS2 peaks, as well the number of fragment peaks found for typical scans. During simulation, the database is also used by the controller to sample for the duration of each scan (Section 2.2.6). To construct this database, users provide their data in mzML format. pymzML [17] is then used to load the input mzML files, extract the necessary features and construct the database which is stored as Python pickled format. In the case of Synthetic Sample Workflow in Figure 1, the database also stores information on the small molecules extracted from an external metabolite database such as HMDB.

#### 2.2.4. ROI Extraction and Normalization

ROIs are extracted using our Python [18] re-implementation of the ROI extraction procedure of XCMS's CentWave algorithm [19] originally available in the R programming language [20], although ROIs could have easily been extracted with alternative software such as MZmine [21]. First, spectra in an mzML file are loaded using pymzML. Then the ROI extraction algorithm loops over all scans, extracting the raw traces (recorded in centroid mode) from observed spectra. This results in a list of peak features of (*m*/*z*, RT, intensity) values. Features are first filtered to remove any that have an intensity below some user-defined threshold. The current *m*/*z* value is matched to find existing ROIs that it could fall into within a mass tolerance window, defined as the window above and below the mean *m*/*z* of the ROI. If no match exists, then this feature forms its own ROI and gets added to the list of existing ROIs. ROIs that are not added to are closed and put aside. ROIs that contain fewer data points than a user-defined threshold parameter are discarded. Finally, ROIs are normalized so their *m*/*z* values are centered around 0, RT values start at 0, and intensity values are scaled to have a maximum of 1, such that they can be assigned to chemicals.

#### 2.2.5. MS2 Scan Generation

The MS2 scan generation process in ViMMS is modular and allows for different methods to be selected for generating and associating MS2 fragments to chemical objects. In our current implementation, two baseline methods are provided. The first is to assign *m*/*z* and intensity values to fragment peaks by randomly sampling from the spectral feature database in Section 2.2.3. This works for experiments where we can make the simplifying assumptions that fragment peaks are completely independent across scans. To reflect a more realistic scenario where groups of fragment peaks may co-occur in multiple fragmentation spectra [22], we provide a second method of assigning MS2 peaks in a fragmentation scan by following a truncated Chinese Restaurant Process (CRP) [23]. This allows for a fragment peak to have a greater likelihood to be selected again if it has been selected before in

previous scans. The truncated CRP process follows the standard process of a CRP, but prevents the same MS2 peak being assigned to the same fragmentation spectra more than once. The modular nature of ViMMS means that it would be straightforward to incorporate MS2 prediction methods such as CFM-ID [24,25] or NEIMS [26].

#### 2.2.6. Scan Time

For accurate simulation of duty cycles, we sample scan durations of MS1 and MS2 scans from the spectral feature database in Section 2.2.3. Based on the MS level of the previous scan, as well as that of the scan about to be undertaken, the time for the scan about to take place is drawn from the times of those scans in the database which represent the relevant scan transition. The only time that this is not the case is when the DsDA controller is used (Section 2.3.3) as we have a fixed timing schedule. Scan times sampled in this manner will almost always not correspond to values observed in the original files from which the ROIs were extracted. This causes some difficulty with determining the intensity and *m*/*z* values of the chemicals that would be observed at this time, as they will not have previously been observed. To overcome this, we use a simple interpolation scheme (the trapezium rule) between the two nearest scans, which gives us estimates of the intensity and *m*/*z* values that would be expected for any chemical object at the previously unobserved RT.

#### *2.3. Controllers*

ViMMS is designed to be flexible, and to achieve this aim, we separate the simulation of mass spectrometer (generating spectra from chemicals) and the fragmentation strategy (determining which precursors to fragment) in the framework. Generating spectra from chemicals is implemented inside a virtual mass spectrometer, while different fragmentation strategies are implemented as controllers. To simulate a scan, the virtual MS iterates through chemical objects that each generate MS1 or MS2 peaks depending on the current RT and the MS level requested by the controlled. The virtual MS is also responsible for broadcasting events, such as when a new scan is generated or when acquisition is started or has been finished. Controllers can subscribe to these events and act upon them, for example by directing the virtual MS to perform different scans according to the current fragmentation strategy (yellow box in Figure 1). It is relatively straightforward to implement various controllers that perform different fragmentation strategies. Each controller is designed such that it is separate from the virtual mass spectrometer, allowing controllers to interact with either the virtual MS or with an actual MS instrument through an application programming interface as a future work.

#### 2.3.1. MS1 Controller

The MS1 controller is designed to replicate the process of generating MS1 full scans by a mass spectrometer. Given a start and end RT range, the MS1 controller steps through time and generates scans from chemicals. A scan therefore consists of *m*/*z* and intensity pairs for those chemicals that are currently eluting. The timings of the scans are determined based on experimental data by sampling from the spectral feature database, as described in Section 2.2.6. Scan results can be exported as an mzML file and viewed in standard programs such as TOPPView [27].

#### 2.3.2. Top-*N* DDA Controller

The Top-*N* controller performs standard DDA acquisition. In each duty cycle, the controller first performs an MS1 scan to establish the most intense precursor ions, followed by up to *N* fragmentation scans depending on the number of precursor ions selected for further fragmentation. To generate fragmentation scans, the Top-*N* precursor ions (in descending order of intensities) in the initial MS1 scan are isolated and fragmented. A dynamic exclusion window (DEW) is used to prevent precursor ions that have recently been analyzed from being fragmented again. In the controller, we also provide a threshold on the minimum MS1 intensity for a precursor ion to be selected for fragmentation.

#### 2.3.3. DsDA Controller

The DsDA [4] controller attempts to optimize fragmentation strategy over several similar samples. DsDA keeps track of which precursor ions have been fragmented in previous samples, and prioritizes those that have high MS1 intensity and have either not been fragmented, or have been fragmented producing low quality MS/MS spectra.

Implementing the full DsDA analysis pipeline in ViMMS requires the following process. First the DsDA controller, written in Python, calls the Top-*N* controller to perform an initial DDA analysis (for the first sample) using a fixed timing schedule. Once the initial DDA analysis is complete, the resulting mzML file is analyzed using the original DsDA scan prioritization algorithm written in R (available from https://github.com/cbroeckl/DsDA). This involves picking peaks and comparing the picked peaks to what has previously been fragmented. This information is used to determine at what *m*/*z* and RT locations new fragmentation scans should be performed. The prioritization algorithm attempts to get the highest quality MS/MS spectra for as many different precursor ions as possible. To avoid missing novel precursor ions that may not have appeared before, DsDA also includes an option called 'MaxDepth' which increases the probability of sampling rare features that the prioritization algorithm was originally designed to devalue. The resulting schedule is used for the analysis of the next sample using the Python-based DsDA controller, a process that is automatically repeated until all the samples have been analyzed.

#### **3. Results**

#### *3.1. MS1 Simulations*

To demonstrate the ability of ViMMS to simulate MS1 scans generated by chemicals from a metabolite database, we create a sample consisting of 6500 chemicals from HMDB and use the 19 full-scan experimental beer data from the *multi-beer* dataset to generate the spectral feature database (Synthetic Sample Workflow in Figure 1). The MS1 controller (Section 2.3.1) in ViMMS is used to perform a full-scan MS1 simulation. Simulation results are exported as an .mzML file and loaded into Jupyter Notebook for further analysis (all example notebooks can be found in our code repository).

Figure 2 shows examples of snapshots of full-scan chromatograms in TOPPView [27] for the actual experimental *multi-beer-1* sample (Figure 2A) and a simulated sample created in ViMMS (Figure 2B). The resulting spectra show similar characteristics to each other in terms of the shapes of the peaks and how they are observed in a full-scan samples. Individually the peaks appear at the different locations and with different profiles as a result of the simulation process, with the aim here not to directly copy the real beer sample, but create a sample with similar overall properties. A further demonstration of the similarity of the samples can be seen in boxplots of the XCMS picked peaks characteristics (RT, *m*/*z*, log intensity) shown in Figure 1 of the supplementary materials. A user could also produce similar results with alternative peak picking algorithms such as MZmine.

#### *3.2. Top-N Simulations*

We now show an example of using the Top-*N* controller, available from ViMMS (described in Section 2.3.2). This controller accepts as input a list of chemicals objects and performs MS2 fragmentation simulation by isolating precursor (MS1) ions and producing scans containing product (MS2) peaks. To check that our Top-*N* simulation processes reflect reality, we conduct an experiment where existing chromatographic peaks from the *multi-beer-1* fragmentation file are loaded into the simulator (Existing Sample Workflow in Figure 1). Top-10 DDA fragmentation is performed using the Top-*N* controller and the resulting output compared to the original input file. The aim here is to assess how much our simulated file differs from the actual fragmentation file given the same input ROIs and similar fragmentation parameters (*N*=10, DEW=15 s). A visual snapshot of resulting spectra in TOPPView can be found in Figure 2 of the supplementary materials and a comparison of when and where the fragmentation events occurred can be seen in Figure S3 of the Supplementary Materials.

**Figure 2.** Real and simulated example outputs. (**a**) A region from the *beer-multibeers-1* LC/MS data. (**b**) A region from an LC/MS datafile generated by randomly generating peaks (mz, RT, intensity, chromatographic shape) from a database of peaks extracted from all *multi-beer* data.

Figure 3a shows the number of MS1 and MS2 scans completed over time for the true and simulated scenarios. The total number of scans is very similar in both cases, as can also be seen in Table S1 in the Supplementary Materials. Figure 3b shows that the situations in which the simulator and actual data do not match typically involve low intensity precursors. Investigating the differences between the simulation and the real data in detail, we observe what seems to be unpredictable behavior from the mass spectrometer. For example, in some cases it fragments 9 instead of 10 ions (even when other ions are present above the minimum intensity that should not be excluded due to a previous fragmentation event), and on some occasions it fragments ions despite them being below the minimum intensity threshold. These differences might be due to our handling of the data in centroid mode (and the real MS controller operating in profile mode), and there will also be a small difference due to our randomly sampled scan times. Overall, however, we are confident that the behavior of the simulator is close enough to reality and that our Top-*N* controller captures the most important fragmentation events and can be used for further experiments in subsequent sections.

(**a**) Cumulative number of scans (**b**) Intensities of matched precursors

**Figure 3.** Figures showing (**a**) the cumulative number of MS1 and MS2 scans over time for real and simulated data, and (**b**) matched precursors from the actual *multi-beer-1* data to the simulated data. Most precursors that could be matched (blue) have higher intensities than those that cannot be matched (red).

#### *3.3. Varying N in Top-N Simulations*

Choosing *N* in DDA is a critical part of method development. Increasing *N* ought to give better MS/MS coverage as more ions are fragmented. However, increasing *N* too far will result in many ions being fragmented below their minimum intensity threshold (even if they were above the minimum during the initial MS1 scan). In addition, larger *N* reduces the frequency of MS1 scans, which will have a detrimental effect on MS1 peak picking. ViMMS allows us to objectively investigate this trade-off, providing a strong evidence base for method development.

Consider a typical scenario where within an experimental batch, only Top-*N* DDA is performed and no full-scan data are available (an alternative scenario where both full-scan and Top-*N* data are acquired is also considered in Section S3 of the Supplementary Material). In this case, it is standard to use only peaks picked from the MS1 scans (which we call *MS1 features*) in the DDA fragmentation files for further analysis. As already mentioned, increasing *N* could result in greater fragmentation coverage since more precursor ions are fragmented but also potentially fewer MS1 features from the fragmentation file due to fewer MS1 data points available for peak picking. Evaluating the best Top-*N* parameter that results in an optimal trade-off between fragmentation coverage and peak picking performance can be challenging on real data, but it is possible in a simulated environment such as ViMMS.

To perform this simulated experiment, first an existing full-scan file is loaded into ViMMS. The Top-*N* DDA controller (Section 2.3.2) can be run with a variety of different *N*s and the results evaluated. Based on these results we can choose the best *N* for future experiments on similar samples for that mass spectrometer. Given actual experimental full-scan MS1 files, the effect of varying *N* to simulated fragmentation coverage and peak picking quality can be evaluated with respect to the ground truth MS1 features found in both the full-scan and fragmentation files. For evaluation, the following definition of positive and negative instances (illustrated in Figure 4) is proposed:

**Figure 4.** Definitions of True Positives (TP), False Positives (FP), True Negatives (TN) and False Negatives (FN) for performance evaluation of Top-*N* DDA fragmentation strategy. The blue circle in the Venn diagram refers to all MS1 features that are fragmented above the minimum MS1 intensity threshold, the green circle refers to all MS1 features found by XCMS' CentWave from the full-scan file, while the red circle refers to all MS1 features found by CentWave from the fragmentation file.


**False Negatives (FN)**: MS1 features not from ground truth (not found in fragmentation file but found in full-scan file) that are not fragmented or fragmented below the minimum intensity threshold.

**True Negatives (TN)**: MS1 features not from ground truth (found in fragmentation file but not found in full-scan file) that are not fragmented or fragmented below the minimum intensity threshold.

It is worth noting that this evaluation strategy uses picked peaks as a ground truth. Peak picking is a process known to not be entirely accurate, although we believe that this represents a meaningful evaluation metric given the widespread use of peaking picking in metabolic analyses.

In our experiment, four existing Top-10 DDA files from the *multi-beer* and *multi-urine* samples are loaded into ViMMS using the Existing Sample Workflow in Figure 1. For each sample (labelled *multi-beer-1*, *multi-beer-2*, *multi-urine-1* and *multi-urine-2* respectively), DDA fragmentation is simulated using the Top-*N* controller in ViMMS. The parameter *N* for Top-*N* is varied in the range *N* = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, ... , 100) in the simulator, while other parameters are fixed following Section 3.2. In this experiment we also fix the dynamic exclusion window (DEW) to 15 s and the minimum MS1 intensity to fragment to 1.75 × 105 based on the actual parameters that were used to generate the data. Our results are evaluated in terms of precision, recall, numbers of peaks picked and *F*<sup>1</sup> score (*Precision* = *TP*/(*TP* + *FP*), *Recall* = *TP*/(*TP* + *FN*), *F*<sup>1</sup> = (2 ∗ *Precision* ∗ *Recall*)/(*Precision* + *Recall*)). To obtain the ground truth for evaluation, we performed peak picking using XCMS' CentWave on both the full-scan and simulated fragmentation files using the parameters in [14].

Using the simulator, we observe that increasing *N* produces an initial increase followed by a decrease in precision (Figure 5a), suggesting that with greater *N*, more peaks in the ground truth are being fragmented but this benefit is rapidly cancelled out by a fast increase in the number of false positives. Similarly, recall increases with *N* initially but decreases (Figure 5b), suggesting that with greater *N*, more precursor ions from ground truth MS1 features are fragmented—up to the point when all possible precursor ions above the minimum intensity threshold of 1.75 × 105 are selected. We can explain this trade-off between precision and recall due to the fact that as fragmentation coverage increases (with greater *N*), fewer ground truth peaks are detected from the fragmentation files (Figure 5c). The quality of MS1 chromatographic peak shapes in the fragmentation file becomes poorer since more duty cycle time is spent performing MS2 than MS1 scans, reducing the number of good-quality MS1 features that can be found by XCMS from the fragmentation files. Assessing the *F*<sup>1</sup> score (Figure 5d), which is the harmonic average of precision and recall and is representative of overall fragmentation performance, we see that the best *F*<sup>1</sup> score can be found at *N* = 10. This is the same as the actual value of *N* used to generate the data (*N* = 10) obtained by expert judgement. The results here demonstrate how a simulated environment such as ViMMS can be used to quantify the trade-off between fragmentation coverage and peak picking performance.

**Figure 5.** *Cont*.

**Figure 5.** Figures showing (**a**) precision, (**b**) recall, (**c**) the number of peaks picked, and (**d**) F1 score for peak picking performance as *N* changes in Top-*N* DDA experiments in ViMMS based on the classification specifications given in Figure 4.

#### *3.4. Varying Multiple Parameters in Top-N Simulations*

To validate the use of ViMMS for Top-*N* method development, we now show how ViMMS compares to data generated at a wide range of *N* and *DEW* times. In the previous Section 3.3 DEW is fixed to 15 s for all values of *N*; however our hypothesis is that the best fragmentation performance can be obtained by optimizing both parameters simultaneously. Here we evaluate the ability of ViMMS to suggest the parameter combinations that provide the best fragmentation performance and compare the results to actual experimental data.

To validate simulated results, we generated a large real dataset in which the same sample, *BeerQCB* (introduced in Section 2.1) was fragmented using all combinations of *N* = (1, 2, 3, 4, 5, 10, 15, 20, 35, 50) and *DEW* = (15, 30, 60, 120). The minimum MS1 intensity threshold to fragment was completely disabled for this experiment to allow a consistent number of MS scans to be acquired under the different scenarios (see Section 2.1.3). To generate simulated data in ViMMS, we extracted ROIs from a full-scan MS1 analysis of the *BeerQCB* sample using the Existing Sample Workflow in Figure 1. These ROIs were used as input to the Top-*N* controller using the same ranges of parameters for *N* and *DEW* as the real data. For evaluation, peak picking using XCMS was performed on the full-scan and fragmentation mzML files, and fragmentation performance was computed on both real and simulated data following Section 3.3.

Inspecting parameter combinations in the heatmaps of Figure 6 we see a high level of agreement between the performance obtained from the simulated data, and that obtained from the real measurements. Optimal performance is observed in both cases for *N* = 20 and *DEW* = 30*s* although regions of high performance for both real and simulated results can be found at *N* = (10, 15, 20) and *DEW* = (15, 30, 60). Ranges of *N* that are either too large or too small demonstrate decreased performance in Figure 6a,b. Please note that the difference in optimum value in this experiment when compared with the previous one is explainable due to the use of a different MS platform (Q-Exactive orbitrap versus Fusion Tribid orbitrap).

Overall our findings demonstrate that ViMMS can be used to optimize Top-*N* acquisition methods in silico before actually running the experiment on a real MS instrument—something of great benefit to the community. Additional results are given in Section 4 of the supplementary materials.

**Figure 6.** Fragmentation performance in terms of F1 score for (**a**) an actual *BeerQCB* sample, (**b**) simulated results from ViMMS.

#### *3.5. DsDA Simulations*

Finally, we show how ViMMS can be used to benchmark fragmentation strategies that work on multiple samples, such as DsDA [4] (Section 2.3.3). To benchmark DsDA using ViMMS, we generate synthetic data where samples are almost identical using the Synthetic Sample Workflow in Figure 1. To do this, 6500 chemical objects are generated by sampling formulae from HMDB (the *multi-beer* data is used to construct the spectral feature database). 20 samples are created from these chemical objects by adding independent Gaussian noise (with standard deviation set to 10,000) to the maximum intensities of the chemicals in the original sample. These 20 samples will have peaks in the same RT and *m*/*z* locations but with a slight variation in how intense they are. We compare the results from DsDA, DsDA MaxDepth and Top-4 DDA fragmentation strategies (*N* = 4 was chosen as that is the default option for DsDA). Following the original DsDA study, performance is evaluated in terms of how many of the aligned peaks found by XCMS are successfully fragmented above a minimum intensity of 1.75 × <sup>10</sup>5. Our experiment shows that DsDA and DsDA MaxDepth clearly outperform Top-4 DDA strategy in terms of how many chemicals they successfully fragment (Figure 7a). This is consistent with the results from the original DsDA study.

**Figure 7.** Top 4, DsDA and DsDA MaxDepth performance for in terms of the number of chemicals fragmented for (**a**) the simulated samples, (**b**) the *multi-beer* samples and (**c**) the *multi-urine* samples.

As a further investigation, we also compare the methods on the *multi-beer* and *multi-urine* data using the Existing Samples Workflow in Figure 1. ROIs are extracted from the full-scan mzML files of the two datasets and converted into chemical objects allowing us to virtually re-run the data under the DsDA fragmentation strategy using real chromatographic peaks. The result in Figure 7b,c shows that unlike previous results on synthetic data, here Top-4 DDA fragmentation strategy clearly gives the best performance in fragmenting the most peaks picked by XCMS, and no difference can be observed between DsDA and DsDA MaxDepth. Since DsDA prioritizes precursor peaks to fragment in a run based on previously seen runs, we explain the results here by the fact that the beer and urine samples are not similar enough for the DsDA strategy to be effective.

To confirm this, we return to our synthetic data and investigate the performance of the different methods as increasing numbers of chemicals are randomly removed from each sample. We consider scenarios where we randomly remove 0%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50% of chemicals from each samples, meaning that on average samples will become less similar. In these samples, a given chemical object will appear in any two samples with a probability of 1, 0.90, 0.81, 0.72, 0.64, 0.56, 0.49, 0.42, 0.36, 0.30 and 0.25, respectively. In all cases, we generate 5 samples to run through the DsDA analysis. Figure 8 shows the number of chemicals fragmented above a minimum intensity of 1.75E5 after all five samples are processed by both the DsDA and Top-4 DDA fragmentation strategies in the different scenarios. The results show that DsDA performs well when the samples are similar, but as the samples becomes less similar the performance drops and DsDA is comfortably outperformed by the Top-4 DDA fragmentation strategy. Hence, as samples become more different, a Top-4 strategy should be preferred, but where samples are very similar (e.g., technical replicates), DsDA is likely to be more efficient.

Such experiments would be very challenging to do in reality. This example demonstrates how ViMMS can provide insight into the scenario in which a certain fragmentation strategy will be successful.

**Figure 8.** DsDA and Top 4 DDA performance in terms of the number of chemicals fragmented over multiple simulated datasets with varying dropout. In each scenario a percentage of chemicals are dropped from the sample (0%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%), meaning that on average samples will become less similar. In these samples, a given chemical object will appear in any two samples with a probability of 1, 0.90, 0.81, 0.72, 0.64, 0.56, 0.49, 0.42, 0.36, 0.30 and 0.25, respectively.

#### **4. Discussion and Conclusions**

In this paper, we introduce ViMMS, the first simulator specifically targeted at mass spectrometry fragmentation-based metabolomics that is modular, easily extensible, and can be used for the development, testing, and benchmarking of different fragmentation strategies. Processing MS2 data (particularly identifying spectra) is generally considered to be more challenging in metabolomics than in proteomics [28]. An in silico simulator such as ViMMS, which can be used to generate realistic-looking full-scan and fragmentation spectra based on either existing data or by sampling from a database of known metabolites, can be used to alleviate this problem. In this work, our experiments show how our proposed simulator can be used to help optimize acquisition methods in silico through two examples: Top-*N* DDA fragmentation, and DsDA. It is also important to note that our simulator could be used to create datasets on which novel data processing methods could be benchmarked.

The results from our experiments show that the spectral data generated from ViMMS have a strong resemblance to data produced from MS instruments. Our experiments with the Top-*N* and DsDA controllers in Sections 3.2–3.5 demonstrate that despite some minor differences in output, the proposed simulator framework can be useful in investigating, understanding and comparing the characteristics of different fragmentation strategies. Furthermore, we provide insights in when best to use Top-*N* and DsDA fragmentation methods; something that is not that easily and cheaply done using experimental data.

When developing acquisition methods, selecting the *N* that provides the highest fragmentation performance and number of detected peaks can be challenging, particularly in the typical scenario where the full-scan data is assumed to be absent and peak picking quality from fragmentation files is therefore important for subsequent analysis. We demonstrated how ViMMS can be used to suggest *N* for use for similar future samples on the MS instrument. Our results show how ViMMS can be used to explore parameter combinations for a particular fragmentation strategies in silico for existing data, virtually re-run existing data under an alternative strategy and benchmark existing fragmentation strategies (like DsDA) with minimal modifications under the proposed framework. This is a capability not available from other simulators [5–10]. On top of fragmentation data, ViMMS can also be used to benchmark and perform comparative evaluation of different LC-MS data processing algorithm, such as peak picking and retention time alignment [29] in a more controlled manner. In each of these cases, ViMMS also has the potential to help develop new methods by allowing them to be evaluated in a scenario where the ground truth is known, and little machine time needed.

The modular nature of ViMMS means that as future work, we can extend it with different and improved noise models and test noise reduction approaches, additional improvement to MS1/MS2 spectral data generations through incorporating fragmentation spectra prediction methods such as CFM-ID [24,25] or NEIMS [26], as well as retention time predictions from chemical structures [15]. Expanding the capabilities in ViMMS by us or others (all code is open source) in the future will further enhance its utility. The target users of ViMMS are currently algorithmic and LC-MS/MS method developers. ViMMS is available as a Python package that can be accessed from Python scripts and interactive environments such as Jupyter Notebook from where users can point to their own spectral files or compound lists to start using ViMMS on their own data. However, for end-users who are not comfortable with scripting, we aim to build an easy-to-use graphical user interface on top of ViMMS. Finally, we plan to use the proposed framework to develop and evaluate novel model-based fragmentation strategies that produces the highest coverage of MS1 and MS2 fragmentation in real time.

**Supplementary Materials:** A variety of additional text, tables and figures are available online at http://www. mdpi.com/2218-1989/9/10/219/s1.

**Author Contributions:** Conceptualization, S.R.; methodology, J.W., V.D., R.D. and S.R.; software, J.W. and V.D.; validation, J.W., V.D., J.J.J.v.d.H., S.W., R.D. and S.R.; data curation, J.J.J.v.d.H. and S.W.; writing–original draft preparation, J.W. and V.D.; writing–review and editing, J.W., V.D., J.J.J.v.d.H., S.W., R.D. and S.R.; supervision, R.D. and S.R.

**Funding:** EPSRC project EP/R018634/1 on 'Closed-loop data science for complex, computationally and data-intensive analytics'.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Sample Availability:** The original 19 *multi-beer* data from [14] are available from GNPS MassIVE MSV000081119 (https://massive.ucsd.edu/ProteoSAFe/dataset.jsp?task=3d3801965ccb4b269a3c8547115c544b), while the original *multi-urine* data from [14] are available from GNPS MassIVE MSV000081118 (https://massive. ucsd.edu/ProteoSAFe/dataset.jsp?task=17813156319b488f9b3351c440ac8d92). The *BeerQCB* data alongside converted mzML files that can be readily used by ViMMS for the *multi-beer* and *multi-urine* data can be found at http://dx.doi.org/10.5525/gla.researchdata.870.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Annotating Nontargeted LC-HRMS/MS Data with Two Complementary Tandem Mass Spectral Libraries**

#### **Herbert Oberacher 1,\*, Vera Reinstadler 1, Marco Kreidl 1, Michael A. Stravs 2, Juliane Hollender 2,3 and Emma L. Schymanski 2,4,\***


Received: 4 December 2018; Accepted: 21 December 2018; Published: 23 December 2018

**Abstract:** Tandem mass spectral databases are indispensable for fast and reliable compound identification in nontargeted analysis with liquid chromatography–high resolution tandem mass spectrometry (LC-HRMS/MS), which is applied to a wide range of scientific fields. While many articles now review and compare spectral libraries, in this manuscript we investigate two high-quality and specialized collections from our respective institutes, recorded on different instruments (quadrupole time-of-flight or QqTOF vs. Orbitrap). The optimal range of collision energies for spectral comparison was evaluated using 233 overlapping compounds between the two libraries, revealing that spectra in the range of CE 20–50 eV on the QqTOF and 30–60 nominal collision energy units on the Orbitrap provided optimal matching results for these libraries. Applications to complex samples from the respective institutes revealed that the libraries, combined with a simple data mining approach to retrieve all spectra with precursor and fragment information, could confirm many validated target identifications and yield several new Level 2a (spectral match) identifications. While the results presented are not surprising in many ways, this article adds new results to the debate on the comparability of Orbitrap and QqTOF data and the application of spectral libraries to yield rapid and high-confidence tentative identifications in complex human and environmental samples.

**Keywords:** nontarget analysis; liquid chromatography mass spectrometry; compound identification; tandem mass spectral library; forensics; wastewater

#### **1. Introduction**

Tandem mass spectral databases are indispensable for fast and reliable compound identification in nontargeted analysis with liquid chromatography–high resolution tandem mass spectrometry (LC-HRMS/MS) [1–7]. These databases have been applied in diverse fields, including forensics, environmental analysis, food analysis, and metabolomics. They are usually applied for target and suspect analysis [8–11], and enable fast and automated annotation of components [12,13]. Database searching can yield identifications at a high confidence level. According to the scheme introduced by Schymanski et al. [14], a Level 2a identification (probable structure via spectral match) can immediately be reached with sufficient match to a library spectrum. Even Level 1 (structure confirmed by a reference compound) can be achieved when the library spectrum and associated retention time (or index) match with data acquired on the same analytical set-up as in the sample. This identification scheme was

designed specifically for HRMS/MS data and is applied in the current manuscript. However, in the context of this article, these levels do not differ markedly from the Metabolomics Standard Initiative levels (MSI) 1 (Identified compounds) and 2 (Putatively identified compounds based upon spectral similarity with spectral libraries) [15].

Tandem mass spectral databases consist of two integral parts: (1) the collection of tandem mass spectral data accompanied by chemical information on the corresponding compounds, and (2) a database management system with diverse search functions. Tandem mass spectra are usually produced by collision-induced dissociation (CID) or higher-energy collision dissociation (HCD). The instruments most commonly applied for the acquisition of reference spectra are quadrupole time-of-flight (QqTOF) and iontrap/quadrupole-Orbitrap. Before storage, spectra are usually curated and cleaned employing multiple steps, which can include some or all of noise and artefact removal, peak annotation and recalibration, testing and benchmarking, as well as expert reviewing [16–21].

A challenge limiting tandem spectral database development has been the variability in observed fragmentation reactions caused by limited standardization and harmonization of experimental conditions. To cope with these reproducibility issues, state-of-the-art libraries contain multiple spectra per compound [17,22–24]. This is usually accomplished by comprehensive coverage of compound-specific breakdown curves via stepwise increase of applied collision energies. Combining these libraries with appropriate tailor-made search algorithms [25–27] enables reliable and robust identification. Such databases are characterized by false positive rates and false negative rates below 5% [3].

Tandem mass spectral libraries are constantly growing. The total number of compounds covered by tandem mass spectral databases is already in the range of several tens of thousands [1,2]. However, the overlap between libraries is still relatively limited [1]. While the results of extensive testing and benchmarking experiments will provide guidance for database selection [20], as has recently been investigated for genome-wide metabolic networks [28], such data is not available for the majority of established databases in an environmental context. A further complication is the fact that databases were established on either single or multiple instruments (i.e., QqTOF and various Orbitrap hybrid instruments). There are a range of scientific opinions on whether Orbitrap databases with HCD (and sometimes CID) spectra and QqTOF databases with CID spectra offer complementary identification possibilities. Initial findings suggest that HCD MS/MS spectra yield acceptable matches in CID mass spectral databases [29]. However, a thorough evaluation of the complementarity of these two important types of tandem mass spectral databases has not been accomplished yet.

Here, we use two specialized collections to investigate the complementarity of QqTOF and Orbitrap libraries, where the Orbitrap library contains both HCD and ion trap CID spectra. First, we investigate the comparability of the spectra in the two libraries, one created on a QqTOF in a forensic-toxicological context, the other a subset of Orbitrap spectra from MassBank compiled in an environmental context. We then use both libraries for mining nontarget Orbitrap and QqTOF data. While more extensive collections are available, we have limited this investigation deliberately to these specialized collections, as both the libraries and nontarget data were generated under relatively consistent conditions at the respective institutes of the coauthors, allowing greater intuitive interpretation of the results beyond other, more extensive collections where this institutional background knowledge is missing.

#### **2. Results and Discussion**

#### *2.1. Testing and Benchmarking of the Tandem Mass Spectral Libraries*

In the first evaluation approach, the performance of the two well-established tandem mass spectral libraries was evaluated. The first collection was the "Wiley Registry of Tandem Mass Spectral Data", hereafter termed WRTMD, developed on QqTOF instruments. The second library was the Eawag collection part of MassBank, developed on Orbitrap instruments (for more details see the "Materials

and Methods" section). Overall, 14,693 QqTOF spectra representing 1349 compound species (i.e., including some compounds with multiple entries due to different precursor ions such as abundant isotopes, adducts, and in-source fragments) and 7415 Orbitrap spectra representing 744 compounds were available. For the QqTOF spectra, fragmentation was accomplished by CID at various collision energies. Out of the entire set of Orbitrap spectra, 321 spectra were acquired with CID at 35 NCE (nominal collision energy units), and 7094 spectra with HCD at various collision energies.

The WRTMD collection of 14,693 CID QqTOF spectra of 1349 compounds has been investigated in multiple studies and the reliability of the search expressed as sensitivity and specificity has been demonstrated [3,16,20,24,25,27,29]. Although the database was tested with spectra acquired on all common types of tandem mass spectral instrumentation, the observed error rate was typically below 5%. This proven track record renders the WRTMD highly suitable for benchmarking experiments.

The Eawag collection of 7415 Orbitrap spectra (321 CID, 7094 HCD), representing 744 compounds, has a proven record of success in application work [8–10,30,31]. As described above, the library spectra are filtered and recalibrated [18]. This level of data curation also renders the Eawag collection suitable for quality tests. The influence of recalibration and cleanup of library spectra on database searching is shown in Table 2 of Stravs et al. [18].

Investigating the overlap of the two libraries tested revealed 233 compounds present in both collections. These 233 compounds represented 17.3% of the WRTMD (2840 QqTOF spectra) and 31.3% of the Eawag collection (2405 Orbitrap spectra).

As described in the "Materials and Methods" section, the 'MSforID Search' was used for spectral matching to obtain *amp-* and *ramp*-values. The thresholds (see "Materials and Methods") are deduced from quality tests and represent a compromise between sensitivity and specificity [20,25].

Firstly, the compatibility of the Eawag collection with the spectral matching via 'MSforID Search' was evaluated. The true positive rate obtained by matching each individual spectrum of the Eawag collection to the entire library was determined. Sensitivity (= true positive rate) was found to be 99.5%. The same test with the WRTMD yielded a sensitivity value of 99.7%. Secondly, each library was used as test set to characterize the reliability of a match to the other collection. For statistical evaluation, libraries were divided into positive and negative controls. By querying the WRTMD with the Orbitrap spectra, sensitivity was 88.0% (2405 test spectra) and specificity (= true negative rate) was 97.7% (5010 test spectra). Querying the QqTOF spectra against the Eawag collection gave a true positive rate of 91.5% (2840 test spectra) and a true negative rate of 98.0% (11,853 test spectra). The observed specificities correlated well to values observed with other test sets [3]. However, the observed sensitivities fell short of expectations. Based on previous results [29], matching Orbitrap spectra to WRTMD was expected to yield a true positive rate closer to 100%.

As a result, the impact of the collision energy on the true positive rate was investigated to determine whether this could cause the reduced sensitivity (see Figure 1). For the majority of compounds, the collision energy ranges 20–50 eV on the QqTOF and 30–60 nominal collision energy units (NCE) on the Orbitrap seem to enable the acquisition of comparable reference and sample spectra. In these cases, substantial overlap between compound-specific spectra acquired on QqTOF and Orbitrap was observed. For the spectra acquired under these conditions, sensitivity values were 95.1–98.4%. For the QqTOF spectra acquired at very low collision energies (5 and 10 eV), sensitivity values fell below 81%. Similarly, for the Orbitrap spectra acquired at collision energies above 90, sensitivity values decreased to 21–61%. The 'MSforID Search' considers the similarity of the sample spectrum to the entire series of compound-specific reference spectra, such that the outcome is not a one-to-one match with a single reference spectrum. Thus, these results of this performance evaluation study indicate that to use these two libraries in a complementary manner in nontargeted LC-MS/MS identification, optimal sensitivity will be achieved for matching to both libraries if the nontarget data is acquired with collision energies in the range of 20–50 eV on a QqTOF or 30–60 NCE on an Orbitrap instrument, which was the case for the application cases presented below.

**Figure 1.** Evaluation of the reliability of a match in the WRTMD and the Eawag library illustrated by plots of sensitivity vs. collision energy applied during spectra acquisition.

#### *2.2. Compound Annotation Workflow for Application Samples*

As discussed above, tandem mass spectral libraries are valuable for mining nontargeted LC-MS/MS data and can rapidly yield either Level 2a (library match) or Level 1 (in-house reference standard match) identifications.

Workflows for mining nontargeted LC-MS/MS data usually involve diverse steps of feature detection, feature annotation, and compound identification. A feature detected by nontargeted LC-MS/MS is characterized by the *m/z* and retention time and, where available, the isotopic pattern of the precursor ion, any additional adduct species, and the corresponding fragmentation pattern. Particularly in environmental analysis and metabolomics, peak picking and extracted ion chromatograms (XICs) often play a key role in data processing. However, the data mining approach used for the plasma and wastewater sample here is different. All features containing information on the *m/z* of the precursor ion and the fragmentation pattern are matched directly to the tandem mass spectral library. This approach is suitable for complex data when searching using tandem mass spectral databases with high sensitivity. It also avoids the loss of matching compounds that may not have been detected by peak picking algorithms.

#### *2.3. Application Work*

#### 2.3.1. Application 1: Systematic Toxicological Analysis of Human Plasma Samples

Forensic toxicology is an important field of application for nontargeted LC-MS/MS [3,5]. Although the WRTMD has a proven record of success in forensic toxicological analysis [3], this library does not cover the full range of compounds principally observable in human samples and should therefore be complemented by other databases. To evaluate the impact of applying multiple libraries for compound annotation, 10 human plasma samples were submitted to systematic toxicological analysis involving nontargeted LC-MS/MS with data-dependant acquisition (DDA) on a QqTOF instrument. Tandem mass spectra were acquired at 35 eV with a collision energy spread of 10 eV. This

CE is well within the working range defined above. The obtained data sets were then matched to the WRTMD and Eawag collections. False positive matches were sorted out by expert reviewing, which involved visual inspection of the spectral match.

In the 10 samples analyzed, a total number of 132 compounds were identified (Figure 2a, Supplementary Table S1). The number of identifications obtained for the individual samples ranged from 41 to 68. In each sample, a considerable number of endogenous compounds were detected. These biomolecules observed included amino acids, biogenic amines, steroids, nucleosides, and vitamins, which are only covered by WRTMD. Several nutritional compounds were observed, including caffeine, nicotine, and piperine, as well as their corresponding metabolites. A third group of observed compounds represented industrial chemicals. While some of these were also detected in the blank and thus may represent impurities and contaminants introduced after sample collection, there were nine compounds that were only observed in patient samples. These included the vulcanization accelerators 2-mercaptobenzothiazole and dibenzothiazyldisulfide, the corrosion inhibitor 2-hydroxybenzothiazole, the cosmetic ingredients ethylparabene, propylparabene, and octocrylene, the plasticiser benzyl butyl phthalate, as well as phenylurea and neocuproine. Detection of these industrial chemicals suggests that nontargeted LC-MS/MS techniques will be an important approach to detect unexpected compounds in human biomonitoring [32]. The fourth group of compounds detected were pharmaceutical compounds and corresponding metabolites. In total, 58 different species were detected. In accordance with previous findings [33], a high number of psychoactive drugs were observed, and these included 12 compounds belonging to the group of benzodiazepines and 8 to the group of opioids. Thirteen antidepressants and six antipsychotics were also identified. The last group of observed compounds represented illegal drugs and corresponding metabolites. Their detection provided evidence for cannabis consumption by four patients, cocaine consumption by six patients, and heroin consumption by one patient. There were three patient samples without any illegal drug detected. Further information about the identified compounds, including chemical identifiers, is given in the Supplementary Materials (Tables S1 and S4).

An important aspect of this study was the evaluation of the number of compounds identified with the two libraries employed in the context of forensic toxicological analysis (Figure 2b). Out of the 570 identifications obtained, 384 (67.4%) were only obtained with the WRTMD, 22 (3.9%) only with the Eawag collection, and 164 (28.8%) with both libraries tested. Obviously for forensic samples, searching the Eawag collection enables verification of a considerable number of matches to the WRTMD, but it only provided a limited number of unique matches. This observation is quite reasonable taking into account that the Eawag library was initially built for environmental applications. The 164 identifications obtained with both libraries corresponded to 39 reference compounds. All other identifications involved compounds that were only included in one of the two libraries applied (85 compounds of the WRTMD and 9 compounds of the Eawag collection).

**Figure 2.** Application of the two tandem mass spectral libraries to systematic toxicological analysis of 10 authentic plasma samples. Nontargeted LC-MS/MS data was acquired on a QqTOF instrument using DDA. (**a**) Overview on the number of compounds identified in different compound classes via the combined use of the two libraries tested, and (**b**) the Venn diagram illustrating the number of identified compounds obtained with the two libraries tested.

2.3.2. Application 2: Comprehensive Compound Identification in Wastewater Influent Samples Collected in a Local Wastewater Treatment Plant (WWTP)

Environmental analysis is another important field of application for nontargeted LC-MS/MS workflows [8–11,30,31]. Particularly in water analysis, the Eawag collection has a proven record of success. Recently, it has been demonstrated that the WRTMD is applicable for that purpose as well [34]. To evaluate the coverage of the two libraries, samples collected at the WWTP Rossau from 1–10 April 2016, were submitted for nontargeted LC-MS/MS analysis with DDA on a QqTOF instrument. Tandem mass spectra were acquired at 35 eV with a collision energy spread of 10 eV. This CE was well within the working range defined above. The obtained data sets were matched to the WRTMD and Eawag collections. False positive matches were sorted out by expert reviewing.

In the 10 influent samples, 149 different compounds were identified (Figure 3a, Supplementary Table S2). Pharmaceutical compounds and their metabolites represented the largest group of compounds detected (*N* = 96). Diverse antipsychotics, anticonvulsants, antidepressants, hypnotics and sedatives, hypoglycaemic agents, anti-inflammatory agents, cardiovascular agents, analgesics, and antibiotics were present. Clearly, wastewater analysis yields a comprehensive overview on medical prescription and consumption practices. Other important classes of compounds observed included biomolecules (*N* = 21) and industrial chemicals (*N* = 16). The groups of nutritional compounds (*N* = 8) and illegal drugs (*N* = 8) provide some insights into lifestyle of the community monitored. It provides evidence for the consumption of caffeine and tobacco, as well as of cocaine, amphetamine, MDMA, and heroine.

**Figure 3.** Application of the two tandem mass spectral libraries to the analysis of wastewater samples collected at the WWTP in Innsbruck. Ten influent samples were analyzed. The nontargeted LC-MS/MS data was acquired on a QqTOF instrument using DDA. (**a**) Overview on the number of compounds identified in different compound classes, as well as (**b**) a Venn diagram characterizing the number of identified compounds obtained with the two libraries tested are provided.

A total of 990 identifications were obtained (Figure 3b) with the two libraries. The WRTMD produced 806 identifications, and the Eawag collection 612 identifications. Of these, 378 identifications (38.2%) were solely obtained by the WRTMD, 184 identifications (18.6%) solely by the Eawag collection, and 428 identifications (43.2%) by both libraries tested. This clearly proves that the two libraries complement each other in wastewater analysis. Thus, for more comprehensive compound identification (at Level 2a), the combined use of the two libraries is recommended.

False negative rates were determined using the 449 identifications corresponding to compounds that were available in both libraries tested. The WRTMD produced 8 (1.8%), and the Eawag collection 13 false negative identifications (2.9%). In the majority of cases, the false negatives matched the corresponding reference compounds but were sorted out during data evaluation based on match probability values below the defined thresholds or during the final expert reviewing. Thus, when using stringent thresholds, the combined use of two or more libraries is recommended. The lower false negative rate for the WRTMD is most likely due to the fact that the acquisition data better matched the original library data.

2.3.3. Application 3: Retrospective Compound Identification in LC-MS/MS Data Acquired from Swiss Wastewater Effluent Samples

The third set of experimental data was selected to evaluate the compatibility of data mining workflow presented here with Orbitrap data. The test sets were obtained from analysing nine Swiss wastewater effluent samples by nontargeted LC-MS/MS with DDA [30]. Tandem mass spectra were acquired at CID 35 and HCD 60. These CE values were within the working range defined above.

In the nine samples analyzed, 82 different compounds were identified (Supplementary Table S3). These included 54 pharmaceutical compounds, 24 industrial compounds, 3 nutritional compounds, and 1 illegal drug. Identifications per sample ranged from 45 to 58, leading to a total number of 458 identifications (Figure 4). Only 7.0% of the identifications were obtained with the WRTMD, 27.5% with the Eawag collection, and 65.5% with both collections. As each institution generally develops their reference standard collection (and thus libraries) for the local conditions and studies of interest, it is not surprising that the Eawag library results in more % identifications for the Swiss data set, and the WRTMD for the Austrian data sets.

**Figure 4.** Application of the two tandem mass spectral libraries to the analysis of wastewater samples collected at the effluent of nine Swiss WWTP. The target and nontargeted LC-MS/MS data was acquired on an Orbitrap instrument using DDA. The column chart visualises the number of identifications obtained with the WRTMD and/or the Eawag library for each sample analyzed.

With the 492 identifications corresponding to compounds that were available in both libraries tested, false negative rates were determined. The WRTMD produced 20 (4.1%) and the Eawag collection 14 false negative identifications (2.8%). As above, in the majority of cases, negatively identified compounds matched to the corresponding reference compounds but were sorted out during data evaluation based on match probability values below the defined thresholds or during the final expert reviewing. The lower false negative rate for the Eawag collection in this case supports the conclusion above that fewer false negatives can be expected when the sample acquisition matches the library acquisition. Nonetheless, it is clear that libraries acquired on different instruments can provide valuable additional information in many cases.

As part of the initial study for this data set, a comprehensive quantitative target analysis was performed [30]. This analysis detected 73 compounds in positive mode. With the retrospective data analysis performed here, 58 of these targets were detected and identified. The identification of the remaining 15 targeted compounds was not reproduced. For six of these false negatives, the tandem mass spectral library search did not produce any evidence for their occurrence in the tested data sets (i.e., no fragmentation information was available). For the remaining eight compounds, at least one match was obtained, but in all cases, the spectral similarity was insufficient for a positive match (Figure 5). The observed discrepancies between the results obtained by target analysis and the suspect screening approach applied here can be explained by the different working principles of the two identification workflows. The suspect screening workflow relies on tandem mass spectral information for identification, such that compounds without fragments will not be detected—six compounds in this case, which were identified with retention time and exact mass and a correspondingly lower "identification point (IP) score" (2 IP vs 4.5 IP for targets with reported matching fragments) in the original study [30]. This means that some low-abundance but well-known compounds will be missed with a spectral library search approach.

The most interesting cases are those that failed due to low spectral similarity values (see Figure 5). This is perhaps not surprising when querying spectra recorded in complex samples, as impurities are likely to occur even in MS/MS fragment information obtained using DDA [12]. This indicates some potential to apply a partial cleanup such as that performed in RMassBank prior to querying spectral libraries. A simple subformula or mass defect filter based on the precursor mass will potentially eliminate several interfering peaks that may correspond with different (coeluting) precursors that are still within the DDA window. Furthermore, this problem could be exacerbated with the increasing popularity of data-independent acquisition data (without precursor isolation and thus potentially more spectral interferences), increasing the need for deconvolution [12] and alternative data-processing approaches (e.g., [35]).

Another interesting result is that the retrospective data analysis performed in this study produced 67 additional tentative identifications corresponding to 24 compounds that were in WRTMD only and thus not obtained in the original investigation (with either target, suspect, or nontarget approaches) [30]. Two of these compounds (O-desmethyltramadol and tri(butoxyethyl)phosphate) were found to be among the thirty most abundant species observed in positive ion mode LC-MS/MS analysis.

Very recently, the Swiss wastewater data sets used here were included in a proof-of-concept study that demonstrated the potential of a global emerging contaminant early warning network to rapidly assess the spatial and temporal distribution of contaminants of emerging concern in environmental samples through performing retrospective data analysis [36]. The data sets were screened for 156 compounds included in the NORMAN Early Warning System (NormaNEWS) suspect list (http://comptox.epa.gov/dashboard/chemical\_lists/normanews). With the data acquired in positive ion mode, 40 compounds were tentatively identified with the NormaNEWS method. For 31 of these compounds, reference spectra were available in the WRTMD and/or the Eawag collection and thus amenable to identification with the tandem mass spectral library search approach applied here. However, out of these 31 compounds detected in the wastewater samples, only 16 were successfully matched to the libraries with the approach used here. In the remaining cases, either

no (*N* = 12) or only low-quality tandem mass spectra (*N* = 3) were available in the data sets (Figure 5), rendering confident compound identification (Level 2a or better) nearly impossible. This reinforces the need for high-quality spectral searching to provide additional evidence to increase the confidence of identification in nontarget screening efforts beyond the levels achieved with exact mass and retention time matches and, where available, selected fragment masses. As discussed above and as shown in Figure 5, the issue of interferences in the spectra extracted from complex samples played a role in the poor-quality spectra in many cases and a future investigation could look into whether spectral cleanup steps may improve these results.

**Figure 5.** Examples of tandem mass spectra obtained from analysing wastewater samples with Orbitrap that showed insufficient spectral similarity to reference spectra of (**a**) tramadol (interfering peaks), (**b**) ibuprofen (noisy sample spectrum), (**c**) N-desmethylvenlafaxine (noise and/or interfering peaks), and (**d**) losartan (noisy spectrum and interfering peaks) stored in the Eawag collection. Black dots indicate precursor mass that triggered the MS/MS spectra (hollow dot).

#### **3. Materials and Methods**

#### *3.1. Tandem Mass Spectral Libraries*

Two libraries were tested: the WRTMD (Wiley, Hoboken, NJ, USA) [37] and the Eawag collection in MassBank [18,23].

Tandem mass spectral data stored in the WRTMD were acquired on QqTOF instruments (Qstar XL or TripleTOF 5600+, Sciex, Framingham, MA, USA). For each reference compound, 10 or more product-ion spectra were acquired at different collision energy levels (resolution >10,000) to comprehensively cover compound-specific breakdown curves. Low-abundance and unspecific signals were removed from reference spectra by filtering [16,17]. For this study, a library version containing 1349 entries with 14,693 spectra was used. A more detailed description of the mass spectral library is provided on www.msforid.com.

The Eawag library used for this study contained 7415 MS/MS spectra corresponding to 744 compounds. Reference spectra of 321 compounds were acquired on a LTQ-Orbitrap XL (Thermo Fisher Scientific, Waltham, MA, USA). For each of these compounds, HCD product-ion spectra were acquired at six different collision energy levels (HCD 15, 30, 45, 60, 75, 90) and a CID spectrum at one collision energy level (CID 35) to comprehensively cover compound-specific breakdown curves. The MS/MS spectra for each collision energy were recorded at two resolutions (7500 and 15,000). Reference spectra for a further 423 compounds were acquired on a QExactive Orbitrap (Thermo Fisher Scientific). For each of these reference compounds, HCD product-ion spectra were acquired at six different collision energy levels (HCD 15, 30, 45, 60, 75, 90). For a subset of 216 compounds, the collision energy range was extended to include HCD product-ion spectra at the collision energy levels 120, 150, and 180. MS/MS spectra were recorded at a resolution of 35,000. In all cases, the R package RMassBank was used to perform recalibration and cleanup of all spectra [18]. RMassBank can be downloaded from BioConductor, at http://bioconductor.org/packages/RMassBank/. The curated spectra (records published prior to 2018) are available at https://github.com/MassBank/MassBankdata/tree/master/Eawag. Listings of the chemicals available in MassBank.EU and WRTMD used in this investigation (beyond those detected and presented in the Supplementary Information) are given on the NORMAN Suspect Exchange (https://www.norman-network.com/?q=node/236) and CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard/chemical\_lists).

#### *3.2. Tandem Mass Spectral Library Search*

The library search was accomplished using the 'MSforID Search' [17,25]. The search algorithm determines the similarity between a sample spectrum and library spectra. The estimation of similarity starts with the identification of fragment ions that are present in both of the spectra being compared. These ions are called "matching fragments". The spectral information retrieved is used to calculate the "reference spectrum specific match probability" (*mp*). As the mass spectral libraries contain multiple spectra per reference compound, multiple *mp* values per reference compound are obtained. To combine all these compound-specific *mp* values to one value that specifies the similarity between the unknown and the specific reference compound, the compound-specific *mp* values are averaged and normalized to yield the compound-specific "average match probability" (*amp*) and "relative average match probability" (*ramp*), respectively. These values range between 0 and 100. High compound-specific *amp* and *ramp* values indicate high similarity between the unknown and the reference compound. The substance with the highest *amp* and *ramp* value is considered to be the best match to the unknown compound.

Automated MSforID search was performed with a program written in Pascal using Delphi 6 for Windows (Borland Software Corporation, Scotts Valley, CA, USA; now Embarcadero Technologies, Inc., San Francisco, CA, USA) using the following search parameters: mass-to-charge ratio (*m/z*) tolerance of ±0.01, intensity cut-off factor of 0.01. The following criteria were used to classify obtained search results as tentatively correct positive results: precursor ion mass tolerance of ±0.01, *amp* > 1.0–10.0 and *ramp* > 30–50. The thresholds were determined using quality tests and represent a compromise between sensitivity and specificity [17,25]. The correctness of tentative identifications was checked by expert reviewing, which included visual inspection and comparison of tandem mass spectral data.

#### *3.3. Performance Evaluation*

The performance of the two libraries (WRTMD, Eawag) was evaluated using two approaches.

In the first approach, the libraries were searched against each other. Either library was used as reference or sample set. The spectra of compounds covered in both libraries served as positive controls. All other spectra represented negative controls. The positive controls were further grouped according to the collision energy settings used to acquire the individual spectra. For each test set, the number of

positive identifications and the number of negative identifications were counted and used to calculate the statistical parameters sensitivity (= true positive rate) and specificity (= true negative rate).

The second evaluation approach involved the analysis of forensic casework and environmental samples. Here, the focus was on evaluating the number and type of identifications obtained with the two libraries. The first set of samples analyzed represented 10 plasma samples collected as evidence in forensic casework at the Institute of Legal Medicine of the Medical University Innsbruck. The second set consisted of wastewater samples collected on 10 consecutive days (1–10 April 2016) at the WWTP Rossau (Innsbrucker Kommunalbetriebe AG, Innsbruck, Austria). The wastewater samples represented 24-h average samples of the influent [34]. The two sample sets were submitted to nontargeted LC-MS/MS on a QqTOF instrument (TripleTOF 5600+, Sciex, Framingham, MA, USA). Details of the analytical workflows employed are provided in the Electronic Supplementary File 1. The third set of samples consisted of nine Swiss WWTP effluent samples that had been analyzed by target and nontargeted LC-MS/MS on an Orbitrap instrument (LTQ Orbitrap XL, Thermo Fisher Scientific, Waltham, MA, USA). Details of the analytical workflow have been published previously [30]. Raw data files for the Swiss study are available at ftp://massive.ucsd.edu/MSV000079601. The remaining files cannot be uploaded for legal reasons, but can be made available to interested researchers upon request.

Data mining involved the extraction of the tandem mass spectra and a subsequent database search. Raw data files were converted to Mascot Generic Format (.mgf) files with the MSConvert from ProteoWizard [38]. The MS/MS spectra part of the .mgf files were extracted with a program written in ActivePerl 5.6.1 (Active State Corporation, Vancouver, Canada) to yield all MS/MS spectra as plain text (ASCII) files containing peak list information. These spectra were then submitted to the tandem mass spectral library search as described above.

#### **4. Conclusions**

This article demonstrates the applicability of tandem mass spectral library searching to complex environmental and toxicological samples and reveals a wide range of comparability between collision energies of different tandem mass spectral instruments over a diverse range of compounds. For complementary use of the two libraries tested, the collision energy ranges 20–50 eV on the QqTOF and 30–60 NCE on the Orbitrap represented suitable working ranges. The results of the applications are in many ways unsurprising, that is, that searching in two libraries instead of one reveals more hits and that entries without fragmentation or with poor fragmentation information are not found. However, this article documents additional investigations to add to the debate on the comparability between QqTOF and Orbitrap instruments. This comparability is of utmost importance to achieve the desired goal of developing a unified and universally applicable tandem mass spectral database. Library development is laborious, time-consuming, and expensive, and this enormous effort is a serious hurdle for individual and isolated labs interested in contributing to accomplishing this mammoth task. Compatibility of libraries will enable the building of strong and dynamic consortia within scientific communities that will significantly increase the number of available reference spectra by sharing the connected workload. Further conclusions from this work are that the data mining approach used here could possibly be improved in the future through the application of some basic spectral cleanup to remove clear matrix interferences as well as the consideration of additional information such as isotope patterns/adduct and retention behavior.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2218-1989/9/1/3/s1. Supplementary File 1: Experimental settings for nontargeted LC-MSMS analysis with QqTOF; Supplementary File 2: Excel spreadsheet containing the following tables: Supplementary Table S1: Overview of compounds identified by systematic toxicological analysis of ten authentic plasma samples; Supplementary Table S2: Overview of compounds identified by nontargeted LC-MS/MS in ten wastewater samples collected at the influent of the WWTP in Innsbruck; Supplementary Table S3: Overview of compounds identified by target and nontargeted LC-MS/MS in nine wastewater samples collected at the effluent of Swiss WWTPs; Supplementary Table S4: Additional chemical information for all compounds mentioned in Table S1 to S3.

**Author Contributions:** Conceptualization, H.O. and E.L.S.; Data curation, H.O., V.R., M.K., M.A.S., and E.L.S.; Formal analysis, H.O., V.R., M.K., M.A.S., and E.L.S.; Funding acquisition, H.O. and J.H.; Investigation, H.O. and E.L.S.; Methodology, H.O., V.R., M.K., M.A.S., and E.L.S.; Project administration, H.O. and J.H.; Resources, H.O.; Software, H.O., M.A.S., and E.L.S.; Supervision, H.O. and J.H.; Validation, H.O., V.R., M.K., and E.L.S.; Visualization, H.O.; Writing—original draft, H.O. and E.L.S.; Writing—review & editing, H.O., V.R., M.K., M.A.S., J.H., and E.L.S.

**Funding:** H.O. acknowledges financial support by the HBM4EU project. HBM4EU has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 733032. E.L.S. acknowledges funding from the Luxembourg National Research Fund (FNR) for project 12341006. J.H., and E.L.S. were supported by the European Union Seventh Framework Program project SOLUTIONS under grant agreement number 603437, J.H. and M.A.S. by the Swiss Federal Office for the Environment and by the Swiss National Science Foundation under grant number 205320165935.

**Acknowledgments:** The authors thank Klemens Geiger and Michael Schlapp (Innsbrucker Kommunalbetriebe, Innsbruck, Austria) for providing wastewater samples. The authors gratefully acknowledge colleagues at the Uchem department involved in the Swiss Wastewater study and in acquiring the MassBank spectra, as well as Antony J. Williams (US EPA) for his efforts in mapping WRTMD and MassBank compounds to the CompTox Chemicals Dashboard.

**Conflicts of Interest:** The authors declare that they have no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*
