*Article* **Applying a Chemogeographic Strategy for Natural Product Discovery from the Marine Cyanobacterium** *Moorena bouillonii*

**Christopher A. Leber 1, C. Benjamin Naman 1,2, Lena Keller 1,3, Jehad Almaliti 1,4, Eduardo J. E. Caro-Diaz 1,5, Evgenia Glukhov 1, Valsamma Joseph 1,6, T. P. Sajeevan 1,6, Andres Joshua Reyes 7, Jason S. Biggs 7, Te Li 2, Ye Yuan 2, Shan He 2, Xiaojun Yan <sup>2</sup> and William H. Gerwick 1,8,\***


Received: 1 September 2020; Accepted: 8 October 2020; Published: 14 October 2020

**Abstract:** The tropical marine cyanobacterium *Moorena bouillonii* occupies a large geographic range across the Indian and Western Tropical Pacific Oceans and is a prolific producer of structurally unique and biologically active natural products. An ensemble of computational approaches, including the creation of the ORCA (Objective Relational Comparative Analysis) pipeline for flexible MS1 feature detection and multivariate analyses, were used to analyze various *M. bouillonii* samples. The observed chemogeographic patterns suggested the production of regionally specific natural products by *M. bouillonii*. Analyzing the drivers of these chemogeographic patterns allowed for the identification, targeted isolation, and structure elucidation of a regionally specific natural product, doscadenamide A (**1**). Analyses of MS<sup>2</sup> fragmentation patterns further revealed this natural product to be part of an extensive family of herein annotated, proposed natural structural analogs (doscadenamides B–J, 2–10); the ensemble of structures reflect a combinatorial biosynthesis using nonribosomal peptide synthetase (NRPS) and polyketide synthase (PKS) components. Compound **1** displayed synergistic in vitro cancer cell cytotoxicity when administered with lipopolysaccharide (LPS). These discoveries illustrate the utility in leveraging chemogeographic patterns for prioritizing natural product discovery efforts.

**Keywords:** *Moorena bouillonii*; marine natural products; chemogeography; metabolomics

#### **1. Introduction**

Natural products discovery programs operate with the general goal of detecting and characterizing chemically unique or biologically active substances. A common obstacle in discovery efforts is the rediscovery of known compounds, suggesting a need for tools and techniques that allow researchers to give priority to samples that possess new or otherwise interesting chemical substances. Various strategies have been employed for the dereplication of known chemicals within samples, and for the prioritization of samples based on chemical composition. In this regard, mass spectrometric analyses, usually in combination with liquid chromatography (e.g., LC-MS), have found great utility in natural products research due to the rapidity, small sample size requirements, and high amount of data generated. As a result, a number of approaches and algorithms have been developed to sift through LC-MS data so as to rapidly detect molecules of greater structural novelty and interest.

PoPCAR (Planes of Principal Component Analysis in R) applies principal component analysis (PCA) to a processed bucket table of sample features, selects outlying samples across different PCA planes, and then leverages the PCA feature loadings to identify the features that make the outlying samples unique [1]. IDBac integrates proteomics and metabolomics data captured via MALDI-TOF MS applied to bacterial colonies on agar plates to classify bacterial strains and distinguish between closely related strains [2]. Global Natural Products Social Molecular Networking (GNPS) is a platform that facilitates the sharing of mass spectral data and provides tools for performing MS2-based networking analyses [3]. GNPS continues to expand the repertoire of innovative approaches and techniques that it offers, with recent additions including a pipeline for Feature-Based Molecular Networking (FBMN) [4]. FBMN utilizes a processed bucket table of sample MS1 features in conjunction with MS<sup>2</sup> fragmentation data to produce highly sensitive molecular networks well suited for quantitation and differentiation of isomeric compounds. In addition to these more specific tools, multiple tools are available for the processing and/or statistical analyses of MS-based chemical profile data, including XCMS [5], MZmine [6], and Metaboanalyst [7]. The GNPS classical molecular networking approach [3] is of particular note. While many approaches are sensitive to sample set heterogeneity and rely on specific or consistent sample preparations and data acquisitions in order to provide appropriate results, the classical molecular networking approach is much more flexible, and its outcomes are insulated from imperfect data. This allows classical molecular networking to be used in analyzing datasets that vary across numerous dimensions (instrument type, chromatographic method, sample preparation, etc.), providing many more opportunities for connecting disparate data sources.

The cyanobacterial genus *Moorena* (previously *Lyngbya*, then *Moorea*) is a prolific source of biologically active natural products, with biosynthetic gene clusters accounting for 18% of *Moorena* spp. genomes, on average [8–10]. Consistent with this finding, some 70 different isolated and structurally defined compounds have been reported from *M. bouillonii* (Table S1) [11–44]. These display a broad structural diversity, and include peptides [41], cyclodepsipeptides [16], macrolides [12] and glycosidic macrolides [35], and lipids [43]. These compounds are also notable for their biological activities, including cytotoxins such as bouillonamide [23], lyngbouilloside [35], multiple lyngbyabellins [12,13], and the exquisitely potent apratoxin A [16]. Other *M. bouillonii* compounds have been reported with cannabimimetic properties, such as columbamides A–C [25] and mooreamide A [43], or as modulators of intracellular calcium mobilization such as alotamide A [14]. *M. bouillonii* has a wide distribution across the tropical Western Pacific and Indian Oceans. However, *M. bouillonii* metabolites have only been described from collections made from a limited number of discrete locations, including Papua New Guinea [14,19,23,25,28,33–35,39,41,43], Guam [11,12,15,16,18,20,22,24,29,36,37,42], Palau [11,18,38,40,44], Malaysia [26,27], Palmyra Atoll [13,21], Fiji [31] (The organism in this manuscript is reported as *M. producens*, however the manuscript includes a photo of the organism, which displays a morphology characteristic of shrimp-woven *M. bouillonii*. The 16S rRNA gene-based classification was inconclusive and known compounds previously isolated from *M. bouillonii* were reported.), the Red Sea [17] (The organism in this manuscript is reported as *M. producens*, however the 16S rRNA gene-based classification is inconclusive and known chemistry associated with *M. bouillonii* was

reported.), and the islands of southern Japan [30,32]. Collections from these diverse geographical regions differ substantially in their composition of metabolites, suggesting that even though many compounds are already known from *M. bouillonii*, comparing samples of different geographical origin could reveal distributional patterns in chemodiversity that would facilitate the identification of new natural products.

Much of the previous work connecting natural products chemistry and geography has focused on the latitudinal herbivory-defense hypothesis (LHDH). The LHDH suggests that tropical species display more developed defense phenotypes (including chemical defenses) than temperate species, due to higher levels of biotic stressors [45–47]. Studies in both terrestrial organisms [45–48] and marine organisms [49–51] lend support to this hypothesis, but many examples counter to LHDH have also been reported, layering the theory with some degree of controversy while also revealing the complexity of drivers that influence chemical defense [52,53]. Orthogonally, it has become a common strategy to look in underexplored geographical locations in order to find new and unique natural products. This has led natural products discovery efforts to interesting and exotic habitats, including tropical coral reefs [11–44], hypersaline lakes [54], the Arctic [55] and Antarctic [56], hydrothermal vents [57], and the deep sea [58]. In spite of the acknowledgement that sampling in new geographical locations can allow access to new natural products, there are few examples of systematically applying geographical knowledge in order to inform natural product discovery. However, in one study the crude extracts and fractions from 300 geographically and taxonomically diverse cyanobacterial and algal collections were profiled by LC-MS/MS [59]. Analyses by GNPS classical molecular networking revealed geographic hotspots for chemodiversity, thus allowing for a molecular feature to be prioritized based on its chemogeographical distribution. In this case, it led to the characterization of a new metabolite given the common name yuvalamide A. Another example study focused on cyanobacteria from one specific genus, analyzing 10 samples of *Symploca* spp. collected at different times and in different places. This led to the efficient and targeted discovery of a new sample-specific bioactive natural product, samoamide A [60].

In the present study, we illustrate the value of leveraging geographical patterns in chemodiversity to find previously uncharacterized natural products and apply this strategy to the marine filamentous cyanobacterial species *M. bouillonii*. This is a particularly interesting organism because of its wide geographical range and richness in natural products. To enable analyses and inform current discovery efforts based on legacy data, we were inspired to develop a flexible data pipeline described as the Objective Relational Comparative Analysis (ORCA) of chemical profiles from LC-MS data. Analyses of the LC-MS profiles from geographically disparate chemical extracts of *M. bouillonii*, used in conjunction with GNPS classical molecular networking, allowed for the prioritization of a molecular feature that led to the isolation and characterization of a new compound we called laulauamide (**1**). (The discovery, isolation, and structure elucidation of **1** were presented at the 2017 Annual Meeting of the American Society of Pharmacognosy. The name laulauamide was used for a poster presentation, and the associated abstract can be found under abstract P-219 at the following link [http://asp2017.org/wpcontent/uploads/2016/12/ASP20201720Annual20Meeting\_web.pdf]). Molecular networks along with detailed MS<sup>2</sup> fragmentation analyses revealed the presence of an extensive collection of proposed natural analogs. These display diversification through varied combinations of fatty acid side chains at two locations. Assays for biological activity yielded synergistic cytotoxic activity between **1** and lipopolysaccharide (LPS). Late in the performance of this work, a manuscript appeared from another laboratory that reported the isolation and structure elucidation of the main component of this new natural product family, and assigned it the common name "doscadenamide A" [29], a name we retain so as to not create confusion in the literature record.
