**Natural Products from Actinobacteria Associated with Fungus-Growing Termites**

**René Benndorf 1, Huijuan Guo 1, Elisabeth Sommerwerk 1, Christiane Weigel 1, Maria Garcia-Altares 1, Karin Martin 1, Haofu Hu 2, Michelle Küfner 1, Z. Wilhelm de Beer 3, Michael Poulsen <sup>2</sup> and Christine Beemelmanns 1,\***


Received: 13 August 2018; Accepted: 3 September 2018; Published: 13 September 2018

**Abstract:** The chemical analysis of insect-associated Actinobacteria has attracted the interest of natural product chemists in the past years as bacterial-produced metabolites are sought to be crucial for sustaining and protecting the insect host. The objective of our study was to evaluate the phylogeny and bioprospecting of Actinobacteria associated with fungus-growing termites. We characterized 97 Actinobacteria from the gut, exoskeleton, and fungus garden (comb) of the fungus-growing termite *Macrotermes natalensis* and used two different bioassays to assess their general antimicrobial activity. We selected two strains for chemical analysis and investigated the culture broth of the axenic strains and fungus-actinobacterium co-cultures. From these studies, we identified the previously-reported PKS-derived barceloneic acid A and the PKS-derived rubterolones. Analysis of culture broth yielded a new dichlorinated diketopiperazine derivative and two new tetracyclic lanthipeptides, named rubrominins A and B. The discussed natural products highlight that insect-associated Actinobacteria are highly prolific natural product producers yielding important chemical scaffolds urgently needed for future drug development programs.

**Keywords:** actinobacteria; symbiosis; secondary metabolites; drug discovery; chemical ecology

#### **1. Introduction**

Historically, natural products of microbial origin have been a rich source of drug-like lead structures and until today, almost 35% of all drugs are based on structures of naturally occurring small molecules [1–3]. Despite this prevalence, natural product chemistry has faced declining enthusiasm and dwindling investments for decades as bioactivity-guided screening programs resulted mostly in the rediscovery of already known compounds. The low success rates of industrial antibiotic drug discovery programs worldwide resulted in todays' eminent lack of new antibiotic drug leads. At the same time, increasing numbers of multiresistant human-pathogenic microbes causing non-treatable infections in clinics are reported. These eminent health threats led to the recent realization that new natural product derived scaffolds are urgently needed to combat the life threating infections caused by multidrug resistant pathogens.

The revolutionary developments in genome sequencing and analytical technologies in the last decade have dramatically changed the field of natural product discovery [4]. Particularly, ecology-driven natural product discovery approaches including the chemical analyses of symbiotic microorganisms, in combination with omics-based dereplication strategies, have become highly efficient approaches to identify new natural products with unique chemical scaffolds and bioactivities [5–8]. Most notably, the analysis of insect-microbe symbioses, and more specifically insect-Actinobacteria interactions, have been the focus of a series of recent natural product discovery studies, as bacterial symbionts are required to communicate with the host or participate in host defense using small molecules. The importance of defensive secondary metabolites in insect-Actinobacteria symbiosis is evident, as exemplified in firebugs (Pyrrhocoridae: *Pyrrhocoris apterus*) [9] or the European beewolf (Crabronidae: *Philanthus*) that harbors antibiotic-producing *Streptomyces* in their antennae to help protect wasp larvae from fungal infections [10]. Similarly, fungus-growing ants (*Attini* species) carry symbiotic *Pseudonocardia* that help protect the ants' fungal gardens against specialized parasites [11–13]. Insect-associated Actinobacteria have also been reported from other insects, including Ambrosia beetles [14], dung beetles (Scarabaeidae: *Copris tripartitus*) [15,16], and fungus-growing termites [17,18].

We have recently focused efforts on the chemical analyses of the delicate interplay between fungus-growing termites (Termitidae: Macrotermitinae), their fungal mutualist *Termitomyces* (Basidiomycota: Agaricales: Lyophyllaceae), and bacteria residing within termite guts and fungus gardens (fungus combs). Fungus-growing termites cultivate the fungal mutualist in subterranean monoculture fungus gardens as their main food source [19]. The maintenance of such a monoculture in a nutritionally-rich environment is expected to make the fungal garden prone to exploitation by competitors and disease, such as mites, nematodes, and co-occurring fungi. In addition to antimicrobial and behavioral defense mechanisms of the termites themselves [20], it has been hypothesized that bacteria are employed as defensive symbionts [17,21].

Using bacteria-fungus interaction assays, we have demonstrated that Actinobacteria associated with *Macrotermes natalensis* secrete secondary metabolite mixtures that are active against co-occurring fungi and the weed fungus *Pseudoxylaria* sp. (Ascomycota: Xylariales: Xylariaceae) [22]. Subsequent analysis of single species resulted in the isolation of a new geldanamycin derivative, named natalamycin (**4**), from *Streptomyces* sp. M56 (Figure 1) [23]. In a follow-up study, activity-based analysis of *Amycolatopsis* sp. M39 identified several new macrolactams named macrotermycins (macrotermycin A, **5**) [24], and comparative genome and metabolomic analysis of *Streptomyces* sp. M41 yielded, amongst others, the novel depsipeptide dentigerumycin B (**2**) [25]. Recently, activity and NMR-guided analysis of *Streptomyces* sp. RB1 led to the isolation of termisoflavones A–C (termisoflavone A, **1**) [26], and co-cultivation studies of *Actinomadura* sp. RB29 yielded a new group of tropolone derivatives named rubterolones (e.g., rubterolone D, **6**) [27].

**Figure 1.** Phylogenetic placement Actinobacteria that have previously been reported from fungusgrowing termites and isolated natural products: termisoflavone A (**1**) from *Streptomyces* sp. RB1, dentigerumycin B (**2**) from *Streptomyces* sp. M41, actinomycin D (**3**) from *Streptomyces* sp. RB94, natalamycin A (**4**) from *Streptomyces* sp. M56, macrotermycin A (**5**) from *Amycolatopsis* sp. M39, and rubterolone D (**6**) from *Actinomadura* sp. RB29.

Here, we present a comprehensive phylogenetic and bioactivity survey of Actinobacteria associated with the fungus-growing termite *M. natalensis*. In summary, our study shows that bioprospecting for secondary metabolites in Actinobacteria associated with this termite species leads to the identification of novel natural with unique chemical scaffolds and provides a foundation for gaining a better understanding of the general sanitary role of Actinobacteria in fungus-growing termites.

#### **2. Results**

#### *2.1. Phylogenetic Diversity*

To assess the culturable actinobacterial diversity, we chose three different sample origins (fresh fungus comb material and the termite worker exoskeleton and gut content) from eleven different *M. natalensis* termite colonies collected in South Africa (Table S1). We focused on the isolation of Actinobacteria capable of living on cellulose or chitin as a sole C-source (Figure 2), as these bacterial isolates are likely adapted to living within the cellulose-rich comb material [28]. Actinobacteria (97) with unique morphotypes were isolated from the termite gut (68), termite abdomen (13), and fungus comb material (16) (Table 1, Table S2). Subsequent phylogenetic analysis of all isolates using 16S rRNA sequencing was conducted, showing that the characterized isolates did not form a monophyletic group (Figure 3, Figure S1) but were interspersed within the Actinobacterium phylum [17,29]. Interestingly, 73 isolates belonged to the genus *Streptomyces* and covered most of the reported phylogenetic diversity of this widespread genus. The remaining 24 isolates belonged to 12 genera within the Actinobacteria

(Table 1). For species delineation a threshold of <98.65% sequence similarity was applied revealing seven putative new Actinobacteria species (Figure 4, Table S3) [30,31].

**Figure 2.** (**A**) Royal chamber of *Macrotermes natalensis* containing the queen, the king, and workers. (**B**) Fungus comb with a major and a minor soldier. (**C**) A plate exemplifying the diversity of culturable bacteria that can be isolated from the gut of a fungus-growing termite worker.

**Genus Family This Study: Origin of Isolation (Number of Isolates) Visser et al. [17] Origin of Isolation (Number of Isolates) Termite Gut Termite Exoskeleton Fungus Comb Termite Exoskeleton Fungus Comb** *Streptomyces* Streptomycetaceae 46 14 13 13 2 *Kitasatospora* Streptomycetaceae 00020 *Actinomadura* Thermomonosporaceae 41010 *Leifsonia* Microbacteriaceae 3 0 0 0 0 *Curtobacterium* Microbacteriaceae 1 0 0 0 0 *Arthrobacter* Micrococcaceae 0 1 0 0 0 *Micromonospora* Micromonosporaceae 30011 *Nocardia* Nocardiaceae 3 0 0 0 0 *Aeromicrobium* Nocardioidaceae 1 0 0 0 0 *Cellulosimicrobium* Promicromonosporaceae 10000 *Mycobacterium* Mycobacteriaceae 3 0 0 0 0 *Sphaerisporangium* Streptosporangiaceae 10000 *Microbispora* Streptosporangiaceae 10000 *Luteimicrobium* family of order *Micrococcales* <sup>10000</sup>

**Table 1.** The number of Actinobacteria isolated within this study and by Visser et al. [17], and their origin of isolation by genera and family.

**Figure 3.** Phylogeny and antimicrobial activity of newly isolated Actinobacteria: (**A**) Phylogenetic analysis based on near full-length 16S rRNA sequences of isolated Actinobacteria including phylogenetic placement to the family level. An unrooted neighbor-joining distance tree is shown with branch values indicating bootstrap support (>50 are given) of 1000 pseudoreplicates, tree was constructed with Mega 7.0 and edited with iTOL v3. Middle: Origin of isolation: termite abdomen: black box, termite gut: brown circle, fungus comb: green star. Right: activity heatmap against test strains *Bacillus subtilis* ATCC 6633 (1), *Staphylococcus aureus* IMET 10760 (2), *Escherichia coli* SG 458 (3), *Pseudomonas aeruginosa* K799/61 (4), *Mycobacterium vaccae* IMET 10670 (5), *Sporobolomyces salmonicolor* SBUG 549 (6), *Candida albicans* BMSY 212 (7) and *Penicillium notatum* JP36 (8). Representative picture of (**B**) fungus comb, (**C**) major worker, (**D**) dissected gut of major worker. 21

**Figure 4.** Rooted neighbor-joining tree based on near-complete 16S rRNA gene sequences showing relationship between putative new Actinobacteria strains (based on 98.65% similarity threshold) and closest relatives. Stars (\*) indicate branches that were also recovered in maximum-likelihood tree. Only bootstrap values above 50% (based on 1000 pseudoreplicates) are shown. *Rubrobacter xylanophilus* was used as an outgroup. The scale bar indicates 0.02 substitutions per nucleotide position.

#### *2.2. Antimicrobial Activities Against Test Strains*

We first assessed the bioactivities of standardized culture extracts (1 mg/mL) of all 97 isolates against a panel of test strains, including human-pathogens (the Gram-positive bacterium *Staphylococcus aureus*, the Gram-negative bacterium *Pseudomonas aeruginosa*, and the fungus *Candida albicans*) (Figure 3, Table S5). While most *Streptomyces* strains produced compounds with antibacterial and antifungal properties, we observed varying intensities of activity. On average, *Streptomyces* extracts (73) inhibited four test strains, but individual strains varied substantially in the number of test strains they suppressed. Several *Streptomyces* extracts showed only antibacterial activity (e.g., *Streptomyces* sp. RB74, *Streptomyces* sp. RB106 and *Streptomyces* sp. RB113), while others exhibited only antifungal activity (e.g., *Streptomyces* sp. RB31). Most notably, *Streptomyces* sp. RB94 and RB100 showed strong antibacterial activity. In contrast, 11 out of 73 *Streptomyces* strains (15%) inhibited none of the tested strains. Isolates belonging to the genera *Actinomadura*, *Sphaerisporangium* and *Micromonospora* inhibited on average three bacterial test strains and only two strains showed antifungal activity (*Actinomadura* sp. RB66 and *Actinomadura* sp. RB99). In contrast, extracts obtained from isolates belonging to *Arthrobacter*, *Cellulosimicrobium*, *Aeromicrobium, Luteimicrobium* and *Mycobacterium* showed almost no inhibitory activity against any of the investigated test strains. Thirty-seven extracts with moderate to strong antifungal activity were subjected to a second antifungal assay against ecologically-relevant co-occurring fungi derived from termite nests, four different fungal cultivar isolates (two species) and two entomopathogenic fungi (*Beauvaria bassiana* ST 17960 and *Metarhizium anisopliae* ATCC 24942) [32,33]. As depicted in Figure 5, the majority of culture extracts inhibited none of the representative competing or mutualistic fungi. Extracts of four strains (*Streptomyces* sp. RB7, RB72, RB13, and RB31) did, however, inhibit on average ten of the ecologically-relevant fungal test strains, including the entomopathogenic fungi and *Termitomyces* (Table S6). Interestingly *Streptomyces* sp. RB116 and *Streptomyces* sp. RB31 have the same closest type strains (Table S3) in the blast search in NCBI (https://blast.ncbi.nlm.nih.gov/Blast.cgi, last visit 26th of July, 2018, 00:58 AM) [34], but show variation in antifungal activity (Figure 5, Table S8).

**Figure 5.** Phylogeny and antifungal activity of isolated Actinobacteria. Left: Phylogenetic analysis based on near full-length 16S rRNA sequences of isolated Actinobacteria. An unrooted maximum-likelihood distance tree is shown with branch values indicating bootstrap support (>50 are given) of 1000 pseudoreplicates; tree was constructed with Mega 7.0 and edited with iTOL v3. Right: Antifungal activity assays of 37 different culture extracts against ecologically relevant fungi: #1: *Cladosporium* sp., #2: *Cladosporium* sp., #4: *Pleosporales* sp., #5: *Fusarium* sp., #8: *Coriolopsis* sp., #10: *Fusarium* sp., #12: *Cunninghamella* sp., #13: *Cladosporium* sp., #15: *Alternaria* sp., #17: *Trichoderma* sp., #22: *Trichoderma* sp., #24: *Hypocrea* sp., T115: *Termitomyces* sp., T112: *Termitomyces* sp., T153: *Termitomyces* sp., P5: *Termitomyces* sp., MA: *Metarhizium anisopliae* ATCC 24942, and BB: *Beauveria bassiana* ST 17960.

#### *2.3. Chemical Analysis*

As gene expression of important biosynthetic clusters often is under the control of promoters that respond to certain external factors, cultivation using standard laboratory conditions is likely to lead to limited amount of secondary metabolite production. To activate so-called "cryptic" gene clusters, it is often necessary to mimic natural key stress factors such as limited nutrient availability or the presence of other potentially competing organisms, as exemplified in a recent study of *Amycolatopsis* sp. M39 [24]. We therefore selected *Actinomadura* sp. RB29 and *Streptomyces* sp. RB108, and subjected both strains to co-cultivation set-ups against co-isolated fungi to stimulate the production of cryptic metabolites.

First, we tested strain RB108, as comparative phenotypical and phylogenetic analyses indicated it to be a novel *Streptomyces* species (Figure S1, Table S3). Although standardized extracts of RB108 showed only moderate antifungal activity (Figure 3, Table S5), co-cultivation induced a strong antifungal activity against almost all tested fungal strains (Table S9) and an increased brownish pigment production. Due to the large inhibition zone, we selected co-cultivation set-up RB108/*Pleosporales* sp. #4 for in-depth analysis. We performed comparative ultra-performance-liquid chromatography-mass spectrometry (UHPLC-MS, Shimadzu, Japan) analysis of concentrated extracts obtained from the zone of inhibition (ZOI). In addition to several other upregulated signals of minor intensity, a distinct UV-detectable metabolite (*m*/*z* = 303.1/321.1) was found to be only produced in co-cultivation with *Pleosporales* sp. #4. Mass spectrometry (MS)-guided high pressure liquid chromatography (HPLC, Shimadzu, Japan) purification resulted in the isolation of barceloneic acid A (**7**) (Figure 6), a fungal metabolite acting as a farnesyl-protein transferase inhibitor [35]. We then performed Matrix Assisted Laser Desorption Ionization Imaging MS (MALDI Imaging MS, Bruker Daltonics) to resolve the spatial distribution of **7** and to identify possible antifungal candidates from RB108. However, due to low ionization capacity of this compound class, a clear spatial location of barceloneic acid A (**7**) was not observable. Instead, the detailed analysis of the co-culture assay revealed a cluster of ions with *m/z* values between 2000 Da and 2500 Da (*m/z* 2188.15 and *m/z* 2134.71 being the most intense) that were upregulated and accumulated in the center and on the edges of the colonies facing the fungus *Pleosporales* sp. #4 (Figure 7). This *m/z* range is typical for ribosomally-synthesized peptides (RiPPs) often associated with high antimicrobial activities. We currently hypothesize that *Pleosporales* sp. #4 modulates the interaction with strain RB108 using barceloneic acid A (**7**) and stimulates the production of RiPPs of yet unknown composition.

**Figure 6.** Co-cultivation studies of strain RB108 with *Pleosporales* sp. #4: (**A**) negative control: axenic *Pleosporales* sp. #4; (**B**) positive control: *Pleosporales* sp. #4 in the presence of amphotericin B (8 mg/mL, middle); (**C**) interaction assay of *Streptomyces* sp. RB108 (middle) with *Pleosporales* sp. #4 (edge of the plate); (**D**) axenic *Streptomyces* sp. RB108; (**E**) structure of barceloneic acid A (**7**); and representative UHPLC-MS analysis (254 nm) of zone of inhibition extracts: EIC (–) of barceloneic acid A (**7**) at *m/z* 319.0.

**Figure 7.** MALDI imaging of a co-cultivation study of *Streptomyces* sp. RB108 with *Pleosporales* sp. #4: (**A**) average MS spectra of the MALDI Imaging MS analysis (TIC normalization) and red rectangle defines the region of the spectra zoomed in; (**B**) extended view of the region from 2100 to 2300 *m*/*z* showing the upregulated RiPPs; (**C**) photograph of co-culture set up: RB108 grown in the middle and right lower side; *Pleosporales* sp. #4 grown on agar plugs on the right and left edge of the plate. Area covered by the MALDI Imaging MS analysis defined in red; (**D**) visualization of the most intense peak ion *m*/*z* 2188.15 (TIC normalization, weak denoising); and (**E**) visualization of the second most intense peak ion *m*/*z* 2134.71 (TIC normalization, weak denoising).

In a second study, we pursued a comparative UHPLC-MS analysis of *Actinomadura* sp. RB29 as co-cultivation against, for example, *Trichoderma* sp. #22, which induced strong antifungal activities (Figure 8). In contrast, no antifungal properties were originally observed in standard culture extracts (Figure 3).

By tracing back the antifungal agents, comparative analysis of culture extracts resulted in the detection of predominant rubterolone derivatives (**6**, **8**, and **9**) [27] and a unique *m/z* signal pattern indicative for a dichlorinated natural product (**11**). Subsequent purification of co-cultivation and liquid culture extracts by semipreparative HPLC led to the isolation of the natural product banegasine (**10**) and the chlorinated natural product cyclo(*N*Me-L-3,5-dichlorotyrosine-Dhb) (**11**) (Table S7) [36].

**Figure 8.** Co-cultivation studies of strain RB29 with *Trichoderma* sp. #22: (**A**) negative control: axenic *Trichoderma* sp. #22; (**B**) positive control: *Trichoderma* sp. #22 in the presence of AmpB; (**C**) co-cultivation of *Actinomadura* sp. RB29 (middle) and *Trichoderma* sp. #22 (edge of the plate); (**D**) axenic culture of *Actinomadura* sp. RB29; (**E**) Representative UHPLC-MS analysis (535 nm) of zone of inhibition extracts: EIC (+) of rubterolone A, B, and D at *m*/*z* 414.1, 496.1, and 554.1 and structures of isolated rubterolone A (**8**), B (**9**), and D (**6**); (**F**) production of banegasine (**10**) on ISP2 agar plate and extract with 1% AcOH containing MeOH; and (**G**) production of cyclo(*N*Me-L-3,5-dichlorotyrosine-Dhb) (**11**) in Soya Broth (here represent from XAD16 80% MeOH eluate).

We also analyzed standard culture extracts using LC-HRMS and detected, in addition to several of the previously reported rubterolone derivatives, a RiPP-type MS<sup>2</sup> pattern of two parent ions *m*/*z* at 957.8 ([M + 2H]2+) and 993.5 ([M + 2H]2+) within the SPE 70% and 80% MeOH C18-SPE eluates. Purification by Sephadex LH20 resin and semipreparative HPLC purification resulted in the identification of two tetracyclic lanthipeptides rubrominin A (**12**) and B (**13**).

The molecular formulas of **12** and **13** were established to be C80H115N21O28S3 and C83H120N22O29S3 based on the exact mass analysis of protonated ion **12** (*m/z* 1914.73938 [M + H]+, calcd. 1914.7 <sup>Δ</sup> <sup>=</sup> −3.21 ppm) and **<sup>13</sup>** (*m/z* 1985.77539 [M + H]+, calcd. 1985.77141 <sup>Δ</sup> = 2.01 ppm). The mass difference of 71.03601 between **1**2 and **13** suggested an additional alanine residue from the *N*-terminus. The MS2 spectra were recorded, submitted to Global Natural Product Social Molecular Networking (GNPS) and processed by RiPPquest [37]. The combined analysis led to the identification of a putative candidate peptide ([A]CSSTCTSGPFTFACDGTTKG), which is presumably modified by dehydrations and oxidation reactions. However, the estimate *p*-value of PSM (peptide-spectrum matches) did not allow an assignment of the modified positions.

Genome analysis of *Actinomadura* sp. RB29 using antiSMASH [38] and Blast resulted in the identification of a cinnamycin-homolog gene cluster [39,40], which we named *rum*. It contains 21 open reading frames (ORF) and homolog genes to *cinA*, *cinM*, *cinX*, and *cinorf7* from *Streptomyces*

*cinnamoneous* (Figure S6, Table S11). The candidate peptide sequence ([A]CSSTCTSGPFTFACDGTTKG) is located at the *N*-terminus of RumA. Furthermore, a precursor peptide sequence and an AYA (AXA motif) between the *C*-terminus leader sequence and the core region of RumA was identified; both of which are likely be recognized by a type I signal peptidases of the general secretory (sec) pathway. The RumM sequence shows high homology with other class II LanM (lanthipeptide synthetase) enzymes and is likely to catalyze the dehydration of Thr4, Ser7, Thr11, and Thr18 in the precursor peptide and the subsequent addition/cyclization reaction of three Cys residues to form the three methyllanthionine bridges. In addition, RumX, a homolog Nif11 family protein, might hydroxylate Asp15 of the precursor peptide. The Cinorf7 homolog named RumN is presumably involved in the formation of the cross-link between Lys19 and dehydroalanine7 to form a lysinoalanine bridge. We then performed isotope labeling experiments using L-serine-2,3,3-D3 and DL-cysteine-3,3-D2, which showed the incorporation of both, serine and cysteine, into the core peptide. Subsequent Marfey's derivatization of both compounds revealed the partial amino acid composition in the matured peptide as L-Phe, L-Ala, Gly, L-Ser, L-Thr, and L-Pro. 1H NMR spectra and COSY correlation of **12** and **13** recorded in D2O revealed amino acid-like chemical shifts, and 10% D2O/90% H2O deduced the amide chemical shifts from peptide bonds, in particular phenylalanine and alanine spin systems. Due to the dominant rubterolone formation and low production titers of **12** and **13**, full NMR assignment and evaluation of the bioactivities is the topic of current investigations. Based on the acquired genomic data, MS2-analysis and Marfey analysis, we are confident propose the tetracyclic structure of rubrominins A (**12**) and B (**13**) as depicted in Figure 9.

**Figure 9.** Proposed structures of lanthipeptides rubrominin A (**12**) and B (**13**).

#### **3. Discussion**

#### *3.1. Phylogenetic and Ecological Relevance of the Actinobacteria*

Termites forage for a broad range of organic material to manure the mutualistic food fungus and this substrate is expected to contain a broad diversity of fungal and bacterial isolates. After predigestion of the harvested material during a first gut passage [20], the termites deposit the resulting feces as fresh comb material for fungal growth. However, it is a conundrum that active combs lack any signs of the presence of fungal contaminates or diseases. This suggests that the gut and comb environment, including the microbial community, provides effective defenses against any incoming potentially parasitic and competitive species. To test the hypothesis that Actinobacteria provide have the potential to be a line of defense against invading fungal species, and to identify novel antimicrobial metabolites produced by these bacteria, we investigated the culturable actinobacterial diversity of guts, combs, and exoskeletons of the fungus-growing termite species *M. natalensis*. Here, we acknowledge the fact that due to the limited numbers of tested culture conditions, we miss "unculturable" and extremophilic members.

Overall, Actinobacteria were found throughout all samples and colonies of *M. natalensis* with 97 representatives, covering twelve genera, ten families, and two orders of the known actinobacterial diversity [29]. We found a dominance of diverse *Streptomyces* species (75%) and a low abundances of other genera, such as *Actinomadura*, *Microbispora*, *Micromonospora*, and *Nocardia* consistent with previous findings [17,29,41]. Here, it is interesting to note that *Actinomadura* are frequently isolated from soil [42] and from other social insects such as bee hives [43], similar to members of the genera *Microbispora*, *Micromonospora*, and *Nocardia* [44,45].

Based on the threshold of 98.65% sequence similarity of the 16S rRNA gene for species delimination, we defined seven isolates as putative novel species (Figure 4), and which are currently investigated for their physiological properties and biosynthetic potential.

The dominant isolation rates of *Streptomyces* from all biological samples and the lack of phylogenetic specificity between Actinobacteria genera and the different biological samples (exoskeleton, gut fluids, and fungus comb), may suggest that Actinobacteria are transient microbes. They are presumably taken up as spores or vegetative mycelium present within digested soil particles and then incorporated as part of the fecal deposits within the comb material; a similar strategy allows the introduction and propagation of the fungal mutualist *Termitomyces.*

In particular, the high isolation rate of Actinobacteria from anaerobic gut fluids was intriguing as previous metagenomic studies indicate that the microbial gut community is dominated by members of Firmicutes, Bacteroidetes, Spirochaetes, Proteobacteria, and Synergistetes, with the most abundant genera being anaerobic or microaerophilic, such as *Alistipes*, *Treponema, Desulfovibrio*, *Paludibacter*, and a member of the Synergistaceae [45]. 16S rRNA sequences affiliated with Actinobacteria appear to account for only a minor component of the gut bacterial microbiota. Similar results were obtained in a related study of the fungus-growing termite *Odontotermes formosanus*, with four phylogenetic groups, Firmicutes, the Bacteroidetes/Chlorobi group, Proteobacteria, and Actinobacteria dominating [46]. Considering the anaerobic or microaerophilic environment of the gut compartments, it is likely that only a small fraction of Actinobacteria are actively growing (Table S10) [47,48], and most of our isolates might originate from germination of spores present within the gut fluids [49].

In contrast to gut microbial communities, taxonomic analyses of the comb microbiota of different termite genera showed a clear shift in microbial composition to a more dynamic microbiota of about 33 different phyla, with Firmicutes, Bacteroidetes, Proteobacteria, and Actinobacteria (47 families of four different classes) as the most abundant phyla [50]. It has been hypothesized that this shift would allow for a second microbe-assisted aerobic decomposition, detoxification of plant substrates, and defense against invading and potentially non-beneficial microbes. In particular, Actinobacteria have a multitude of enzymatic capabilities to break down polysaccharides (cellulose, chitin, xylan, and agar) [29,41,51], and to detoxify microbial metabolites, metabolic capacities that most likely contribute to the optimal growth conditions for *Termitomyces*.

#### *3.2. Bioactivities and Natural Products*

Using two standardized bioassays, we explored the antimicrobial activities of associated Actinobacteria. More than three quarters of the actinobacterial isolates produced compounds with antimicrobial activity against one or more test strains (human-pathogenic), but only four of the generated extracts revealed antifungal activity against fungal garden weeds, in addition to inhibitory activity against the fungal cultivar. Here, we acknowledge the possibility that metabolite secretion of isolated bacterial strains is strongly depended on the culture environment. We also noticed that even very closely related strains showed a high variability amongst activities. In subsequent actinobacterium-fungus co-cultivation studies we showed that metabolite production was stimulated in the presence of a fungal species and that strains which previously secreted no antifungal metabolites were stimulated and produced compounds with strong antifungal activity. We decided to analyze two co-culture case studies in detail. In the first case study, we analyzed the interaction zone of a co-culture between *Streptomyces* sp. RB108 and *Pleosporales* sp. #4 using LC-MS and MALDI Imaging to visualize

the production of potential "cryptic" metabolites. UHPLC-MS-based analysis of the co-culture between *Streptomyces* sp. RB108 and *Pleosporales* sp. #4 revealed that the fungal metabolite barceloneic acid A (**7**), a known farnesyl-protein transferase inhibitor, was strongly upregulated. Although barceloneic acid A did not reveal any antimicrobial activity, it is likely that the molecule modulates the bacterium-fungal interactions using yet unknown mechanisms. We then used a MALDI imaging approach to identify the origin of the antifungal activities and found an increased production of RiPP-like metabolites, which are known for antimicrobial activities and their detailed structural analysis is topic of recent investigations. Upregulations of RiPP-like metabolites having different *m/z* ranges were also observed in other co-cultivation studies; revealing a glimpse into the plethora of metabolites present within complex multipartner interactions.

In a second case study, we analyzed the metabolome of *Actinomadura* sp. RB 29 in more detail, as growth studies on different media and co-culture assay induced strong metabolomic shifts and inducible antifungal activity. Comparative analysis revealed that in addition to the previously reported rubterolones, the natural product banegasine (**10**) and dichlorinated diketopiperazine derivative **11** were produced. Banegasine (**10**) displayed moderate antimicrobial activities against Gram-positive bacteria, including *Mycobacteria*, and is known to potentiate the antimicrobial activity of, for example, pyrrolnitrin [52]. Although the dichlorinated diketopiperazine derivative **11** has been previously reported as part of a screening library, its origin has been undisclosed [36]. In general, diketopiperazine derivatives are common secondary metabolites from bacteria and fungi and the combination of natural and modified amino acids produces diverse structural and bioactivity diversity. The structure of compound **11** is unique as it contains two modified amino acids. First, it contains a 2,3-dehydro-2-aminobutyric acid (Dhb), which presumably originates from the dehydration of threonine [53], and is frequently found in bioactive natural products family like nonribosomal peptides [54] and lanthipeptides [55]. Secondly, it contains a 3,5-dichlorotyrosine moiety, which is a building block of several natural products like chloropeptin [56] from *Streptomyces lavendulae*, and cyclo(13,15-dichloro-L-Pro-L-Tyr) from fungi *Leptoxyphium* sp. In addition, the modified amino acid has been detected in cuticles from several insect species, where they might play important roles in the sclerotization process [57]. A similar diketopiperazine *cyclo*(L-*N*-MePhe-Dhb) was identified from *Streptomyces globisporus*, exhibiting interesting morphogenic and biosynthetic regulator effects [58]. Additionally, we also identified two ribosomally synthesized and post-translationally modified peptides, named rubrominins A (**12**) and B (**13**). The post-translational modification reactions include the dehydration of Ser residue to dehydroalanine and a cyclization step that includes the addition of Cys residues to the dehydrated Ser residues yielding the lanthionine and thioether cross-link. The resulting polycyclic peptides, named lanthipeptides, have constrained conformations that often confer their biological activities. Due to the very low production titers of rubrominins A (**12**) and B (**13**) in co-cultures and axenic cultures, their biological activities have not yet been elucidated and is a topic of current investigations.

Overall, the chemical analysis of *Actinomadura* sp. RB 29 revealed growth-condition dependent metabolite production, and although the origin of the antifungal activity within co-cultures remains to be fully elucidated, the identified natural products (rubterolones, banegasine a chlorinated diketopiperazine derivative **11**, and two lanthipeptides **12** and **13**) exhibit interesting chemical features that are key in many pharmacologically important compound classes.

#### **4. Materials and Methods**

General procedures: NMR measurements were performed on a Bruker AVANCE III 500 MHz and 600 MHz spectrometer, equipped with a Bruker Cryoplatform. The chemical shifts are reported in parts per million (ppm) relative to the solvent residual peak of DMSO-*d*<sup>6</sup> ( 1H: 2.50 ppm, quintet; 13C: 39.52 ppm, heptet). LC-ESI-HRMS measurements were carried out on an Accela UPLC system (Thermo Scientific) coupled with an Accucore C18 column (100 mm × 2.1 mm, particle size 2.6 μm) combined with a Q-Exactive mass spectrometer (Thermo Scientific) equipped with an electrospray ion

(ESI) source. UHPLC-MS measurements were performed on a Shimadzu LCMS-2020 system equipped with single quadrupole mass spectrometer using a Phenomenex Kinetex C18 column (50 mm × 2.1 mm, particle size 1.7 μm, pore diameter 100 Å). The column oven was set to 40 ◦C; scan range of MS was set to *m*/*z* 150–2000 with a scan speed of 10,000 u/s and event time of 0.25 s under positive and negative mode. The DL temperature was set to 250 ◦C with an interface temperature of 350 ◦C and a heat block temperature of 400 ◦C. The nebulizing gas flow was set to 1.5 L/min and dry gas flow to 15 L/min. Semipreparative HPLC was performed on a Shimadzu HPLC system using a Phenomenex Luna C18(2) 250 mm × 10 mm column (particle size 5 μm, pore diameter 100 Å). IR spectra were recorded on an FT/IR-4100 ATR spectrometer (JASCO). Optical rotations were recorded in methanol on a P-1020 polarimeter (JASCO). Solid phase extraction was carried out using Chromabond C18ec cartridges filled with 2 g and 10 g of octadecyl-modified silica gel (Macherey-Nagel, Düren, Germany). Open column chromatography was performed on Sephadex LH20 (GE Healthcare, Hamburg, Deutschland). Chemicals: Methanol and acetonitrile LC-MS grade (VWR International GmbH, Dresden); water for analytical and preparative HPLC (Millipore, Darmstadt, Germany); formic acid (Carl Roth, Karlsruhe, Germany); acetonitrile (VWR as LC-MS grade); media ingredients (Carl Roth, Karlsruhe, Germany).

Sample collections and isolation procedures: Biological material (soldiers, workers, and fungus comb) was collected from eleven *M. natalensis* nests (stored in 50% glycerol) and one *M. natalensis* and one *Odontotermes* sp. nest for transcriptomic analysis (Table S1) (stored in RNAlater@, Sigma Aldrich, St. Louis, MO, USA). Samples were kept on ice immediately after collection and stored at −80 ◦C within one day. Frozen termite workers (gut and cuticle) and fresh fungus comb material were used for the isolation of Actinobacteria and each sample was processed separately. First, termites and fungus comb samples were individually washed with ddH2O (250 μL) and the wash water collected separately for subsequent isolation procedure. Then, major termite workers were surface sterilized with 70% ethanol and washed in sterile Ringer solution (7.5 g/L NaCl, 0.35 g/L KCl, 0.21 g/L CaCl2). Termites were dissected using sterile, fine tipped forceps and intact guts were immediately removed and stored in 500 μL PBS on ice until further use (5 guts per sample). Dissected guts were crushed using a sterile pestle and a series of dilution (up to 10−<sup>6</sup> in PBS) was produced. Bacteria from each sample were isolated by plating 100 μL of each dilution series (10−4–10−6) on two different selective low-nutrient media: chitin and microcrystalline medium supplemented with 0.05 g/L cycloheximide (Table 2) [17]. Isolates with Actinobacteria-like morphology were transferred to the nutrient-richer medium ISP2 and subcultured. A total of 68 isolates were obtained from gut compartment, 16 isolates from termite cuticle, and 13 isolates from fungus comb (Table S2).


**Table 2.** Media compositions used for initial isolations and subsequent growth assays and large-scale cultivation.

DNA extraction, PCR amplification, pairwise sequence similarities and phylogenetic analysis: Actinobacteria were grown in nutrient-rich ISP2 broth for 5 to 7 days at 30 ◦C (150 rpm). Cells were harvested, and genomic DNA was extracted using the GenJet Genomic DNA Purification Kit (Thermo Scientific, Waltham, MA, USA, #K0721) following the manufacturer's instructions with slight changes (lysozyme incubation time 40 min, protein kinase K treatment 40 min). DNA was quantified spectrophotometrically using Nanodrop (Thermo Scientific, Waltham, MA, USA). 16S rRNA gene was amplified using the primers pair 27F/1492R [17]. The amplification reaction was prepared in 25 μL final volume containing: 7.25 μL dH2O, 5.0 μL HF buffer, 5.0 μL of each primer (2.5 μM), 0.5 μL dNTPs (10 μM), and 0.25 μL Phusion High Fidelity DNA Polymerase (New England Biolabs). PCR was performed with the conditions: 98 ◦C for 38 s, followed by 32 cycles of 98 ◦C for 30 s, 52 ◦C for 45 s, 72 ◦C for 1 min 20 s, and a final extension of 72 ◦C for 8 min. PCR products were visualized by agarose gel electrophoresis. PCR reactions were purified using the PCR Purification Kit (Thermo Scientific, Waltham, MA, USA, #K0702) and sequenced at GATC (Konstanz).

Sequences were checked for purity and mismatches using BioEdit [59]. Forward and reverse sequences of each sequence were assembled with BioEdit and tested for chimeras using DECIPHER [60]. For strains RB9, RB54, RB74, RB85, and RB129 only reverse or forward sequences were generated (Table S3). Resulting sequences were used for a BLASTn search in GenBank using "refseq\_rna" database [61]. Pairwise sequence similarities were calculated using the method recommended by Meier-Kolthoff [30] for 16S rRNA gene available via the GGDC web server available at http://ggdc.dsmz.de/. Sequence similarities were calculated for all strains with first three hits (Table S3). A phylogenetic analysis was done with the 16S rRNA sequences (GenBank accession numbers KX344916-KX344918, KY312017-KY312022, KY558669-KY558746, and MH044507-MH044516) and the first hit from the BLASTn search (Figure S1, Table S3). Sequences were aligned with muscle [62] and trimmed using MEGA 7.0.26 [63]. Two different phylogenetic trees were reconstructed with neighbor-joining [64] and maximum likelihood algorithms (Figure S1) [65].

The evolutionary distance model of Kimura or Tamura and Nei was used to generate evolutionary distance matrices for the maximum likelihood [66,67], and neighbor joining algorithm with deletion of complete gaps and missing data. For the maximum-likelihood algorithm, a discrete Gamma distribution was used (+G) and the rate variation model allowed for some sites to be evolutionarily invariable (+I). For the neighbor-joining algorithm rate variation among sites was modeled with a gamma distribution. For all constructed trees the confidence values of nodes were evaluated by bootstrap analysis based on 1000 resamplings [68]. For graphic design, iTOL v3 (https://itol.embl.de/, 31st of July, 2018) was used with the following settings: leaf sorting = none, branch length = ignore, scaling factors: Hor = 0.3, Vert: = 0.8) [69].

Phylogenetic comparison of strain RB108: Near-complete 16S rRNA sequences (1365 bp, GenBank accession number KY558675) were used for a search in NCBI database (reference RNA sequences). The first three hits were *Streptomyces pulveraceus* NBRC 3855, *Streptomyces atratus* NRRL B-16927, and *Streptomyces gelaticus* NRRL B-2928 with an Ident value of 99%. All three hits were phenotypically compared using the Wink compendium [70]. Strain RB108 exhibited a different phenotype compared to the above listed *Streptomyces*. Therefore, the full-length 16S rRNA sequence (1514 bp, GenBank accession number MH828334) was extracted of the genome of strain RB108 and used for comparison. The first three hits of the reference RNA sequence database were *Streptomyces fulvissimus* DSM 40593, *Streptomyces caviscabies* ATCC 51928, and *Streptomyces luridiscabiei* S63. Virtual DDH estimation of strain RB108 and *Streptomyces fulvissimus* DSM 40593 was performed, resulting in a value of 26.20% (23.8–28.7%). According to the DDH threshold of <70% both strains can be regarded as distinct two separate strains [30,31].

Culture extracts: Actinobacteria were cultivated in 25 mL ISP2 or PDB for 4 days at 30 ◦C at 150 rpm, after 4 days additional 25 mL ISP2 or PDB broth were added and cultivation was continued for another 3 to 5 days. Cultures were centrifuged (6000 rcf, 10 min) and the resulting cell pellets were lysed using MeOH (9 mL). The resulting methanolic cell extracts were combined with the culture supernatant to yield a 20% MeOH culture supernatant. Metabolites from the supernatant were concentrated using an activated (20% MeOH) Chromabond C18ec cartridges filled with 500 mg of octadecyl-modified silica gel (Macherey-Nagel, Düren, Germany). (Unless stated otherwise: %MeOH refers to a mixture of MeOH and dH2O). Metabolites were eluted using 100% MeOH (5 mL) and 100% acetone (2 mL) and pooled and concentrated in vacuo. The resulting extracts (E) were adjusted with MeOH to 1 mg/mL and used for bioactivity assays.

Antimicrobial activities against test strains: Antimicrobial assays against *Bacillus subtilis* ATCC 6633, *Staphylococcus aureus* IMET 10760, *Escherichia coli* SG 458, *Pseudomonas aeruginosa* K799/61, *Mycobacterium vaccae* IMET 10670, *Sporobolomyces salmonicolor* SBUG 549, *Candida albicans* BMSY 212, and *Penicillium notatum* JP36 were done using the broth dilution method according to the NCCLS (National Committee for Clinical Laboratory Standards) (Table S5).

Antifungal activity assay against co-isolated fungi: All fungal isolates (Table S4) were cultivated on PDA plates for a maximum of six weeks (23 ◦C) and subcultured by plating mycelium-containing agar pieces (1 cm × 1 cm) onto fresh PDA. To evaluate antifungal activity, a filter paper disk (d = 6 mm) was soaked with 10 μL extract (1 mg/mL) and dried (sterile air flow). Depending on the growth behavior of each fungus, two different assays were applied. Method A (fast-growing fungi), mycelium covered agar pieces were placed in the middle of a PDA plate (standard petri dish 92 mm × 16 mm) and sterile filter paper discs were placed at a distance of 1–2 cm from each agar plug. Method B (slow to medium-fast growing fungi): Fungi were grown in 25 mL PDB for 10 to 14 days at 30 ◦C (150 rpm) and 500 μL of actively growing culture or a spore solution (*M. anisopliae*) was used to inoculate a PDA plate (standard petri dish 92 mm × 16 mm). Distribution of mycelium or spores on plates was performed using sterile glass beads. Plates were dried and filter discs soaked with equal amounts (10 μL) of extracts were put onto the PDA plate. Plates were checked daily and the diameter of the zone of inhibition (ZOI) (no growth, mycelium free) was recorded as a measure of inhibition (Tables S6 and S8). Amphotericin B (8 mg/mL in DMSO) and cycloheximide (50 mg/mL in MeOH) were used as positive controls, 100% MeOH was used as a negative control. All combinations were prepared in duplicate.

Co-cultivation studies: *Streptomyces* sp. RB108 was grown for 14 days at 30 ◦C in ISP2, 25 μL of liquid culture were used to inoculate ISP2 and PDA plates centrally. Plates were incubated for 7 days at 30 ◦C until a clear colony (1 cm in diameter) was apparent. Then, plates were inoculated at the edge of the agar plate with two agar pieces covered with fungal mycelium (Table S9). All combinations were prepared in triplicate. Amphotericin B (8 mg/mL in MeOH) was used as positive control; ddH2O was used as negative control. Plates were incubated for 10 days at room temperature Plates were checked daily for the formation of a zone of inhibition (ZOI). When a clear, stable ZOI was detectable (normally after 10 days), the ZOI was cut out and extracted with 100% methanol (overnight). Methanol extracts were dried and stored at −20 ◦C until further use and then subjected to comparative LC-MS and HPLC analysis. Due to significant upregulation of metabolites during co-cultivation, the combination of *Streptomyces* sp. RB108 and fungus #4 was selected for subsequent experiments.

Isolation and structural elucidation of barceloneic acid A (7): For isolation of upregulated metabolites, large scale co-cultivation of *Streptomyces* sp. RB108 and *Pleosporales* sp. #4 was accomplished by inoculation of 30 PDA agar plates (92 mm × 16 mm, 20 mL agar/plate) as described above. Each ZOI was excised from the plate then pooled and extracted with MeOH overnight and solvent removed by evaporation. The resulting crude extract (adjusted to 10% MeOH) was loaded on an activated and equilibrated SPE C18 column and fractionated by step-gradient from 10% MeOH to 100% MeOH (100 mL each). The 50% MeOH eluate was separated by semipreparative HPLC to yield pure compound 7 (*m*/*<sup>z</sup>* 319.19 [M − H]−; 302.95 [M + H − H2O]+). The molecular formula of barceloneic acid A (**7**) was assigned as C16H16O7 based on ESI-HRMS (*m*/*z* 321.0964 [M + H]+, calcd. 321.0969 <sup>Δ</sup> <sup>=</sup> −1.46 ppm) (Figure S11). The 1H NMR analysis (Table S12) revealed 13 protons as sharp signals, suggesting additional three invisible exchangeable protons. Two *meta*-substituted aromatic moieties could be deduced according to the 1H chemical shifts and multiplicity of protons

at δ<sup>H</sup> 6.23 ppm (1H, doublet, *J* = 2.7 Hz, H-11) and δ<sup>H</sup> 6.44 ppm (1H, doublet, *J* = 2.7 Hz, H-13); δ<sup>H</sup> 5.95 ppm (1H, broad singlet, H-4) and δ<sup>H</sup> 6.24 ppm (1H, broad singlet, H-6), respectively. The methyl group at δH3-8 2.07 ppm/δC-8 21.2 ppm was deduced to attach on aromatic ring based on the HMBC correlations of H3-8 to C-4/C-5/C-6 and H-4 to C-1/C-3/C-5/C-6/C-8 and H-6 to C-1/C-4/C-7/C-8. The oxygenated methyl moiety at δH3-16 3.68 ppm/δC-16 54.9 ppm was suggested to connect on second aromatic ring based on the HMBC correlation of H3-16 to C-12. The oxygenated methylene moiety at δH-15 4.51 ppm (2H, doublet, *J* = 1.2 Hz, H-15)/δC-15 58.3 ppm was deduced to connect on the same aromatic ring based on the HMBC correlations of H2-15 to C-9/C-13/C-14, H-13 to C-12/C-14/C-15, and H-11 to C-9/C-10/C-12/C-13. Finally, the proposed structure of barceloneic acid A (**7**) matched the literature reported HRMS and 2D NMR data (Table S13) [35]. NMR spectra are shown in Figure S7–S10.

MS Imaging: An indium tin oxide (ITO) coated glass slide was used for Imaging MS (Bruker Daltonics, Billerica, MA, USA) and covered with 1 mL of ISP2 and PDA medium, respectively. The dried agar glass slide was then inoculated with 10 μL of a *Streptomyces* sp. RB108 liquid culture (middle of the cover slide) and incubated for 7 d at 30 ◦C. Two square agar pieces (0.3 cm × 0.3 cm) covered with mycelium of fungus # 4 were arranged at a distance of 1cm on the cover slide (Figure S2). After an incubation period of 7 days at RT, the slides were dried for an additional half hour next to an open flame and then sprayed with a saturated solution (7 g/L) of universal MALDI matrix (1:1 mixture of 2,5-dihydroxybenzoic acid and α-cyano-4-hydroxy-cinnamic acid; Sigma Aldrich) in acetonitrile HPLC grade, using the automatic system ImagePrep device 2.0 (Bruker Daltonics, Bremen, Germany). The sample was analyzed in an UltrafleXtreme MALDI TOF/TOF (Bruker Daltonics, Bremen, Germany), which was operated in positive reflector mode using flexControl 3.0 (Bruker, Bremen, Germany). The analysis was performed in the 100–3000 and 400–4000 Da ranges, with 40% laser intensity (laser type 3), accumulating 1000 shots by taking 50 random shots at every raster position. Raster width was set at 150 μm. Spectra were processed with baseline subtraction in flexAnalysis 3.3 (Bruker, Bremen, Germany). Processed spectra were uploaded in flexImaging 3.0 for visualization and SCILS Lab 2015b for analysis and representation. Chemical images were obtained after peak alignment on the dataset using Median normalization and weak denoising.

Isolation and structure elucidation of banegasine (**10**): Large-scale cultivation of *Actinomadura* sp. RB29 was performed by inoculation of 100 ISP2 agar plates (standard 150 mm × 20 mm, 40 mL ISP2 agar/plate) at 30 ◦C for 10 days. Whole agar plates were cut into pieces and extracted twice with 2 L MeOH (1% AcOH) at 4 ◦C overnight. MeOH/AcOH extracts were filtered and concentrated under reduced pressure. The crude extract was dissolved using 10% MeOH and loaded on an activated and equilibrated SPE C18 column (10 g) and fractionated by step-gradient from 10% MeOH to 100% MeOH (100 mL each). The eluent using 30% MeOH was first purified by Sephadex LH20 using 50% MeOH to obtain subfractions Fr.3.1–Fr.3.6 (20 mL/Fr.). Concentrated violet band Fr.3.6 was further separated by semipreparative HPLC to yield pure banegasine (**10**, 2.0 mg, *t*<sup>R</sup> = 6.74 min) using the following gradient: 0–5 min, 50% B; 5–20 min, 50–100% B; 20–25 min, 100% B (A: dd H2O + 0.1% formic acid; B: MeOH) with a flow rate of 2.0 mL/min. The molecular formula of banegasine (**10**) was assigned as C11H12O2N2 based on ESI-HRMS (*m*/*z* 205.0972 [M + H]+, calcd. 205.0972 Δ = 0.13 ppm) and λmax 217,.278 nm (MeCN/H2O/FA) (Figure S13). The 1H NMR analysis (Table S14) revealed nine protons as sharp signals, suggesting additional three invisible exchangeable protons. The *ortho*-substituted aromatic moiety was deduced from the 1H chemical shifts and multiplicity of protons at δ<sup>H</sup> 7.35 ppm (1H, doublet, *J* = 8.1 Hz, H-3 ); δ<sup>H</sup> 7.06 ppm (1H, triplet, *J* = 7.8 Hz, H-4 ); δ<sup>H</sup> 6.97 ppm (1H, triplet, *J* = 7.2 Hz, H-5 ); δ<sup>H</sup> 7.56 ppm (1H, doublet, *J* = 7.9 Hz, H-6 ). The dihydropyrrole moiety was deduced from the aliphatic protons at δ<sup>H</sup> 3.48 ppm (1H, doublet of doublet, *J* = 8.6, 3.9 Hz, H-2) and δ<sup>H</sup> 2.99 ppm (1H, doublet of doublet, *J* = 15.1, 8.9 Hz, H-3a) and δ<sup>H</sup> 3.31 ppm (1H, doublet of doublet, *J* = 15.1, 3.7 Hz, H-3b), and olefinic proton at δ<sup>H</sup> 7.22 ppm (1H, doublet, *J* = 1.2 Hz, H-5). 1H NMR spectrum of banegasine is shown in Figure S12.

Isolation and structure elucidation of *cyclo*(*N*Me-L-3,5-dichlorotyrosine-Dhb) (**11**): Large-scale liquid cultivation of *Actinomadura* sp. RB29 was performed in a 50 L fermenter (20 L Soya Broth liquid media, pH 6.8) for 5 days at 28 ◦C (stirring). The culture supernatant was separated from biomass by separator and collected and loaded onto activated XAD16 resin (1 kg). The resin was first washed by water (2 L) and then eluted by MeOH/H2O mixture in a step gradient manner, with 1 L 10% MeOH, 30% MeOH, 50% MeOH, 80% MeOH, 100% MeOH, respectively. The corresponding eluates were concentrated under reduced pressure and redissolved into 50% MeOH or 100% MeOH as 5 mg/mL for standard metabolomic LCMS analysis. The interesting ion with *m*/*z* at 328.9 ([M+H]+) was redetected from 80% MeOH eluate and allowed us to submit this 80% MeOH eluate onto Sephadex LH20 resin and eluted by 100% MeOH for further purification. Subfractions containing the ion *m*/*z* at 328.9 ([M+H]+) were concentrated under reduced pressure and finally purified by semipreparative HPLC (Nucleodur C18 250 mm × 10 mm) to yield compound **11** (2.0 mg, *t*<sup>R</sup> = 13.33 min) using the following gradient: 0–5 min, 30% B; 5–25 min, 30–83% B; 25–30 min, 100% B (A: dd H2O + 0.1% formic acid; B: MeCN) with a flow rate of 2.0 mL/min. The molecular formula of compound (**11**) was determined as C14H14O3N2Cl2 based on ESI-HRMS analysis (*m*/*z* 329.04520 [M + H]+, calcd. 329.04542 Δ = −0.68 ppm) and the isotope abundance of chlorine, and further confirmed by the observation of 14 carbon atoms from 13C-NMR and DEPT135 spectra analysis (Figure S15 and S16). Detailed analysis of 1D and 2D NMR spectra (Figure S14, S17–S20) indicated two carbonyl groups, one four-substituted benzyl moiety, one olefinic moiety, one methylene and one methine group, two methyl groups, and one visible OH/NH signal. The presence of two amide groups (δC-1 165.5 ppm/δC-9 159.0 ppm) suggested the amino acids origin. Methylated olefin moiety was deduced from the 1H-1H COSY correlation of H-11 (δ<sup>H</sup> 5.41 ppm/δ<sup>C</sup> 112.0 ppm) to H3-12 (δ<sup>H</sup> 1.46 ppm/δ<sup>C</sup> 10.7 ppm) and the HMBC correlations of H-11 to C-12, and H3-12 to C-10 and C-11. The further HMBC correlations of H-11 and H3-12 to C-9 (δ<sup>C</sup> 159.0 ppm), OH/NH (δ<sup>H</sup> 9.92 ppm) to C-9 and C-10 suggested the possible dehydroaminobutyric acid (dhAbu or Dhb) moiety. Second spin system C-2–C-3 from COSY correlation of H-2 to H2-3 and the aromatic protons (δH-5 6.94 ppm/δC-5 129.7 ppm), and *N*-methyl group (δH3-8 2.94 ppm/δC3-8 32.1 ppm) were deduced to belong to the substituted tyrosine skeleton based on the deeper observation of HMBC correlations of H-2 to C-1/-3/-4/-8, and H2-3 to C-1/-2/-4/-5, and H-5 to C-2/-3/-6/-7, and H3-8 to C-2. Based on the 13C chemical shift compared with the literature [70] and demand of elemental composition, two chlorine atoms were suggested to attach on the C-6 individually, leading to the build-up of *N*-methyl 3,5-dichlorotyrosine moiety. The diketopiperazine structure condensed from Dhb and *N*-methyl 3,5-dichlorotyrosine residue was deduced by the HMBC correlations of *N*H to C-1/-2, and H-2 to C-9, and H3-8 to C-9. Furthermore this conclusion was confirmed by the observation of MS fragmentation of *m*/*z* 329.04520 ([M+H]+) (Figure S21) and prediction by Mass Frontier 7.0 (Thermo, Figure S22). The fragment ion at *m*/*z* 301.05118 (C13H15N2O2Cl2 +) might originate from the diketopiperazine ring opening and loss of carbonyl group; ion at *m*/*z* 218.01393 (C9H10NOCl2 +) might derive from the diketopiperazine ring opening and loss of carbonyl group and Dhb moiety; and ion at *m*/*z* 174.97168 (C7H5OCl2 +) might represent the (3,5-dichloro-4-hydroxyphenyl)methylium and the ion at *m*/*z* 127.08707 (C6H11N2O+) might deduce from the Dhb moiety and *N*-methyl group with rearrangement (e.g., 1-(2-iminobutanoyl)aziridin-1-ium). The NOESY correlation of H3-12 to NH suggested the configuration of double bond from Dhb moiety as *Z* form. Finally the planar structure of **11** was suggested as *cyclo*(dehydroaminobutyric acid- *N*-methyl 3,5-dichlorotyrosine). Comparison of the specific optical rotation of **11** with reported tyrosine or phenylalanine containing diketopiperazine [71,72], the stereochemistry of *N*-methyl 3,5-dichlorotyrosine moiety (–104◦) was deduced to be L-configurated.

*cyclo*(*N*-Me-L-3,5-dichlorotyrosine-Dhb) (**11**): light yellow solid; [α] 25 <sup>D</sup> –104.0 (c 0.1 *w*/*v*%, MeOH); UV (MeCN/H2O/FA) λmax 230 nm; IR (ATR) νmax 3857, 3738, 3650, 2925, 2855, 1745, 1683, 1650, 1539, 1512, 1460, 674 cm–1; NMR spectral data, see Table S15; ESI-HRMS [M+H]+ *m*/*z* 329.04520 (calcd for C14H15O3N2Cl2, 329.04542). A SCIFinder search indicated the commercial availability from Aurora Screening Library (CAS 1798281-02-3) however the natural origin of the compound is unassigned.

Antimicrobial activities of *cyclo*(*N*-Me-L-3,5-dichlorotyrosine-Dhb) against test strains: Antimicrobial assays against *Bacillus subtilis* ATCC 6633, *Staphylococcus aureus* IMET 10760, *Escherichia*

*coli* SG 458, *Pseudomonas aeruginosa* SG137, *Pseudomonas aeruginosa* K799/61, MRSA *Staphylococcus aureus* 134/93, VRSA *Enterococcus faecalis* 1528, *Mycobacterium vaccae* IMET 10670, *Sporobolomyces salmonicolor* SBUG 549, *Candida albicans* BMSY 212, and *Penicillium notatum* JP36 were done using the broth dilution method according to the NCCLS (National Committee for Clinical Laboratory Standards) (Table S7).

Isolation and structure elucidation of rubromidin A (**12**) and B (**13**): Large-scale liquid cultivation was performed as mentioned above. SPE 70% and 80% MeOH C18-SPE eluates, which containing ion with *m*/*z* at 957.8 ([M+2H]2+) and 993.5 ([M+2H]2+).

Those two fractions were concentrated under reduced pressure and pooled and resubmitted to Sephadex LH20 resin and eluted by 100% MeOH for further purification. The subfractions containing ion with *m*/*z* at 957.8 ([M + 2H]2+) and 993.5 ([M + 2H]2+) were concentrated under reduced pressure and finally purified by semipreparative HPLC (Nucleodur C18 250 mm × 10 mm) to yield rubromidin A (**12**, 0.5 mg, *t*<sup>R</sup> = 10.02 min) and rubromidin B (**13**, 0.5 mg, *t*<sup>R</sup> = 10.80 min) using the following gradient: 0–20 min, 30% B (A: dd H2O + 0.1% formic acid; B: MeCN) with a flow rate of 2.0 mL/min. ESI-HRMS analysis of purified lanthipeptide **12** revealed the protonated molecular ion at *m*/*z* 1914.73938, as well as the doubly protonated ion at *m*/*z* 957.87408 (Figure S23). The MS2 spectra at *m*/*z* 1914.73938 and 957.87408 under positive mode were recorded and submitted to GNPS and processed by RiPPquest, a tandem mass spectrometry database search tool for identification of microbial RiPPs [73]. The exact mass of protonated ion of compound **13** was assigned with *m*/*z* 1985.77539, by the observation of the doubly protonated ion at *m*/*z* 993.39203 (Figure S24). The mass difference of 71.03601 between **12** and **13** suggested an additional alanine residue from the *N*-terminus. Similarly, the MS2 spectra at *m*/*z* 1985.77539 and 993.39203 were recorded and submitted to GNPS and processed by RiPPquest, and led to the identification of the candidate peptide (ACSSTCTSGPFTFACDGTTKG), including an additional alanine which might due to the alternative cleavage of RumM.

Marfey's reaction [74]: Compounds **12** and **13** (0.1 mg each) were hydrolyzed separately by 6 N HCl (1.0 mL) at 110 ◦C for 15 h. Then HCl was removed using a SpeedVac and 20 μL FDAA (1-fluoro-2,4-dinitrophenyl-5-L-alanine amide, 10 mg/mL in acetone) and 100 μL NaHCO3 (1 N aqueous solution) were added. The reaction was heated at 80 ◦C for 10 min, and the reaction quenched by addition of 50 μL 2 N HCl. L- and D-phenylalanine and L- and D-alanine, glycine, L- and D-serine, L- and D-theorine, and L- and D-proline were converted accordingly. After centrifugation for 10 min at 13,000 rpm, the reaction mixture was analyzed by UHPLC-MS (Figures S29 and S30). Five μL of the reaction mixture were injected and analyzed using the following gradient: 0–1 min, 10% B; 1–7 min, 10–100% B; 7.1–10 min, 100% B (A: dd H2O with 0.1% formic acid; B: MeCN with 0.1% formic acid) at a flow rate of 0.7 mL/min.

Ser/Cys labeling: *Actinomadura* sp. RB29 was first incubated in 50 mL ISP2 at 30 ◦C (150 rpm shaking) for one week. Then, biomass was collected by centrifugation (4000 rpm, 10 min, rt), washed twice using autoclaved minimal media and transferred into 100 mL minimal media containing L-serine-2,3,3-D3 (100 mg) and DL-cystein-3,3-D2 (100 mg), respectively. The cultures were incubated for one week at 30 ◦C (150 rpm). The culture broth was collected by filtration and extracted with activated HP20 resin (20 g/L) at 4 ◦C overnight. The resin was washed by H2O firstly then eluted by 100% MeOH. The MeOH eluate was concentrated under reduced pressure and resuspended into MeOH for LCMS analysis and ESI-HRMS analysis (Figure S24–27).

Gene cluster identification: The putative biosynthetic gene cluster of rubromidin (*rum*) was predicted using antiSMASH and compared with already described lantipeptide gene cluster *cin* (from *Streptomyces cinnamoneus*) and putative biosynthetic gene cluster of cinnamycin B from *Actinomadura atramentaria* [39,40,75] (Figure S6, Table S11). The peptide sequence of lantibiotic cinnamycin precursor (predicted with antiSMASH) was used for al BLASTp search in GenBank using "refseq protein" or "nonredundant protein sequence" database. A phylogenetic analysis was performed using lantibiotic cinnamycin precursor peptide sequence (GenBank accession number WP 103565569) and LanM lanthipeptide synthetase (GenBank accession number WP 103565568) and related hits from the BLASTp search. Phylogentic trees are shown in Figures S3 and S4. Comparative sequence alignment of precursor peptide sequence is shown in Figure S5. Sequences were aligned with Muscle [62]. Two different phylogenetic trees were reconstructed with neighbor-joining and maximum-likelihood algorithms [63,64]. The evolutionary distance model of Jones, Taylor, and Thorton [76] was used to generate evolutionary distance matrices for the maximum likelihood and neighbor joining algorithm with deletion of complete gaps and missing data. For the maximum-likelihood algorithm, a discrete Gamma distribution was used (+G). For the neighbor-joining algorithm, the rate variation among sites was modeled with a gamma distribution. For all constructed trees the confidence values of nodes were evaluated by bootstrap analysis based on 1000 resamplings.

#### **5. Conclusions**

This study provides an extended taxonomical and chemical analysis of Actinobacteria isolated from the fungus-growing termite *M. natalensis*. Our findings clearly verify that a high diversity of Actinobacteria can be found with these termites, and most notably the termite gut. The high recovery rates from gut fluids suggest that most species survive the relatively short passage and are inoculated together with the fungal mutualists into the fresh fungus comb. Although the high phylogenetic diversity denotes a certain lack of specificity between bacteria and origin of isolation, it simultaneously ensures the presence of diverse chemistry and biochemical capacity, which may be necessary for protection against alien species and the assisted breakdown of complex plant material. It also needs to be noted that antibiotics at higher concentrations might be deployed as weapons against competing microbes; they also have been found to elicit changes in the global bacterial transcription patterns and metabolism at sub-inhibitory concentrations [77,78]. Therefore, it could be speculated that natural products, including antibiotics, serve as signaling molecules between different microorganisms within the termite symbiosis and contribute to the overall unique stability of the system. Consequently, the structural diversity of metabolites produced within the mutualistic agricultural system might provide a net benefit for the fungal mutualist and farming termites; however, documenting the individual compounds and costs and benefits involved in this potential mutualism are still required.

An intriguing feature of studying the defensive symbiosis paradigm is its parallel to human medicine as it both deploys mediating or antagonistic molecules to suppress pathogens. Insect defensive symbioses, such as the fungus-growing termite systems, probably offer the clearest window into antibiotic use in nature, and the presence of natural product factories, such as Actinobacteria, represent an untapped chemical treasure trove of novel chemical scaffolds, a fact that is underlined by the impressive amount of new natural product scaffolds derived from termite-associated bacteria reported in the last decade. However, key to the discovery of new natural products is the ability to mimic the natural environment and the dynamics present within the system. As exemplified in our two case studies, co-culture assays are a first step towards mimicking natural habitats and identifying key secondary metabolites from both bacteria and fungi. However, more efforts are clearly needed to steadily increase the levels of complexity and more holistically describe the metabolic network within the symbiosis.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2079-6382/7/3/83/s1: Table S1: Colonies information of *M. natalensis* for Actinobacteria isolation and metatranscriptomic analysis, Table S2: Actinobacteria IDs of isolates from fungus-growing termites, Table S3: Identities of isolated Actinobacteria strains, Table S4: Identities of ecologically-relevant fungal strains used as targets in the bioactivity tests, Table S5: Antimicrobial assay results of extracts of isolated bacteria against eight medically relevant bacteria and fungi, Table S6: Antifungal assay results of extracts of isolated bacteria against ecologically-relevant fungi, Table S7: Antimicrobial assay results of *cyclo*(*N*-Me-L-3,5-dichlorotyrosine-Dhb) (**11**), Table S8: Representative images of antifungal assay of extract against ecologically relevant fungi, Table S9: Representative interactions between *Streptomyces* sp. RB108 and co-cultivated fungi, Table S10: Presence/absence of transcripts identified in gut microbiome metatranscriptome data from major old worker guts of *M. natalensis* Mn156 and Odontotermes sp. Od127, Table S11: Rubromidin biosynthetic protein annotations based on sequence homology, Tables S12–S15: NMR tables, Figure S1: Unrooted Neighbor-joining tree based on near-complete 16S rRNA gene sequences showing relationship**s** between isolated Actinobacteria and closest relatives, Figure S2: MALDI-TOF imaging of co-cultivation of RB108 and *Pleosporales* sp. #4, Figure S3: Neighbor-joining tree based on peptide sequence of

*Antibiotics* **2018**, *7*, 83

precursor peptide sequence (RumA), Figure S4: Neighbor-joining tree based on type 2 lantipeptide synthetase RumM, Figure S5: Comparative sequence alignment of precursor peptide sequence, Figure S6: Comparative gene maps of *rum* and *cin* and proposed cinnamycin B biosynthetic clusters, Figures S7–S30: NMR spectra and LCMS of Marfey's reaction.

**Author Contributions:** R.B., M.P., and C.B. conceived and designed the experiments; R.B., E.S., H.G., C.W., M.G.-A., K.M., H.H., and M.K. performed the experiments and analyzed the data; C.W., Z.W.d.B., and C.H. contributed reagents/materials/analyses tools; and R.B., M.P., and C.B. wrote the paper.

**Funding:** Rene Benndorf was funded by the International Leibniz Research School for Microbial and Biomolecular Interactions (ILRS) and Jena School for Microbial Communication (JSMC, DFG). Financial support of the Boehringer Ingelheim Foundation, the Daimler Benz foundation, and the German Research Foundation (CRC 1127 (ChemBioSys) and BE-4799/3-1) to Christine Beemelmanns is greatly acknowledged. This work was performed with financial support from the Villum Kann Rasmussen Foundation Young Investigator Fellowship (VKR10101) to Michael Poulsen.

**Acknowledgments:** We thank the Oerlemans family (Mookgophong) for permission to sample colonies at their farm.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

### **Lysoquinone-TH1, a New Polyphenolic Tridecaketide Produced by Expressing the Lysolipin Minimal PKS II in** *Streptomyces albus*

**Torben Hofeditz 1, Claudia Eva-Maria Unsin 2, Jutta Wiese 3, Johannes F. Imhoff 3, Wolfgang Wohlleben 2,4, Stephanie Grond 1,\* and Tilmann Weber 2,5,\***


Received: 4 May 2018; Accepted: 22 June 2018; Published: 28 June 2018

**Abstract:** The structural repertoire of bioactive naphthacene quinones is expanded by engineering *Streptomyces albus* to express the lysolipin minimal polyketide synthase II (PKS II) genes from *Streptomyces tendae* Tü 4042 (*llpD-F*) with the corresponding cyclase genes *llpCI-CIII*. Fermentation of the recombinant strain revealed the two new polyaromatic tridecaketides lysoquinone-TH1 (**7**, identified) and TH2 (**8**, postulated structure) as engineered congeners of the dodecaketide lysolipin (**1**). The chemical structure of **7**, a benzo[a]naphthacene-8,13-dione, was elucidated by NMR and HR-MS and confirmed by feeding experiments with [1,2-13C2]-labeled acetate. Lysoquinone-TH1 (**7**) is a pentangular polyphenol and one example of such rare extended polyaromatic systems of the benz[a]napthacene quinone type produced by the expression of a minimal PKS II in combination with cyclases in an artificial system. While the natural product lysolipin (**1**) has antimicrobial activity in nM-range, lysoquinone-TH1 (**7**) showed only minor potency as inhibitor of Gram-positive microorganisms. The bioactivity profiling of lysoquinone-TH1 (**7**) revealed inhibitory activity towards phosphodiesterase 4 (PDE4), an important target for the treatment in human health like asthma or chronic obstructive pulmonary disease (COPD). These results underline the availability of pentangular polyphenolic structural skeletons from biosynthetic engineering in the search of new chemical entities in drug discovery.

**Keywords:** lysolipin; minimal PKS II; cyclases; benz[a]naphthacene quinone; tridecaketide; aromatic polyketide; pentacyclic angular polyphenol; extended polyketide chain

#### **1. Introduction**

Polyketides are a large family of structurally-diverse natural products, mainly produced by bacteria, fungi, and plants. Their biosynthesis is catalyzed by distinct enzymes, termed type I polyketide synthases (PKS), type II PKS, type III PKS or variants thereof [1]. They exhibit broad ranges of pharmacological properties for use in clinical applications [2].

Many bacterial aromatic polyketides, such as the clinically used tetracycline, are biosynthesized by type II polyketide synthases (PKS II). Each PKS II contains a minimal set of enzymes (minPKS) that is required to synthesize a polyketide chain of defined length usually primed by acetyl-CoA and extended with malonyl-CoA to polycyclic products [3]. A minimal PKS II consists of two β-ketoacyl synthases (KS<sup>α</sup> and KSβ) and one acyl carrier protein (ACP). KS<sup>α</sup> is responsible for loading malonyl-CoA extender onto the PKS II system and also for the iterative Claisen condensations to extend the polyketide chain. KSβ, also referred to as chain length factor (CLF), is contributing to control the polyketide chain length [4]. Diverse cyclization patterns convert the polyketones to an enormous variety of structural polycyclic skeletons [5].

The final bioactive polycyclic PKS II natural products are aromatic compounds and often arise from additional enzymatic conversions, among them cyclases (CYC) [6], reductases (KR) [7], oxidases [8] and decarboxylating enzymes [2,9] or amidotransferases [10] for insertion of nitrogen or glycosyl transferases [11].

Lysolipin I (**1**) from *Streptomyces* Tü 4042 [12] and other members of microbial aromatic polyketides, such as pradimicin A (**2**) [13], fredericamycin A (**3**) [14], benastatin A (**4**) [15], bequinostatin C (**4a**) [16] and xantholipin (**5**) [17], are among the largest type II PKS products that have been described. They differentiate from the important groups of the smaller angucyclinones and anthracyclines (A-type, resp. B in Figure 1). They have distinct pentangular polycyclic aromatic core structures with pyridone, piperidone or lactone rings, respectively, added to a hexacyclic core structure as in xantholipin (**5**), lysolipin (**1**) or fredericamycins (**3**, **3a**) [18] (Figure 1). This additional ring F varies in δ-position with different substituents next to the amide. Several lysolipin derivatives have been described. While lysolipin I (**1**) carries a methoxy substituent at C-24, derivatives of lysolipin have been engineered which have a methyl group at this position (Patent WO2007079715 (A3), Combinature Biopharm, Berlin, Germany).

**Figure 1.** Chemical structures of PKS II products (**1**–**6**) and examples of biosynthetic congeners (**3a**, **b**, **4a**). PKS chains (acetate units bold) of angucyclinone (**A**) and anthracycline (**B**) type structures start to cyclize with ring A at C-7/C-12. Benz[a]naphthacene (**C**) and benz[a]naphthacene quinone (**D**) type structures cyclize starting with C-9/C-14 according to polyketide chain numbering (green). Variations of δ-substituents of ring-F highlighted in red. R = H or alkyl-, allylic carbon chains (usual chemical nomenclature in blue).

Several biosynthetic gene clusters of these large PKS-II antibiotics are known and have been subject to extensive genetic and biochemical characterization [18–23]. In the biosynthetic pathways of pentangular polyaromatic polyketides, e.g., **1**–**5**, the respective cyclized polyphenols (**3b**) and polyphenolic quinones (**3a**) have been demonstrated as pathway intermediates; they have also been obtained as products of genetic engineering efforts (Figure 2). Therefore, we regard the class of pentangular quinones as the metabolic hub of the important aromatic polyketide products.

**Figure 2.** Chemical structure of lysoquinone-TH1 (**7**, identified), lysoquinone-TH2 (**8**, proposed), the constitutional isomer sapurimycin (**11**) and structurally related pentangular polyketides **9**–**13**.

Here, we report that the heterologous expression of the lysolipin minimal PKS II genes, which comprise *llpF*, coding for the ketosynthase α (KSα), *llpE*, coding for the ketosynthase β (KSβ), and *llpD*, coding for the ACP, in combination with genes encoding the cyclases (*llpCI-CIII*) in the host *S. albus* J1074 resulted in the production of novel metabolites (Figure 2). The structure elucidation of the novel bioactive lysoquinone-TH1 (**7**), strong evidence for its tridecaketide backbone from the doubly labeled [1,2-13C2]-acetate-feeding experiments, and the potent bioactivity as potent phosphodiesterase inhibitor are discussed.

#### **2. Results and Discussion**

#### *2.1. Heterologous Production of Lysoquinone-TH 1 (7) in S. albus*

The lysolipin gene cluster has been identified on a 42-kb genomic region in *Streptomyces tendae* Tü 4042 and analyzed by sequence comparison and heterologous expression in *S. albus* [21]. For this study, the genes coding for the minimal PKS II (*llpD-F*) and cyclases (*llpCI-CIII*) were amplified by PCR from the cosmid 4H04 encoding the complete lysolipin (**1**) gene cluster. The 4.1-kb PCR fragment was cloned into the vector pSET152*ermE\*p* [24] under control of the constitutive promoter *ermE\*p* yielding plasmid pCU1 (Figure S1). pCU1 was introduced into *S. albus* J1074 by intergeneric conjugation and the minimal PKS II and cyclase genes were heterologously expressed in culture. The recombinant strain expressing the minimal PKS and cyclases changed the color of the nutrient media (R5) and colonies on plates from yellowish to dark brown. In comparison, *S. albus* strain containing a pSET152*ermE\*p* plasmid without insert does not show the phenotype.

Thus, this strain gained the ability to produce new substances. Thin layer chromatography (TLC) analysis revealed a strong red fraction proved to be not identical to lysolipin (**1**).

#### *2.2. Isolation and Structure Elucidation of Lysoquinone-TH1 (7)*

Starting from the newly observed colored product, we developed a work-up procedure towards the isolation of the pure compound for structure elucidation and biological profiling from a four liter fermentation broth in M65 medium. Acetone and methanol extractions removed water soluble and other unwanted compounds. Filtration with RP-18 material preceded the preparative HPLC purification using a C4 column and H2O (Ammonium formate, TFA): acetonitrile as solvents (see Supplemental Materials). Thereby, 3.4 mg of purified red compound (lysoquinone-TH1, **7**) were obtained from a four-liter fermentation broth from medium supplemented with sterile adsorber resin XAD-16 and subsequently characterized by TLC, LC-mass spectrometry (MS) and NMR-analysis (Figures S2–S5).

HR-ESI-MS data for the red metabolite **7** revealed a *m*/*z* = 461.087756 [M−H]<sup>−</sup> ((Δppm = 0.11 ppm) and suggested the molecular formula for **7** to be C25H18O9 (MR = 462,4). Consistently low resolution ESI-MS monitoring of the crude extracts exhibited *<sup>m</sup>*/*<sup>z</sup>* = 461.1 [M−H]<sup>−</sup> and 463.2 [M+H]+) ions with stronger ionization in the negative mode. Thus, the sum formula for compound **7** proposed 17 degrees of unsaturation and implied a large aromatic ring system.

The LC-MS/MS fragmentation analysis of lysoquinone-TH1 (**7**) gave evidence for a mass difference of *m*/*z* 58 with *m*/*z* = 403.1 [M−C3H6O-H]<sup>−</sup> as the only fragmentation product, which was assigned to a McLafferty rearrangement and a neutral loss of acetone (Figure S2). These results indicated a stable substance in MS-fragmentation.

One- and two-dimensional NMR-data (1H-NMR, 13C-NMR, HMBC, HSQC, COSY) yielded ten proton signals; one aliphatic methyl-, one methylene group, four protons with methylene character and four aromatic protons (Figures S3–S5). The carbon to proton correlation (HSQC experiments) pointed to two isolated methylene groups with diasterotopic protons. These doublet protons (δ<sup>H</sup> = 2.67, 2.91 and δ<sup>H</sup> = 3.01, 3.25 ppm) underlined that they only couple within the methylene group, each (J = 15.9 Hz), and suggest a ring structure. Furthermore, a methyl group singlet (δ<sup>H</sup> = 2.18 ppm) next to an aliphatic carbonyl group (δ<sup>C</sup> = 207.8) implied no other direct substituents. However, this carbonyl group is adjacent to a methylene group (δ<sup>H</sup> = 2.70, δ<sup>C</sup> = 53.3 ppm), attached to a quaternary carbon (δ<sup>C</sup> = 71.0 ppm) as part of the aliphatic ring system (C-1-C-4, C-4a, C-14b). The proton at δ<sup>H</sup> = 9.48 indicated an aromatic proton with exceptional low field shift from two carbonyl groups (C-1, C-13) in spatial proximity. In consistence with the recorded 13C-NMR- and 1H-NMR-data, the HMBC-experiment unambiguously delivered assigned key correlations due to well separated signals which established the connectivity to the whole scaffold of lysoquinone-TH1 (**7**) and a full structural assignment (Table S1) to a pentangular polyphenolic core (Figure S4). Lysoquinone-TH1 (**7**) is a 3-(2-oxo-propyl)-decorated dihydrobenz[a]napthacene-8,13-quinone, a novel compound to the best of our knowledge.

Additionally, the purification protocol revealed another violet fraction with obviously a second compound (proposed as lysoquinone-TH2, **8**) not identical to lysolipin (**1**). However, only minute amounts were observed in production cultures, and presumably a distinct instability did not allow for purification and full structure elucidation. HR-ESI-MS data from extract fractions of the violet metabolite **8** revealed a *m*/*z* = 487.0670 [M−H]<sup>−</sup> (Δppm = 0.1 ppm) and suggested the molecular formula for **8** to be C26H16O10 (MR = 488.4). In comparison to **7**, 19 instead of 17 degrees of unsaturation were calculated. A large aromatic ring system is also concluded. The isolation of **8** was not achieved to allow for NMR studies. Thus, the analytical MS/MS and the UV-data of lysoquinone-TH1 (**7**) in addition to the knowledge of the mentioned McLafferty rearrangement are pointing to structure **8**. This proposed compound was named lysoquinone-TH2 (**8**) and resembles the C-2 carboxyl analogue of **7** as a yet unknown structure.

In comparison to **7**, the only known constitutional isomer sapurimycin (**11**, C25H18O9) [25] is an annealed tetracyclic ring system. The rare moiety of a reduced ring E of angular naphthacene quinone of **7** is only known from metabolite **9**, generated via CRISPR-Cas9 technology with *S. viridochromogenes* [26]. JX111a (**10a**), JX111b (**10b**), and further precursors of pradimicin A (**2**) [27], KS-619-1 (**12**) [28], and frankiamicin (**13**) from *Frankia* [29] display a similar structural skeleton to lysoquinone-TH1 (**7**) and to the proposed structure lysoquinone-TH2 (**8**) (Figure 2). It could be anticipated that the benz[a]napthacene carbon skeleton originates from a native tridecaketide polyketide chain for the lysoquinones **7** and **8** from the min PKS while the parent natural product of strain *S.* Tü4042 lysolipin I (**1**) has a dodecaketide backbone.

### *2.3. Feeding Experiment with [1,2-13C2]-Labeled Acetate*

For validating the biogenesis of lysoquinone-TH1 (**7**) a feeding experiment with the doubly 13C-labeled [1,2-13C2] acetate was carried out, and lysoquinone-TH1 (**7**) was purified from the culture and subjected to NMR spectroscopy. All carbon atoms turned out to be enriched [30] with highly specific incorporation rates between 2.0 and 7.3 (Figure 3, Figure S4 and S6, Table S1). The specific coupling constants from NMR analysis again confirmed the structure of lysoquinone-TH1 (**7**) and suggest a polyketide origin assembled from 13 acetate extender units (tridecaketide, Figure 2). It is therefore among the largest polyketides formed by a minimal PKS II (LlpD, E, F) with cyclases (LlpCI–CIII) via heterologous expression. It could be anticipated that the proposed structure of lysoquinone-TH2 (**8**) further corroborates the native tridecaketide precursor and carries the additional carboxyl group (C-15) of the final extender acetate unit. The continuous labeled-acetate chain of **7** corresponds with the pattern of the further post-PKS-processed antibiotic lysolipin I (**1**) [31].

**Figure 3.** (**A**) Biosynthetic origin of lysoquinone-TH1 (**7**) and the proposed structure of lysoquinone-TH2 (**8**): Biosynthesis hypothesis from feeding experiments with doubly labeled [1,2-13C2] acetate with the *S. albus* host strain and heterologous expression of the lysolipin minimal PKS genes (*llpD-F*) and cyclase genes (*llpCI-CIII*). (**B**) Oxytetracycline (**6**) from a biosynthesis primed with malonamide in the wild-type producer, and with two acetate units in the mutant producer.

A specific feature of the lysolipin polyketide biosynthesis is the priming by a malonate-derived starter unit [31]. However, in the identified lysoquinone-TH1 (**7**) as well as in the proposed structure of lysoquinone-TH2 (**8**), two acetate units replace the original C3-malonyl/malonamide starter unit. Respective observations were also made for oxytetracycline (**6b**) with a malonamide starter unit since heterologous expression of the oxytetracycline minimal PKS of *Streptomyces rimosus* in *S. coelicolor* yielded tetracycline analogue **14** with two acetate units priming the polyketide backbone instead [32] (Figure 3). Co-expressing the *oxyD* gene with the oxytetracyline minimal PKS in heterologous expression experiments has shown that it is presumably responsible for generating the characteristic malonamate starter unit. OxyD codes for an amidotransferase and is homologous to the putative amidotransferase LlpA of lysolipin biosynthesis. Therefore, priming of the PKS biosynthesis with malonamate and and formation of a N-heterocyclic product does not require additional enzymes [10].

Studies on hybrid PKS pathways with benastatin A (**4**) revealed isolated hybrid PKS-II products with the number of extender units increased if shorter starter units were used. In analogy, based on the observations on the identified lysoquinone-TH 1 (**7**) and the proposed structure of lysoquinone-TH2 (**8**) it can be anticipated that the number of elongation steps is dependent on the length of the polyketide chain [33] fitting into the substrate pocket of the KSα/KSβ-complex.

#### *2.4. Biological Activity of Lysoquinone-TH1 (7)*

Lysolipin (**1**) is highly active against various Gram-positive bacteria and shows antifungal activity. While the molecular target of lysolipin (**1**) still remains elusive, there is strong indication that this xanthone antibiotic targets the bacterial cell envelope [12,34].

Because of the high antibiotic activity of lysolipin (**1**) in nM-range, biological assays with lysoquinone-TH1 (**7**) were performed. Only weak antibiotic activities at 100 μM concentration of lysoquinone-TH1 (**7**) were observed against the Gram-positive strains *Staphylococcus lentus*, *Staphylococcus epidermidis* and *Propionibacterium acnes* with inhibition of 39%, 66% and 74%, respectively, in comparison to the positive control chloramphenicol [35]. Therefore, no IC50 values were determined.

KS-619-1 (**12**) and K-259-2 are representing inhibitors of the cyclic nucleotide phosphodiesterase (PDE4) [28,36–38]. The enzyme PDE4 is a very attractive target for the treatment of asthma, chronic obstructive pulmonary disease (COPD), psoriasis, schizophrenia, diet induced obesity, glucose intolerance and multiple sclerosis [39–44]. PDE4 addresses cyclic nucleotides like cAMP and cGMP and degrades these cellular messengers. However, these messengers possess regulatory functions in almost all cells so it is regarded an important target [45], therefore, detailed studies with different inhibitors are needed to evaluate an ideally selective bioactivity.

For profiling lysoquinone-TH1 (**7**), an enzyme assay using PDE-4B2 was carried out according to Schulz et al. [39]. A clearly defined IC50 value could not be determined in this assay due to an additional luminescence signal derived from the chromophore of lysoquinone-TH1 (**7**), an extensive polyaromatic skeleton. However, an IC50 range of 10–20 μM could be inferred from the assays against the PDE-4B2 enzyme. The MIC (minimal inhibition concentration) of lysoquinone-TH1 (**7**) was determined to a value of 2.33 μM (±0.04) corresponding to 3.368 log(nM) (±0.007) (Figure S7). The standard used in these assays was rolipram (IC50 = 0.8 μM (±0.1)) [39], an optimized and approved drug. Lysoquinone-TH1 (**7**), a completely new substance, is only 10-fold less active as this well-known PDE4 inhibitor rolipram. When lysolipin (**1**) was tested in the same assay, no PDE4 inhibition was detected up to a concentration of 50 μM.

#### **3. Materials and Methods**

#### *3.1. Cloning of the Lysolipin Minimal PKS*

The genes of the lysolipin minimal PKS II (*llpD*, *llpE*, *llpF*), which are surrounded by the cyclases *llpCI*, *llpCII*, and *llpCIII,* were amplified by PCR using the primers minPKScyc-fw-Hind (aaa gct tga gta gcc aaa cgg gtt c) and minPKScyc-revSs (aag aat tca ata ttg tgc cca cca gta cac) and the template cosmid 4H04 [21] using ProofStart PCR polymerase kit (*Qiagen*). The PCR Program used in an PTC-100 thermocycler (MJ Research, Waltham, MA, USA) was: 95 ◦C—5 min; 30 cycles with (94 ◦C—90 s, 62 ◦C—90 s, 72 ◦C—4 min); 72 ◦C—10 min. The PCR product was then cloned into pSETermE\*p [24] (Combinature Biopharm AG, Berlin, Germany) via the EcoRI/HindIII restriction sites. The resulting plasmid pCU1 was checked by restriction and DNA sequencing.

The plasmid pCU1 was introduced into *Streptomyces albus* J1074 via a standard intergeneric conjugation protocol as, for example, described in [46].

#### *3.2. Culture Conditions*

A pre-culture with medium G20 (600 mL) in six 300 mL Erlenmeyer flasks was inoculated with *S. albus* J1074 and apramycin (50 μg/mL) for 48 h at 28 ◦C and 180 rpm (B. Braun Certomat HK with shaker B. Braun Certomat U, B. Braun, Melsungen, Germany). The main culture was inoculated with 400 mL of the pre-culture and was grown up in medium M65 (3.6 L) under selection with apramycin (50 μg/mL) in a fermenter (B. Braun Biostat B, B. Braun, Melsungen, Germany) for 96 h at 28 ◦C and 300 rpm. After 24 h, 15 g/300 mL of sterile XAD-16 was added to the culture. Nutrient solutions: G20 (Glycerol (20 g), malt extract (10 g), yeast extract (4 g) in 1 L of tab water. pH = 7.2). M65 (Malt extract (10 g), yeast extract (4 g), D-glucose (4 g), CaCO3 (2 g) in 1 L of tab water. pH = 7.2).

#### *3.3. Extraction and Isolation*

For initial detection, agar plates from *S. albus* incubation were extracted with ethyl acetate, the organic phases evaporated and the extract applied to silica gel TLC analysis (solvent cyclohexane/ethylacetate/methanol 6:8:1, with 1% of trifluoric acetate acid added). For purification, 4 L of culture broth with XAD-16 was filtered over Celite to separate the mycelia from the liquid culture. The filtrate was autoclaved and discarded. The mycelia were extracted two times with acetone/methanol 7:3 and then in acetone/methanol 1:1 in an ultrasonic bath. After filtration, the organic phases were combined, evaporated, water was added, and the pH adjusted to 4–5 with 1 M HCl. Extraction with ethyl acetate and evaporation gave the crude extract. RP silica gel was pretreated with 3–4 column volumes (CV) of pyridine and washed with 3–4 CV of water as a basic activation of the RP phase. After column conditioning with 1–2 CV of the solvent acetone/methanol 1:1 the extract was loaded, and red and violet fractions were selected, accompanied by TLC analysis on silica gel (see above) and LC-ESI-MS analysis (HPLC, Agilent 1100 series. Ion trap, Bruker Daltonic Esquire 3000+, He as reactant gas, with Data Analysis software, Bruker Daltonik, Bremen, Germany). Combined red fractions were dissolved in DMSO and purified with HPLC (Thermo Ultimate 3000 Thermo Scientific, Dreieich, Germany); Column: Dr. Maisch, Ammerbuch-Entringen, Germany, Reprosil 120 C-4, 5 μm, 250 × 20 mm id, pre-colum: Dr. Maisch, standard guard Reprosil 120 C-4, 30 × 20 mm id, flow rate: 13.0 mL/min.; program: 20 min at 45% B, in 5 min to 75% B, 8 min at 75% B, in 2 min to 45% B, 8 min at 45% B; solvent: A = ammonium formate (20 mM) + 0.1% TFA in water; B = acetonitrile). A retention time from 12 to 13 min was observed. After evaporating a Sephadex LH-20 (2 × 2 cm, methanol) and following extraction three times with ethyl acetate and three times with diisopropyl ether was necessary for desalting the sample. This work up procedure gives 3.4 mg lysoquinone-TH1 (**7**) from four liters of culture.

### *3.4. Feeding Experiment with [1,2-13C2]-Labeled Acetate*

For the feeding experiment 2.0 g of the doubly labeled [1,2-13C2] acetate (99% enrichment; Cambridge Isotope Laboratories, Inc., Tewksbury, MA, USA), which corresponds to 5.95 mM end concentration in the fermenter (4 L, B. Braun Biostat B) were added to the culture broth after 32 h of cultivation. The isotope labeled lysoquinone-TH1 (**7**) was purified with same protocol as described above and subjected to NMR-analysis (13C-NMR, 1H-NMR, HMBC, HSQC).

#### *3.5. Biological Activity Assays*

Antibacterial assays were carried out with the test strains *Staphylococcus lentus* DSM 6672, *Staphylococcus epidermidis* DSM 20044 and *Propionibacterium acnes* DSM 1897 using a cell viability test based on the reduction of resazurin to resorufin. Details on the cultivation conditions of the strains *S. epidermidis* and *P. acnes*, as well as on the evaluation of cell viability are described by Silber et al. [34]. The experiments with *S. lentus* were performed in the same manner as *S. epidermidis*. The positive control chloramphenicol was applied in a concentration of 10 μM for *S. lentus* and *S. epidermidis* and of 1 μM for *P. acnes*.

The effect of lysoquinone-TH1 (**7**) on PDE-4B2, a human recombinant cyclic adenosine monophosphate (cAMP) specific phosphodiesterase (BPS Bioscience no. 60042, San Diego, CA, USA) was determined in 96 well plates using the PDELight HTS cAMP Phosphodiesterase Kit (Lonza, LT07-600, Wuppertal, Germany). Lysoquinone-TH1 (**7**) was diluted in 50 mM Tris-HCl buffer (pH 7.5) containing 8.3 mM MgCl2 and 1.7 mM EGTA. 10 μL of each dilution was transferred to a well. 20 μL PDE-4B2 solution (0.25 U/μL) were added. The reaction was started by adding 10 μL of 12 mM cAMP (Sigma A9501, Taufkirchen, Germany) dissolved in 50 mM Tris-HCl buffer (pH 7.5) containing 8.3 mM MgCl2 and 1.7 mM EGTA to each well of the microtiter plate. PDE-4B2 hydrolysed cAMP to adenosine monophosphate (AMP). After an incubation at 30 ◦C for 30 min, the reaction was stopped by transferring 30 μL solution containing 10 μL PDELight Stop Solution and 20 μL PDELight AMP detection reagent. The detection reagent converted AMP to ATP and luciferase catalyzed the formation of light from ATP and luciferin. The emitted light is proportional to the level of AMP produced. AMP was quantified after incubation at 30 ◦C for 10 min by measuring the luminescence using the microtiter plate reader Infinite M200 (Tecan, Crailsheim, Germany) with 0.1 s integration time. The assays were performed in duplicates. Rolipram (4-[3- (cyclopentyloxy)-4-methoxyphenyl]-2-pyrrolidinone) was used as positive control.

#### **4. Conclusions**

Lysoquinone-TH1 (**7**) is a "non-natural natural product", a pentangular aromatic polyketide derived from engineering of the lysolipin biosynthetic pathway. It was produced with *Streptomyces albus* as host expressing the minimal PKS II genes (*llpD-F*) in combination with three cyclases (*llpCI-CIII*) of the lysolipin gene cluster. In a bioactivity profiling study, it was shown that lysoquinone-TH1 (**7**) only has weak antibacterial activity, but instead is an inhibitor of phosphodiesterase 4 (PDE4) which is a target for treatment of pulmonary diseases. The lysoquinone-TH1 (**7**) biosynthetic pathway, which was deduced based on NMR data and supported by the new but postulated analogue lysoquinone-TH2 (**8**) also provides insights on the biosynthesis of lysolipin I (**1**). Evidence is given for the acetate-derived tridecaketide backbone of **7** in contrast to the dodecaketide malonyl-derived (malonyl- or malonamide CoA) precursor chain of the parent compound lysolipin I (**1**).

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2079-6382/7/3/53/s1. Figure S1: Plasmid map of pCU1; Figure S2: HPLC and LC-MS of lysoquinone-TH1 (7) and proposed lysoquinone-TH2 (8); Figure S3: NMR spectra of lysoquinone-TH1 (7); Figure S4: NMR spectra of 13C-enriched lysoquinone-TH1 (7); Figure S5: 2D-NMR data of lysoquinone-TH1 (7); Figure S6: 13C enrichment of 7; Figure S7: PDE-4B2 inhibition assay with lysoquinone-TH1; Table S1: Level of enrichment and specific incorporation of lysoquinone-TH1 (7), resulted from the feeding experiment with doubly labeled [1,2-13C2] acetate.

**Author Contributions:** T.H. performed strain cultivation and feeding experiments, designed and performed chemical preparative isolation and chemical analyses, wrote the draft manuscript with contributions of all authors. C.E.-M.U. designed and performed cloning of the minimal PKS. J.W. and J.F.I. designed and performed bioactivity studies. W.W., S.G. and T.W. conceived the study, all authors contributed and approved the manuscript.

**Funding:** This work was supported by the German Ministry of Education and Research [GenBioCom 0315585A] to S.G., T.W. and W.W. T.W. is supported by a grant of the Novo Nordisk Foundation [CFB, grant NNF10CC1016517] and W.W. by a grant from the German Center for Infection Research [DZIF TTU 09.912, FKZ 8020809912].

**Acknowledgments:** We thank Bruker Daltonics, Bremen, Germany for valuable discussions.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Novel Polyethers from Screening** *Actinoallomurus* **spp.**

### **Marianna Iorio 1, Arianna Tocchetti 1, Joao Carlos Santos Cruz 2, Giancarlo Del Gatto 1, Cristina Brunati 2, Sonia Ilaria Maffioli 1, Margherita Sosio 1,2 and Stefano Donadio 1,2,\***


Received: 1 May 2018; Accepted: 13 June 2018; Published: 14 June 2018

**Abstract:** In screening for novel antibiotics, an attractive element of novelty can be represented by screening previously underexplored groups of microorganisms. We report the results of screening 200 strains belonging to the actinobacterial genus *Actinoallomurus* for their production of antibacterial compounds. When grown under just one condition, about half of the strains produced an extract that was able to inhibit growth of *Staphylococcus aureus*. We report here on the metabolites produced by 37 strains. In addition to previously reported aminocoumarins, lantibiotics and aromatic polyketides, we described two novel and structurally unrelated polyethers, designated α-770 and α-823. While we identified only one producer strain of the former polyether, 10 independent *Actinoallomurus* isolates were found to produce α-823, with the same molecule as main congener. Remarkably, production of α-823 was associated with a common lineage within *Actinoallomurus*, which includes *A. fulvus* and *A. amamiensis*. All polyether producers were isolated from soil samples collected in tropical parts of the world.

**Keywords:** *Actinoallomurus*; antibiotics polyethers; screening

#### **1. Introduction**

Antimicrobial resistance among bacterial pathogens is becoming a major threat to human health and well-being. While different approaches can be deployed to mitigate and delay the insurgence and spread of antibiotic resistance, it is also clear that we will need a constant supply of new antibiotics, especially new chemical classes not affected by current resistance mechanisms. However, new chemical classes of antibiotics have been extremely difficult to discover from combinatorial and chemical libraries and microbial products still represent a major source of drug leads as antibiotics [1].

One of the main issues with antibiotic discovery based on microbial products is the probability of rediscovering known metabolites. This requires introducing one or more elements of novelty in the screening with respect to past efforts [2,3]. An attractive element of novelty can be represented by using novel strains, for example a taxonomic group that has not witnessed extensive analyses of its secondary metabolites, since taxonomic diversity can be seen as a surrogate for chemical diversity [4]. The main idea behind this concept is that organisms that have been subjected to different evolutionary pressures have developed unique biology to survive and, for some taxa, secondary metabolites are an important part of their biology. However, since production of secondary metabolites is not distributed equally among all species, it is important to select a taxon with a high potential to produce bioactive compounds in order to increase the probability of finding new compounds with a reasonable screening effort. Following this rationale, we initiated over a decade ago a project aimed at finding taxonomically

divergent filamentous *Actinobacteria*, which led to the discovery of several novel taxa, including new suborders, families and genera [5–8]. One of the new taxa, originally designated as "alpha" [5], turned out to coincide with the genus *Actinoallomurus* (family *Thermomonosporaceae*), formally described in 2009 [9] with new entries added since [10–15]. With proper isolation methods, strains belonging to the genus *Actinoallomurus* could be effectively retrieved from a variety of soil samples, enabling the creation of a consistent collection of about 1000 isolates [2,12].

Strains belonging to the genus *Actinoallomurus* have been shown to produce a variety of metabolites [12,15–19] originating from different biosynthetic pathways. In this study, we explored 200 randomly picked *Actinoallomurus* isolates from the NAICONS strain collection. Together with the metabolites previously described [16–19], we analyzed the antibacterial compounds produced by 37 strains. This set of compounds includes two novel polyethers, as described here.

#### **2. Results and Discussion**

#### *2.1. The Screened Set*

The selected *Actinoallomurus* strains were isolated from a variety of samples collected in different continents, and representing diverse environments such as densely vegetated areas, sulfur-enriched craters of volcanic origin, and plant rhizosphere. The geographic distribution of the screened strains is listed in Table 1.


**Table 1.** Geographic origin of the analyzed strains.

<sup>a</sup> All from tropical countries; <sup>b</sup> All from tropical countries, except four strains from continental USA.

Three extract types were prepared from the strains—see Material and Methods—and evaluated for their ability to inhibit growth of *Staphylococcus aureus* and of a Δ*tolC* mutant of *Escherichia coli*. Overall, 104 and 17 strains produced at least one extract with activity against *S. aureus* and *E. coli*, respectively. All extracts with activity against *E. coli* were also active against *S. aureus*. The highest activity was observed in the mycelium and the ethyl acetate extracts at comparable frequency (57 and 46, respectively), and only in one case was the ethyl acetate exhaust extract more active. Except perhaps for an under-representation of active strains isolated from Asian samples, there was no apparent effect of the continent of origin on the frequency of anti-staphylococcal activity (Table 1). Positive extracts were analyzed as described under Material and Methods, leading to preliminary information on the chemical identity of the identified compounds. We report below the characterization of the molecules identified from 37 of the active strains.

#### *2.2. Coumermycins, Spirotetronates, Lantibiotics and Diketopiperazines*

Coumarin antibiotics target bacterial DNA gyrase and one member of this family, novobiocin, has been used to treat bacterial infections in humans caused by Gram-positive bacteria [20]. Coumermycins are other member of this family with higher antibacterial activity than novobiocin [20]. We have previously reported that the *Actinoallomurus* sp. K275, belonging to the Alp18 phylotype, produced several members of the coumermycin complex [12]. In the course of screening the 200 strains, two additional coumermycin producers were identified: strains ID145250 and ID145519, belonging to the Alp22 phylotype. The main coumermycin congeners produced by these three strains were A2, D1 and A1, respectively. Relevant data are shown in Supplementary Materials Figure S1. Coumermycins have been reported mostly from *Streptomyces* spp. [20].

Tetronate-containing polyketide natural products represent a large and diversified family of microbial metabolites with different bioactivities [21]. Halogenated spirotetronates designated NAI-414 A and B were previously described as the main products of *Actinoallomurus* sp. ID145414 [16]. During our screening, two strains (ID145260 and ID145814) were found to produce a molecule with *m/z* [M−H]<sup>−</sup> 839, an isotopic pattern compatible with the presence of two chlorines and a UV spectrum with maxima at 228 and 268 nm (see Supplementary Materials Figure S2). These properties closely resemble those reported for NAI-414 and actually match those reported for pyrrolosporin, a compound structurally related to NAI-414 but containing an additional unsaturation in the polyketide backbone. Pyrrolosporin was previously reported as a metabolite from a *Micromonospora* sp. [22,23].

Ribosomally synthesized and post-translationally modified peptides represent a rapidly expanding family of microbial metabolites, with lantibiotics as one of better known representatives [24]. *Actinoallomurus* sp. ID145699 was previously reported to produce the chlorinated lantibiotic NAI-107 and its brominated variant in Br-supplemented medium [19]. In the course of our screening, we identified strain ID145640 as an additional NAI-107 producer, with the known variations at Trp4 (hydrogen or chlorine) but just zero or one hydroxylation at Pro14 (Supplementary Materials Figures S3 and S4). Previously, NAI-107 was reported as the product of two independent *Microbispora* isolates [25,26].

Diketopiperazines represent a broad family of cyclized dipeptides produced by a large variety of microorganisms [27]. It is thus no surprise that one of the strains in the present work, *Actinoallomurus* sp. ID145219, produced two compounds with activity against *S. aureus* that were identified as cyclo-Phe-Leu and cyclo-Phe-Phe (see Supplementary Materials Figure S5). The structures of the metabolites mentioned in Section 2.2 are illustrated in Figure 1.

**Figure 1.** Chemical structures of molecules produced by *Actinoallomurus* and described in Section 2.2.

#### *2.3. Aromatic Polyketides*

During the course of our screening, we frequently encountered strains producing aromatic polyketides. Identified products included the allocyclinones, hyper-halogenated angucyclinones detected from twelve independent strains belonging to three different phylotypes [17]; the paramagnetoquinones, highly paramagnetic tetracenes produced by three independent strains belonging to three different phylotypes [18]; three producers of the related dihydrobenzo-(alpha)-naphthacenequinones pradimicin (strains ID145114 and ID145318) and benanomicin (strain ID145226) (see Supplementary Materials Figure S6). Pradimicin and benanomicin present a common polyketide core decorated with a disaccharide unit and differ for the presence/absence of an *N*-methyl on the aminated-sugar. Both compounds were previously reported as products of *Actinomadura* spp., with pradimicin produced by a confirmed species of the genus, *Actinomadura hibisca* [28]. Benanomicin had already been reported as a metabolite of *Actinoallomurus* strain K15 [12].

Overall, this brief survey of aromatic polyketides indicates that *Actinoallomurus* spp. are capable of producing decaketides (i.e., paramagnetoquinones), undecaketides (i.e., allocyclinones, presumably undergoing oxidative ring cleavage after polyketide formation) and dodecaketides (i.e., pradimicin and benanomicin). The structures of these metabolites are shown in Figure 2.

**Figure 2.** Chemical structures of aromatic polyketides produced by *Actinoallomurus* and described in Section 2.3.

#### *2.4. Polyethers*

The extracts from several strains presented large inhibition halos against *S. aureus* but little or no activity against the *E. coli* Δ*tolC* strain. Upon resolution by high performance liquid chromatography (HPLC), the active fractions showed a retention time of 5–11 min and, with one exception, had no ultraviolet (UV) absorption. Mass spectrometry (MS) analysis indicated the presence of *m/z* signals consistent with the formation NH4 <sup>+</sup> and Na+ adducts but with no detectable H+ adducts. It should be noted that the extraction procedure and the liquid chromatography (LC)-MS eluent do not contain ammonium or sodium ions. Hence, the observation of NH4 <sup>+</sup> and Na+ adducts suggests a high cation-binding ability of the active molecules. Moreover, the fragmentation pattern showed losses of 44 amu (free carboxylic acid) and 62 amu (decarboxylation and dehydratation). As demonstrated below, we identified three distinct polyether families within twelve strains: one new compounds, designated α-823, produced by ten independent isolates; an additional new polyether, designated α-770, and the previously reported octacyclomycin, produced by one strain each. Table 2 lists the identified polyether-producing *Actinoallomurus* isolates, along with their origins, accession number of the 16S rRNA gene sequences and the *m/z* value of the most abundant congener.


**Table 2.** Polyether-producing *Actinoallomurus* strains.

<sup>a</sup> On the basis of the 16S rRNA gene sequence.

Several strains produced a likely polyether with major *m/z* signals [M+NH4] <sup>+</sup> 932 and [M+Na]<sup>+</sup> 937 (see Figure 4a,b for representative example). The metabolite produced by all these strains appeared identical (Table 2) and those from strain ID145823 were analyzed in detail. The strain produced a complex of related molecules (Figure 4a; Supplementary Materials Table S1) with similar HPLC retention times (they all eluted at ≥90% acetonitrile; see Supplementary Materials Figure S7), appearing as both [M+NH4] <sup>+</sup> and [M+Na]<sup>+</sup> adducts, and with similar fragmentation patterns (Supplementary Materials Figure S7). The deduced molecular formulae indicate that the congeners varied in methyl group(s) and oxygen(s) (Supplementary Materials Table S1). The structure of the major congener, designated α-823, was elucidated by a combination of NMR (Supplementary Materials Figures S8–S13) and HR-ESI-MS (Figure S14) and MS/MS analyses. The molecular formula was defined as C48H82O16Na (calculated 937.5495 [M+Na]+, found 937.5510 [M+Na]+). The analysis of 1H-monodimentional spectrum revealed the presence of four singlet and six doublet methyl signals, along with four methoxy groups. Moreover, several diastereotopic methylene signals were observed using 2D-HSQC (bi-dimensional Heteronuclear Single Quantum Coherence) experiments, indicating CH2 inserted into rigid structures or close to stereocenters. COSY (COrrelated SpectroscopY) and TOCSY (TOtal Correlated SpectroscopY) analyses, along with HMBC (Heteronuclear Multiple Bond Correlation, resulted in the structure shown in Figure 3. α-823 consists of a C30 chain with three substituted tetrahydrofuranes and three substituted tetrahydropyranes. Tetrahydrofurane C carries a deoxysugar (Figure 4). Structurally, α-823 closely resembles the polyether SF-2361, produced by an *Actinomadura* sp. [29,30]. Despite an identical molecular formula, α-823 carries a methyl at C-6, while a methyl group in SF-2361 has been assigned to C-2. Indeed, in the α-823 spectrum C-2 is a free methylene with δ<sup>H</sup> signal at 2.18–2.54 ppm and δ<sup>C</sup> at 44.7 ppm (due to the proximity to a hemiacetal and a carboxylic acid), while C-6 carries no proton (δ<sup>C</sup> at 78.6 ppm) and shows HMBC correlations with a methoxy at 3.38 ppm and a singlet methyl at 1.16 ppm.

Strain ID145817 was found to produce a bioactive compound eluting at 9.0 min with no UV adsorption. Upon MS analysis, it showed *m/z* signals [M+NH4] <sup>+</sup> 1034 and [M+Na]+ 1039, corresponding to the NH4 <sup>+</sup> and Na+ adduct of a molecule of 1016 amu, with major fragments at 990–995 and 972–977 (Supplementary Materials Figure S15). Additionally, a Δ*m*/*z* of 129 suggested the elimination of a deoxysugar. All these properties are consistent with the metabolite produced by strain ID145817 being identical to octacyclomycin, a di-glycosylated polyether previously reported from a *Streptomyces* sp. [31]. NMR analysis of the purified compound confirmed this hypothesis showing

signals identical to those reported in literature for octacyclomycin (data not shown). The structure of octacyclomycin is reported in Figure 4. Octacyclomycin, SF-2361 and α-823 derive from a C30 chain with identical sequence of tetrahydrofuranes and tetrahydropyranes, but differ for the number and position of methyls, oxygens and glycosyl moieties.

**Figure 3.** Analysis of α-823. (**a**) Base peak chromatogram of the 4.0–10.0 min portion with retention times and *m/z* [M+MH4]<sup>+</sup> values. Data obtained with a partially purified extract of *Actinoallomurus* sp. ID145823 (see Table S1 and Figure S7 for the congeners comparison); (**b**) mass spectrometry (MS) at 8.1 min in positive (above) and negative (below) ionization mode; (**c**) MS2 analysis of *m/z* [M+NH4] <sup>+</sup> 932; (**d**) putative fragmentation pathway for α-823.

Strain ID145770 was found to produce an active peak eluting at 6.9 min and showing the polyether-diagnostic *m/z* signals [M+NH4] <sup>+</sup> 852 and [M+Na]+ 857, corresponding to a molecule of 834 amu (Figure 5a,b). Unlike the other polyethers, however, this peak showed a UV signal with maximum at 314 nm (Figure 5a) and a Δ*m*/*z* 135 upon MS fragmentation, consistent with presence of a methylsalicylate moiety (Figure 5c), a chromophore previously found in the polyether cationomycin [32]. The active molecule was produced along with a related, Δ*m*/*z* +14 species, consistent with the presence of an extra methyl group atom, as listed in Table S3 and shown in Figure S16. The structure of the major congener α-770 was elucidated by a combination of NMR (Figures S17–S22), HR-ESI-MS (Figure S23) and MS/MS analyses (Figure 5). The molecular formula was defined as C45H70O14Na (calculated 857.4658 [M+Na]+, found 857.4695 [M+Na]+). The analysis of 1H- and HSQC spectra revealed the presence of 2 methoxy along with 10 methyl groups, with four of them devoid of multiplicity. Moreover, several diastereotopic methylene signals were observed, indicating CH2 inserted into rigid structures or close to stereocenters. The 2D-NMR-experiments allowed assigning the carbons in a structure consisting four substituted tetrahydrofuranes and one substituted tetrahydropyrane, as shown in Figure 4. The overall structure of α-770 is similar to that of cationomycin, produced by an *Actinomadura* sp. [32]. Both polyethers consists of a C27 chain with identical positioning of the methyl groups, suggesting they a common origin from incorporation of the same sequence of propionate and acetate precursors [33]; and

both polyethers carry a 6-methylsalicylate moiety linked in an ester bond the C-3 hydroxyl. However, despite these similarities and the small mass difference (16 amu), α-770 and cationomycin differ significantly in the hydroxyl and methoxy decorations. Indeed, 2D-HMBC correlations established that α-770 lacks the hydroxyls at positions 15 and 5 , which are present as methoxy groups in cationomycin, as well as the hydroxyl at position 3. In contrast, α-770 carries methoxys at positions 11 and 21, while cationomycin has no hydroxyls at those positions.

**Figure 4.** Chemical structure of polyethers produced by *Actinoallomurus* and described in Section 2.4.

**Figure 5.** Analysis of α-770. (**a**) UV chromatogram at 230 nm and UV spectrum of 6.9-min peak. Data obtained with a partially purified extract of *Actinoallomurus* sp. ID145770; (**b**) MS at 6.9 min in positive (above) and negative (below) ionization mode; (**c**) MS2 of *m/z* [M+NH4] <sup>+</sup> 852; (**d**) putative fragmentation pathway for α-770.

The carbon chain of polyethers is assembled by type I PKSs, followed by ring formation by dedicated epoxidases [33]. The carbon chains of α-770 and α-823 are likely to derive from trideca- and pentadeca-ketide precursors, respectively. In addition, the 6-methylsalicylate unit of cationomycin has been show to result from acetate incorporation [34], consistent with the involvement of a type III PKS system.

The antimicrobial activity of polyethers is strictly connected to their ability to insert into cellular membranes and alter transport of metal cations, which leads to changes in the osmotic pressure inside the cytoplasm and cell death [30,35]. However, polyethers generally lack cellular selectivity. The antibacterial activities of α-823 and α-770 are reported in Table 3. They show potent activities against most Gram-positive bacteria, with minimal inhibitory concentrations (MICs) well below 1 μg/mL for α-770, the most active of the three compounds, with 2–4 times lower MICs than salinomycin against most of the tested strains. The polyether α-823 was generally 4–16 times less active than α-770, except for an increased activity against *Mycobacterium smegmatis*. No activities were detected against Gram-negative strains (not shown), except for *Moraxella catarrhalis*.

**Table 3.** MICs (Minimal Inhibitory Concentrations) of α-770 and α-823 in comparison with salinomycin.


<sup>a</sup> abbreviations: MRSA, methicillin-resistant *Staphylococcus aureus*; MSSA, methicillin-sensitive *Staphylococcus aureus*; GISA, glycopeptide-intermediate *Staphylococcus aureus*.

Some polyethers (e.g., salinomycin, monensin) are commercially used as coccidiostatic agents and salinomycin has also been evaluated as anticancer agents [36]. Recently, salinomycin and other ionophores have been shown to have transmission blocking activity against the etiological agent of malaria [37]. When tested against one chloroquine-sensitive and one chloroquine-resistant strain of *Plasmodium falciparum*, α-770 and α-823 showed inhibitory activity in the 2–10 nM range comparable to those of salinomycin [37]. It should be noted that, although there are over 120 reported polyethers in the literature, their mechanism of action has been studied on a limited number of molecules [30] and we are not aware of studies aimed at making polyethers selective towards a particular cell type.

The 16S rRNA gene sequences of the polyether-producing *Actinoallomurus* strains of Table 2 were determined and compared to those of all described *Actinoallomurus* species. The results are shown in Figure 6. All the ten α-823 producers cluster together in a compact branch that includes the type strains *Actinoallomurus fulvus* and *A. amamiensis*: specifically, the identical 16S rRNA gene sequences from strains ID145265, -145554, -145603 and -145830 are 100% identical to that from *A. fulvus*; the 16S rRNA genes sequences from ID145802 and -145828 are 99.9% and 100% identical, respectively, to that from *A. amamiensis*; while the identical 16S rRNA gene sequences from strains ID145804, -145811 and -145816 and that from strain ID145823 are less related to those of described species (Figure 6). The octacyclomycin producer ID145817 is less closely related (≤99.1% identity) to *A. bryophytorum* and *A. yoronensis*, although all these strains belong to a related phylogenetic

branch. The α-770 producer, instead, is distantly related to *A. spadix* (98.7% identity) and belongs to an unrelated branch that includes, among others, strain ID145113, the producer of the aromatic polyketide paramagnetoquinone [18]. Remarkably, all polyether-producing *Actinoallomurus* strains were isolated from soil samples of tropical origin (Table 2), notwithstanding that 64% of the screened strains were of non-tropical origin (mostly from Europe; see Table 1). These observations suggest that the branch including *A. fulvus*, *A. amamiensis* and the α-823 producers might consist of cosmopolitan strains and that polyether production might be mostly associated with *Actinoallomurus* strains from tropical environment. Previous studies have established a correlation between different classes of secondary metabolites and geographic origin [38,39], although we are unaware of previous reports on biogeography of polyether production.

#### **3. Materials and Methods**

#### *3.1. Bacterial Strains and Media*

*Actinoallomurus* strains are from the NAICONS strain library. Each strain was propagated on S1-5.5 plates (60 g/L oatmeal, 18 g/L agar, 1 mL/L Trace Elements Solution) at 30 ◦C for two to three weeks. From these plates, the grown mycelium was used to inoculate AF-A medium (10 g/L dextrose monohydrate, 4 g/L soybean meal, 1 g/L yeast extract, 0.5 g/L NaCl, 1.5 g/L 2-(*N*-morpholino) ethanesulfonic acid, pH adjusted to 5.6) in shake-flasks. After 8 days in a rotatory shaker (200 rpm) at 30 ◦C, cultures were harvested and extracted (see Section 3.2).

PCR amplifications with the eubacterial primers F27 and R1492 and phylogenetic analyses of the 16S rRNA gene sequences were performed as previously described [40]. The 16S rRNA gene sequences have been deposited in GenBank, as listed in Table 2.

#### *3.2. Preparation of Extracts*

Three different extracts were prepared from each culture. The culture was centrifuged at 16,000 rcf for 5 min and the resulting pellet was resuspended in 0.4 vol ethanol, while the supernatant was used for ethyl acetate extraction (see below). After shaking 1 h at 55 ◦C, the suspension was centrifuged once more (16,000 rcf, 5 min) and the supernatant transferred to a new tube, dried under vacuum and resuspended in 10% DMSO at 0.2× the original culture volume. This extract was designated as the mycelium extract.

The supernatant from the first centrifugation step above was extracted with 0.5 vol ethyl acetate. After mixing and phase separation, the organic phase was transferred to a new tube and the aqueous phase extracted again with further 0.5 vol ethyl acetate. The two organic phases were combined, dried and resuspended at 5× the original concentration in 10% DMSO. This extract was designated as EtAc extract. The exhausted aqueous phase was also retained and tested as such.

#### *3.3. Antibacterial Assays*

The screening was performed by agar diffusion, using plates containing 15 mL of Müller-Hinton Agar and inoculated with 5 × 105 CFU/mL of the indicator strain. Strains used in this assay were *Staphylococcus aureus* ATCC 6538P and *Escherichia coli* L4242, a Δ*tolC* derivative of MG1061. After spotting 20 μL of the resuspended extract, plates were incubated 18–20 h at 37 ◦C. After HPLC fractionation, bioactive fractions were identified using the same methodology.

MIC determinations of purified compounds were performed by broth micro dilution in sterile 96-well polystyrene microtiter plates according to CLSI guidelines, using Müller Hinton broth (Difco Laboratories) containing 20 mg/L CaCl2 and 10 mg/L MgCl2 for all strains except for *Streptococcus* spp., which were grown in Todd Hewitt broth. Strains were inoculated at 5 × <sup>10</sup><sup>5</sup> CFU/mL and incubated at 37 ◦C for 20−24 h. Strain with an L prefix are from the NAICONS pathogens library.

#### *3.4. Analytical Procedures*

For monitoring metabolites production analytical HPLC was performed on Shimadzu Series 10 spectrophotometer (Kyoto, Japan), equipped with a reverse-phase column, LiChrospher RP-18, 5 μm, 4.6 × 125 mm (Merck, Darmstadt, Germany). Phase A was 0.1% trifluoroacetic acid (TFA), phase B acetonitrile, and the flow rate was 1 mL/min. Resolution was achieved with a linear gradient from 10% to 36% phase B in 5 min; from 36% to 50% phase B in 7 min; and from 50% to 80% phase B in 1 min; followed by a 4-min isocratic step at 80% phase B and column re-equilibration. UV detection was at 230 and 270 nm. LC-MS analyses were performed on a Dionex UltiMate 3000 coupled with an LCQ Fleet (Thermo scientific) mass spectrometer equipped with an electrospray interface (ESI) and a tridimensional ion trap. The column was an Atlantis T3 C18 5 μm × 4.6 mm × 50 mm maintained at 40 ◦C at a flow rate of 0.8 mL/min. Phases A and B were 0.05% TFA in water and acetonitrile, respectively. The elution was with a 14-min multistep program that consisted of 10, 10, 95, 95, 10 and

10% phase B at 0, 1, 7, 12, 12.5 and 14 min, respectively. UV-VIS signals (190–600 nm) were acquired using the diode array detector. The *m/z* range was 110–2000 and the ESI conditions were as follows: spray voltage of 3500 V, capillary temperature of 275 ◦C, sheath gas flow rate at 35 units and auxiliary gas flow rate at 15 units.

High resolution MS spectra were recorded at Unitech OMICs (University of Milano, Italy) using a Triple TOF® 6600 (Sciex) equipped with an ESI source. The experiments were carried out by direct infusion in positive ionization mode. The ESI parameters were the following: curtain gas 25 units, ion spray voltage floating 5500 v, temperature 50 ◦C, ion source gas1 10 units, ion source gas2 0 units, declustering potential 80 v, syringe flow rate 10 μL/min, accumulation time 1 s.

Mono- and bi-dimensional NMR spectra were measured in CDCl3 at 298K using an AMX 400 MHz spectrometer. Chemical shifts are reported relative to CDCl3 (δ 7.26 ppm).

#### *3.5. Purification of Polyethers*

α-823: Nine parallel 100-mL cultures of *Actinoallomurus* sp. ID145823 in AF-A medium were harvested at seven days and filtered through paper under reduced pressure to separate the mycelium from the clear broth. The latter (860 mL) was extracted three times with 450 mL ethyl acetate while the mycelium was treated with 100 mL acetone, kept on a rotary shaker 1 h and centrifuged. The combined organic phases were dried under reduced pressure and dissolved in 2 mL dichloromethane. The sample was resolved on a 12 g direct-phase Flash column RediSep RF (Teledyne Isco) by using a CombiFlash RF Teledyne Isco medium-pressure chromatography system. The column was previously conditioned at 100% dichloromethane and then eluted at 15 mL/min with a 20-min linear gradient from 0 to 10% methanol. Fractions were analyzed by LC-MS and those with the highest purity were pooled and dried, obtaining 14 mg of purified α-823. Five mg were dissolved in CDCl3 for NMR analysis.

Octacyclomycin: Two parallel 100-mL cultures of *Actinoallomurus* sp. ID145817 in AF-A medium were harvested at seven days and filtered through paper under reduced pressure to separate the mycelium from the clear broth. The latter (170 mL) was extracted three times with 80 mL ethyl acetate while the mycelium was treated with 100 mL ethanol, kept 1 h on a rotary shaker and centrifuged. The combined organic phases were dried under reduced pressure and dissolved in 2 mL dichloromethane. The sample was resolved by medium-pressure chromatography as described above for α-823. Fractions were analyzed by LC-MS and processed as above. Four mg of purified octacyclomycin were obtained.

α-770: Two parallel 100-mL cultures of *Actinoallomurus* sp. ID145770 in medium M8 [41] were harvested at seven days. Mycelium was harvested by centrifugation (10 min at 4000 rpm), treated with 20 mL ethanol, kept 1 h on a rotary shaker and centrifuged. The organic phase was recovered, dried under reduced pressure, dissolved in 2 mL dichloromethane and resolved by medium-pressure chromatography as described for α-823, except that the flow rate was set at 30 mL/min. Fractions 11–14, which showed activity against *S. aureus*, were analyzed by LC-MS and the ones containing similar signals were pooled, dried and dissolved in dichloromethane. A further purification step was performed by preparative thin layer chromatography on silica gel (Analtech Preparative Silica Gel GF with UV254 2000 μm; Sigma-Aldrich, St Louis, MO, USA) in dichloromethane:methanol 9:1. The spot at Rf 0.9 was dried and dissolved in CDCl3 for NMR analysis. Four mg of purified α-770 were obtained.

#### **4. Conclusions**

When grown under one routine condition in shake-flasks and only looking at metabolites with growth inhibitory activity towards *S. aureus*, we have been able to show that *Actinoallomurus* strains can express several types of biosynthetic pathways: type I (for making polyethers and spirotetronates), type II (for aromatic polyketides) and type III (for the 6-methylsalicylate moiety of the polyether α-770) polyketide synthases; ribosomally synthesized and post-translationally modified peptides (lantibiotic); aminocoumarins; and short non-ribosomal peptide synthase derived products (diketopiperazines).

Some of the observed metabolites, e.g., the previously reported aromatic polyketide allocyclinones [17] and the polyether α-823, seem to be relatively frequent metabolites in the screened *Actinoallomurus* strains. Other metabolites represent rarer discovery events, with identical or close matches in several *Actinobacteria* genera. Indeed, the genus *Actinoallomurus* resulted from a reclassification of *Actinomadura* spp. within the family *Thermomonosporaceae*, order *Streptosporangiales*[9]. Some of the compounds described here (e.g., pradimicin and benanomicin) and the α-770- and α-823-related polyethers cationomycin and SF-2361, respectively, were previously reported from *Actinomadura* spp. Others of the described metabolites are produced by distantly related taxa, such as NAI-107 by *Microbispora* spp. (family *Streptosporangiaceae*, order *Streptosporangiales*), pyrrolosporin by a *Micromonospora* sp. (order *Micromonosporales*), in addition to the *Streptomyces*-produced coumermycin and octacyclomycin.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2079-6382/7/2/47/s1, Figure S1: LC-MS, UV analysis of ethyl acetate extract of *Actinoallomurus* sp. ID145519, Figure S2: LC-MS, UV analysis of ethyl acetate extract of *Actinoallomurus* sp. ID145814, Figure S3: LC-MS, UV analysis of mycelium extract of *Actinoallomurus* sp. ID145640, Figure S4: Comparison between mycelium extract of *Actinoallomurus* sp. ID145640 and NAI-107 standard, Figure S5: LC-MS, UV analysis of ethyl acetate extract of *Actinoallomurus* sp. ID145219, Figure S6: LC-MS, UV analysis of ethyl acetate extract of *Actinoallomurus* sp. ID145114, Figure S7: Fragmentation patterns of the eight different congeners of α-823, Figure S8: 1H-NMR spectrum of α-823, Figure S9: 1H-COSY NMR spectrum of α-823, Figure S10: 1H-TOCSY NMR spectrum of α-823, Figure S11: 1H-13C HSQC NMR spectrum of α-823, Figure S12: 1H-13C HMBC NMR spectrum of α-823, Figure S13: Major COSY and HMBC correlations of α-823, Figure S14: HRESI-MS spectrum of α-823, Figure S15: LC-MS, UV analysis of mycelium extract of *Actinoallomurus* sp. ID145817, Figure S16: Fragmentation patters of the two different congeners of α-770, Figure S17: 1H NMR spectrum of α-770, Figure S18: 1H-COSY NMR spectrum of α-770, Figure S19: 1H-TOCSY NMR spectrum of α-770, Figure S20: 1H-13C HSQC NMR spectrum of α-770, Figure S21: 1H-13C HMBC NMR spectrum of α-770, Figure S22: Major COSY and HMBC correlations of α-770, Figure S23: HRESI-MS spectrum of α-770, Table S1: Different α-823 congeners detected in the extract from *Actinoallomurus* sp. ID145823, Table S2: 1H and 13C NMR data for α-823 in CDCl3, Table S3: α-770 congeners detected in the extract from *Actinoallomurus* sp. ID145770, Table S4: 1H and 13C NMR data for α-770 in CDCl3.

**Author Contributions:** M.S. and S.D. conceived and designed the experiments; M.I., A.T., J.C.S.C., G.D.G., C.B. and S.I.M. performed the experiments; M.I., A.T., J.C.S.C., G.D.G., C.B., S.I.M., M.S. and S.D. analyzed the data; M.I., A.T. and S.D. wrote the paper.

**Acknowledgments:** Portion of this work was part of a PhD dissertation of J.C.S.C. at the University of Warwick, UK. This work was partially supported by the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement 289285 and by grants from Regione Lombardia. We are grateful to Carlo Mazzetti, Mirko Ornaghi, Roberta Pozzi and Matteo Simone for their early contributions to this project, and to Donatella Taramelli and Silvia Parapini for the anti-plasmodium activity tests. We also thank the Unitech OMICs platform at the University of Milano for HRMS analyses.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


nov. and *Catenulisporinae* subord. nov. in the order *Actinomycetales*. *Int. J. Syst. Evol. Microbiol.* **2006**, *56*, 1747–1753. [CrossRef] [PubMed]


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Specificity of Induction of Glycopeptide Antibiotic Resistance in the Producing Actinomycetes**

#### **Elisa Binda 1,2, Pamela Cappelletti 1,2, Flavia Marinelli 1,2 and Giorgia Letizia Marcone 1,2,\***


Received: 28 February 2018; Accepted: 20 April 2018; Published: 25 April 2018

**Abstract:** Glycopeptide antibiotics are drugs of last resort for treating severe infections caused by Gram-positive pathogens. It is widely believed that glycopeptide-resistance determinants (*van* genes) are ultimately derived from the producing actinomycetes. We hereby investigated the relationship between the antimicrobial activity of vancomycin and teicoplanins and their differential ability to induce *van* gene expression in *Actinoplanes teichomyceticus*—the producer of teicoplanin—and *Nonomuraea gerenzanensis*—the producer of the teicoplanin-like A40926. As a control, we used the well-characterized resistance model *Streptomyces coelicolor*. The enzyme activities of a cytoplasmic-soluble D,D-dipeptidase and of a membrane-associated D,D-carboxypeptidase (corresponding to VanX and VanY respectively) involved in resistant cell wall remodeling were measured in the actinomycetes grown in the presence or absence of subinhibitory concentrations of vancomycin, teicoplanin, and A40926. Results indicated that actinomycetes possess diverse self-resistance mechanisms, and that each of them responds differently to glycopeptide induction. Gene swapping among teicoplanins-producing actinomycetes indicated that cross-talking is possible and provides useful information for predicting the evolution of future resistance gene combinations emerging in pathogens.

**Keywords:** actinomycetes; glycopeptide antibiotics; teicoplanin; A40926; *van* resistance genes

#### **1. Introduction**

Glycopeptide antibiotics (GPAs) are drugs of last resort for treating severe infections caused by Gram-positive pathogens such as *Staphylococcus aureus* (SA), *Enterococcus* spp., and *Clostridioides difficile* [1]. Clinically important GPAs include first-generation vancomycin and teicoplanin—which, although discovered many decades ago, continue to be extensively used in clinical practice—and second-generation telavancin, dalbavancin, and oritavancin, which were recently approved for clinical use for their increased antimicrobial potency and superior pharmacokinetic properties [2–4]. Vancomycin and teicoplanin are natural product GPAs produced by soil-dwelling filamentous actinomycetes. Their common structural motif is a core heptapeptide scaffold containing aromatic amino acids that have undergone extensive oxidative cross-linking and decoration with different moieties, such as sugar residues, chlorine atoms and—in case of teicoplanin—a lipid chain. GPAs inhibit peptidoglycan (PG) synthesis by binding to the D-alanyl-D-alanine (D-Ala-D-Ala) terminus of the peptide stem of PG-precursor lipid II. The binding of GPAs to lipid II by forming five hydrogen bonds locks PG precursors, impeding subsequent cross-linking reactions [5,6]. Second-generation GPAs are semisynthetic derivatives of vancomycin- and teicoplanin-like molecules, where the chemical modifications were introduced outside the D-Ala-D-Ala binding pocket, mainly involving

the appendage of hydrophobic aryl or acyl groups that mimic the natural lipid chain of teicoplanin. In fact, the superior antimicrobial potency of teicoplanin-like molecules is due to membrane anchoring of the hydrophobic tail, which strengthens the bond to membrane-localized lipid II [7,8]. Additionally, Dong et al. [9] demonstrated that lipidation is the key functional difference between vancomycin and teicoplanin related to their differing abilities of inducing a GPA resistance response in enterococci. More recently Kwun and Hong [10], using the harmless actinomycete *Streptomyces coelicolor* as a model resistance system, confirmed that teicoplanin-like derivatives are poor inducers of GPA resistance, and a lack of induction accounts for the susceptibility to these molecules.

Many different GPA-resistant phenotypes have been described in enterococci and staphylococci (for an extensive review, see Binda et al., 2014 [2]). In the two most prominent manifestations of resistance (VanA or VanB phenotypes), the GPA-induced expression of *van* genes remodels the bacterial cell wall. The replacement of the dipeptide (D-Ala-D-Ala) terminus of PG precursor with the depsipeptide D-alanyl-D-lactate (D-Ala-D-Lac) reduces by 1000-fold the GPA affinity to their molecular target [11]. Strains displaying this type of resistance are either resistant to both vancomycin and teicoplanin (VanA phenotype), or they are resistant only to vancomycin and susceptible to teicoplanin (VanB phenotype) [12]. Although the proteins directly involved in conferring VanA resistance—i.e., VanH, which converts pyruvate into D-lactate; VanA, a D-Ala-D-Lac ligase; and VanX, a D-Ala-D-Ala dipeptidase—are highly homologous to their counterparts in the VanB phenotype (VanHB, VanB, VanXB), the two-component regulatory systems controlling *van* gene transcription (VanS/VanR in VanA phenotype and VanSB/VanRB in VanB phenotype) are only distantly related [13]. In particular, the membrane-associated sensor domains of VanS and VanSB are unrelated in amino acid sequence, and they respond to GPAs by different mechanisms which account for the difference in induction specificity by vancomycin and teicoplanin and their semi-synthetic derivatives [13–15]. Many efforts [12,14,16,17] have been devoted to identifying the molecular species responsible for differently inducing VanS and VanSB, but their entity and mode of action—i.e., direct binding of the GPAs to the sensor domain or its activation by cell wall intermediates that accumulate as a result of antibiotic action—is still being questioned.

Since it is widely believed that GPA resistance mechanisms are ultimately derived from GPA-producing actinomycetes, which use them to avoid suicide during antibiotic production [2,14], in this paper we investigated the specificity of induction of GPA resistance in teicoplanin- and A40926-producing actinomycetes. Teicoplanin is produced by *Actinoplanes teichomyceticus*, and the teicoplanin-like molecule A40926 [18], which is the natural precursor of the second-generation dalbavancin, is produced by the recently classified *Nonomuraea gerenzanensis* [19,20]. As a control, we used the well-characterized resistance model *S. coelicolor*, which does not synthesize any GPA but does possess a *van* gene cluster conferring inducible resistance to vancomycin but not to teicoplanin, showing the features of the VanB phenotype [10,16,21]. The purpose of this study was to elucidate the relationship between GPA activity and the ability to induce *van* gene expression in the producing actinomycetes, which are considered the evolutionary source of resistance determinants emerging in pathogens, shedding light on the possible evolution of the *van* gene cluster.

#### **2. Results and Discussion**

Table 1 reports the minimal inhibitory concentrations (MICs) of vancomycin, teicoplanin, and A40926 against *S. coelicolor*, *A. teichomyceticus,* and *N. gerenzanensis* on solid media. MICs were measured on solid media, since the standard broth dilution method used in unicellular bacteria to determine MICs by turbidity is compromised in mycelial actinomycetes by the formation of multicellular aggregates and by the coexistence of cells in different physiological states (e.g., vegetative mycelium, aerial mycelium, and spores). This is particularly true for difficult-to-cultivate nonstreptomyces actinomycetes such as *Actinoplanes* and *Nonomuraea* strains, which form compact and different-sized clumps when growing in liquid cultures.


**Table 1.** Minimal inhibitory concentrations (MICs) of glycopeptide antibiotics (GPAs), ramoplanin and bacitracin. The values represent the average of the data from three independent experiments.

As expected, *S. coelicolor* was resistant to vancomycin and susceptible to teicoplanin and A40926. Its Δ*vanRS* mutant did not respond to vancomycin, and consequently, it was sensitive to it [10]. For these *Streptomyces* strains, MICs values were slightly higher than those previously reported in a liquid medium [10], which is likely due to the different cultivation method used. Interestingly, *S. coelicolor* and its Δ*vanRS* mutant were equally sensitive to bacitracin and ramoplanin, which are antibiotics structurally unrelated to GPAs that inhibit the late steps of PG synthesis by a diverse mode of action [22,23], evidently not mediated by VanS interaction.

*A. teichomyceticus* was resistant to all the GPAs tested (Table 1), although its chromosome harbours a canonical *vanHAX* gene cluster including the *vanRS* two component-regulatory system associated with the teicoplanin biosynthetic genes [24]. In this actinomycete, *van* genes are expressed constitutively, even in the absence of the antibiotic, making the cells intrinsically resistant to GPAs [24,25].

In contrast, the A40926-producing strain *N. gerenzanensis* was resistant to vancomycin (less than *A. teichomyceticus* and *S. coelicolor*) but sensitive to teicoplanin and to its own product, albeit at a lower extent. As previously described [26], *N. gerenzanensis* does not possess a *vanHAX* gene cluster, and the only known mechanism of resistance relies on the action of a VanY metallo-D,D-carboxypeptidase (named VanYn) that hydrolyses the C-terminal D-Ala residue of PG pentapeptide precursors. Interestingly, the well-characterized VanA and VanB type enterococci possess, in addition to *vanHAX* genes, an extra *vanY* gene that plays an ancillary unessential role in conferring glycopeptide resistance [27]. Upon vancomycin induction, enterococcal VanX cleaves any residual cytoplasmic D-Ala-D-Ala dipeptide, ensuring that the newly formed PG precursors terminate mostly in D-Ala-D-Lac, whereas VanY just acts on PG precursors that have escaped VanX hydrolysis and converts them into GPA-resistant tetrapeptides [27]. The integration of the pST30 plasmid containing the complete *vanRSHAX* gene cluster from *A. teichomyceticus* [25,28] into the *N. gerenzanensis* chromosome, consistently rendered the host strain more resistant to teicoplanin and A40926 in comparison to the parental strain harbouring only the *vanY* gene (Table 1), confirming the role of *vanHAX* genes in conferring high GPA resistance.

In contrast to *S. coelicolor*, both *A. teichomyceticus* and *N. gerenzanensis* were intrinsically resistant to both bacitracin and ramoplanin. This last result merits further investigation, although it is well-known that PG structure and density, and consequently antibiotic resistance profile, might dramatically vary among different genera of actinomycetes [19,29,30].

To determine the correlation between the antimicrobial activity of the GPAs and their ability to induce the *van* resistance system in actinomycetes, we followed the enzyme activities corresponding to either VanX or VanY peptidases in cells growing in liquid culture in the presence or absence of subinhibitory concentrations of GPAs (Figures 1 and 2). The D,D-dipeptidase activity of VanX (Figure 1) and the D,D-carboxypeptidase activity of VanY (Figure 2) were measured by determining the amount of D-Ala released from the hydrolysis of the D-Ala-D-Ala dipeptide and of the Nε-acetyl-L-Lys-D-Ala-D-Ala tripeptide, respectively. D-Ala release was measured by using a D-amino acid oxidase coupled to a peroxidase [25,31]. As in the case of enterococci [27], VanX activity was detectable in the cytoplasmic fractions of *S. coelicolor*, *S. coelicolor* Δ*vanRS*, *A. teichomyceticus*, and *N. gerenzanensis* pST30 (Figure 1), and it was specific for the hydrolysis of D,D-dipeptides, being inactive on the ester D-Ala-D-Lac and on dipeptides substituted at the C or N terminus (data not shown). No VanX activity was detectable in *N. gerenzanensis* cytoplasmic fractions (see Table S1 Supplementary Material). Alternatively, VanY activity was detectable only in the membrane fractions of *N. gerenzanensis* and in its recombinant-derived *N. gerenzanensis* pST30 strain (Figure 2), consistent with the predicted VanY N-terminal structure, which contains a hydrophobic domain [31]. No VanY activity was detectable in *S. coelicolor*, *S. coelicolor* Δ*vanRS*, and *A. teichomyceticus* (see Table S1 Supplementary Material).

In *S. coelicolor* and *S. coelicolor* Δ*vanRS,* basal VanX activity (without any GPA addition) reached the maximum value within the 24 h of growth. When *S. coelicolor* parental strain was grown in the presence of subinhibitory concentrations of vancomycin (10 μg/mL), teicoplanin (0.75 μg/mL), and A40926 (0.75 μg/mL), only vancomycin induced an increase in VanX activity, and the level of enzyme activity doubled within the first 24 h from the induction (Figure 1). The addition of ramoplanin and bacitracin (both added at 0.45 μg/mL) did not show any effect on VanX activity (data not shown), thus indicating the specificity of vancomycin induction. Moreover, the addition of subinhibitory concentrations of vancomycin (in this case 0.6 μg /mL), teicoplanin (0.75 μg/mL), A40926 (0.75 μg/mL), and ramoplanin and bacitracin (both added at 0.45 μg/mL) to *S. coelicolor* Δ*vanRS* did not induce any increase in VanX activity, confirming the role of VanS in responding to vancomycin induction (Figure 1).

**Figure 1.** VanX activity in *S. coelicolor*, *S. coelicolor* Δ*vanRS*, *A. teichomyceticus*, and *N. gerenzanensis* pST30 grown in the absence (dotted line) or in the presence (continuous line) of subinhibitory concentrations of GPAs added at the moment of inoculum (see material and methods). VanX-specific activity is defined as the number of nanomoles of D-Ala released by the hydrolysis of D-Ala-D-Ala dipeptide at 37 ◦C per minute per milligram of protein contained in cytoplasmic fractions. Symbols represent non-induction (-), induction with vancomycin ( ), or teicoplanin (-), or A40926 (). The values represent the averages from three independent experiments, with a standard deviation of <5%.

In *A. teichomyceticus*, basal VanX activity (without any GPA addition) reached the maximum value after 48 h of growth. No significant variation in VanX activity was detectable following the addition of 45 μg/mL of vancomycin, of 10 μg/mL of teicoplanin, or of 15 μg/mL of A40926 (Figure 1). As expected, the MICs of GPAs measured on *A. teichomyceticus* after 48 and 72 h of growth in the presence of teicoplanin were the same as reported in Table 1 (see Table S2 in Supplementary Material). These results are in agreement with the constitutive expression of *vanHAX* genes that, according to Beltrametti et al. [24], is due to an impaired phosphatase function of the mutated VanS, which is consequently locked in the "on" state in this organism [14]. The addition of ramoplanin and bacitracin did not exert any specific effect on the VanX activity in *A. teichomyceticus* (data not shown).

Interesting is what occurred in the recombinant strain *N. gerenzanensis* pST30 that, in addition to *vanYn*, also harbours the heterologous *vanX* from *A. teichomyceticus*. In this strain, the basal D,D-dipeptidase activity of VanX reached the maximum value after 72 h of growth. VanX activity was induced by 10 μg/mL of vancomycin, by 1.75 μg/mL of teicoplanin, and by 10 μg/mL of A40926, albeit with different kinetics and intensity (Figure 1). When the *A. teichomyceticus vanRSHAX* genes were integrated into the host genome, VanX activity was not longer under the control of the parental *vanRS* but became inducible by GPAs, undergoing regulation by a still-unknown circuit present in *N. gerenzanensis*, which responds to subinhibitory concentrations of vancomycin (within the 48 h of growth), A40926 (within the 72 h), and, albeit to less extent, teicoplanin (within the 48 h) (Figure 1).

To shed light on the specificity of induction of GPA resistance in *N. gerenzanensis*, we compared the VanY activity in *N. gerenzanensis* and in its recombinant strain *N. gerenzanensis* pST30 when grown in the absence or presence of GPAs (Figure 2). In both the strains, basal VanY activity reached its maximum activity after 72 h of growth. Upon induction with 10 μg/mL of vancomycin, 0.45 μg/mL of teicoplanin, or 2 μg/mL of A40926, VanY activity in the parental strain increased within the first 24 h (Figure 2), suggesting that VanY-based resistance was induced by the exposure to the three GPAs. As expected, MICs of GPAs measured on *N. gerenzanensis* after 48 and 72 h of growth in the presence of A40926 increased, as reported in Table S2 of Supplementary Material. Again, the addition of ramoplanin and bacitracin did not influence the activity profile (data not shown). Surprisingly, the membrane-associated VanY activity in *N. gerenzanensis* pST30 was significantly induced only by vancomycin, and its detection was delayed in comparison to the parental strain, reaching its maximum value after 96 h of growth (Figure 2). In the presence of A40926 and teicoplanin, the VanY activity was comparable to the basal level previously observed in the noninduced recombinant strain (Figure 2). Intriguingly, these data suggest that swapping heterologous *A. teichomyceticus* genes in *N. gerenzanensis* pST30 did alter the specificity of induction of the homologous VanY activity, although the heterologous VanX activity responded to induction by the three GPAs as the VanY activity did in the parental strain. It seems that in the presence of the heterologous VanX dipeptidase, homologous VanY carboxyesterase tends to play an auxiliary role as it occurs in VanA and VanB enterococci [21]. Irrespective of this finding, further investigations on the still-unknown regulatory genes controlling GPA resistance in *N. gerenzanensis* are needed to better explain these diverse responses to GPA induction. In a previous '*van* genes' swapping experiment conducted by Hutchings et al., 2006, introducing the *Streptomyces toyocaensis* VanRS signal transduction system into *S. coelicolor* Δ*vanRS* switched inducer specificity to that of *S. toyocaensis*, whose resistant genes are induced by A47934 but not by vancomycin [21]. The authors concluded that the inducer specificity was determined by the origin of VanS/VanR [21]. More recently, Kilian et al. reported that the VanRS homologous two-component system VnlRS of *Amycolatopsis balhimycina* (which produces the vancomycin-type balhimycin) activated the transcription of the *vanHAX* genes in *S. coelicolor*, but not in *A. balhimycina* [32]. Surprisingly, the introduction of VnlRS from *A. balhimycina* into *S. coelicolor* induced teicoplanin resistance, most likely by activating further unknown genes required for teicoplanin resistance [32]. Although two members of a putative VanRS-like two-component signal transduction system were previously identified in the A40926 biosynthetic cluster in *N. gerenzanensis* [33], their role in controlling the VanY-based self-resistance mechanism and in responding to GPA induction is still unveiled and merit further investigations.

**Figure 2.** VanY activity in *N. gerenzanensis* and *N. gerenzanensis* pST30 grown in the absence (dotted line) or in the presence (continuous line) of subinhibitory concentrations of GPAs added at the moment of inoculum (see material and methods). VanY-specific activity is defined as the number of nanomoles of D-Ala released by the hydrolysis of Nε-acetyl-L-Lys-D-Ala-D-Ala tripeptide at 37 ◦C per minute per milligram of protein contained in membrane extracts. Symbols represent non-induction (-), induction with vancomycin ( ), or teicoplanin (-), or A40926 (). The values represent the averages from three independent experiments, with a standard deviation of <5%.

#### **3. Materials and Methods**

#### *3.1. Strains and Media*

*S. coelicolor* A3(2) was a gift from Mervyn Bibb, John Innes Institute, Norwich, UK [34]. The Δ*vanRS* mutant of *S. coelicolor* was generously donated by Hee-Jeon Hong, University of Cambridge, UK [21]. *Streptomyces* spp. strains were maintained as spores in 10% (*v/v*) glycerol and propagated in MS agar media [34]. Liquid media for streptomycetes were YEME [34] and BTSB (Difco). Colonies were picked up from agar plates and inoculated into 300 mL Erlenmeyer flasks containing 50 mL of YEME. Flask cultures were incubated on a rotary shaker at 200 rpm and 28 ◦C.

*N. gerenzanensis* ATCC 39727 [19], its recombinant strain *N. gerenzanensis* pST30 [25], and *A. teichomyceticus* ATCC 31121 [35] were maintained as lyophilized master cell banks (MCBs). The mycelium from the MCBs was streaked on slants of a salt medium (SM) [34] solidified with agar (15 g/liter). After its growth, the mycelium from a slant was homogenized in 10 mL of 0.9% (*w*/*v*) NaCl, inoculated into liquid SM, grown for 96 h at 28 ◦C with aeration, and stored as a working cell bank (WCB) in 1.5 mL cryovials at −80 ◦C. One vial was used to inoculate each 300 mL Erlenmeyer flask containing 50 mL VSP medium [26], and the flasks were incubated at 28 ◦C, with shaking at 200 rpm. Previous HPLC data showed that no A40926 or teicoplanin production occurred in this vegetative medium [26,35]. In the case of *N. gerenzanensis* pST30, the VSP medium was added with 50 μg/mL apramycin to maintain plasmid selection. Surface cultures were grown on V0.1 agar [26]. All medium components were from Sigma-Aldrich (St. Louis, MO, USA) unless otherwise stated.

#### *3.2. Induction Experiments*

*S. coelicolor* A3(2), its Δ*vanRS* mutant, *A. teichomyceticus* ATCC 31121, *N. gerenzanensis* ATCC 39727, and *N. gerenzanensis* pST30 were grown as described above. Vancomycin, teicoplanin, A40926, bacitracin, and ramoplanin (all from Sigma-Aldrich) were dissolved in MilliQ water and sterilized by filtration using 0.22 μm filters. Antibiotics were added to the cultures at the moment of inoculation using concentrations calculated as the half-point of MICs (see below and Table 1), except for

vancomycin (10 μg/mL) in *S. coelicolor* A3(2), as reported by Kwun et al. [10]. At these concentrations, growth curves of noninduced and induced strains were overlapping.

#### *3.3. MIC Determination*

Cryovials of WCBs were thawed at room temperature and used to inoculate a VSP medium for *Nonomuraea* spp. and *Actinoplanes* spp. or YEME for *Streptomyces* spp. in the presence or the absence of GPAs. The strains were grown to exponential phase (approximately 72 h) at 28 ◦C with shaking. The mycelium was harvested by centrifugation, suspended in 0.9% (*w*/*v*) NaCl, and fragmented by sonication with a Vibracell Albra sonicator 400 W [26]. A suspension of sonicated hyphae (corresponding to 10<sup>7</sup> CFU) was seeded onto V0.1 (*Nonomuraea* spp. and *Actinoplanes* spp.) or MS (*Streptomyces* spp.) agar plates supplemented with increasing concentrations of the following antibiotics: 0 to 100 μg/mL vancomycin in 10-μg/mL increments; 0 to 2 μg/mL teicoplanin in 0.1-μg/mL increments or 0 to 40 μg/mL teicoplanin in 10-μg/mL increments depending on the strain; and 0 to 5 μg/mL or 5 to 50 μg/mL A40926, depending on the strain, in 0.5-μg/mL or 2.5-μg/mL increments. The plates were dried and then incubated at 28 ◦C. MIC values were determined as the lowest antibiotic concentrations that inhibited visible growth after 10 days of incubation.

#### *3.4. D,D-dipeptidase and D,D-carboxypeptidase Assays*

Cells were harvested by centrifugation at 3600× *g* for 20 min at 4 ◦C, and then they were suspended in 2 mL of 0.9% (*w*/*v*) NaCl per gram of cells (wet weight). All of the following manipulations were conducted at 0 to 4 ◦C. The mycelium was fragmented by sonication with a Sonics Vibracell VCX 130. Sonication was carried out for 5 min on ice, with cycles of 30 s with an amplitude of 90% (90% of 60 Hz), with breaks of 10 s. The samples were then centrifuged at 39,000× *g* for 15 min, and the supernatants (cytoplasmic fractions) were collected. An alkaline extraction of the pellets (cell debris and membrane fractions) was carried out by adapting a protocol developed previously for extracting membrane-bound proteins [25,36]. The sedimented pellets were resuspended in ice-cold distilled water containing proteinase inhibitors (0.19 mg/mL phenylmethanesulfonyl fluoride and 0.7 μg/mL pepstatin, both purchased from Sigma-Aldrich), and then, immediately before centrifugation (28,000× *g* for 15 min at 4 ◦C), the pH was adjusted to 12 by adding an appropriate volume of 2.5 N NaOH. Immediately after centrifugation, the supernatants were neutralized to pH 7 by adding 0.5 M sodium acetate (pH 5.4). Enzymatic activities in the supernatant (cytoplasmic fractions) and in the resuspended insoluble fractions (membranes) were assayed as reported previously [36] by measuring the release of D-Ala from commercially available dipeptide (D-Ala-D-Ala, 10 mM; Sigma-Aldrich) and tripeptide (Nε-acetyl-L-Lys-D-Ala-D-Ala, 10 mM; Sigma-Aldrich). D,D-carboxypeptidase activity was confirmed using 10 mM UDP-MurNAc-L-Ala-D-Glu-*meso*-Dap-D-Ala-D-Ala (UK-BaCWAN, University of Warwick) as a substrate. The release of D-Ala was followed spectrophotometrically at 510 nm with a D-amino acid oxidase–peroxidase coupled reaction. Reaction mixtures contained 5 mM of the peroxidase colorimetric substrate 4-aminoantipyrine (Sigma-Aldrich), 3 U/mL D-amino acid oxidase (Sigma-Aldrich), 7.5 U/mL horseradish peroxidase (Sigma-Aldrich), and 6 mM phenol in 50 mM 1,3-bis[tris(hydroxymethyl)methylamino]propane (pH 7.4) in a final volume of 1 mL, as described in detail in [36]. To compare the D,D-dipeptidase and D,D-carboxypeptidase activities in the cytoplasmic and membrane extracts, the activity was expressed as the number of nanomoles of D-Ala released from the dipeptide or tripeptide per min per mg of protein in the extract.

#### **4. Conclusions**

By comparing the specificity of induction of VanX and VanY peptidases in three different soil-dwelling actinomycetes, we confirm that *S. coelicolor* has a VanB-phenotype responding to vancomycin but not to teicoplanin or to teicoplanin-like A40926. *S. coelicolor* does not produce any GPA but possessing resistance genes confers it a selective advantage since it shares the ecological niche (soil) with many other GPAs-producing actinomycetes. Interestingly, by a culture-independent

approach using molecular probes, it has been recently estimated that the frequency of encountering a vancomycin-type producer in soil is from 2.5 to 5 times higher than for a teicoplanin-like producing actinomycete [37]. In contrast, the teicoplanin producer *A. teichomyceticus* is highly and constitutively resistant to all the GPAs, including its own product. On the contrary, lower resistance in *N. gerenzanensis* is induced either by vancomycin or teicoplanins, and is based on the action of a VanY carboxypeptidase, which in many enterococci and staphylococci plays an unessential and ancillary role in the presence of VanHAX system [27]. Evidently, although producing structurally similar GPAs, *A. teichomyceticus* and *N. gerenzanensis* do not share the same self-resistant mechanism. Previous comparative analyses of their biosynthetic GPA clusters indicated that many A40926 biosynthetic genes are more related to vancomycin-type genes than to their teicoplanin homologs [38]. The production of A40926 and teicoplanin are most likely the result of a convergent evolution rather than originating from the same common ancestor. Although the identity and the role of the putative VanRS-like two-component signal transduction system in *N. gerenzanensis* needs to be investigated, the results of gene swapping between *A. teichomyceticus* and *N. gerenzanensis* indicate that cross-talking of the two-component systems is possible, as previously demonstrated in other actinomycetes [21,32]. Determining the sequences and the protein structures of the putative VanRS two-component system in *N. gerenzanensis* may help to confirm that the inducer specificity is determined by the origin of VanRS proteins and may provide additional evidence on the role of GPAs as VanS effector ligands. Additionally, since the genes involved in GPAs resistance in pathogens have been recruited from the different antibiotic-producing actinomycetes, gene swapping among different GPA-producing actinomycetes is proving to be useful for unveiling the specificity of regulation and predicting the evolution of future resistance gene combinations emerging in pathogens.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2079-6382/7/2/36/s1, Table S1: Maximum VanX and VanY activities during the growth (in the absence of GPAs) of *S*. *coelicolor*, *S*. *coelicolor* Δ*vanRS*, *A*. *teichomyceticus*, *N*. *gerenzanensis* and *N*. *gerenzanensis* pST30. The values represent the average of the data from three independent experiments, Table S2: MICs of GPAs in noninduced and teicoplanins-induced actinomycetes. The values represent the average of the data from three independent experiments.

**Author Contributions:** E.B., F.M. and G.L.M. conceived and designed the experiments; E.B. and P.C. performed the experiments; E.B., P.C. and G.L.M. analyzed the data; E.B., F.M. and G.L.M. wrote the paper.

**Acknowledgments:** This work was supported by public grants "Fondo di Ateneo per la Ricerca" 2015, 2016, 2017 to F. Marinelli and G.L. Marcone and by Federation of European Microbiological Societies (FEMS) Research Fellowship 2015 to E. Binda.

**Conflicts of Interest:** The authors declare that they have no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

## **Isolation, Characterization, and Antibacterial Activity of Hard-to-Culture Actinobacteria from Cave Moonmilk Deposits**

**Delphine Adam 1,†, Marta Maciejewska 1,†, Aymeric Naômé 1, Loïc Martinet 1, Wouter Coppieters 2, Latifa Karim 2, Denis Baurain <sup>3</sup> and Sébastien Rigali 1,\***


Received: 26 February 2018; Accepted: 19 March 2018; Published: 22 March 2018

**Abstract:** Cave moonmilk deposits host an abundant and diverse actinobacterial population that has a great potential for producing novel natural bioactive compounds. In our previous attempt to isolate culturable moonmilk-dwelling Actinobacteria, only *Streptomyces* species were recovered, whereas a metagenetic study of the same deposits revealed a complex actinobacterial community including 46 actinobacterial genera in addition to streptomycetes. In this work, we applied the rehydration-centrifugation method to lessen the occurrence of filamentous species and tested a series of strategies to achieve the isolation of hard-to-culture and rare Actinobacteria from the moonmilk deposits of the cave "Grotte des Collemboles". From the "tips and tricks" that were tested, separate autoclaving of the components of the International Streptomyces Project (ISP) medium number 5 (ISP5) medium, prolonged incubation time, and dilution of the moonmilk suspension were found to most effectively improve colony forming units. Taxonomic analyses of the 40 isolates revealed new representatives of the *Agromyces*, *Amycolatopsis*, *Kocuria*, *Micrococcus*, *Micromonospora*, *Nocardia*, and *Rhodococcus* species, as well as additional new streptomycetes. The applied methodologies allowed the isolation of strains associated with both the least and most abundant moonmilk-dwelling actinobacterial operational taxonomic units. Finally, bioactivity screenings revealed that some isolates displayed high antibacterial activities, and genome mining uncovered a strong potential for the production of natural compounds.

**Keywords:** Cave microbiology; secondary metabolite; antibiotics; rare Actinobacteria; Streptomyces; Amycolatopsis; unculturability; siderophore

#### **1. Introduction**

Bioprospecting for natural compounds from microorganisms dwelling in poorly explored and extreme environments has gained renewed interest [1,2], boosted by the need to fight resistance to compounds currently used as antimicrobials, herbicides, antivirals, and anticancer agents [3]. Caves, despite being highly oligotrophic environmental niches, support rich and diverse microbial life, a phenomenon called "the Paradox of the Plankton", which suggests that, in spite of limited nutrient resources, an unexpectedly wide range of species coexist [4]. The extremely starved nature of

subsurface habitats presumably stimulates unique strategies of indigenous microbiomes, among which the fine-tuning of the secondary metabolism (in terms of quantity and diversity of molecules) might be one of the key features enabling life in such challenging environments [5,6]. The observation that Actinobacteria, which are the most prolific producers of specialized (secondary) metabolites [7], are also abundant in limestone caves [8–10] is not likely a coincidence. Therefore, Actinobacteria isolated from oligotrophic environments, particularly the rare representatives of the phylum, are expected to own a unique metabolome that could potentially be an important source of novel drugs.

In our first attempt, aimed at isolating antibiotic-producing Actinobacteria from cave carbonate deposits called moonmilk originating from the cave "Grotte des Collemboles", we successfully isolated representatives of a single genus, *Streptomyces* [11,12]. However, high-throughput sequencing (HTS) of DNA extracted from the same moonmilk deposits demonstrated that we failed to isolate representatives of at least 46 additional actinobacterial genera [13]. Such an outcome might be related not only to the fact that *Streptomyces* are adapted to use a wide range of nutrients and therefore grow effectively on many selective media, but also to the fact that their growth is faster in comparison to that of other actinobacterial genera. For this reason, non-streptomycetes, which are more challenging to isolate in pure cultures, are regarded as rare Actinobacteria. Nonetheless, the 78 culturable *Streptomyces* strains isolated in our previous screening [11] constitute a minor fraction of those dwelling in the studied moonmilk deposits, because we only recovered relatives of 5 out of the 19 operational taxonomic units (OTUs) affiliated with the *Streptomyces* genus [13], suggesting that this environment hosts hard-to-culture *Streptomyces* species that are potentially highly diverged from their soil-dwelling counterparts.

Bacterial unculturability is one of the major problems in microbiology, as many species are present in the environment in a so-called viable but non-cultivable state (VBNC) [14]. The phenomenon known as the "great plate count anomaly" clearly depicts this issue—no matter what kind of approach will be applied to the isolation of microorganisms, it is possible to cultivate only about 1% of what is present in a sample [15]. As demonstrated in the recently published new "Tree of Life", the large majority of "known" microorganisms have been identified exclusively via genome-based approaches and are not represented by any culturable strain [16]. For this reason, a series of innovations in cultivation techniques are being introduced to increase the recovery yield of microorganisms from their natural habitats, including: (i) dilution of the sample to minimize the competition of fast-growing organisms and reduce the effect of growth inhibitors [17]; (ii) addition of signaling compounds/growth factors [18,19]; (iii) extension of the incubation time for the recovery of slow-growing bacteria [20]; (iv) in situ cultivation [21]; (v) use of polymers as growth substrates to reduce osmotic shock [20]; and (vi) limitation of the production of reactive oxygen species, which prevent bacterial growth [22,23]. In addition, various pretreatment methods are being applied to selective isolation, which targets (to favor or instead to exclude) a bacterial group of interest [24]. In the case of Actinobacteria, a series of chemical and physical treatments are used [20], which take advantage of the high resistance of actinobacterial spores to many factors such as ultraviolet (UV), ultrasonic waves, desiccation, and many chemicals [25–27].

In light of the above findings, and taking into account the fact that moonmilk deposits host a broad diversity of Actinobacteria (243 OTUs) [13], we have made an attempt to improve their recovery, particularly of rare taxa, in pure cultures by combining several approaches. Here, we present the "tips and tricks" used to isolate rare moonmilk-dwelling Actinobacteria, evaluate their potential at producing antibacterial compounds, and describe their genetic capacity to produce other specialized metabolites.

#### **2. Results and Discussion**

#### *2.1. Assessment of Various Strategies for Isolating Moonmilk-Dwelling Rare Actinobacteria*

Different strategies were implemented in order to increase the odds of isolating hard-to-cultivate Actinobacteria from the moonmilk samples of the cave "Grotte des Collemboles". For this purpose, we applied the rehydration-centrifugation (RC) method to the preparation of the moonmilk suspension. Applying the RC methodology has been shown previously to lessen the occurrence of filamentous and long spore-chain forming species (such as most *Streptomyces*) and non-motile bacteria, thereby enriching the isolation of rare and zoosporic Actinobacteria [27]. The moonmilk suspension was inoculated in the two media that showed the most contrasting isolation efficiency in our previous screening [12]—(i) the International Streptomyces Project (ISP) medium number 5 (ISP5) medium in which no colonies could be observed; and (ii) the starch nitrate (SN) medium, which gave the highest number of colony forming units (CFUs). As the existence of unculturable bacteria in certain environments has been proposed to depend on siderophore-producing neighboring microorganisms [18,28], we facilitated iron acquisition in these two media by (i) increasing the concentration of iron (20 μM FeCl3); (ii) including 1 μM of the siderophore desferrioxamine B (DFB), which has been shown to be essential for growth in certain media [28]; and (iii) adding both FeCl3 and DFB. Moreover, in order to reduce the formation of highly toxic reactive oxygen species (ROS) that might be generated during autoclaving through the interaction of components of the media, particularly phosphate and agar [22,23], ISP5 and SN were prepared in two different manners—(i) either with all of the components autoclaved together or (ii) with agar and phosphate autoclaved separately from all of the other components. Further evaluation of the possible toxic effect of ROS generated during autoclaving was tested by adding a solution of catalase to the center of the plates, as described previously [23]. All media were additionally supplemented with selective agents (antibiotic and antifungal compounds), inoculated with serially diluted moonmilk suspension (up to 10<sup>−</sup>4), and incubated for up to two months.

The combination of strategies described above increased the magnitude of isolated bacteria by about 15-fold in comparison to the colony counts obtained in our previous screening (~5 × <sup>10</sup><sup>5</sup> CFUs per gram of inoculum in this study (Figure 1), in contrast to ~3 × 104 CFUs in [12]). Dilution of the inoculum (from 100- to 1000-fold) increased CFUs (compare bar plots in blue and those in red in Figure 1). Considering that bacteria living in extremely oligotrophic environments are thought to have developed mutualistic interactions, this result is rather unexpected. Indeed, dilution increases the distance between colonies—which limits the exchange of growth factors between 'helpers' and unculturable organisms—and should eventually yield a lower number of colonies. The fact that diluting the inoculum instead resulted in higher CFUs suggests either that one or more of the strains inoculated at the lower dilution secreted highly active antibacterial agents or that the collected moonmilk deposits could possibly contain growth inhibitors that were attenuated by the sample dilution.

Furthermore, the putative effect of ROS on bacterial cultivability was clearly recorded on the ISP5 medium (Figures 1 and 2). Sterilization of phosphate and agar separate from the other components of the ISP5 medium resulted in abundant growth, whereas no colonies could be observed when the same medium was prepared according to the standard procedure (Figures 1 and 2). The probable effect of ROS-preventing bacterial growth on the traditionally prepared ISP5 medium was supported by the observation that adding a solution containing a catalase (decomposing hydrogen peroxide to water and oxygen) restored colony formation (Figure 2b). In contrast, supplementation of the media by iron (at final concentration of 20 μM), DFOB (at a final concentration of 1 μM), or both did not significantly improve the bacterial count (Figure 1). However, regardless of the lack of improved CFUs, the growing/isolated bacteria still might be different from those grown in the same media deprived from additional iron/siderophore supply. Finally, increased incubation time (from 16 to 44 days) increased CFUs in only a limited number of conditions (see for instance Fe + DFB in starch nitrate medium, Figure 1).

**Figure 1.** Total number of colony forming units (CFUs) according to the different media and methods used. The different media were inoculated with a diluted moonmilk suspension (10−<sup>2</sup> in blue and 10−<sup>3</sup> in red) prepared according to the rehydration-centrifugation method and incubated for up to two months. Data obtained after 16 days (16 d) and 44 days (44 d) of incubation are displayed. The "separated" label indicates that the autoclaving of agar and phosphate was performed separately from the other components of the media. Abbreviations: Fe, 20 μM of FeCl3; DFB, 1 μM of desferrioxamine B.

**Figure 2.** Effect of reactive oxygen species on bacterial growth in International Streptomyces Project (ISP) medium number 5 (ISP5) medium. Growth in (**a**) ISP5 medium prepared according to the standard protocol, in (**b**) the same medium but with catalase injected on the paper disc in the center of the plate, and in (**c**) the ISP5 medium prepared with autoclaving of agar and phosphate separate from the other media components. (**d**) CFUs on (**a**), (**b**) and (**c**).

#### *2.2. Characterization of Culturable Moonmilk-Derived Actinobacterial Isolates*

100 colonies were initially selected for further characterization, but only 42 survived after two rounds of inoculation/cultivation to obtain pure isolates (Table 1). This significant loss (58%) was likely caused by the lack of growth factors emanating from neighboring colonies in pure cultures. Hence, the strains that passed the purification steps might represent bacteria able to feed only on nutrients present in the synthetic medium, whereas species most adapted to cooperative growth and life in oligotrophic and inorganic environments would constitute a significant part of the lost isolates. The remaining 42 isolates were cultured in liquid LB, ISP1, or ISP2 media for DNA isolation and

subsequent sequencing of the 16S (SSU) rRNA gene, either at full length (extracted from the genome sequence) or nearly full length (through PCR amplification).

Table 1 lists the closest hits for members of our collection deduced from BLAST searches, and Figure 3 shows their phenotype. Two of the isolates (MMun130 and MMun142), closely related to *Paenibacillus* sp. DSL09-3 and *Paracoccus* sp. clone 54, respectively (Table 1), are not Actinobacteria and were not investigated further in this study. The remaining 40 isolates represented eight actinobacterial genera, namely *Agromyces*, *Amycolatopsis*, *Kocuria*, *Micrococcus*, *Micromonospora*, *Nocardia*, *Streptomyces*, and *Rhodococcus* (Table 1 and Figure 3). Finding members of the *Rhodococcus* genus, which was the most abundant in terms of the absolute number of sequences (17% of actinobacterial 16S rRNA sequences), and of the *Streptomyces* genus, which displayed the largest diversity (19 OTUs) based on DNA extracted from the investigated moonmilk deposits [13], was rather expected. However, finding representatives of genera *Nocardia* (0.5%), *Agromyces* (0.2%), *Kocuria* (0.1%), and *Amycolatopsis* (0.03%), which correspond to less abundantly represented taxa, demonstrated the efficiency of the tested cultivation approaches to also recover rare moonmilk-dwelling Actinobacteria. Additionally, our work reports the first isolation of strains belonging to the *Micrococcus*, *Kocuria* (Micrococcaceae), and *Micromonospora* (Micromonosporaceae) genera from moonmilk deposits. While other members of Micrococcaceae (*Arthrobacter*) had been previously reported in moonmilk [10], representatives of the Micromonosporaceae family (MMun172) had not been identified so far in this type of speleothems.

The overrepresentation of *Streptomyces* isolates (27 out of 40 strains) despite using the RC protocol might be explained by specific morphological features of the isolated moonmilk-dwelling species. While *Streptomyces* filamentous pellets would be excluded from the final suspension by the centrifugation steps in the RC protocol, the inoculum could still contain single streptomycetes spores. *Streptomyces lunaelactis* [29], the most frequently observed *Streptomyces* species in the studied deposits, was found to form collapsed/fragmented packs of spores instead of long aerial spore chains. This morphological feature would explain why we isolated two strains of the *S. lunaelactis* species (MMun143 and MMun152) using the RC method and could be a common characteristic of the other *Streptomyces* isolates obtained in this study. In addition to these two novel *S. lunaelactis* strains, another 14 out of 27 *Streptomyces* strains obtained in this study have an identical 16S rRNA sequence compared to strains previously isolated from the same moonmilk deposits (MM18, MM44, MM63, MM82, MM110, MM126 [12], Figure 4 and Table 1). These 14 "MMun" strains were therefore assigned to phylotypes defined in our first screening (phylotypes marked with an asterisk in Table 1) [12], and the other 11 *Streptomyces* strains were distributed into 9 novel phylotypes (from XXXII to XL) based on their 16S rRNA sequence (Figure 4).

#### *Antibiotics* **2018**, *7*, 28

**Figure 3.** Phenotypes of actinobacterial strains isolated in this study. For each isolate, the front and back of the Petri dish-grown bacteria are presented together with the phenotype of a single colony. The phylotype number for *Streptomyces* isolates is indicated in parentheses.

**Figure 4.** 16S rRNA-based phylogenetic tree of moonmilk-dwelling *Streptomyces* strains. Strains isolated using the rehydration-centrifugation (RC) method (MMun) are shown in grey, and strains isolated in our previous study (MM) are in black. Phylotype affiliations are indicated in parentheses. The alignment of nearly complete 16S rRNA sequences had 1538 unambiguously aligned positions for 92 strains, but only the 1342 positions without missing nucleotides were used to infer the tree. The evolutionary model was GTR + Γ<sup>4</sup> and bootstrap values are based on 100 pseudoreplicates (bootstrap values <50% are not displayed). *Saccharopolyspora erythraea* was used as outgroup. The scale bar represents 0.02 substitutions per site.

An important question is "to which of the 243 actinobacterial OTUs (Table S1) identified by HTS [13] in the studied moonmilk deposits are the 40 new actinobacterial isolates associated". In other words, did the isolation protocols used in this study enable the recovery of abundant and widespread strains or Actinobacteria with much more limited abundance and distribution in these speleothems (or both). For this purpose, we compared the 16S rRNA sequences (trimmed to the V6-V7 variable region) of the new isolates with the 16S rRNA amplicons (V6–V7 region) of the 243 actinobacterial OTUs obtained in our metagenetic study on DNA extracted from the same moonmilk deposits [13]. The phylogenetic tree positioning the 40 MMun strains (Table 1) with the 243 OTUs [13] is displayed in Supplementary Figures S1 and S2 and the results are summarized in Figure 5.

**Figure 5.** Membership of MMun strains isolated in this study to the actinobacterial operational taxonomic units (OTUs) identified by the high-throughput sequencing (HTS) metagenetic study. OTUs in blue boxes are those for which we isolated a first representative strain in this study. Note that for MMun149 and MMun172, the search of their associated OTU was performed at the family level (Micrococcaceae and Micromonosporaceae) and not at the genus level, as the HTS study [13] did not allow us to unambiguously assign the sequences of the 16S rRNA V6–V7 amplicons to the *Micrococcus* and *Micromonospora* genera. Symbol: \*, due to the phylogenetic divergence of its V6–V7 region and based on the full sequence of its 16S rRNA gene (extracted from the genome sequence), MMun149 was manually affiliated to OTU23 even though its "closest" OTU in the tree was OTU141 (Figure S2).

Figure 5 reveals that our study enabled the isolation of a strain (MMun145) associated with the most abundant OTU of the studied carbonate deposits—OTU1, which was assigned to the *Rhodococcus* genus and accounted for 16.45% (75,143 sequences) of the total number of 16S rRNA amplicon sequences obtained by our HTS analysis [13]. BLAST analysis of the full-length 16S rRNA sequence of MMun145 found *Rhodococcus* sp. strain MAK1 [30] as the closest hit, a strain isolated from diesel oil-contaminated soil and therefore possibly adapted to life in extreme environments. In another (and opposite) extreme case, MMun171 (*Amycolatopsis*) was associated with one of the less abundant OTUs—OTU282, which corresponded to only five 16S rRNA amplicon sequences identified (0.001%) in our metagenetic study [13]. This extremely rare Actinobacteria, only detected in collection point number 4 of the Grotte des Collemboles [13], has been able to grow in the ISP5 medium with separate autoclaving of the phosphate and agar components, as well as in media supplemented by iron and DFB (Table 1).

The other isolated MMun strains belong to OTUs that stand between these two extreme ends, but generally, they belong to some of the most abundant OTUs of their respective genus/family (Figure 5). It must be noted that eight of the newly identified *Streptomyces* strains are associated with OTUs for which culturable representatives had not been isolated in our previous screening [12] (Figure 5). However, no culturable *Streptomyces* strains have been isolated from twelve of the 19 *Streptomyces* OTUs, which highlights how many moonmilk-dwelling species remain to be discovered.

#### *2.3. Evaluation of the Antibacterial Activity of the New Moonmilk Isolates*

The potential to produce compounds with antimicrobial activity against Gram-positive and Gram-negative bacteria was evaluated for each strain via cross-streak assays (see the heatmap presented in Figure 6). Strains were streaked on ISP2, S-ISP5, ISP7, and TSA agar plates, and incubated for 14 days at 28 ◦C. Two types of antibacterial activities were recorded: (i) those that fully inhibited the growth (GI, growth inhibition) of the tested reference strains (see Figure 7 MMun156 against *B. subtilis* as a GI example) and (ii) those that allowed only partial growth (IG, impaired growth, see Figure 7 MMun160 against *E. coli* as an IG example). On several occasions, both GI and IG were observed (see Figure 7 MMun171 against *B. subtilis* as an example).

As expected, much stronger overall inhibitory activities (GI and IG) were recorded against Gram-positive bacteria (~87% of the MMun strains) compared to those against Gram-negative bacteria (~59% of the MMun strains) (Figure 8). Two isolates were particularly active against the tested Gram-negative bacteria—MMun160 (*Kocuria*) and MMun171 (*Amycolatopsis*) (Figures 6 and 7). However, these two strains exhibited very different activity patterns depending on the culture conditions: while MMun171 displayed strong inhibitory activity under all culture conditions, MM160 only secreted its antibiotic(s) when cultured in the ISP2 medium (Figure 6).

Regarding the antibacterial activities against the Gram-positive strains, their growth was affected by more actinobacterial isolates and was overall more strongly inhibited (Figures 6 and 7). Three moonmilk *Streptomyces* strains (MMun141, MMun146, and MMun156) were found to be particularly active, as they exhibited strong inhibitory activities against all tested Gram-positive species and under each culture condition tested (Figure 6 and examples shown in Figure 7). Interestingly, the same isolates showed virtually no inhibitory effect against Gram-negative bacteria. In addition, the moonmilk isolates most active against Gram-negative pathogens (MMun160 and MMun171) also displayed quite strong inhibitory profiles against Gram-positive strains.

In contrast, a few strains did not display any (or extremely weak) antibacterial activity, such as isolates MMun145 and MMun155 that belong to the *Rhodococcus* genus, the two *Agromyces* strains (MMun159 and MMun167), and the *Micromonospora* isolate MMun172 (Figures 6–8). Similarly, the five *Nocardia* isolates did not secrete compounds with antibacterial activity under most conditions, except isolate MMun136, which displayed weak antibacterial activity against *M. luteus*. Therefore, while we managed to uncover conditions to isolate and grow these rare Actinobacteria, the conditions that enable production of their putative antibiotics remain undefined.

**Figure 6.** Heatmap illustrating antibacterial activities of the moonmilk actinobacteria isolated in this study. The size of the inhibition zone (in cm) is given for eachstrain.

**Table 1.** Moonmilk strains isolated using the RC method and according to selective media.




OTU identified in our metagenetic study [13] the MMun strains are affiliated (according to the topology and patristic distances of the tree in Supplementary Figures S1 and S2). Abbreviations: S- before SN and ISP5, indicates that media were prepared by autoclaving of agar and phosphate separate from other media components; DFB, desferrioxamine B; na, not Actinobacteria; \*, phylotypes described in our previous study [12]; lost, culture from mycelium stocks of MMun153 are no longer possible, and this strain is now considered as lost; and ♣,MMunisolatesforwhichthehasbeensequenced.

 genome 

**Figure 7.** Selected representative examples of the cross-streak results. MMun strain numbers, media, and tested cultures are mentioned.

**Figure 8.** Number of strains with antibacterial activity, broken down by actinobacterial genus and test conditions.

Interestingly, unusual growth/susceptibility of *Klebsiella pneumonia* was frequently observed in the ISP7 medium, as reported in our first antibiotic activity screening assays [11]. Examples of different atypical growth responses are presented in Figure 9. In the depicted cases, we observed a non-linear response to the diffusion gradient of the secreted antibiotics, which has been described as the 'Eagle effect' (or paradoxical zone phenomenon) [31]. *K. pneumoniae* is indeed often able to grow near the actinobacterial strain, while its susceptibility to the secreted compounds is increased at a higher distance. Occasionally, the 'Eagle effect' response was also observed with *Citrobacter freundii*.

Finally, draft genome assemblies of 20 MMun isolates were mined for biosynthetic gene clusters (BGCs) in order to estimate their potential at producing natural compounds. More precisely, we examined the presence of BGCs containing polyketide synthases (PKSs, including PKS-I, PKS-II, and PKS-III) and NRPS domains, as described previously [12]. The evaluation of the genetic predisposition to produce natural compounds of isolates for which we obtained genome assemblies is summarized in Table 2.

All investigated isolates encode in their genomes at least four and up to 34 NRPS domains. All strains (except MMun154 and MMun145) also possess multiple PKS-I (up to 21, MMun146) and PKS-II (up to 6, MMun131, 137, and 141) domains. PKS-IIIs were expectedly found less frequently (maximum 2), albeit 70% contained at least one BGC including a PKS-III domain. Interestingly, the highest number of BGCs was found in all isolated *Nocardia* strains, with MMun133 harboring the largest amount of biosynthetic cluster domains (45 in total). The fact that the strains that harbor the highest number of BGCs (Table 2) did not display any—or extremely weak or only Eagle-effect—antibacterial activity (Figure 6) suggests that if we indeed found conditions to isolate and grow them in synthetic media, we still have to discover the culture conditions that will trigger their secondary metabolism.


**Table 2.** Number of biosynthetic gene clusters (BGCs) based on genome mining of 22 MMun strains.

#### **3. Materials and Methods**

#### *3.1. Preparation of Media and Moonmilk Suspension for Isolation of Rare Actinobacteria*

Starch nitrate (SN) [32] and International Streptomyces Project (ISP) medium number 5 (ISP5) [33] were prepared in two different manners: either according to the standard protocols with all of the components autoclaved together (SN/ISP5) or with the agar and phosphate solution autoclaved separately and mixed before media pouring (S-SN/S-ISP5) in order to avoid hydrogen peroxide (H2O2) formation [23]. Media were supplemented with nalidixic acid (75 μg/mL) and nystatin (50 μg/mL) to suppress the growth of Gram-negative bacteria and fungi, respectively. Each medium was additionally supplemented with factors known or predicted to increase microbial growth, including: (i) 20 μM FeCl3; (ii) 1 μM of desferrioxamine B (Sigma-Aldrich, St. Louis, MO, USA); and (iii) 10 μl of 10 mg/mL bovine liver catalase (Sigma-Aldrich, St. Louis, MO, USA) (on the Whatman 3 MM paper disc (Sigma-Aldrich, St. Louis, MO, USA) in the center of the Petri dish) [23].

Isolation of Actinobacteria was performed by preparing a moonmilk suspension via the rehydration-centrifugation method according to Hop et al. (2011) [25]. Briefly, 0.45 g of freeze-dried moonmilk (0.15 g of each collection site—COL1, COL3, and COL4) was suspended in 50 mL of 0.01 M phosphate buffer (pH 7.0), vortexed for 5 min, and incubated for 90 min at 30 ◦C. Then, 8 mL of the upper part of the supernatant was transferred to a 50 mL conical centrifuge tube and centrifuged at room temperature for 10 min at 3000 rpm. After a 30-min incubation at 30 ◦C, 100 μL of serial dilutions (up to 10<sup>−</sup>4) were inoculated on the different solid media. All plates were incubated at 15 ◦C for about 2 months. The number of CFUs was evaluated after 16, 44, and 69 days of incubation. Among the 100 colonies selected from the culturable population, 42 isolates (named MMun) were successfully subcultured twice to obtain pure strains and stored both as 25% glycerol mycelium stock at −20 ◦C and in solid media at 4 ◦C.

#### *3.2. Isolation and Sequencing of Genomic DNA*

Genomic DNA from MMun strains was extracted with the GenElute Bacterial Genomic DNA Kit (Sigma-Aldrich, St. Louis, MO, USA) according to the manufacturer's instructions from the mycelium grown in liquid LB, ISP1, or ISP2 media at 28 ◦C. Genomic libraries of moonmilk isolates were constructed using the Nextera XT kit (Illumina, Inc., San Diego, CA, USA). Sequencing was carried out on the Illumina MiSeq platform with 2 × 300-bp read configuration. Complete genomes were assembled de novo from raw sequence data with SPAdes v3.6.2 [34] using the "careful" option, and the quality of the assemblies was subsequently assessed with QUAST v2.3 [35].

#### *3.3. Phylogenetic Analyses of Actinobacterial Strains*

The phylogenetic analysis of the moonmilk-derived strains shown in Figure 4 was based on either the full-length or the nearly full-length sequences of their 16S rRNA gene, which were either (i) recovered from the sequenced genomes (Table 2) or (ii) amplified using bacterial universal primers 8F and 1541R [36], as reported previously [27]. A multiple sequence alignment (1538 sites) was built with MUSCLE v3.8.31 [37] (default parameters) and filtered using the script ali2phylip.pl (from the Bio-MUST-Core software package; D. Baurain; https://metacpan.org/release/Bio-MUST-Core), so as to only keep sites with no missing character states. The filtered alignment (92 sequences × 1342 sites) was then submitted to phylogenetic inference using the rapid bootstrap analysis of RAxML v8.1.17 ([38]; 100 pseudoreplicates) under the model GTR + Γ4.

The V6–V7 regions of the 40 MMun strains were extracted from their 16S rRNA full-length sequences and combined to the corresponding regions of the 243 metagenetic OTUs to yield the tree shown in Supplementary Figure S1. Briefly, the final alignment was carefully crafted using a combination of automatic and manual approaches, including MAFFT v7.273 [39], Clustal Omega v1.1.0 [40], and the alignment editor (ed) of the MUST software package [41]. OTU403 was discarded because of its identity to OTU1, except for an aberrant five-terminal stretch. The optimized alignment (296 sites) was then filtered using ali2phylip.pl to remove sites present in <5% of the sequences, resulting in a final dataset of 282 sequences × 261 sites (0.12% of missing character states). Phylogenetic inference was carried out as for the full-length 16S rRNA genes above. The tree was first formatted in FigTree v1.4.21 (http://tree.bio.ed.ac.uk/software/figtree/) and then further arranged using Inkscape v0.91 (https://inkscape.org/). Patristic distances were then obtained using SeaView v4.5.4 [42] and the OTU closest to each MMun V6-V7 region was automatically extracted using a custom Perl script. Both the tree topology and the patristic distances were considered to affiliate MMun strains to metagenetic OTUs (Supplementary Figures S1 and S2).

The accession numbers of the 16S rRNA deposited in the GenBank database are listed in the Supplementary Table S2.

#### *3.4. Antimicrobial Activities of Rare Moonmilk Actinobacterial Strains*

Antibacterial activities of "MMun" strains of our collection were evaluated via cross-streak assays as described previously [11]. Each strain was inoculated from a mycelium stock as a single line in the center of the square plate and incubated for 14 days at 28 ◦C before cross-streaked with bacterial reference strains: *Escherichia coli* (ATCC 25922), *Pseudomonas aeruginosa* (ATCC 27853), *Citrobacter freundii* (ATCC 43864), *Klebsiella pneumoniae* (ATCC 13883), *Bacillus subtilis* (ATCC 19659), *Staphylococcus aureus* (ATCC 25923), and *Micrococcus luteus* (ATCC 9341).

#### *3.5. Genome Mining for Gene Clusters Involved in Secondary Metabolite Production.*

Genomes of 20 MMun strains were screened for NRPS and PKS genes, as described previously [11]. Only contigs ≥10 kb were considered and genes were identified as NRPS and PKS-I only when they displayed adenylation and acyltransferase domains, respectively.

#### **4. Conclusions**

Motivated by the results of our metagenetic study that had revealed a high diversity of the actinobacterial microbiome dwelling in the moonmilk deposits of the "Grotte des Collemboles" [13], we set up a series of protocols and culture conditions to isolate representatives of some of the most promising actinobacterial genera for the production of natural compounds. Remarkably, we isolated for the first time rare Actinobacteria belonging to the *Agromyces*, *Amycolatopsis*, *Kocuria*, *Micrococcus*, *Micromonospora*, *Nocardia,* and *Rhodococcus* genera, as well as additional new members of the *Streptomyces* genus. Remarkably, we succeeded to isolate for the first time from moonmilk deposits actinobacterial strains that belong to the *Micrococcus*, *Micromonospora*, and *Kocuria* genera. Another remarkable outcome of our research is the isolation of an *Amycolatopsis* strain that belongs to one of the less abundant OTUs identified in the studied moonmilk, accounting for less than 0.001% of the actinobacterial microbiome. This strain has also been revealed to display the highest propensity for producing antibacterial agents, both in terms of intensity and spectrum of activity.

Despite our success at isolating strains belonging to different actinobacterial genera, we however encountered a significant loss of the isolated colonies (58%) during our steps to obtain pure cultures. This most likely highlights the necessity of these strains to grow in the presence of specific growth factors of their environmental niche or an obligation to evolve in a mutualistic population with other moonmilk-dwelling bacteria. This is particularly problematic in bioprospecting, as the lost strains have probably been unable to grow as pure isolates because they are particularly (too) well-adapted to their unique natural habitat, and this adaptation might be a consequence of a unique specialized (secondary) metabolism. Therefore, we might have lost some of the most interesting strains in terms of potential producers of novel bioactive compounds.

In line with the conclusion mentioned above, though in this work we managed to isolate novel strains (and most likely novel species) thanks to our adapted protocols; the comparison of the 16S rRNA sequences of the new isolates with the 16S rRNA amplicons obtained in our metagenetic study [13] revealed the weak representativeness of our collection compared to Actinobacteria actually dwelling in the studied moonmilk deposits. Indeed, the identified 132 culturable strains are affiliated with 16 (6.5%) of the 243 actinobacterial OTUs identified from the DNA extracted from the moonmilk deposits. Considering that one single OTU can include lots of different species and strains (see OTU21 as example, Figure 4), this further emphasizes how much our collection is not exhaustively representative of the actinobacterial community inhabiting these carbonate deposits.

Finally, finding novel species/strains is only the first step towards the discovery of novel natural compounds. The next step is to make them trigger their secondary metabolism, which is an equally challenging task. Indeed, even if some strains already display high antibacterial activity, many others remain metabolically silent under the tested culture conditions. This is the case for all *Nocardia* strains isolated in this study that were unable to display antibacterial activity but were revealed to possess the largest number of BGCs (up to 45). As most of our new moonmilk strains have been isolated using a series of "tips and tricks" to cultivate the unculturable, they are unlikely to grow in many different media, which limits the culture conditions to be tested to awaken their secondary metabolism. Mining their genomes to unveil the triggers and cues of their secondary metabolism is a priority of our ongoing research [43].

*Antibiotics* **2018**, *7*, 28

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2079-6382/7/2/28/s1: Figure S1: Phylogenetic tree of newly isolated moonmilk-dwelling Actinobacteria strains based on the V6–V7 region of the 16S (SSU) rRNA gene; Figure S2: Closest OTUs (V6–V7 regions) for the 40 MMun strains; Table S1: OTUs and their abundance in moonmilk deposits of the "Grotte des Collemboles"; Table S2: Accession number of 16S rRNA sequences of MMun strains. Please provide the Table S2 that can be edited and put Figure S1 and Tables S1 and S2 into Word file.

**Acknowledgments:** D.A. and M.M.'s work was supported by a Research Foundation for Industry and Agriculture (FRIA) grant. Computational resources ("durandal" grid computer) were funded by three grants from the University of Liège, "Fonds spéciaux pour la recherche", "Crédit de démarrage 2012" (SFRD-12/03 and SFRD-12/04), and "Crédit classique 2014" (C-14/73) and by a grant from the F.R.S.-FNRS "Crédit de recherche 2014" (CDR J.0080.15). This work is supported in part by the Belgian program of Interuniversity Attraction Poles initiated by the Federal Office for Scientific Technical and Cultural Affairs (PAI No. P7/44). SR is a Research Associate at Belgian Fund for Scientific Research (F.R.S.-FNRS).

**Author Contributions:** D.A., M.M., A.N., L.M. and S.R. designed and performed experiments. Bioinformatic analyses were performed by A.N., D.B. and S.R. Data were analyzed by all authors. The manuscript was written and/or corrected by all authors.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **High-Throughput Sequencing Analysis of the Actinobacterial Spatial Diversity in Moonmilk Deposits**

**Marta Maciejewska 1, Magdalena Całusi ´nska 2, Luc Cornet 3, Delphine Adam 1, Igor S. Pessi 1, Sandrine Malchair 4, Philippe Delfosse 2, Denis Baurain 3, Hazel A. Barton 5, Monique Carnol <sup>4</sup> and Sébastien Rigali 1,\***


Received: 12 February 2018; Accepted: 16 March 2018; Published: 21 March 2018

**Abstract:** Moonmilk are cave carbonate deposits that host a rich microbiome, including antibiotic-producing Actinobacteria, making these speleothems appealing for bioprospecting. Here, we investigated the taxonomic profile of the actinobacterial community of three moonmilk deposits of the cave "Grotte des Collemboles" via high-throughput sequencing of 16S rRNA amplicons. Actinobacteria was the most common phylum after Proteobacteria, ranging from 9% to 23% of the total bacterial population. Next to actinobacterial operational taxonomic units (OTUs) attributed to uncultured organisms at the genus level (~44%), we identified 47 actinobacterial genera with *Rhodoccocus* (4 OTUs, 17%) and *Pseudonocardia* (9 OTUs, ~16%) as the most abundant in terms of the absolute number of sequences. Streptomycetes presented the highest diversity (19 OTUs, 3%), with most of the OTUs unlinked to the culturable *Streptomyces* strains that were previously isolated from the same deposits. Furthermore, 43% of the OTUs were shared between the three studied collection points, while 34% were exclusive to one deposit, indicating that distinct speleothems host their own population, despite their nearby localization. This important spatial diversity suggests that prospecting within different moonmilk deposits should result in the isolation of unique and novel Actinobacteria. These speleothems also host a wide range of non-streptomycetes antibiotic-producing genera, and should therefore be subjected to methodologies for isolating rare Actinobacteria.

**Keywords:** antibiotics; geomicrobiology; Illumina sequencing; microbiome diversity; *Streptomyces*; Actinobacteria

#### **1. Introduction**

Molecular approaches evaluating microbial communities in caves have revealed a level of diversity greater than initially expected [1]. Microorganisms have been found to inhabit virtually all subterranean niches, including cave walls, ceilings, speleothems, soils, sediments, pools, and aquifers [2]. Cave bacteria often represent novel taxonomic groups [3–7], which are frequently more closely related to other cave-derived bacterial lineages than to the microbiota of other environments [8–10].

Among cave speleothems, moonmilk draws a particular scientific attention due to its distinctive crystal morphology. The origins of various moonmilk crystalline habits, including monocrystalline rods, polycrystalline chains, and nanofibers, are tentatively attributed to the moonmilk indigenous microbial population [11]. Among a moonmilk microbiome comprising Archaea, Fungi, and Bacteria [9,10,12–19], the indigenous filamentous Fungi [20] and Actinobacteria [11,21] are believed to mediate moonmilk genesis with cell surfaces promoting CaCO3 deposition [11,20,21]. Actinobacteria were additionally reported to be metabolically capable of inducing favorable conditions for CaCO3 precipitation, or even directly precipitating carbonate minerals [12,21]. Members of the phylum Actinobacteria are routinely found in this speleothem [9,10,12–14,18,19], as well as in the other subterranean deposits within limestone caves [3,8,22,23], volcanic caves [24–26], and ice caves [27]. The broad distribution of Actinobacteria in the subsurface systems stimulates investigation in order to understand the factors driving their existence in mainly inorganic and highly oligotrophic environments, and the processes that enable them to mediate speleogenesis. The successful adaptation of Actinobacteria to a wide range of environments could probably be a consequence of their broad-spectrum metabolism, which includes prolific secreted hydrolytic systems that are capable of generating nutrient sources from various substrates, along with their extraordinary faculty to produce specialized metabolites (metal chelators, antimicrobials, hormones, etc.) [28].

As recently reported, moonmilk Actinobacteria represent novel microorganisms, which is a discovery that opens great avenues for the bioprospecting of novel drugs [6,10,18]. Rooney et al. (2010) [13] showed that spatially separated moonmilk speleothems in Ballynamintra Cave are inhabited by taxonomically distinct fungal and bacterial communities. Instead, in our attempt to isolate moonmilk-dwelling Actinobacteria for assessing their potential for participating in the genesis of these speleothems [21] and producing antimicrobial compounds [10], we only recovered members of the genus *Streptomyces*. Such a dominance of streptomycetes was rather unexpected, according to other moonmilk microbial diversity studies performed through culture-dependent [10,12,13,18] and culture-independent approaches using clone libraries [9], denaturing gradient gel electrophoresis (DGGE) fingerprinting [14,16,17], automated ribosomal intergenic spacer analysis (ARISA) [13], and, more recently, high-throughput sequencing (HTS) [19]. The actinobacterial genera identified in those studies included *Rhodococcus*, *Pseudonocardia*, *Propionibacterium*, *Nocardia*, *Amycolatopsis*, *Saccharothrix*, *Geodermatophilus*, *Mycobacterium*, *Aeromicrobium*, *Kribella*, *Nocardioides*, *Actinomycetospora*, *Nonomuraea*, *Euzebya*, *Rubrobacter*, and *Arthrobacter*, in addition to *Streptomyces*. Nonetheless, the diversity of the moonmilk actinobacterial microbiome still remains largely unknown and, beyond evaluating "*what and how much have we missed in our culture-dependent bioprospecting approach*" [10], a major important question that arises is: *to what extent are moonmilk-dwelling Actinobacteria different between the moonmilk deposits within a single cave, or in different caves?*

In this work, we carried out a comparative (HTS) of 16S small subunit (SSU) rRNA gene from DNA extracted from spatially separated moonmilk deposits within the same cave, "Grotte des Collemboles" (Springtails' Cave) in Comblain-au-Pont, Belgium (Figure S1), in order to draw a detailed taxonomic picture of the intra-phylum diversity. Identifying the presence of rare Actinobacteria and unveiling to which degree they exhibit a spatial variability would help determining whether it is worth prospecting from different moonmilk deposits to isolate unique and novel natural compound producers.

#### **2. Results**

#### *2.1. Actinobacterial Abundance within the Whole Moonmilk Bacterial Microbiome*

Libraries spanning the V4–V6 variable regions of the 16S rRNA gene using universal bacterial primers were used to assess the proportion of Actinobacteria in comparison to the whole bacterial community of three moonmilk deposits of the cave "Grotte des Collemboles" (Table S1a). The observed bacterial communities differed in species richness, evenness, and diversity between the three sampling points (Table 1). Phylotype richness (total number of operational taxonomic units (OTUs) per site) was the highest in COL4 (1863 OTUs), followed by COL1 and COL3, with 1332 and 1161 OTUs, respectively (Table 1, Figure 1a). Across the three sampling points, we found a total of 2301 different OTUs, amongst which 710 (31%) were common to all of the deposits (Figure 1a). Interestingly, pairwise comparison revealed highly similar percentages (~31.7 ± 0.53%) of shared bacterial OTUs between moonmilk deposits (Table 2, Figure 1a). A total of 956 OTUs (42%) were found to be exclusive to one sampling site, with COL4 having the highest number of unique bacterial phylotypes (584 OTUs), along with the most diverse bacterial population, as reflected by the highest diversity indices (Table 1, Figure 1a).


**Table 1.** Richness, specificity, diversity, and evenness of the bacterial and actinobacterial communities in the three moonmilk deposits of the "Grotte des Collemboles". OTUs: operational taxonomic units.

**Figure 1.** Venn diagrams showing the numbers of shared and unique bacterial (**a**) and actinobacterial (**b**) OTUs between the three moonmilk sampling points (COL1, COL3, COL4).



Bacterial OTUs were grouped into 21 phyla and 18 candidate phyla (Table S2, Figure 2). Actinobacteria represented 9%, 23%, and 10% of the total bacterial population in COL1, COL3, and COL4, respectively (Table S2, Figure 2a). In terms of abundance, they were the most common phylum after Proteobacteria, which accounted for 52%, 34%, and 30% of the total community in COL1, COL3, and COL4, respectively (Table S2, Figure 2a). The other major phyla of the moonmilk microbiome included Acidobacteria, Nitrospirae, Chloroflexi, Gemmatimonadetes, Planctomycetes, Latescibacteria, Verrucomicrobia, Zixibacteria, Armatimonadetes, Bacteroidetes, and Parcubacteria (Table S2, Figure 2a). Together, these phyla constituted 93.4%, 94.7%, and 91.5% of the total community in COL1, COL3,

and COL4, respectively (Table S2, Figure 2a). The remaining phyla (with a relative abundance of <1%) were pooled as 'other' (Figure 2a), and included most of the candidate divisions identified in this study (Figure 2b). Sequences that could not be affiliated to any bacterial phylum accounted for 4%, 3%, and 5% of the sequences in COL1, COL3, and COL4, respectively (Table S2). Some fraction of the moonmilk microbial diversity still remains to be discovered for all of the three sampling sites, as the rarefaction curves did not reach a plateau (Figure S2a).

**Figure 2.** Taxonomic profiles of the moonmilk-associated microbiome at the phylum level across the three moonmilk sampling points (COL1, COL3, COL4). The main phyla of the microbiome are presented on the left (**a**), while the pattern of low-abundance taxa, named as 'other' (with a relative abundance of <1%) is displayed on the right (**b**).

#### *2.2. Actinobacterial Diversity in Moonmilk Deposits*

Evaluation of the actinobacterial profile was performed with libraries spanning the V6–V7 variable regions of 16S rRNA gene, and using modified Actinobacteria-specific primers (Table S1a). The specificity of the primers was confirmed by the detection of only 1%, 0.2%, and 2% of non-actinobacterial sequences in COL1, COL3, and COL4, respectively (Figure S3). In contrast to the bacterial dataset, the diversity of Actinobacteria appeared to be exhaustively sampled with the phylum-specific primers (Figure S2b).

The diversity indices for Actinobacteria showed the same trends as the ones observed for the whole Bacteria domain, i.e., evenness and diversity were the highest in COL4, followed by COL3, and COL1 (Table 1). Phylotype richness was the highest in COL4 with 211 OTUs, followed by COL1 and COL3 with 150 OTUs and 147 OTUs, respectively (Figure 1b and Table 1). Among the 243 different OTUs, 105 OTUs (43%) were found in all three of the studied moonmilk deposits (Figure 1b). Hence, the moonmilk-associated actinobacterial community appeared to be more conservative than the moonmilk-associated bacterial population (31%, Figure 1a). If we also include OTUs shared between at least two sampling points, the level of conservation rises to 66% of OTUs for Actinobacteria, and 58% for Bacteria. Still, 34% of the 243 OTUs (14, 15, and 54 OTUs in COL1, COL3, and COL4, respectively) remained specific to a moonmilk deposit, despite the close localization of collection points within the studied cave (Figure 1b). COL4 was characterized not only with the highest number of unique phylotypes (54 OTUs) (Figure 1b), but also with the most diverse population, as revealed by diversity indices (Table 1). As observed for the bacterial dataset, pairwise comparisons showed highly similar percentages (~36.4% ± 0.41%) of shared actinobacterial OTUs between moonmilk deposits (Table 2).

A taxonomic assignment of actinobacterial OTUs revealed the presence of two major classes—Acidimicrobiia and Actinobacteria, next to the low-abundant Thermoleophilia class (Table S3, Figure 3a). Acidimicrobiia was represented by one single order, the Acidimicrobiales, which dominated sample COL4, constituting 55.3% of the population (Table S3, Figure 3a). The Acidimicrobiales order consisted of two families, i.e., Acidimicrobiaceae and Iamiaceae (Table S3). The Actinobacteria class was represented by 15 orders, with Corynebacteriales dominating in COL1, and Pseudonocardiales in COL3 and COL4 (Table S3, Figure 3b). The most abundant families among the Actinobacteria class were Pseudonocardiaceae, Nocardiaceae, and Streptomycetaceae (Table S3). The proportion of unclassified and uncultured sequences at the family level ranged from 9% in COL3, to 25% in COL1, and 53% in COL4 (Table S3).

**Figure 3.** Taxonomic profiles of moonmilk-associated Actinobacteria at different taxonomic levels—(**a**) class; (**b**) order; (**c**) family—observed across the three moonmilk-sampling points (COL1, COL3, COL4). 'Other' includes orders and families with a relative abundance of <1%.

Among 28 families, a total of 47 genera were identified across the investigated samples (Table S3 and Table 3), with 35 genera identified for the first time in moonmilk (Table 3). COL1 was dominated by *Rhodococcus* (38.37%), while uncultured and unclassified Actinobacteria were the most abundant in COL3 and COL4 (Table 3). When only known genera were taken into account, *Pseudonocardia* prevailed in those samples, accounting for 20% and 18% of the population in COL3 and COL4, respectively (Table 3). Other genera, which constituted more than 1% of the population in at least one moonmilk deposit, included *Streptomyces*, *Arthrobacter*, *Sporichthya*, *Planotetraspora*, *Nocardia*, *Mycobacterium*, and *Frankia* (Table 3). While accounting in average for only 3% of the actinobacterial community, streptomycetes displayed the highest diversity, with 19 OTUs identified across the three moonmilk deposits (Table 3).

Some taxa showed important differences in their relative abundance between investigated samples, particularly *Rhodococcus*, which was approximately four and 14 times more abundant in COL1 than in COL3 and COL4, respectively (Table 3). The *Streptomyces* genus represented only 0.8% of the population in COL3, while it was detected at the level of 5.3% and 3% in the COL1 and COL4, respectively (Table 3). An important discrepancy in the relative abundance between speleothems was also observed for the genera *Planotetraspora*, *Mycobacterium*, and *Frankia*, whereas some taxa (e.g., *Pseudoclavibacter*, *Lentzea*, *Propionibacterium*) were exclusively found in a single sampling site (Table 3).


**Table 3.** Actinobacterial genera pattern in moonmilk deposits of the "Grotte des Collemboles" based on 16S rRNA amplicon libraries.

For each taxon, the number of obtained sequences (Seq.) and their relative abundance (%), together with the number of OTUs, are given. The total number of sequences, average relative abundance, and total number of different OTUs obtained per genus are shown in the last three columns. Taxa marked with an asterisk (**\***) were reported for the first time in moonmilk deposits in this studyTaxa marked with a cross (†) were detected in moonmilk deposits in this work, and in the high-throughput sequencing (HTS)-based study of Dhami et al. [19]. Taxa underlined represent the ones that were also detected in other moonmilk microbial diversity studies [12–15,18]. Cases filled in grey highlight the most abundant genera in each studied sampling point. Abbreviations: Seq.—number of sequences identified.

#### *2.3. Analysis of the Most Abundant Actinobacterial OTUs*

In order to obtain more information about the most dominant moonmilk-dwelling Actinobacteria, a detailed analysis was conducted for the 41 most abundant OTUs (~17% of all of the OTUs) accounting together for 90% (413,739 out of 456,878) of the sequences obtained via our HTS approach (Table 4). Out of the subset of 41 OTUs, 16 phylotypes belonged to the class Acidimicrobiia, with most of them being uncultured at the family level, and the remaining 25 OTUs belonged to the class Actinobacteria (Table 4). In the latter case, all of the OTUs were associated with major families previously identified in moonmilk deposits, including Pseudonocardiaceae, Propionibacteriaceae, Micrococcaceae, Nocardiaceae, Streptomycetaceae, and Streptosporangiaceae (Table 4). Only 16

OTUs could be classified at the genus level and were affiliated to genera *Rhodococcus*, *Pseudonocardia*, *Arthrobacter*, *Sporichthya*, *Streptomyces*, *Planotetraspora*, and *Nocardia* (Table 4).


**Table 4.** The relative abundance (%) and taxonomy assignment of the most abundant actinobacterial OTUs found across moonmilk samples within the "Grotte des Collemboles".

Taking into account the spatial differences in terms of the most abundant taxa across the cave, COL1 was highly dominated by OTU1, affiliated to the genus *Rhodococcus*, and accounting for 38% of the total population in this speleothem (Table 4). This phylotype highly outnumbered other two *Rhodococcus* OTUs detected in COL1 (Table 3), which together constituted only 0.1% (data not shown). The predominant phylotypes identified in speleothems COL3 and COL4 were OTU2, representing an unclassified Pseudonocardiaceae in COL3 (29%), and OTU4, representing uncultured bacterium from Acidimicrobiia class in COL4 (11%) (Table 4). Among the known genera, *Rhodococcus* (OTU1, 10%) was also prevailing in COL3, while *Pseudonocardia* (OTU262, 6%) was found to be the most abundant in COL4 (Table 4).

In total, 40 out of 41 OTUs were present in all the three studied moonmilk deposits, often with an extreme variation in terms of their relative abundance across the different collection points. This is well demonstrated by OTU2 (Pseudonocardiaceae, unclassified at the genus level), which largely dominated the actinobacterial community in COL3 (29%), while only representing 0.3% of the actinobacterial microbiome in COL1 (Table 4).

#### *2.4. Comparison of Moonmilk Streptomyces OTUs and Streptomyces Strains Isolated via the Culture-Dependent Approach*

The true diversity of microbial communities is known to be strongly biased by cultivation-based methods in comparison to molecular techniques; therefore, we wanted to assess how much of the *Streptomyces* moonmilk-dwelling community we managed to isolate in our previous bioprospection work [10]. For this purpose, we compared the 16S rRNA sequences of the 19 *Streptomyces* OTUs retrieved from the HTS approach with the sequences of the 31 previously isolated *Streptomyces* phylotypes (MM strains), which were trimmed to the corresponding V6–V7 variable regions of HTS amplicons. Figure 4 presents the phylogenetic tree generated by maximum likelihood with all the 252 nt 16S rRNA sequences from the *Streptomyces* phylotypes (MM strains) and OTUs. The identity threshold for clustering sequences in the same branch of the tree was fixed to 97%, i.e., the same threshold as the one used to define OTUs in our HTS approach (see methods for details). As deduced from the generated phylogenetic tree, the 31 isolated *Streptomyces* strains matched with only five of the 19 *Streptomyces* OTUs, suggesting that the isolated strains represent a minor fraction of the *Streptomyces* species dwelling in the moonmilk deposits of the studied cave. Expectedly, Figure 4 further shows that we isolated *Streptomyces* species that are associated with the most abundant *Streptomyces* OTUs, e.g., OTU15, OTU21, OTU30, and OTU99 (Table 4), which together represent 79% of the *Streptomyces* sequences retrieved by our HTS approach. Moreover, 21 out of the 31 phylotype strains (68%) clustered together with OTU21 (Figure 4). Finally, two *Streptomyces* isolates, i.e., MM24 and MM106, did not cluster with any of the identified *Streptomyces* OTUs (Figure 4).

**Figure 4.** Phylogenetic relationships between culturable and non-culturable *Streptomyces* originating from moonmilk of "Grotte des Collemboles". The tree was inferred by maximum likelihood. Scale bar is in substitution per site. Numbers between brackets reflect the predicted mean abundance of *Streptomyces* OTUs in the studied deposits based on the percentage of sequences retrieved from the HTS analysis. *Streptomyces* phylotypes isolated in our previous bioprospection study (MM strains) are marked in blue.

#### **3. Discussion**

#### *3.1. New Insights into Moonmilk Bacterial Diversity Revealed by High-Throughput Sequencing*

Previous investigations on the moonmilk microbiome revealed a very diverse microbial community in these deposits [9,13–17,19]. The high-throughput sequencing approach used in this work complemented previous findings by providing an in-depth picture of the bacterial population, together with a detailed taxonomic fingerprint of the phylum Actinobacteria.

Comparison of the bacterial diversity in moonmilk between earlier investigations and the present work is limited to some extent by the differences in experimental procedures, such as DNA isolation and PCR-based approaches, and the sensitivity of the sequencing techniques. Nonetheless, the profile of the major taxonomic groups found in this work is consistent with that observed for the moonmilk communities in the caves "Grotta della Foos" and "Bus della Genziana" in Italy, which were obtained from 16S rRNA clone libraries [9]. All of the phyla detected in the above-mentioned caves, including Bacteroidetes, Acidobacteria, Chloroflexi, Planctomycetes, Verrucomicrobia, Actinobacteria, Firmicutes, Nitrospirae, Chlorobi, Proteobacteria, and WS3 (now Latescibacteria), were also identified in the cave "Grotte des Collemboles", although their relative abundance varied between the studies. While Proteobacteria were found to be the most abundant phylum in both cases, the second most abundant population identified in Italian caves was the phylum Bacteroidetes, which constituted a minor part of the bacterial community in the present study. The Actinobacteria population was found to be an important part of the moonmilk microbiome in the "Grotte des Collemboles" (from 9% to 23%), but instead represented only a minor fraction (<2%) of the bacterial population in the two Italian caves investigated by Engel et al. (2013). Very recently, a study by Dhami et al. has reported the moonmilk microbiome profile in the Australian "Lake Cave" using an HTS approach [19], as in this work. The presence of Proteobacteria, Actinobacteria, Acidobacteria, Chloroflexi, Nitrospirae, Gemmatimonadetes, Firmicutes, and Bacteroidetes were detected in the moonmilk deposit of the "Lake Cave", similarly to the "Grotte des Collemboles". However, many of the low-abundance taxa identified in the Belgian cave were not reported, possibly because their phylogenetic profiles were based on different regions of 16S rRNA gene—V3/V4 for the "Lake Cave", and V6–V7 for the "Grotte des Collemboles". Interestingly, unlike in Italian and Belgian caves, the "Lake Cave" moonmilk deposit was strongly dominated by Actinobacteria, which were more than twice as abundant as the Proteobacteria [19]. The highly sensitive HTS amplicon sequencing approach employed in this work revealed the presence of 26 phyla within the moonmilk microbiome that had not been previously described in this speleothem. These included Zixibacteria (formerly RBG-1), Armatimonadetes (formerly OP10), and Parcubacteria (formerly OD1) among the main phyla of moonmilk microbiome (Figure 2a), which have been previously reported from other subterranean environments [24,29–33], and 23 low-abundant taxa that were found below the level of 1%, and included many candidate divisions (Figure 2b).

This new study uncovered a surprisingly diverse Actinobacteria taxonomic profile that demonstrates the limitations of our previous cultivation-based screening, in which only the *Streptomyces* species could be isolated from the three moonmilk deposits [10]. Here, a total of 47 actinobacterial genera from 28 families were identified across the investigated samples. Beyond the previously reported members of the Actinomycetales family—including *Nocardia* and *Rhodococcus* (Nocardiaceae) [15,18], *Pseudonocardia*, *Amycolatopsis* and *Saccharothrix* (not identified in our study) (Pseudonocardiaceae) [12,14,19], *Propionibacterium* (Propionibacteriaceae) [14], *Streptomyces* (Streptomycetaceae) [10,12,18,19], *Arthrobacter* (Micrococcaceae) [13], *Mycobacterium* (Mycobacteriaceae) [19], *Nocardioides*, *Aeromicrobium*, and *Kribbella* (Nocardioidaceae) [19], and *Geodermatophilus* (Geodermatophilaceae) [19]—35 other genera were identified in the moonmilk deposits of the "Grotte des Collemboles". The population of each investigated sample was also found to include representatives of the Acidimicrobiia class, which were not previously reported in moonmilk. Their presence in all the three sampling sites, with an abundance up to 55% in COL4,

and the dominance of the unclassified Acidimicrobiia phylotype (OTU 4) within the community of COL4, suggest that the chemical composition of the investigated moonmilk would be particularly suitable for the development of the representatives of this class of Actinobacteria, of which the ecology and metabolism are still largely unknown.

#### *3.2. Moonmilk Deposits as Appealing Source of Novel Producers of Bioactive Compounds*

Extreme environmental niches have recently become the main targets for intense bioprospecting, as they are expected to host diverse yet-unknown microorganisms, which could offer unexplored chemical diversity. While *Streptomyces* are reported as the most prolific "antibiotic makers", advances in the cultivation and characterization of rare Actinobacteria revealed similarly promising capabilities for the production of bioactive natural compounds [34–36]. The results obtained in this work suggest a significant biodiversity of the moonmilk-dwelling actinobacterial population, with a wide spectrum of rare genera. Next to *Streptomyces*, other members of Actinobacteria with valuable secondary metabolism were detected at a high proportion, such as *Pseudonocardia*, *Amycolatopsis*, *Streptosporangium*, *Nocardia*, *Nocardioides*, and *Rhodococcus*. Such findings clearly prompt to apply appropriate selective cultivation methods to isolate rare Actinobacteria from moonmilk deposits.

Moreover, particular importance should be also focused on Acidimicrobiia, which constituted an important part of the community in the studied deposits. Members of this class are a recently identified taxonomic unit [37] that is considered to represent an early-branching lineage within the phylum [38]. Due to their phylogenetic isolation and novelty, they are likely to hide a yet-uncovered valuable bioactive arsenal.

The great potential of moonmilk as a source of diverse and metabolically beneficial Actinobacteria is illustrated by the comparison of *Streptomyces* isolated in our previous study and the *Streptomyces* OTUs identified in this work (Figure 4). Most *Streptomyces* OTUs are phylogenetically distinct from culturable representatives (Figure 4), indicating that a great number of species still remain to be isolated. On the other hand, our culture-dependent study identified *Streptomyces* strains (MM24 and MM106, Figure 4) that were not associated with OTUs deduced from the HTS approach, confirming that both strategies are complementary, and should be used in parallel for microbial diversity assessment [39,40]. In addition, next to the identification methods themselves, our data suggests that the diversity level can be also biased by the identity threshold that is used for OTU definition. The tree revealed that a single OTU (OTU21, Figure 4) clustered together with most of the phylotypes deduced from MLSA (multilocus sequence analysis), each most likely representing a distinct species [10]. This indicates that the 97% sequence homology threshold applied to the comparative analysis of the V6–V7 regions of the 16S RNA gene largely underestimated the number of *Streptomyces* species dwelling in a studied environmental niche.

#### **4. Materials and Methods**

#### *4.1. Site description and Sampling*

The cave "Grotte des Collemboles" (Springtails' Cave), located in Comblain-au-Pont (GPS coordinates 50◦28 41 N, 5◦36 35 E), Belgium (Figure S1, Maciejewska et al. 2017 for full description), was formed in Visean limestone and has the shape of a 70-m long meander. White to brown–orange moonmilk deposits are found on the walls in the first narrow chamber located at the entrance of the cave, as well as in the narrow passages leading deeper into the cave (Figure S1). Moonmilk samples used for total DNA extractions were aseptically collected in January 2012 from three spatially separated locations along about a 20-m transect in the cave. Soft moonmilk speleothems were scratched with sterile scalpels into sterile Falcon tubes from the wall in the first chamber, adjacent to the cave entrance (COL4), and from the walls in a narrow passage after the first chamber (COL1, COL3) (Figure S1). COL4 was located approximately 6 m from COL1, and 20 m from COL3 (Figure S1). Samples were immediately transferred to the laboratory, freeze-dried on a VirTis Benchtop SLC Lyophilizer (SP Scientific, Warminster, PA, USA), and stored at −20 ◦C.

#### *4.2. Total DNA Extraction and 16S rRNA Gene Amplicon High-Throughput Sequencing*

The metagenetic approach applied in this work was performed on DNA extracted from three moonmilk deposits (COL1, COL3, and COL4) originating from the "Grotte des Collemboles". Environmental genomic DNA isolation was carried out from 200 mg of the freeze-dried moonmilk samples COL1, COL3, and COL4 (Figure S1), using the PowerClean Soil DNA kit (MoBio, Carlsbad, CA, USA), according to manufacturer's instructions. The integrity of purified DNA was assessed by agarose gel electrophoresis (1% *w*/*v*), and the dsDNA concentration was evaluated by Qubit fluorometer (Invitrogen, Carlsbad, CA, USA).

The 16S rRNA gene amplicon libraries were generated using bacterial (S-D-Bact-0517-a-S-17/S-D-Bact-1061-a-A-17 spanning V4–V6 region [41]) and actinobacterial (Com2xf/Ac1186r, spanning V6–V7 region [42]) specific primer pairs. The Illumina platform-compatible dual index paired-end approach was designed as previously described [43] (detailed description provided in the Table S1a). Each forward and reverse primer consisted of an Illumina-compatible forward/reverse primer overhang attached to the 5 end. Additionally, a heterogeneity spacer of four degenerate nucleotides (Ns) was added to the forward primer, between the primer overhang and the locus-specific sequence. The Illumina barcodes and sequencing adapters were added during the subsequent cycle-limited amplification step using Nextera XT Index kit (Illumina, San Diego, CA, USA). Triplicated PCR reactions were performed for each sample in 25 μL of volume containing 2.5 μL of total DNA, 5 μL of each primer (1 μM), and 12.5 μL of 2× Q5 High-Fidelity Master Mix (New England Biolabs, Ipswich, MA, USA). Amplification conditions for each set of primers are listed in Table S1b. The triplicated amplicons were visualized on 3% agarose gel, pulled, purified with Agencourt AMPure XP beads (Beckman Coulter, Brea, CA, USA), and quantified with the Qubit HS dsDNA assay kit (Invitrogen, Carlsbad, CA,) before being processed for index ligation, using the Nextera XT Index kit (Illumina, San Diego, CA, USA). The PCR amplifications were performed with the same enzyme and cycling conditions as described above [43], with the total number of cycles reduced to eight, and an annealing temperature of 55 ◦C. The resulting amplicons were purified with the Agencourt AMPure XP magnetic beads (Beckman Coulter, CA, USA), quantified, and pooled in equimolar concentrations. The library concentration was quantified by qPCR using a Kappa SYBR FAST kit (Kapa Biosystems, Wilmington, MA, USA), and subsequently, the library was normalized to 4 nM, denaturated, and diluted to the final concentration of 8 p.m. The resulting pool was mixed with the PhiX control and subjected to 2 × 300 bp paired-end sequencing on Illumina MiSeq platform (Illumina, San Diego, CA, USA). Raw sequences were deposited in the NCBI Sequence Read Archive (SRA) database under the Bioproject PRJNA428798 with accession numbers SRX3540524–SRX3540529.

#### *4.3. 16S rRNA Amplicon Analysis*

16S rRNA amplicon analysis was based for both Bacteria and Actinobacteria on forward reads only, owing to the poor quality of reverse reads. Quality trimming (prohibiting mismatches and ambiguities, ensuring a minimum quality score of 20 and removing the four degenerate nucleotides from the 5 end) was carried out using CLC Genomic Workbench (Qiagen, Hilden, Germany). USEARCH [44] was applied for length trimming (minimum length = 240 nt) and dereplication. Operational taxonomic units (OTUs) for both bacterial and actinobacterial datasets were defined using a 97% identity threshold on 16S rRNA sequences. OTUs were clustered using the UPARSE algorithm [45], and their taxonomic position was assigned by MOTHUR [46] with SILVA v128 database [47]. OTUs were further classified using BLASTN [48] analyses against a local mirror of NCBI nt database (downloaded on 9 August 2017), through manual and automatic analyses. For the automatic approach, a last common ancestor (LCA) classification was performed with a custom parser mimicking the MEGAN algorithm [49], which we developed for analyses of genome contamination (Cornet et al., 2017, under review). A maximum

number of 100 hits per OTU were taken into account. To consider a BLASTN hit, the E-value threshold was set at 1e-15, the minimum identity threshold was set at 95.5%, the minimum bit score was set at 200, and the bit score percentage threshold was set at 99% of the best hit. These thresholds were defined through preliminary analyses (data not shown). When the BLASTN hits are too numerous, the MEGAN-like algorithm frequently yields high-ranking LCAs (e.g., Bacteria) that are not informative in practice. In order to minimize this effect, we decided to skip uncultured/unclassified hits whenever other, more informative, hits also passed the thresholds. Moreover, when computing LCAs, we only considered the most frequent taxa, provided that they represented ≥95% of the (up to 100) accumulated BLASTN hits, so as to avoid uninformative classifications due to a few (possibly aberrant) outliers.

Normalized OTU abundance data was used to calculate α-diversity and β-diversity estimators using MOTHUR [46]. Community richness, evenness, diversity, and differential OTU abundance between samples were calculated using sobs, the Simpson index, the inverse Simpson index and Venn diagrams, respectively.

The 19 OTUs identified as *Streptomyces* were combined to 31 sequences (16S rRNA region V6–V7) from previously isolated *Streptomyces* phylotypes (MM strains) and dereplicated with the UCLUST algorithm [44] using an identity threshold of 97%. This yielded 21 clusters, to which we added the homologous region of *Corynebacterium diphtheriae* JCM-1310 as an outgroup. A multiple sequence alignment was built with MUSCLE [50] (default parameters), and then analyzed with PhyML [51] under a K80 + Γ<sup>4</sup> model. Due to the limited amount of phylogenetic signal (short sequences from very related organisms), the resolution of the tree was low (bootstrap proportions <50 for nearly all nodes; data not shown).

#### **5. Conclusions**

Before the advent of metagenomics, bioprospecting was carried out blindly, with poor knowledge on the real potential of an ecological niche mined for novel organisms, enzymes, or bioactive compounds. The results of the metagenetic study presented here confirmed that different moonmilk deposits host their own indigenous microbial population, and thus each individual speleothem can be a source of a great biodiversity. Consequently, the observed important differences in the spatial diversity of Actinobacteria imply that bioprospecting within different moonmilk deposits—from different caves or within the same cave—could result in the isolation of unique and novel natural compound producers. Our study also revealed how many and which actinobacterial genera have been missed in our first attempt to isolate antibiotic producers. We now know that the *Streptomyces* strains of our collection isolated from the moonmilk deposits of the cave 'Grotte des Collemboles' [10] are just the tip of the iceberg. These results prompted us to apply a series of '*tips and tricks*' to isolate other *Streptomyces* and representatives of other antibiotic-producing Actinobacteria that are present in different proportions in each moonmilk deposit. The results of our adapted protocols for the isolation of rare Actinobacteria are presented in the article '*Isolation*, *Characterization*, *and Antibacterial Activity of Hard-to-Culture Actinobacteria from Cave Moonmilk Deposits*', which is published in the same special issue [52].

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2079-6382/7/2/27/s1, Figure S1: Localization of the "Grotte des Collemboles" (Springtails' Cave) together with the cave map and visualization of the moonmilk deposit sampling points, Figure S2: Rarefaction curves of OTUs clustered at 97% sequence identity across the three moonmilk-sampling points for Bacteria (a) and Actinobacteria (b), Figure S3: Taxonomic profile of bacterial phyla generated with Actinobacteria-specific primers. Note the high specificity of Actinobacteria primers, Table S1: Details of the PCR primers used for community profiling of moonmilk samples (a) and—PCR conditions used for 16S rRNA amplification from moonmilk samples (b), Table S2: Relative abundance (%) of bacterial phyla identified in the three moonmilk deposits in "Grotte des Collemboles". Low-abundant taxa with relative abundance <1% are marked in red, Table S3: Relative abundance (%) of the phylum Actinobacteria at different taxonomic levels identified in the three moonmilk deposits in the "Grotte des Collemboles".

**Acknowledgments:** The authors are grateful to Luc Willems for the introduction to the subject and help with sampling. MM, DA, and LC work was supported by a Research Foundation for Industry and Agriculture (FRIA) grant. MC and PD were supported by the Luxembourg National Research Fund (FNR CORE 2011 project

GASPOP; C11/SR/1280949: Influence of the Reactor Design and the Operational Parameters on the Dynamics of the Microbial Consortia Involved in the Biomethanation Process). Computational resources ("durandal" grid computer) were funded by three grants from the University of Liège, "Fonds spéciaux pour la recherche", "Crédit de démarrage 2012" (SFRD-12/03 and SFRD-12/04) and "Crédit classique 2014" (C-14/73) and by a grant from the F.R.S.-FNRS "Crédit de recherche 2014" (CDR J.0080.15). This work is supported in part by the Belgian program of Interuniversity Attraction Poles initiated by the Federal Office for Scientific Technical and Cultural Affairs (PAI No. P7/44). SR is a Research Associate at Belgian Fund for Scientific Research (F.R.S-FNRS). The authors declare no conflict of interest. We dedicate the work to the memory of Leonard Maculewicz (1936–2017) who always supported our work with great enthusiasm.

**Author Contributions:** M.M., Ma.C., S.M., Mo.C., P.D., and S.R. designed and performed experiments. Bioinformatic analyses were performed by M.M., Mo.C., Ma.C., L.C., D.B., and S.R. Data were analyzed by all authors. The manuscript was written and/or corrected by all authors.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Biosynthesis of Rishirilide B**

#### **Philipp Schwarzer 1,†, Julia Wunsch-Palasis 1,†, Andreas Bechthold 1,\* and Thomas Paululat 2,\***


Received: 14 February 2018; Accepted: 1 March 2018; Published: 7 March 2018

**Abstract:** Rishirilide B was isolated from *Streptomyces rishiriensis* and *Streptomyces bottropensis* on the basis of its inhibitory activity towards alpha-2-macroglobulin. The biosynthesis of rishirilide B was investigated by feeding experiments with different 13C labelled precursors using the heterologous host *Streptomyces albus* J1074::cos4 containing a cosmid encoding of the gene cluster responsible for rishirilide B production. NMR spectroscopic analysis of labelled compounds demonstrate that the tricyclic backbone of rishirilide B is a polyketide synthesized from nine acetate units. One of the acetate units is decarboxylated to give a methyl group. The origin of the starter unit was determined to be isobutyrate.

**Keywords:** streptomyces; rishirilide; biosynthesis; polyketides

#### **1. Introduction**

Polyketides represent a large and diverse group of natural products from different biological sources such as bacteria, plants and fungi. Numerous polyketides are pharmacologically useful and in clinical use in many areas of application [1].

Rishirilide B (Figure 1), a product of a type II polyketide synthase (PKS), was first isolated from *Streptomyces rishiriensis* OFR-1056 in 1984 [2]. Later, *Streptomyces bottropensis* (formerly *Streptomyces* sp. Gö C4/4) was described as another producer of rishirilide B [3]. Rishirilide B inhibits alpha-2 macroglobulin, a plasma protein that effects the blood coagulation system by inhibiting a large variety of proteinases. Thus alpha-2-macroglobulin inhibition is an effective mechanism to prevent and treat fibrinolytic accentuated thrombosis [3]. Structurally, rishirilide B is a tricyclic compound with an isopentyl sidechain, and further substitutions including one methyl, one carboxlic acid, and three hydroxyl groups. The isopentyl side chain is uncommon for aromatic polyketides. This branched chain may originate from activated isobutyrate and could be the starter unit of polyketide synthesis [4]. The origin of this starter unit might be valine-derived isobutyl-CoA as described in literature for other natural compounds, such as tautomycin and virginiamycin M [5,6]. Isobutyryl-CoA derives from L-valine via transamination and decarboxylation [7].

**Figure 1.** Structure of rishirilide B.

Aromatic polyketide formation undergoes a complex building process under the control of the PKS. CoA activated carboxylic acids are assembled via decarboxylative Claisen condensation by an iteratively used set of proteins leading to the formation of a highly reactive polyketide chain. Within this process, many variables determine the unique order of events and therewith the final structure of the emerging polyketide. The variables include the choice of starter units, the determination of chain length as well as cyclisation patterns. Post PKS events, induced by tailoring enzymes, increase structural variation [4,8]. A review by Staunton and Weissman provides a complete overview about polyketides [9]. The gene cluster responsible for Rishirilide B production contains 28 ORF's that encode a type II PKS, tailoring enzymes, regulatory proteins, and transporters [3].

Insights in the early formation process of polyketides can be obtained by incorporation studies with labelled precursors. These experiments lead to substantial information about the starter unit and the number of incorporated extender units and also indicate the order of events during polyketide formation and give valuable hints about possible rearrangement steps that might occur throughout the biosynthesis. In this way, the carbon backbone of many known natural products like malonomicin, polyketomycin and clavulanic acid could be examined [10–12]. Recent incorporation studies with labelled [1-13C]acetate, [2-13C]acetate, and [1,2-13C2]acetate and [13c6]-L-isoleucine on trixocarcin, a natural compound produced by *Streptomyces bottropensis* DO-45 led to the identification of eight extender units and to 2-methylbutyryl-CoA, a previously unknown starter unit which derives from isoleucine [13]. Here we describe incorporation studies with [1-13C]acetate, [2-13C]acetate, and [1,2-13C2]acetate, as well as [13c5, 15n1]-L-valine that clarify the origin of all carbon atoms in rishirilide B. These experiments also give insight into the rearrangement process that occur during rishirilide biosynthesis.

#### **2. Results**

A cosmid encoding the entire rishirilide B gene cluster was transformed into *Streptomyces albus* J1074, yielding *S. albus* J1074::cos4. Stable heterologous production of rishirilide B was observed with *S. albus* J1074::cos4 and therefore all feeding experiments have been carried out with this strain. The origin of each carbon atom in rishirilide B was determined by supplementing the production medium with [1-13C]-, [1,2-13C2]-, [2-13C]acetate, and [13c5, 15n1]-L-valine (Figure 2 and Table 1, assignment in Table S1).

**Figure 2.** Labelling positions from feeding experiments using different labelled acetates and L-valine.


**Table 1.** 13C-NMR signals of rishirilide B together with specific incorporations and coupling constants after feeding with [1-13C]acetate (I), [2-13C]acetate (II), [1,2-13C2]acetate (III) and [13c5, 15n]-L-valine (IV).

(**a**) Relative enrichments were normalized to peak intensity of the C-14 signal; (**b**) Reference signal; (**c**) Supported from Inadequate NMR spectrum; (**d**) Weak enrichment, pattern similar to [1,2-13C2]acetate incorporation through biosynthetic transformation of valine.

Incorporation studies with [1-13C]acetate show 13C isotopic labelling in positions C-2, C-4, C-5, C-7, C-8a, C-9a, C-10 and C-16 as detected from 13C NMR signal enhancements (Table S2, Figures S1 and S2). Feeding of uniformly labelled [1,2-13C2]acetate reveals eight intact acetate units (Table S3, Figures S3–S5). When denoted according to the direction of the proposed biosynthetic pathway, these acetate units correspond to C11/C4, C4a/C-10, C-10a/C-5, C-6/C-7, C-8/C-8a, C-9/C-9a, C-1/C-2 and C-3/C-16. C-17 was proposed to come from methionine, but a feeding experiment with L-[methyl-13c]methionine did not produce a labelled position in rishirilide B (Table S4, Figure S6). After feeding of [2-13C]acetate, enrichment in positions C-1, C-3, C-4a, C-6, C-8, C-9, C-10a, C-11 and C-17 was observed (Table S5, Figures S7 and S8). This implies that an acetate unit affords the C-17 methyl group after decarboxylation. This result also demonstrates that a nonaketide is the precursor to the tricyclic backbone of rishirilide B.

The C-12–C-15 moiety, which represents the starter unit of the polyketide, was not affected by any feeding experiment with labelled acetate or methionine, indicating that it derives from a different precursor. However, feeding experiments with [13c5, 15n1]-L-valine led to a mass increase of +4 (*m*/*z*) compared to unlabeled rishirilide B (Figure S9). 13C-NMR spectroscopic analysis likewise showed enrichments at positions C-12, C-13, C-14, and C-15 (Table S6, Figures S10–S13). Both MS and NMR data are consistent with a starter unit that is a valine derived isobutyryl-CoA.

#### **3. Discussion**

The biosynthetic origin of all carbon atoms of rishirilide B has been determined through labelled precursor feeding experiments. The tricyclic skeleton shows the expected isotopic labelling pattern that arises from a nonaketide that adopts an S-mode folding prior to condensation and aromatization [14]. Decarboxylation of one acetate shortens the polyketide, resulting in a residual methyl group. The starter unit arises from isobutyryl-CoA, which in turn is derived from valine. Interestingly, when feeding

with [13c5, 15n]-L-valine was performed, we observed strong enrichment of C12-C15 but also weak enrichment pattern in other carbons similar to [1,2-13C2]acetate incorporation. Despite feeding the labelled precursors in several pulses, some L-valine degraded to acetate.

The labelling pattern arising from acetate feeding indicates that a rearrangement occurs during rishirilide B biosynthesis, otherwise C-17 and C-3/C-16 could not show the detected isotopic enrichments. We propose that a Bayer-Villiger type oxidation followed by hydrolytic ring opening and an aldol condensation would produce the labelling pattern detected in our feeding experiments (Figure 3). A similar sequence of Bayer-Villiger oxidation and hydrolysis is proposed for the conversion of questin to desmethylsulochrin [15]. Likewise, Bayer-Villiger oxidations play a role in the biosynthesis of premithramycin B-lactone or murayaquinone, respectively [16,17]. Enterocin, produced by *Streptomyces maritimus*, also has an unprecedented carbon skeleton that is derived from an aromatic polyketide biosynthetic pathway. Its caged tricyclic, nonaromatic core is derived from a linear poly-beta-ketide precursor that undergoes a Favorskii-like oxidative rearrangement [18,19].

Further examples on molecular level show that Bayer-Villiger rearrangements can be catalyzed by oxygenases, like the monooxygenases GilOII in Gilovarcin biosynthesis and BexE in BE-7585A biosynthesis [20,21]. Several enzymes like the luciferase like monooxygenase RslO1 and the putative monooxygenase of the antibiotic biosynthesis monooxygenase (ABM) superfamily RslO4, are potential candidates involved in the rearrangement. An experimental confirmation is still pending.

**Figure 3.** Proposed pathway of rishirilide B biosynthesis. (**a**) cyclisation and decarboxylation; (**b**) aromatization, oxidation; (**c**) Bayer-Villiger oxidation; (**d**) hydrolytic ring opening; (**e**) aldole condensation.

#### **4. Materials and Methods**

#### *4.1. Bacterial Strains and Cultivation*

Experiments were performed in *Streptomyces albus* J1074::cos4, a transformant, which harbors a cosmid clone of the entire gene cluster of rishirilide B. *Streptomyces albus* J1074::cos4 was cultivated in TSB media (CASO bouillon 30 g·L<sup>−</sup>1, Carl Roth, Karlsruhe, Germany) supplemented with appropriate antibiotics and incubated in shake flasks at 28 ◦C for 24 h.

#### *4.2. Feeding Experiments*

Sodium[1-13C]acetate, sodium[1,2-13C2]acetate and sodium[2-13C]acetate were purchased from Sigma Aldrich and have 99% 13C atom purity, whereas [13c5, 15n1]-L-valine was purchased from cortecnet (98% 13C, 95% 15N enriched). Feeding experiments with labelled sodium[1-13C]acetate, sodium[1,2-13C2]acetate and sodium[2-13C] acetate were performed in HA media (glucose 4 g·L−1, yeast extract 4 g·L−1, malt extract 10 g·L−1, pH 7.4), whereas experiments with [13c5, 15n1]-L-valine were carried out in DNPM media (Bacto Soytone 7.5 g·L<sup>−</sup>1, dry yeast 5 g·L<sup>−</sup>1, MOPS 21 g·L<sup>−</sup>1, pH 6.8) both supplemented with appropriate antibiotics. The media was inoculated (1% *v*/*v*) with a 24 h old TSB culture of *Streptomyces albus* J1074::cos4. After 20 h of cultivation, determined as the starting point of rishirilide B production, the isotope labelled precursors were added aseptically into the production media. The feeding was done in 10 pulses—every six hours with a 12 h overnight break, hence 3 feedings within 24 h. The final concentration for [1-13C]acetate, [1,2-13C2]acetate and [2-13C]acetate was 6.1 mM and for feeding with [13c5, 15n1]-L-valine, 2 mM. Rishirilide B was isolated after 5–6 days when production reaches its maximum.

#### *4.3. Analysis of Rishirilide B by HPLC/MS*

Rishirilide B production was monitored on a HPLC/MS system equipped with a XBridge C18 precolumn (3.5 μm; 20 mm × 4.6 mm) and an XBridge C18 main column (3.5 μm, 100 mm × 4.6 mm) coupled to a DAD UV detector and MS (Agilent, 1100 series, Agilent Technologies, Waldbronn, Germany). The column was run at a flow rate of 0.5 mL·min−<sup>1</sup> beginning with 80% of buffer A (H2O + 0.5% acetic acid) and 20% of buffer B (MeCN + 0.5% acetic acid). After 6 min, buffer B was raised to 30% within 1 min followed by an 18 min linear gradient from 30% to 95%. After a delay of 3 min at 95%, the conditions returned to the starting values within 2 min followed by 5 min equilibration. Rishirilide B was detected at 254 nm.

#### *4.4. Isolation and Purification of Rishirilide B*

Rishirilide B was isolated from 2 L of culture broth for all feeding experiments. After adjustment to pH 4 the culture broth was centrifuged and extracted with EtOAc. The crude extract was fractionated with increasing methanol content (10% increments) by solid phase extraction (Oasis HLB 35cc (6g) LP Extraction Cartridge, Waters GmbH, Eschborn, Germany). The 70% and 80% methanol fractions containing rishirilide B were combined and further purified by semi preparative HPLC using a Zorbax SB-C18 precolumn (5 μm, 9.4 mm × 20 mm) and a Zorbax B-C18 main column (5 μm, 9.4 mm × 150 mm) coupled to a PDA detector. The column was eluted with buffer A and buffer B at a flow rate of 0.5 mL·min<sup>−</sup>1.

The gradients for purification of rishirilide B from acetate feeding and L-valine feeding experiments were slightly different. The starting conditions for purification of rishirilide B from acetate feeding experiments were 50% buffer A and 50% buffer B held for 4 min followed by a 7 min linear gradient to 95% buffer B, where rishirilide B was collected. With a delay of 1 min at 95% buffer B, the conditions were changed to starting conditions and maintained for 15 min.

The starting conditions for purification of rishirilide B from L-valine feeding experiment were 55% buffer A and 45% buffer B. After 3 min the concentrations were changed to 80% buffer B over a 6 min linear gradient. After 9 min, buffer B was quickly increased to 95%. With a delay of 3 min under these conditions, buffer B was lowered to starting conditions at minute 12, and the column was washed for a further 3 min.

Finally, the samples were further purified on a column (39.5 cm × 1.4 cm) packed with Sephadex® LH20 (GE Healthcare GmbH, Solingen, Germany) with MeOH as solvent. Flow rate was determined by gravity. The fractions were analyzed by HPLC-MS and rishirilide B containing fractions were combined and evaporated to dryness.

#### *4.5. NMR Analysis of Rishirilide B*

NMR spectra were recorded using a Varian VNMR-S 600 equipped with a Nalorac 3 mm broadband probe. Spectra were recorded in 150 μL DMSO-d6 at 35 ◦C. Solvent signals were used as internal standard (DMSO-d6: δ<sup>H</sup> = 2.50 ppm, δ<sup>C</sup> = 39.5 ppm). Calculation of enrichment and specific enrichment was done according to Scott et al. [22].

#### **5. Conclusions**

Investigations of rishirilide B biosynthesis have been carried out from feeding experiments with different isotopic labelled acetates and L-valine. The carbon skeleton of rishirilide B is assembled from nine acetate units and one isobutyrate (derived from valine). One acetate undergoes decarboxylation to afford a methyl group. The pattern of the labelling positions from feeding experiments with labelled acetate reveals that a rearrangement of the carbon skeleton occurs during rishirilide B biosynthesis. Future studies involving gene knock out experiments and isolation of reaction intermediates will provide insight into the enzymes and mechanisms involved in this rearrangement.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2079-6382/7/1/20/s1; Table S1: NMR data of rishirilide B (600/150 MHz, DMSO-d6, 35 ◦C); Table S2: Calculation of enrichment and specific enrichment for rishirilide B from feeding experiment with [1-13C]acetate; Figure S1: Labelling positions of rishirilide B after feeding experiment with [1-13C]acetate; Figure S2: 13C NMR spectra (150 MHz, DMSO-d6, 35 ◦C) of rishirilide B after feeding of [1-13C]acetate in comparison to rishirilide B at natural abundance; Table S3: Table of NMR data of rishirilide B from feeding experiment with [1,2-13C2]acetate; Figure S3: Labelling positions of rishirilide B after feeding experiment with [1,2-13C2]acetate; Figure S4: 13C NMR spectra (150 MHz, DMSO-d6, 35 ◦C) of rishirilide B after feeding of [1,2-13C2]acetate in comparison to rishirilide B at natural abundance; Figure S5: Inadequate (150 MHz, DMSO-d6, 35 ◦C) of rishirilide B after feeding of [1,2-13C2]acetate; Table S4: Results of feeding experiment with L-[methyl-13c]methionine; Figure S6: 13C NMR spectra (150 MHz, DMSO-d6, 35 ◦C) of rishirilide B from feeding experiment with L-[methyl-13c]methionine in comparison to rishirilide B at natural abundance; Table S5: Calculation of enrichment and specific enrichment for rishirilide B from feeding experiment with [2-13C]acetate; Figure S7: Labelling positions of rishirilide B after feeding experiment with [2-13C]acetate; Figure S8: 13C NMR spectra (150 MHz, DMSO-d6, 35 ◦C) of rishirilide B after feeding of [2-13C]acetate in comparison to rishirilide B at natural abundance; Figure S9: Mass spectrum of rishirilide B from feeding experiment with [13c5, 15n1]-L-valine; Table S6: Table of NMR data rishirilide B from feeding experiment with [13c5, 15n1]-L-valine; Figure S10: Labelling positions of rishirilide B after feeding experiment with [13c5, 15n1]-L-valine; Figure S11: 13C NMR spectra (150 MHz, DMSO-d6, 35 ◦C) of rishirilide B from feeding experiment with [13c5, 15n1]-L-valine in comparison to rishirilide B at natural abundance; Figure S12: Inadequate (150 MHz, DMSO-d6, 35 ◦C) of rishirilide B from feeding experiment with [13c5, 15n1]-L-valine; Figure S13: Expansion of Inadequate (150 MHz, DMSO-d6, 35 ◦C) of rishirilide B from feeding experiment with [13c5, 15n1]-L-valine.

**Acknowledgments:** This work was supported by the Deutsche Forschungsgemeinschaft (RTG 2202), grand to Andreas Bechthold. We thank David L. Zechel, Queen's University Kingston, Ontario for manuscript reading.

**Author Contributions:** Andreas Bechthold and Thomas Paululat conceived and designed the experiments. Philipp Schwarzer performed the feeding experiment with labelled valine and isolation of the product. Julia Wunsch-Palasis performed the feeding experiment and product isolation with the other precursors. Thomas Paululat measured the NMR spectra and analysed the data. Philipp Schwarzer, Andreas Bechthold and Thomas Paululat wrote the paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Diversification of Secondary Metabolite Biosynthetic Gene Clusters Coincides with Lineage Divergence in** *Streptomyces*

#### **Mallory J. Choudoir, Charles Pepe-Ranney and Daniel H. Buckley \***

School of Integrative Plant Science, Bradfield Hall 705, Cornell University, Ithaca, NY 14853, USA; mjchoudoir@gmail.com (M.J.C.); chuck.peperanney@gmail.com (C.P.-R.) **\*** Correspondence: dbuckley@cornell.edu; Tel.: +1-607-255-1716

Received: 11 January 2018; Accepted: 7 February 2018; Published: 13 February 2018

**Abstract:** We have identified *Streptomyces* sister-taxa which share a recent common ancestor and nearly identical small subunit (SSU) rRNA gene sequences, but inhabit distinct geographic ranges demarcated by latitude and have sufficient genomic divergence to represent distinct species. Here, we explore the evolutionary dynamics of secondary metabolite biosynthetic gene clusters (SMGCs) following lineage divergence of these sister-taxa. These sister-taxa strains contained 310 distinct SMGCs belonging to 22 different gene cluster classes. While there was broad conservation of these 22 gene cluster classes among the genomes analyzed, each individual genome harbored a different number of gene clusters within each class. A total of nine SMGCs were conserved across nearly all strains, but the majority (57%) of SMGCs were strain-specific. We show that while each individual genome has a unique combination of SMGCs, this diversity displays lineage-level modularity. Overall, the northern-derived (NDR) clade had more SMGCs than the southern-derived (SDR) clade (40.7 ± 3.9 and 33.8 ± 3.9, mean and S.D., respectively). This difference in SMGC content corresponded with differences in the number of predicted open reading frames (ORFs) per genome (7775 ± 196 and 7093 ± 205, mean and S.D., respectively) such that the ratio of SMGC:ORF did not differ between sister-taxa genomes. We show that changes in SMGC diversity between the sister-taxa were driven primarily by gene acquisition and deletion events, and these changes were associated with an overall change in genome size which accompanied lineage divergence.

**Keywords:** *Streptomyces*; biogeography; comparative genomics; diversification; secondary metabolite biosynthetic gene clusters; SMGC; natural products

#### **1. Introduction**

Microbial secondary metabolism encapsulates a remarkable diversity of natural products with an extensive range of biological activities. Secondary metabolites differ from primary metabolites in that they are not involved in essential catabolic and anabolic activities required for normal growth and reproduction, but may contribute significantly to an individual's fitness [1]. While primary metabolic pathways are often conserved deeply within a phylogeny, secondary metabolic pathways are more divergent, often being species or strain-specific, with conservation sometimes observed among closely related species and genera [2]. This phylogenetic pattern suggests an adaptive role for secondary metabolites, and if secondary metabolism pathways provide adaptive benefits, their evolution might drive or reinforce evolutionary processes that result in microbial diversification and speciation [3].

The values of natural products to humanity are widely recognized, yet because most research has focused on their discovery and human-centric relevance, we are still far from understanding their biological role in natural systems. The discovery and application of antibiotics revolutionized medicine in the 1940's, sparking the "golden age" of antibiotics between 1950 and 1960, during which time approximately half of the microbial-derived drugs we use today were discovered [4]. Presently, thousands of bioactive compounds with antibacterial, antifungal, and antitumor activities are cataloged [5,6], and yet these represent only a fraction of actual natural product diversity [7]. In addition, microbial populations in situ are exposed to natural products at concentrations far below the lethal clinical dose, and hence these compounds may serve different functions in the environment from those observed during therapeutic application. We know that secondary metabolites can mediate diverse biotic interactions including mutualistic interactions, competition for nutrients, metal scavenging, and plant-microbe and insect-microbe symbioses [8–10], which can all have profound impacts on microbial fitness. It is clear that natural products must have considerable impacts on microbial ecology and evolution and that understanding the biology and evolutionary history of natural products will enhance our ability to use these agents therapeutically.

Soil-dwelling actinomycetes are the predominant source of microbial-derived therapeutic natural products, and the majority of described bioactive compounds originate from the genus *Streptomyces* [6,7]. The *Streptomyces* life cycle resembles that of many fungi, consisting of filamentous growth, formation of mycelia, and production of aerial hyphae and spores. Indeed, *Streptomyces* were thought to be an intermediary between bacteria and fungi until as recently as the 1950's [11]. However, *Streptomyces* are Gram-positive *Actinobacteria* with long linear chromosomes that have a high G+C content [12]. Traditionally, *Streptomyces* species are often known to produce several secondary metabolites when grown in culture. Genome sequencing, however, reveals that *Streptomyces* contain an enormous reservoir of "cryptic secondary metabolites" which are not expressed under standard laboratory conditions [13]. For instance, while *Streptomyces coelicolor* A3(2) was known to produce several well characterized secondary metabolites, genome sequencing discovered that it actually contained >20 biosynthetic gene clusters not expressed when grown in culture [14]. Genes within secondary metabolite biosynthetic gene clusters (SMGCs) are co-localized as operons within discrete genomic regions. SMGCs have recognizable functional domains, so SMGCs are readily predicted using bioinformatics [15]. Phylogenetic conservation of SMGCs between closely related microbes suggests that these secondary metabolites may have ecological roles which facilitate microbial diversification [16–18].

The evolutionary and ecological processes that govern SMGC diversity remain largely unexplored. The richness of SMGCs within soils is linked to both edaphic and biotic factors [19,20]. For example, the production of antibiotics by *Streptomyces* isolated from prairie soils is highly variable between strains and correlates poorly with 16S rRNA gene phylogeny, suggesting a role of selection acting at small spatial scales [21,22]. Conversely, at larger spatial scales, *Streptomyces* SMGC composition varies in relation to both spatial distance and environmental dissimilarity [23]. Furthermore, evidence within *Streptomyces* for endemism at inter-continental and regional geographic scales [16,24,25] suggests limits to dispersal at large spatial scales. These data indicate that both adaptive and neutral processes contribute to patterns of SMGC biogeography.

Microbial biogeography is readily explored with geographically explicit microbial culture collections (reviewed in [26]), and the genus *Streptomyces* is an ideal model system to evaluate the influence of SMGC dynamics on patterns of diversification. We previously assembled a culture collection of *Streptomyces* from sites spanning the United States, and we observed evidence for dispersal limitation, as well as a latitudinal gradient of species riches and intraspecific nucleotide diversity [27,28]. From this culture collection, we have identified *Streptomyces* sister-taxa that have geographic ranges delimited by latitude and have patterns of gene flow and genomic diversity consistent with their diversification from a recent common ancestor [28,29]. Here, we evaluate changes in SMGC diversity between these *Streptomyces* sister-taxa to explore SMGC evolutionary dynamics during the divergence of *Streptomyces* species.

#### **2. Results and Discussion**

#### *2.1. Genomic Divergence between Streptomyces Sister-Taxa*

We used comparative genomics to analyze patterns of genomic diversity and SMGC content in 24 *Streptomyces* representing sister-taxa and related strains. These strains were identified through a phylogenetic analysis of a *Streptomyces* culture collection [28] generated from soils of ecologically similar grassland sites spanning 6000 km across the continental United States (Figure 1, Table S1). The sister-taxa, which we have designated the northern-derived (NDR) and southern-derived (SDR) clades, were defined by their geographic range and genomic similarity (Figure 1). Each clade contains ten isolates, and an additional four genomes represent intermediate (INT) taxa.

**Figure 1.** The northern-derived (NDR) and southern-derived (SDR) clades are closely related sister-taxa and yet were isolated from soils of different latitude. The un-rooted tree was constructed from multiple whole genome alignments with maximum likelihood and a GTRGAMMA model of evolution. Scale bar represents nucleotide substitutions per site. Colored branches depict the northern-derived (NDR) and southern-derived (SDR) clades. Strain names reflect the sample site they were isolated from (Table S1). Genome NBRC 13350 is the publically available type strain *Streptomyces griseus* subsp. *griseus* NBRC 13350. Sample locations are shown in the right panel and labeled with the site code. Circles are colored to reflect the geographic distribution of clades. (Figure modified from [29]).

Assembled genomes are 7.5–9.1 Mb with a G+C content of 71.4–72.5% and 6776–8078 predicted open reading frames (ORFs) (Table S2). The core gene content across all 24 strains is comprised of 3234 orthologous genes (representing 2778 single-copy genes), with a total of 22,054 genes in the overall pan-genome. All isolates affiliate taxonomically with the *Streptomyces griseus* species cluster [30] and share >90% average nucleotide identity (ANI) with the type strain *Streptomyces griseus* subsp. *griseus* NBRC 13350 (Figure 1).

The NDR core genome is comprised of 4234 genes, and the SDR core genome is comprised of 4400 genes. The NDR and SDR clades share a recent phylogenetic ancestor and have nearly identical 16S rRNA genes (inter-lineage nucleotide dissimilarity of 0–0.21% between strains). Strains within each clade have a whole genome ANI value ranging from 95.6% to 99.9%, while the ANI between strains of NDR and SDR range from 92.6% to 93.3% (Figure 1). Distinct microbial species are

typically distinguished by ANI in the range of 95–96% [31]. Comparative population genomics reveals signatures of genomic differentiation and gene flow limitation between NDR and SDR consistent with expectations of allopatric diversification [29]. Collectively, these results indicate that NDR and SDR clades represent distinct microbial species which have recently diverged from a common ancestor.

#### *2.2. Secondary Metabolite Biosynthetic Gene Cluster (SMGC) Identification and Classification*

We used antiSMASH [32] to identify SMGCs in the genomes of our *Streptomyces* sister-taxa. To assess the novelty of these SMGCs, we utilized antiSMASH's downstream annotation pipeline, which annotates SMGCs based on similarity to genes and pathways present within the Minimum Information about a Biosynthetic Gene cluster (MIBiG) database. The antiSMASH pipeline annotated 120 SMGCs across the 24 strains (Table S3). Each genome had between 28 and 47 SMGCs which ranged in size from 1 to 137 Kb (20.9 ± 15.7 Kb, mean ± S.D., respectively) (Figure 2). This range in SMGC content is consistent with the results obtained from previous genomic surveys of *Streptomyces* [14,33–36]. The NDR clade has a greater number of SMGCs per genome than the SDR clade (40.7 ± 3.9, 33.8 ± 3.9, mean ± S.D., respectively; *t*-test, *p* < 0.001; Figure 2a). The NDR clade also has a greater number of ORFs per genome than the SDR clade (7775 ± 196 and 7093 ± 205, mean and S.D., respectively; *t*-test, *p* < 0.001; Table S2). Correspondingly, NDR strains also have larger genomes than SDR strains (8.7 ± 0.25 Mb and 7.9 ± 0.21 Mb, mean ± S.D., respectively; *t*-test, *p* < 0.001; Table S2). We observed a strong positive correlation between genome size and number of SMGCs across all genomes examined (Pearson's *r* = 0.66, *p* < 0.001).

**Figure 2.** NDR strains have more secondary metabolite biosynthetic gene clusters (SMGCs) than SDR strains (*t*-test, *p* < 0.001). (**a**). Bars indicate the number of SMGCs identified in each genome and are colored according to clade affiliation, and genome names reflect the site of isolation as identified in Table S1; (**b**). Kernal density plot shows the distribution of SMGC length (bp).

Only 21% (*n* = 25) of the MIBiG-annotated SMGCs represent well-characterized biosynthetic gene clusters (in which ≥70% of the genes in a SMGC show similarity to genes within the most similar known cluster from the MIBiG database) (Table S3). In addition, each genome harbors five to 25 potentially novel SMGCs with low similarity to biosynthetic pathways within the MIBiG database. These findings indicate that the diversity of *Streptomyces* SMGCs found within public databases remains low and that a vast reservoir of *Streptomyces* SMGC diversity remains to be characterized within natural populations.

The SMGCs predicted by antiSMASH within our *Streptomyces* sister-taxa encompass 22 classes of natural products. Most of these classes, including bacteriocin, butyrolactones, ectoine, lantipeptide, melanin, non-ribosomal peptide synthases (NRPS), siderophore, polyketide synthases (PKS), and terpene gene clusters, are widely conserved at the genus level [2]. The most abundant SMGC classes in our genomes are NRPS and terpene clusters (Figure 3, Table S3). Many of the predicted gene clusters are NRPS-PKS hybrids (Table S3). Given the similar structure and activity between NRPS and PKS [37], it is unsurprising that hybrid NRPS-PKS clusters are commonly detected in *Streptomyces* genomes [38,39]. Most SMGC classes are present in both NDR and SDR clades, but the relative abundance of each class differs between genomes, as well as between clades (Figure 3). We observe the significant enrichment of melanin and ladderane gene clusters in NDR compared to SDR (*t*-test with Bonferrori correction, *p* < 0.002). Additionally, NDR genomes harbor linaridin gene clusters, which are entirely absent from SDR genomes (Figure 3) but are found in the type strain *Streptomyces griseus* NBRC subsp. *griseus* 13350 [40]. Interestingly, antiSMASH did not identify aminoglycoside biosynthetic clusters in our *Streptomyces* isolates, and all of these genomes presumably lack genes for streptomycin biosynthesis (Figure 3). Schatz and Waksman reported the isolation of streptomycin from *Streptomyces griseus* in 1944, and this was the first antibiotic used to successfully combat tuberculosis [41]. However, not all *Streptomyces griseus* isolates produce streptomycin [42,43].

**Figure 3.** A total of 22 SMGC classes were observed in NDR and SDR genomes by antiSMASH [32]. The tree reflects phylogenetic relationships between *Streptomyces* sister-taxa genomes and was constructed from multiple whole genome alignments (see Figure 1). Scale bar represents nucleotide substitutions per site. Tree branches are colored according to clade affiliation. Bars depict the number of gene clusters belonging to each class for each genome. Colors illustrate gene cluster class as provided by the legend. Asterisks note gene cluster classes that are significantly enriched between clades (*t*-test and Bonferonni correction for multiple comparisons, *p* < 0.002).

#### *2.3. Core and Accessory SMGCs of Streptomyces Sister-Taxa*

Comparative population genomics and pan-genome analyses can offer powerful insights into the processes underlying species divergence [44,45]. Given that many of our SMGCs have low similarity to biosynthetic pathways in public databases, we determined shared orthologous SMGCs within our genomes using an annotation-independent approach that compares SMGCs based on similarity in nucleotide composition and gene content (see Materials and Methods). This approach identified 310 non-redundant SMGCs within the pan-genome of all 24 strains (Figures 4 and 5); this number

is greater than the number of MIBiG-annotated SMGCs because it classified both known and unknown pathways into distinct non-redundant gene clusters. Only two SMGCs are conserved in all 24 genomes, an ectoine gene cluster and the siderophore desferrioxamine B (Figure 6). Desferrioxamine siderophores are commonly observed in other species of *Streptomyces* and acintomycetes [46,47].

We observed that core SMGC content increased with phylogenetic similarity, but that more than half of the SMGCs were strain-specific (Figures 4 and 5). NDR and SDR shared nine core SMGCs (present in ≥80% of genomes), while NDR strains shared 11 core SMGCs (nine in the conserved core and two in the NDR-specific core), and SDR strains shared 15 core SMGCs (nine in the conserved core and six in the SDR-specific core) (Figure 6). In addition, there were 158 accessory SMGCs (present in <80% genomes) in NDR and 114 accessory SMGCs in SDR (Figure 4). Most SMGCs were observed at low to intermediate frequencies (Figures 4 and 5), and 177 SMGCs were strain-specific, with each *Streptomyces* genome harboring one to 19 exclusive SMGCs. These estimates are generally consistent with previous observations that indicate each different *Streptomyces* species will harbor a distinct repertoire of natural product pathways [17]. For example, Seipke [36] estimated 18 core SMGCs for six *Streptomyces albus* isolates. However, despite the phylogenetic conservation of core SMGC content, even *Streptomyces* with identical 16S rRNA gene sequences can have distinct secondary metabolite profiles [48], indicating that SMGC content exhibits significant strain to strain variability within a species. Thus, we propose that core SMGCs reflect the shared evolutionary history of *Streptomyces* genomes, while patterns of the accessory SMGC carriage suggest lineage and strain-specific processes across more recent evolutionary time scales.

**Figure 4.** The frequency distribution of SMGCs across strains shows that most SMGCs are strain-specific and fewer are species-specific. Results are shown both for NDR and SDR. (**a**) and for all 24 genomes; (**b**). Non-redundant orthologous SMGCs were defined using our annotation-independent approach (see Materials and Methods).

**Figure 5.** We identified 310 non-redundant distinct SMGCs using our annotation-independent gene clustering approach (see Materials and Methods). Each point represents a unique SMGC from a single genome, and colors correspond to clade affiliation. SMGCs with a similar gene composition are clustered spatially, and cluster membership is depicted with polygons. The same data is presented in a different network diagram in Figure S1.

#### *2.4. Evolutionary Dynamics of Core and Accessory SMGCs*

To address potential lineage-specific mechanisms of divergence, we next evaluated the evolutionary dynamics of SMGCs. Most shared SMGCs occur within rather than between clades (Figures 5 and S1). A total of 78 SMGCs are shared among two or more NDR genomes, and 55 are shared among SDR genomes, but only 37 SMGCs are shared across clade boundaries (i.e., found in both NDR and SDR genomes). Furthermore, network analysis reveals unique patterns of SMGC sharing that manifests as nodes of connectivity within clades (Figure S1). This network indicates that there is a core set of SMGC content which links NDR and SRD and which must be ancestral, that there is a clade-specific core set of SMGCs which link the strains of each clade together based on shared SMGC content, and that there are a large number of strain-specific SMGCs (Figures 5 and S1).

**Figure 6.** A total of nine core (i.e., conserved in ≥80% of genomes) SMGCs were found in both NDR and SDR. The NDR clade had 11 core SMGCs and the SDR clade had 15 core SMGCs. The tree reflects phylogenetic relationships between *Streptomyces* sister-taxa genomes and was constructed from multiple whole genome alignments (see Figure 1). Scale bar represents nucleotide substitutions per site. Tree branches are colored according to clade affiliation. Core orthologous SMGCs (depicted by colored circles) were determined using the antiSMASH [32] MIBiG annotation pipeline or were defined using our annotationindependent approach (see Materials and Method). Colors correspond to SMGC class (see legend), and natural product annotations are labeled if available.

Differences in gene content between closely related microbes ultimately result from gene gain and loss events [49–51]. Although deletion bias is strong in bacterial genomes [52], gene acquisitions can drive rapid genome innovation and evolution [53]. Gene clusters are often acquired through horizontal gene transfer leading to the formation of new operons in bacterial genomes [54], and many SMGCs in actinomycetes are believed to be the result of horizontal gene transfer [16,18,33,55]. Parsimony predicts that low frequency and strain-specific genes are likely the result of a recent acquisition, while high frequency "near core" genes are the likely result of recent deletion events [56]. Hence, we are able to infer SMGC gain and loss dynamics in our *Streptomyces* sister-taxa from SMGC frequency distributions (Figure 4).

The majority of SMGCs observed within the sister-clades occurred in only one or a few strains, and this suggests that gene acquisition is a major force that drives the diversity of SMGC pathways in *Streptomyces*. However, each clade has a distinct set of core and accessory SMGCs (Figures 3, 5, 6 and S1), and this suggests that SMGC composition (Figure 7) may underlie ecological traits that promote or reinforce lineage divergence. For example, nearly all genomes within the NDR clade (with the exception of rh34) harbor a melanin gene cluster which is absent from both the intermediate (INT) and SDR genomes, suggesting that horizontal gene transfer of the melanin gene cluster into the immediate ancestor of NDR accompanied lineage divergence (Figures 6 and 7). Overall, NDR has more low frequency SMGCs (present in one to three strains) than SDR (139 and 96, respectively) (Figure 4). This result suggests a greater rate of gene acquisition in NDR than in SDR and is consistent with the observation that NDR has more SMGCs (Figure 2) and larger genomes overall than SDR. While this difference in gene content is potentially adaptive, it could also be explained as a consequence of neutral demographic processes such as genome surfing (reviewed in [57]). However, the distribution of SMGC frequencies does not differ significantly between clades (Kolmogorov-Smirnov test, *p* = 0.4). Hence, while it seems clear that gene acquisition is a major driver of SMGC biodiversity, the role of gene acquisition in driving lineage divergence remains unclear.

**Figure 7.** Gene content of core SMGCs vary within and between clades as a result of gene acquisition and deletion events. Panels depict the gene content (i.e., genetic architecture) of core SMGCs (i.e., conserved in ≥80% of genomes), the NDR-specific SMGC core, and the SDR-specific SMGC core. Black bars within the panels represent orthologous genes. The tree reflects phylogenetic relationships between *Streptomyces* sister-taxa genomes and was constructed from multiple whole genome alignments (see Figure 1). Scale bar represents nucleotide substitutions per site. Panel colors correspond to SMGC class (see legend). Panels are labeled with the SMGC cluster membership (see Table S3) defined using our annotation-independent approach (see Materials and Methods).

We also see evidence that NDR has undergone the deletion of SMGC-associated genes inherited from the common ancestor of NRD and SDR. For example, the SRO15-2005 lassopeptide gene cluster is conserved in SDR and found in INT but absent from NDR, suggesting that deletion of this lassopeptide accompanied NDR divergence (Figures 6 and 7). We also find that core SMGC gene loss is more common in NDR than SDR (strain-level deletions occur in six out of nine core gene clusters within NDR and two out of nine core gene clusters in SDR) (Figure 6). Similarly, we can observe SDR species-specific core gene clusters (AmfS, coelichelin, a T1PKS, and a terpene) that are found in only 70% (i.e., near core) of NDR strains (Figure 6). This pattern suggests that these SMGCs were present in the common ancestor of the two clades and subsequently deleted from NDR isolates. In addition, the butyrolactone operon (cluster 3) is comprised of more genes in SDR than in NDR, and this likely indicates active gene loss within this pathway for NDR strains (Figure 7).

Taken together, these results suggest that the sister-clades are under different evolutionary pressures which drive dissimilarity in SMGC composition. NDR genomes have increased in size relative to their ancestors suggesting an overall increase in the rate of gene acquisition via horizontal gene exchange, and this increase in gene acquisition has resulted in an increase in strain-specific SMGC content in NDR. In addition, the presence of NDR-specific core SMGCs (e.g., melanin gene cluster) indicates that some horizontally acquired SMGC have gone to fixation within NDR. At the same time, deletion events in NDR have pruned away SMGCs inherited from ancestral lineages (i.e., those clusters

present in SDR and INT). We hypothesize that these changes in SMGC content are likely to have effects on fitness which should act to reinforce lineage divergence either as a result of antagonism or niche differentiation.

#### **3. Materials and Methods**

#### *3.1. Streptomyces Isolation and DNA Extraction*

We built a culture collection of >1000 *Streptomyces* isolated from grassland soils (pH 3.9–7.3) sampled at 0–5 cm from sites across the United States [27]. Pure *Streptomyces* cultures were obtained from air-dried soils on glycerol-arginine agar (pH 8.7) containing antifungals as previously described [58]. Genomic DNA was extracted using a standard phenol/chloroform/isoamyl alcohol protocol from liquid cultures grown in yeast extract-malt extract medium (YEME) with 0.5% glycine [5] for 72 h shaking at 30 ◦C.

#### *3.2. Whole Genome Sequencing, Assembly, and Annotation*

*Streptomyces* genomic sequencing libraries were prepped with the Nextera DNA Library Preparation Kit (Illumina, San Diego, CA, USA), and draft genomes were generated using the Illumina HiSeq2500 platform (Illumina, San Diego, CA, USA) and paired-end 2 × 100 bp reads at the Cornell University Biotechnology Resource Center (BRC). Quality control and assembly was performed with the A5 pipeline [59], and genomes were annotated using the online RAST Server [60]. Multiple whole genome alignments were obtained with Mugsy [61], and trimAL v1.2 removed poorly aligned regions [62]. Orthologous genes were identified using ITEP [63] with MCL clustering parameters as follows: inflation value = 2.0, cutoff = 0.04, maxbit score. Average nucleotide identity (ANI) was determined using mother [64]. Genome sequences are available at NCBI under BioProject ID PRJNA401484 accession numbers SAMN07606143–SAMN07606166.

#### *3.3. Phylogenetic Reconstruction*

The phylogenetic relationship between genomes was reconstructed from DNA sequences of multiple whole genome alignments using maximum likelihood (ML) with the generalized time reversible nucleotide substitution model [65] with gamma distributed rate heterogeneity among sites (GTRGAMMA) in RAxML v7.3.0 [66]. Bootstrap support was determined using the RAxML rapid bootstrapping algorithm [67].

#### *3.4. Secondary Metabolite Biosynthetic Gene Cluster (SMGC) Identification*

Secondary metabolite biosynthetic gene clusters (SMGC) were predicted and annotated using the online server antiSMASH 3.0 [32]. We also used an annotation-independent approach to identify SMGCs shared between genomes. For each SMGC identified by antiSMASH, we used Prodigal [68] to call open reading frames (ORFs) and Parasail with default parameters to identify orthologous genes and orthologous gene groups [69]. We used the R package igraph [70] to cluster similar SMGCs, define cluster membership, and thus determine which SMGCs are shared between genomes. Cluster membership was determined based on gene content using a binary (i.e., Jaccard) dissimilarity distance of ≤4.0 generated from an orthologous group presence/absence table. Dissimilarity distances of >4.0 did not result in an appreciable gain in the number of total clusters. The SMGC network was visualized and analyzed with Cytoscape 3.3.0 [71].

#### **4. Conclusions**

We used comparative genomics to examine SMGC diversity within strains of two closely related *Streptomyces* species that recently diverged from a common ancestor. Our objective was to observe and explore the evolutionary dynamics of SMGCs that accompany evolutionary diversification and to assess SMGC conservation within and between closely related species. It is clear that gene gain and loss events drive major differences in SMGC composition, both within and between species. While both species share conserved core SMGCs, each clade has its own species-specific SMGC core, and the majority of SMGCs were strain-specific. This pattern indicates that these SMGCs, not present in shared ancestors, were acquired recently due to horizontal gene exchange.

In addition, we observe that SMGCs that have been inherited from a shared ancestor can vary considerably in gene content, both due to the acquisition and deletion of individual genes within each gene cluster. We observe SMGC gain and loss dynamics that differ between clades and identify SMGC acquisition and deletion events that correspond to ancestral diversification events. These findings show that SMGC modification is associated with lineage divergence, though whether these changes cause or reinforce divergence directly or are an indirect product of evolutionary divergence remains to be seen. A limitation of the comparative genomics approach is that we cannot assess the ecological activity of a pathway from genome sequence data. It is possible that some (or all) of the strain-specific pathways, if acquired by recent horizontal exchange, may be non-functional. It is also possible that changes in SMGC architecture and gene content could alter pathway functionality and that pathways deemed orthologous on the basis of genetic similarity may have different functions in different strains.

Finally, we can conclude that, while strains within a species will share a core set of SMGCs, the number of accessory SMGC within a given species can be quite large, with each strain having its own repertoire of strain-specific SMGCs. Furthermore, the majority of these strain-specific SMGCs remain uncharacterized and lack similarity to SMGCs documented in public databases.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2079-6382/7/1/12/s1, Figure S1: Each clade has a distinct SMGC network. The network illustrates inter- and intra-clade sharing of SMGC content. Large circles represent the genomes of Streptomyces strains and are labeled with isolate names and colored according to clade affiliation. Smaller circles represent non-redundant distinct SMGCs identified using our annotation-independent approach (see Materials and Methods). Lines connect each SMGC to the strains in which they are found. Network nodes and edges are scaled in proportion to the number of connections and colored according to gene cluster class (see legend). Network is arranged in the organic layout using Cytoscape 3.3.0 [71]. Core SMGCs can be observed as larger central nodes while strain specific and low frequency SMGCs occur around the edges of the graph; Table S1: The 24 Streptomyces genomes were isolated from 11 sites. Isolate names begin with the site code from which they were isolated from followed by strain number; Table S2: Genome and assembly characteristics for 24 Streptomyces genomes. The clade affiliations include the northern-derived (NDR), southern-derive (SDR), and intermediate (INT). Sample site of each isolate can be found in Table S1. Values report assembled draft genome size, genome-wide G+C content, the number of predicted open reading frames (ORFs), and the number of predicted secondary metabolite biosynthetic gene clusters (SMGCs) per genome; Table S3: SMGCs are predicted by antiSMASH [32] in our 24 Streptomyces genomes. For each SMGC, columns report the affiliated genome, clade, gene cluster class (hybrids are indicated by hyphens), gene cluster length (bp), natural product annotation provided by antiSMASH, cluster membership (Clust Memb), MIBiG database identification, the portion of genes with similarity to genes within the most similar known cluster from the MIBiG database (% Genes w/Similarity). Cluster membership was determined using our annotation-independent approach (see Materials and Methods). NA indicates information is not available.

**Acknowledgments:** This material is based upon work supported by the National Science Foundation under Grant No. DEB-1456821 awarded to Daniel H. Buckley.

**Author Contributions:** Mallory J. Choudoir and Daniel H. Buckley conceived and designed the study; Mallory J. Choudoir performed the analyses and analyzed the data; Charles Pepe-Ranney contributed to the analyses; Mallory J. Choudoir and Daniel H. Buckley wrote the paper.

**Conflicts of Interest:** The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Review* **Mining Actinomycetes for Novel Antibiotics in the Omics Era: Are We Ready to Exploit This New Paradigm?**

#### **Olga Genilloud**

Fundación MEDINA, Avda Conocimiento 34, 18016 Granada, Spain; olga.genilloud@medinaandalucia.es

Received: 6 August 2018; Accepted: 21 September 2018; Published: 25 September 2018

**Abstract:** The current spread of multi-drug resistance in a number of key pathogens and the lack of therapeutic solutions in development to address most of the emerging infections in the clinic that are difficult to treat have become major concerns. Microbial natural products represent one of the most important sources for the discovery of potential new antibiotics and actinomycetes have been one of the most relevant groups that are prolific producers of these bioactive compounds. Advances in genome sequencing and bioinformatic tools have collected a wealth of knowledge on the biosynthesis of these molecules. This has revealed the broad untapped biosynthetic diversity of actinomycetes, with large genomes and the capacity to produce more molecules than previously estimated, opening new opportunities to identify the novel classes of compounds that are awaiting to be discovered. Comparative genomics, metabolomics and proteomics and the development of new analysis and genetic engineering tools provide access to the integration of new knowledge and better understanding of the physiology of actinomycetes and their tight regulation of the production of natural products antibiotics. This new paradigm is fostering the development of new genomic-driven and culture-based strategies, which aims to deliver new chemical classes of antibiotics to be developed to the clinic and replenish the exhausted pipeline of drugs for fighting the progression of infection diseases in the near future.

**Keywords:** actinomycetes; antibiotics; secondary metabolism; culture-based approaches; omics

#### **1. Introduction**

Actinomycetes are accepted as one of the most relevant bacterial groups as prolific producers of secondary metabolites (SM). For decades, they were intensively exploited by industrial discovery programs that deliveredmost of the natural products, which were subsequently used as scaffolds to derive a large number of the antibiotics that are currently in the clinic [1,2]. Despite this past success, the current spread of multi-drug resistance in a number of key pathogens (*Enterococcus faecium*, *Staphylococcus aureus*, *Klebsiella pneumoniae*, *Acinetobacter baumannii*, *Pseudomonas aeruginosa* and *Enterobacter* spp.) and the lack of therapeutic solutions in development to address most of the emerging infections in the clinic that are difficult to treat have become major health concerns and require a new and sustained research effort to respond to the need for new antibiotics [3–5]. The progressive abandonment of the field by large pharmaceutical companies, moving away from the high costs of development and poor incentives as part of a broken economic model, has left academic groups and small biotech companies as the only players in the discovery field. These research teams are confronted with the major challenge of identifying novel antibiotic classes that they may not have the capacity to develop alone to reach the market [6,7]. The international initiatives launched in the last few years are supporting the development of new antibiotics and are revitalizing new preclinical and clinical developments. In contrast, basic research and early discovery efforts still remain poorly addressed by these programs

and the existing research gaps are not enabling the establishment of a sustainable early discovery pipeline for future new compounds from the innovative approaches that have emerged in the field in last few decades [8,9]. This limited number of clinical development programs is in direct contrast to the dynamic and prolific activity that has developed in the field of microbial natural products during the last decade [10,11]. Research in natural products has evolved to represent one of the most important sources for the discovery of potential new antibiotics, compared to the lack of success of synthetic molecules. The poor suitability of synthetic libraries that were used to select molecules from target-based in vitro screens, or to develop rational drug designs based on ligand binding devoid of suitable physiochemical properties for penetrating bacterial membranes has been extensively discussed in previous reports [12–15]. Despite the rediscovery problem of known compounds, which has been represented as one of the major burdens for the continued investment in traditional NP discovery programs in the past, the advances in genome sequencing and bioinformatic tools have permitted us to collect a wealth of knowledge on the biosynthesis of natural products, which has revealed the broad untapped biosynthetic diversity of microorganisms, especially actinomycetes [16]. These talented filamenting bacteria have been shown to encode a previously unexpected diversity of biosynthetic gene clusters, opening new opportunities to identify novel classes of compounds that are awaiting to be discovered [16–18]. New advances in cultivation techniques from unexploited microbial niches have provided new insights into the diversity of actinobacteria and the possibility to access a broader chemical space of bioactive compounds. Comparative genomics, metabolomics and proteomics and the development of new analysis tools are generating a wealth of integrated information, which is fostering the emergence of new strategies that are aimed at obtaining a better understanding of the physiology of actinomycetes and their production of natural products. The new methods developed to mine genomes and predict chemical structures, to activate silent clusters and enable pathway engineering and to directly express metagenome gene clusters in heterologous hosts are providing the necessary tools to set up the basis for a new antibiotic discovery paradigm. Many recent excellent reviews cover each of the individual aspects of these new developments in the field and their impact on the identification of new antibiotics [19–22]. This short review is mostly focused on revisiting the most recent contributions in the field of actinomycetes in terms of their natural products, which have an impact on the discovery of novel scaffolds. These new approaches could bring new opportunities for the production of novel classes of compounds to address the current therapeutic needs in the multi-resistance era.

#### **2. Exploiting Diversity of Cultured Actinomycetes**

The continued interest in exploring the environment to untap novel sources of microbial diversity has resulted in extensive prospective studies on a broad variety of sources, which ranges from specific terrestrial extreme environments to unique microbial assemblages in plant–host associations and marine ecosystems. The distribution of some microbial species presents biogeographic patterns that are mostly determined by micro-environmental conditions and the most recent efforts have been focused at exploiting these still poorly explored habitats to discover new chemical diversity. The exploration of the extreme conditions of desert and arid habitats, such as the Atacama desert with high salinity and high levels of UV radiation, has permitted the isolation of new species of actinomycetes that are well adapted to surviving under these conditions [23,24]. These actinomycetes communities have shown to produce a wide diversity of novel compounds from different natural product classes, such as the new ansamycins chaxamycins or the β-diketone polyketides asenjonamides A–C. This only names a few compounds with antibacterial activities that have been described recently [25,26] (Table 1) (Figure 1). Cave microbiomes are another pristine eco-system that have frequently been studied in the search for novel producing strains and many reports have highlighted the isolation of novel species of actinomycetes that produce bioactive compounds [27]. The actinobacterial diversity in calcite moonmilk deposits are one of the most recent examples showing the potential of these new isolates to create products with antibacterial activities [28]. Alternatively, the use of diffusion chamber methods to grow previously uncultured soil bacteria has permitted to isolate new antibiotics, such as lassomycin [29].

The marine environment has been traditionally another source of new actinomycetes [42,43]. The broad diversity of marine ecosystems, ranging from mangroves, shallow waters, deep sea sediments and associated invertebrates, have continued to attract the interest of microbiologists. Their isolation programs have ensured a continued discovery of new strains that produce new compounds or new analogs with biological activity (Table 1 and Figure 1) [30–34,43–48]. Despite this prolific description of marine-derived strains and new antibiotic producers, this environment remains poorly studied in terms of microbial diversity and functional diversity. Marine sediments have been the focus of recent studies, which has shown that actinobacteria are in fact only a minor component of this microbial community. More interestingly, these findings have suggested that the production of many of the secondary metabolites have a deep impact on the microbial community composition [49]. There is no doubt that the parallel advances in the metagenomic assessment of microbial diversity have allowed us to explore the dynamics of the microbial populations of interest in the communities currently being studied. These technologies are opening new avenues to investigate the potential roles of these members and the effects of the antibacterial compounds on the microbiome composition. The results derived from these studies will guide the isolation and selection of the most promising members of these microbial communities and their screening for the production of novel bioactive compounds.



**Figure 1.** *Cont*.

**Figure 1.** New antibiotics and analogs discovered from Actinomycetes.

#### **3. Genomics Driven Discovery**

In parallel, the exponential increase in the number of partial and complete genome sequencing projects on the actinomycetes species available in the public databases not only have confirmed their broad biosynthetic diversity across the different lineages, but also have enabled intensive genome mining approaches to untap new natural products scaffolds. The identification of new biosynthetic pathways of potentially interesting new molecules has fostered the search for new biosynthetic clusters (BGCs) on draft genomes, which has taken advantage of improved bioinformatic tools for cluster identification and gene annotation, such as AntiSMASH [50–52]. Furthermore, the increasing numbers of almost complete genomes has permitted the extensive comparative genomic analyses of members of this bacterial group, and the identification of the genomic components and the evolutionary history of many different species [53–55]. These comparative studies are revealing the existence of a core genome within some members of actinomycetes as well as a divergence of BGCs among the different lineages. These results are providing a basis for understanding the functional evolution of species as shown for *Streptomyces* [56,57]. Another relevant aspect of the impact of the increasing number of BGCs sequence information on antibiotic discovery is the possibility of developing specific targeted genome mining searches in genomic libraries based on specific genomic signatures related to the biosynthesis of privileged scaffolds or functionalizations that could drive the discovery of novel compounds and chemical spaces (Table 1 and Figure 1) [35,36]. The integration of genomics with transcriptomics, proteomics and metabolomics is providing unique information for assessing the functional evolution of actinomycetes species. These data are encouraging the development of new approaches for the expression and engineering of these biosynthetic pathways and the identification of new bioactive compounds from cultured actinomycetes not only from underexplored habitats but also from large microbial collections that are still the untapped treasures of biosynthetic diversity (Figure 2).

**Figure 2.** Omics-driven discovery.

One of the major challenges that still remains in this context is the efficient cloning and expression of BGCs that are originally silent or poorly expressed in their natural host after the use of refactoring by the replacement of the regulatory elements and further detection of the synthetized compounds [37,58–60]. Thus far, many different in vitro DNA assembly or direct capture methods have been described to clone BGCs in heterologous hosts [38,39,61–63]. Many BGCs cannot be detected by the rule-based bioinformatic tools due to the absence of signature genes, but the application of prediction tools based on the frequencies of Pfam domains occurring in BGCs have improved the identification of additional clusters [51]. New genomic bacterial artificial chromosome (BAC) libraries built from large 100-Kb fragments of *Streptomyces* spp. genomic DNA are also being used in the high throughput functional screening approaches to identify non-predicted BGCs by heterologous expression [64].

#### **4. Eliciting Production from Silent Pathways**

The activation of silent BGCs in the well-characterized producing strains by direct gene manipulation is an alternative approach when the strains are amenable to genetic engineering. Many recent successful examples have been reported, which have described the use of the newest genetic engineering tools in modifying metabolic pathways, altering metabolic fluxes that are blocking unrelated metabolic pathways, inactivating transcriptional repressors, over-expressing pathway-specific activator genes or even multiplying the BGC copies in the original producing strains to increase titers [63,65–67]. The recent development of optimal ribosomal binding sequences and strong terminators to be applied in the control of the metabolic pathways of actinomycetes has opened new possibilities for gene expression modulation and represent promising tools for metabolic engineering (Table 2) [68].



Despite this success, a large proportion of wild-type and industrial strains remain recalcitrant to being manipulated and many silent BGCs cannot be directly activated using genetic tools. Culture-based approaches based on multiple nutritional conditions have been one of the most common methods employed to explore the media components required for the production of new compounds. The concept of using small molecules as elicitors by perturbing biological systems and signaling pathways dates back several decades and has been extensively used to activate silent or poorly expressed pathways with a broad range of small molecules, including the sub-inhibitory concentrations of antibiotics [69,70]. The first large scale elicitor screening reported was performed with more than 30,000 compounds on *S. coelicolor* and it identified a small number of compounds that are able to stimulate the production of some of the secondary metabolites by several times [89]. Many examples of hormesis have been described with sub-inhibitory concentrations of antibiotics and other well-known bioactive natural products, which has elicited a response that is associated with the major activation of secondary metabolism, induction of cryptic gene clusters and production of novel compounds [70,90]. The activation effect cannot be predicted from the antibiotic mode of actions and the lack of universal effectors to awaken all silent BGCs has added another level of complexity in the identification of new effector molecules [70,71]. The same effect is pursued by co-cultivation, which is an approach that has been used extensively with many different types of cultivation formats and strain combinations. The approach takes advantage

of the effect that small concentrations of signaling or antibiotic molecules from one of the strains can have on another strain. The difficulty to scale-up this approach as a general method is related to the impossibility of predicting which combinations will result in an effective response, which normally does not account for more than 15–20% of the cases studied [71]. Mycolic acids have been shown to play a role in the physical interaction and the activation of some silent pathways. New antibacterial compounds, such as alchivemycin A and B, arcyriaflavin E or ciromicins, were described after co-culturing different *Rhodoccoci* with the species of *Streptomyces*, *Tsukamurella* and *Nocardiopsis* (Table 1) (Figure 1) [40,41]. In other situations, the induction does not require cell-to-cell interaction and is only mediated by diffusible small effector molecules. One of the most recent examples is the production of the cryptic natural product keyicin from the co-cultivation of the producer *Micromonospora* strain with *Rhodococcus* [91] (Table 2).

Most recent studies on the differential metabolomic analysis of metabolites in response to cultivation conditions have shown that the chemical potential of actinomycetes is far from being fully characterized [92]. From a methodological perspective, the modern LC-MS and NMR analytical tools and differential metabolomic analysis have been the determinant for detecting the production of novel compounds in complex mixtures and mapping the response to external chemical signals. Comparative metabolomics have been efficiently used to identify the induced expression of secondary metabolites from *S. coelicolor* cryptic genes resulting from the exposure to multiplexed perturbations and to identify the subsets of primary and secondary metabolites that respond similarly across a large variety of stimuli [72]. The major challenge for these methods is related to the identification of novel bioactive molecules within the complexity of the metabolomic profiles. New dereplication and identification approaches are continuously being developed, which are based on the similarity of MS/MS patterns in natural product databases and NMR-based metabolomics [73–75]. Proteomining is another method developed to support this identification. The analysis links natural products to biosynthetic enzymes as it correlates protein expression profiles of biosynthetic enzymes to the metabolome of the producing strains based on the statistical analysis of strains cultured under different conditions [76,77].

#### **5. Harnessing Regulation of Primary and Secondary Metabolisms**

The production of secondary metabolites in actinomycetes is tightly regulated and responds to external stimuli from the environment. This regulation is the result of different classes of pathway-specific regulatory elements involving two-component systems, extra-cytoplasmic sigma factors or pathway specific regulators, such as the most recently described LmbU family [78–80]. Triggering the expression of BGCs frequently involves an additional transcriptional response through the master regulators involved in global regulatory metabolic networks that are not always pathway specific [81]. Understanding the right combinations of regulatory elements and transcription factors that regulate a BGC has been proposed as the "cracking the code" approach to be followed in identifying the key regulatory signals and the eliciting signals needed to set up culture conditions to activate a specific BGC [83]. The regulon predictor program PREDetector was developed to identify signaling cascades deduced from the in silico searches of regulatory elements, which revealed that primary metabolism transcription factors were also involved in controlling pathway-specific regulators [84]. One of the best examples are the pleiotropic regulators, DasR and CebR, that control the uptake of chitin and cellulose, for which responsive elements were identified upstream of many pathway-specific regulators [85–87]. The increasing number of available new genomes are enabling the new in silico approaches derived from comparative genomics to detect specific binding motifs of transcription factor orthologues [82]. Regulon prediction has been proposed as a successful strategy to identify the regulatory networks involved in the control of BGC expression and the most promising strains to be explored for the induction of silent pathways. This finely tuned participation of regulatory requirements for the specialized metabolite production also requires the precise provision of chemical precursors from primary metabolism. Recent reports highlight the role of primary metabolism gene expansions in secondary metabolite producing strains with impact on metabolic adaptation and strain

fitness and how they may represent another target for future genetic engineering interventions to improve production and activate silent pathways [88] (Table 2).

#### **6. Conclusions and Future Prospects**

Microbial prospection studies have continued to reveal that there is a huge and still poorly explored diversity of actinomycetes in the environment that is waiting to be mined for the isolation of new bioactive compounds. In addition, thousands of selected strains that are preserved in microbial collections and distributed across laboratories worldwide also should be revisited as they represent a unique reservoir of silent biosynthetic diversity that traditional approaches did not manage to unlock in its full extension. The continued increase in new genome sequences and the development of new genome-mining and genome-directed engineering tools to trigger the production of new natural products are paralleling the development of integrated analytical tools and open access databases. The application of comparative genomics, metabolomics and proteomics to culture-based studies is providing a wealth of knowledge on the physiology and the regulation systems, opening new avenues to approach the activation of silent or poorly expressed pathways. There is no doubt that all these advances are setting the new foundations for a new paradigm in natural product discovery, especially from actinomycetes. A sustained and integrated multidisciplinary research effort should respond to the major challenge of discovering new chemical classes of antibiotics that will be required to replenish the preclinical development pipeline in the near future.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **References**


© 2018 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Review* **Acyltransferases as Tools for Polyketide Synthase Engineering**

#### **Ewa Maria Musiol-Kroll \* and Wolfgang Wohlleben**

Interfakultäres Institut für Mikrobiologie und Infektionsmedizin, Eberhard Karls Universität Tübingen, Auf der Morgenstelle 28, 72076 Tübingen, Germany; wolfgang.wohlleben@biotech.uni-tuebingen.de

**\*** Correspondence: ewa.musiol@biotech.uni-tuebingen.de; Tel.: +49-7071-29-78839

Received: 2 June 2018; Accepted: 16 July 2018; Published: 18 July 2018

**Abstract:** Polyketides belong to the most valuable natural products, including diverse bioactive compounds, such as antibiotics, anticancer drugs, antifungal agents, immunosuppressants and others. Their structures are assembled by polyketide synthases (PKSs). Modular PKSs are composed of modules, which involve sets of domains catalysing the stepwise polyketide biosynthesis. The acyltransferase (AT) domains and their "partners", the acyl carrier proteins (ACPs), thereby play an essential role. The AT loads the building blocks onto the "substrate acceptor", the ACP. Thus, the AT dictates which building blocks are incorporated into the polyketide structure. The precursor- and occasionally the ACP-specificity of the ATs differ across the polyketide pathways and therefore, the ATs contribute to the structural diversity within this group of complex natural products. Those features make the AT enzymes one of the most promising tools for manipulation of polyketide assembly lines and generation of new polyketide compounds. However, the AT-based PKS engineering is still not straightforward and thus, rational design of functional PKSs requires detailed understanding of the complex machineries. This review summarizes the attempts of PKS engineering by exploiting the AT attributes for the modification of polyketide structures. The article includes 253 references and covers the most relevant literature published until May 2018.

**Keywords:** natural products; polyketides; polyketide synthases; acyltransferases; engineering; new bioactive compounds

#### **1. Introduction**

Polyketides are a large class of structurally complex compounds with interesting and valuable activities. Those agents are widely used as antibiotics, antifungals, and drugs for other clinical applications [1–3]. Erythromycin [4,5], mupirocin [6,7], rapamycin [8,9], FK506 [10,11], and epothilone B [12,13] are examples of antimicrobial, immunosuppressant, and anticancer drugs, respectively (Figure 1). In general, polyketides can be obtained from biological sources (e.g., actinomycetes) or chemically synthesized (semi- or total chemical synthesis). While the isolation of compounds from biological material is rather easy to implement, the chemical synthesis is quite challenging, often limited to certain type of reactions and resulting in low quantities [14]. Therefore, bioactive product discovery methods, which are based on traditional, bioactivity-guided screening and compound isolation are still relevant. There is even a renewed interest in this approach as a strategy for the identification of new drugs [15–19]. In particular, the developments in molecular biology, synthetic biology, sequencing technology, chemistry, and bioinformatics provide new opportunities for the classic screening programs and compound recovery from the natural environment. For example, the combination of the bioactivity-guided screening with sequencing and genome mining approaches supports and accelerates the downstream process of early drug development after a compound has been identified. Furthermore, the recently established methods provide new opportunities, not only for the optimization of the

production of low-yield natural products, but they also enable the in vivo/in vitro modification of the molecules, leading to advances in structural diversity. The new perspectives are particularly important for diversification of polyketides, as their structures are complex and thus difficult to access by chemical synthesis. It is the polyketide biosynthesis that offers many opportunities for implementation of the innovative technologies and tools to generate novel analogs. One of such "hotspots" enabling structural modifications of polyketides, are the core assembly lines and their components (Figure 2).

**Figure 1.** Structures and bioactivity of clinically relevant polyketides.

**Figure 2.** The erythromycin assembly line (DEBS) [20,21]. The prototypical type I modular PKS machinery is composed of three subunits DEBS 1, DEBS 2 and DEBS 3, which are organized into modules. Each module contains a set of domains catalysing one elongation step. The biosynthesis of the chain begins with the loading of the starter unit (derived from propionyl-coenzyme A(-CoA)) onto the loading module (load). Subsequently, the polyketide chain is extended by one extender unit (derived from methylmalonyl-CoA) on each of the "downstream" extension modules. In modules which possess reductive domains (optional), the polyketide intermediate is modified. The synthesized chain is released from the last module (off-load) by the thioesterase domain. Finally, the generated erythromycin intermediate (6-deoxyerythronolide B) is further processed in the post-PKS steps (tailoring). PKS domains: AT, acyltransferase; ACP, acyl carrier protein; KS, ketosynthase; DH, dehydrogenase; KR, ketoreductase; ER, enoylreductase; TE, thioesterase.

The polyketide chain synthesizing enzymes, the polyketide synthases (PKSs), often differ in their composition and organization, and the iterative or non-iterative fashion. Based on those features, the PKSs were classified into "different" groups: modular type I PKSs (including the so-called *cis*-AT (AT-containing) type I PKSs and *trans*-AT (AT-less) type I PKSs), iterative type I PKSs, type II PKSs (iterative), and type III PKSs (iterative (chalcone synthase-like PKSs)) [22]. However, in the past two decades many PKS assembly lines were identified, which do not conform to this classification (non-canonical PKSs) [22–27]. For example, in aureothin or sceliphrolactam biosynthesis, the modular PKSs AurA [28] and SceQ [29], respectively, harbour an iterative module. Because of such discrepancies, the described classification and terminology are currently used with caution, especially for the newly identified, non-canonical PKS pathways.

Modular type I PKSs (e.g., the prototypical 6-deoxyerythronolide B synthase (DEBS) from erythromycin) are multifunctional megaenzymes, equipped with domains catalysing the successive linkage of simple building blocks to a polyketide chain (Figure 2). The PKS domains are organized into units termed modules, of which each harbours a set of domains required for one elongation (and modification) cycle. Thus, a minimal PKS consists of the essential domains: the acyltransferase (AT), the acyl carrier protein (ACP) and the ketosynthase (KS). The AT is either embedded into the PKS (*cis*-AT PKS) [30] or encoded as a separate gene, of which the gene product (discrete AT) complements the AT-less PKS (*trans*-AT PKS) [31,32]. In contrast to PKS-independent acyltransferase enzymes, suchas dihydroxyacetonephosphate acyltransferases [33] or long-chain-alcohol O-fatty-acyltransferases (wax synthases) [34,35], which typically transfer acyl groups to a non-acyl carrier protein (non-ACP) acceptor molecule, the ATs of polyketide assembly lines load simple units (building blocks) derived from thioester-activated precursors onto the ACPs of a PKS.

After the ACP-loading step, the KS domain of the PKS catalyses the decarboxylative Claisen-like condensation of the newly loaded unit and the already existing polyketide chain. Additional optional domains, such as the ketoreductase (KR), dehydratase (DH), enoylreductase (ER) and methyltransferase (MT) process the generated β-ketoacyl thioester intermediate. Finally, the polyketide chain is released from the last module of the assembly line, which is usually catalysed by thioesterase (TE). Tailoring enzymes (if present) further modify the intermediate into the final product (Figure 2).

The structural diversity of polyketides can be assigned to several factors. Those include the length of the polyketide chain, oxidation state and stereochemistry of the β-keto groups, mechanism of termination and chain release, as well as post-PKS tailoring steps [2]. Most of those variations result from the features of the PKS system (type of the PKS, PKS module and domain composition) and modification enzymes. However, the fact that the ATs provide the essential precursors for the polyketide biosynthesis and that they differ in their substrate specificities across the PKS pathways make the ATs to one of the most promising "targets" for engineering of the polyketide assembly lines. Although, the AT-based engineering was successful in some cases, this strategy is still limited by numerous challenges resulting from the complexity of the PKS systems. Therefore, the AT-based engineering is not yet a well-established high-throughput technology. To eliminate or at the very least reduce the engineering bottlenecks and increase the effectiveness of this approach, further knowledge about those complex systems is required. This will contribute to a better understanding of the PKS machineries and enable a more predictable PKS-engineering.

The goal of this review is to present an overview on the AT-targeted PKS engineering attempts and the recently gained knowledge, which renews the hope for programmable PKS modification and the production of new polyketide derivatives. The synopsis includes an introduction into the PKS assembly lines and its AT-domains, a brief summary of the biosynthesis of polyketide precursors, mechanistic and structural insights into PKSs and finally a more detailed description of the AT-based engineering strategies applied to modular type I PKSs and their future perspectives.

#### **2. Polyketide Synthases and the Essential Acyltransferase Domains**

The PKSs are the core enzymes of the polyketide biosynthetic machineries. In modular PKSs, the PKS domains are combined into modules. Each module contains one set of domains required for one elongation and potential modification cycle (Figure 2).

In this review, we discuss the AT-based PKS engineering efforts applied to modular type I PKSs. The structure of a polyketide scaffold, synthesized by a modular PKS assembly line reflects the order of the catalytic domains within the PKS-modules, which usually corresponds to the chromosomal order of the underlying genes. This assembly line-like structure of the textbook modular PKSs (*cis*-AT type I PKSs) and the corresponding products was defined as the "collinearity rule" [21,36–38]. It was thereby generally assumed that relying on the sequence encoding the assembly line and the collinear architecture of the modules/domains, including the conserved domain motifs, enable the prediction of the polyketide structure and its stereochemistry [39–41]. However, the previous PKS "module" definition, which was deduced from the organization and function of the domains observed in mammalian fatty acid synthase (KS → AT → DH → ER → KR → ACP → TE; from the KS to the ACP) was reconsidered very recently. In 2017, Zhang et al. reported that processing enzymes co-migrate

during assembly line evolution with the KS domain downstream (not upstream) of the ACP [42]. Consequently, the term "module", for modular *cis*-AT PKSs had to be redefined and the polyketide assembly lines updated [43]. According to the newly proposed definition, a PKS module containing all processing domains has the following organization: AT+DH+ER+KR+ACP+KS. Subsequently, 526 ACPs from 33 characterized *trans*-AT PKS assembly lines were analysed using bioinformatic tools [44]. To group the ACPs, "module types" (*a*–*y*) were defined based on the chemistry they perform and classes of enzymes present in the module. A cladogram of ACPs that belong to defined module types was generated and ACP families, which contain related module types, were specified [44]. The analysis uncovered that ACPs from the same module type generally clade together, reflective of the co-evolution of these domains with their cognate enzymes [44,45]. Moreover, cladograms of KSs upstream and downstream of ACPs revealed that in most of the analysed systems, the KSs downstream of ACPs from the same module type also clade together. However, this was not the case for the KSs upstream of ACPs, which was inconsistent with the traditional definition of a module. Therefore, the authors suggested to update the term "module" also for the *trans*-AT assembly lines [44]. We refer to the fact that a new definition of the PKS module was recently introduced. Nonetheless, to avoid any confusion in this review, we rely on the original publications and use the previously reported module nomenclature whenever a PKS assembly line was characterized according to the old module terminology.

The modularity, together with the structural and functional PKS characteristics further strengthened the idea of exploiting the megaenzymes and their domains for diversification of polyketide molecules. The ATs are particularly eligible for these purposes as those domains are responsible for the selection of the precursors and provision of the building blocks (units) to the PKS. More specifically, the precursor- and occasionally the ACP-specificity of the ATs are features that essentially contribute to structural diversity of polyketides and encourage the AT-based PKS engineering. Albeit there are still gaps in understanding of the complex polyketide assembly lines [46,47], various approaches of AT-engineering, including AT domain swapping, AT site-directed mutagenesis, and AT knockout and complementation by an AT-domain from other modules were reported [48,49] (Table 1 and Figure 3). However, it does not necessarily mean that the engineering potential of PKSs by manipulation of the AT features is exhausted. Recent findings such as updates on PKS module boundaries might boost implementation of these strategies and lead to a new breakthrough in this field. Therefore, it seems to be the right time to summarize the available information on the ATs of PKSs, the AT-based engineering of polyketides and the potential new prospects.


**Table 1.** Examples of acyltransferase-based polyketide synthase engineering. Abbreviations: acyltransferase (AT); methylmalonyl-coenzyme A (MM-CoA);malonyl-CoA (M-CoA), methoxymalonyl-ACP (MeO-ACP).

**Figure 3.** General strategies for acyltransferase-based engineering of polyketide synthases. The acyltransferase (AT)-based engineering of polyketide synthases (PKSs) includes the AT domain swapping, AT-site directed mutagenesis and the AT-cross complementation. In the AT domain swapping approach, the AT (red-purple) of a "target" assembly line is replaced by an AT from another module of the native assembly line or an AT of a foreign pathway (heterologous ATs) (orange). AT-site directed mutagenesis usually involves the identification of conserved motifs, which contribute to substrate specificity of the AT and the mutation of those sites of the AT (red-purple AT with orange circles/black outline). This approach enables the alteration of the AT-substrate specificity. In AT cross-complementation experiments, the AT (red-purple) of a "target" assembly line is inactivated and the PKS module is complemented by a foreign AT (orange or blue). The AT for complementation of the AT-inactivated PKS is either incorporated in place of the native AT (*cis*- to *cis*-AT complementation (orange)) or encoded as separate gene, of which gene product (discrete AT (blue)) complements the assembly line *in trans* (*cis*- to *trans*-AT complementation). Because the outcome of the *cis*- to *cis*-AT complementation is similar to AT-domain substitution or swapping, this strategy is often regarded as such.

#### *2.1. Acyltransferase Substrates and Loading of the Acyl Carrier Protein*

ATs are indispensable for polyketide assembly as they select the precursors and load the building blocks (units) for the polyketide biosynthesis onto the ACPs of PKSs. However, the ACPs have to be "prepared" for accepting the unit prior to the loading reaction. This is accomplished in a post-translational modification by the 4-phosphopantetheinyl (Ppant) transferase (PPTase) [68–70]. The enzyme transfers the Ppant arm of coenzyme A (CoA) to the serine residue of the ACP (*apo*-ACP form), which results in the activated ACP (*holo*-ACP form). The ATs of PKSs load the building block onto the Ppant group of the *holo*-ACP. The units for loading of the *holo*-ACP originate from thioester-activated precursors (e.g., CoA- or ACP-linked units such as acyl starter unit, malonyl units or their 2-substituted derivatives). The loading of the ACP in PKSs is accomplished in a two-step reaction [71]. In the first step, the AT recognizes and binds the precursor to its active serine residues, which results in the acyl-*O*-AT intermediate (self-acylation reaction; step 1). Subsequently, the acyl-*O*-AT interacts with the activated ACP (*holo*-ACP) and transfers the unit onto the protein, which leads to the formation of an acyl-*S*-ACP (transacylation reaction; step 2).

The ATs of *cis*-AT PKSs usually load only one type of extender units (e.g., malonate derived from malonyl-CoA or methylmalonate derived from methylmalonyl-CoA) onto its ACP, which strongly indicates that the ATs are substrate-specific at the native polyketide production conditions. However, there are examples of assembly lines where an AT domain transfers different substrates simultaneously to the same ACP, resulting in the production of polyketide mixtures. This is the case, for example, for the AT domain in module five of the monensin PKS (AT5mon). The monensin AT5 accepts both, ethylmalonyl- and methylmalonyl-CoA as substrates, leading to the production of monensins A and B [72,73] at standard fermentation conditions. The inherent promiscuity of the AT5mon was exploited for loading of synthetic non-natural extender units (allylmalonyl-*N*-acetylcysteamine (allylmalonyl-SNAC), propargylmalonyl-SNAC, propylmalonyl-SNAC, butylmalonyl-SNAC, and hexyl-SNAC) onto the cognate ACP in in vivo feeding experiments [74]. Except 2-hexylmalonyl-SNAC, which was toxic to the producer strain, the synthetic building blocks were incorporated in significant amounts and new premonensin derivatives, including propargyl-premonensin, were produced. This suggests that some ATs might accept a wider range of substrates than originally assumed and that limitations of the availability of non-native precursors for the production of polyketide derivatives can be overcome by feeding with synthetic building blocks. Moreover, it makes the idea of heterologous expression of genes or pathways, directly delivering non-native precursors for the production of polyketide derivatives, very promising (examples for this strain engineering approach were also provided in Sections 3.1.1 and 3.1.2).

In the following, we briefly summarized the biosynthetic routes for those precursors, which are frequently used by the ATs for the assembly of polyketide structures.

#### 2.1.1. Provision of Malonyl-CoA and Methylmalonyl-CoA Precursors

Malonyl-CoA and methylmalonyl-CoA are most commonly used and metabolically available precursors for the biosynthesis of polyketides. In general, there are two pathways for the biosynthesis of malonyl- or methylmalonyl-CoA. One route is the carboxylation of the acetyl- and propionyl-CoA, respectively [75,76]. The acetyl-CoA carboxylases (ACCs) carboxylate acetyl-CoA to malonyl-CoA and propionyl-CoA carboxylases (PCCs) convert propionyl-CoA to methylmalonyl-CoA [77,78]. The other route involves the direct conversion of malonate or methylmalonate to the CoA-activated forms by ATP-dependent malonyl-CoA synthetase and homologous enzymes. Malonyl-CoA synthetases, such as MatB have been described for *Streptomyces coelicolor* [79] and *Rhizobium* [80–82]. Hughes and Keatinge-Clay studied the substrate flexibility of the *S. coelicolor* MatB and have shown that this adenylate-forming enzyme is capable of producing most CoA-linked polyketide extender units, as well as pantetheine- and *N*-acetylcysteamine-linked analogs, useful for in vitro PKS studies [79]. It was demonstrated that the methylmalonyl groups ligated by MatB to CoA, pantetheine or *N*-acetylcysteamine were utilized and incorporated into a triketide pyrone by the terminal module of the 6-deoxyerythronolide B synthase (Mod6TE) in vitro [79].

The MatB enzyme of *Rhizobium trifolii* has shown a tolerance for a variety of C2 substituted malonic acids [80] however, in most cases the activity towards the non-native substrates was low. This encouraged the engineering of the malonyl-CoA synthetase and the generation of enzyme mutants, which deliver exotic precursors [65,83–85] for polyketide biosynthesis with improved efficiency.

The incorporation of some of the generated unnatural building blocks into polyketide scaffolds was confirmed in vivo [86–90], which makes the use of the engineered MatB variants attractive for non-chemical polyketide derivatization ("bio-derivatization").

#### 2.1.2. Provision of Ethylmalonyl-CoA and Exotic Alkylmalonyl-CoA Precursors

In addition to the typically used substrates malonyl-CoA and methylmalonyl-CoA, some ATs of PKSs utilize unusual alkylmalonyl-CoA precursors for polyketide assembly. Such exotic precursors are generated by crotonyl-CoA carboxylase/reductase (CCR) and homologues (CCRs were broadly defined as enoyl-CoA carboxylase/reductases (ECRs)) [91–94]. CCRs/ECRs are members of the medium-chain reductase/dehydrogenase (MDR) protein superfamily, which can be found in all three domains of life [95]. The MDR superfamily includes enzymes that reduce either C=O or C=C bonds in α,β-unsaturated carbonyl compounds.

Recent structure-based engineering of the active-site binding pocket of CCRs enabled significant alteration of their catalytic activity towards non-native substrates [96,97]. For example, site-directed mutagenesis of the V350G-site in the CCR enzyme AntE expanded its substrate scope to afford indolylmethylmalonyl-CoA [96].

An alternative strategy to the construction of modified biocatalysts for the provision of unusual polyketide precursors is the screening and identification of independent pathways or novel enzymes, directly delivering the exotic substrates. In 2016 Ray et al. described a CCRC (crotonyl-CoA reductase/carboxylase)-independent mechanism for assembly of unusual PKS precursors, where an acyl-CoA carboxylase (YCC, biotin-dependent enzyme) directly carboxylates medium chain acyl-CoA thioesters to alkylmalonyl-CoA [98].

The provision of diverse precursors is the prerequisite for the exploitation of the AT promiscuity and the production of polyketide derivatives. Malonyl-CoA synthetases (e.g., MatB-like synthetases) and CCR enzymes enable the generation of a variety of substrates, which can be utilized by the ATs. However, for their use in directed production of AT substrates, the enantiomeric selectivity of the ATs needs to be considered [99]. For example, the AT domains of the model assembly line erythromycin (Figure 2) utilize solely the (*2S*)-isomer of methylmalonyl-CoA for each of the extension step [100]. In such cases the application of CCRCs or biotin-dependent carboxylase enzymes, which generate (*S*)-enantiomers might be the better route to obtain 2-substituted malonyl-CoAs, compared to malonyl-CoA synthetases providing (*R*)-enantiomers. Nevertheless, both enantiomeric forms are useful for testing and characterization of the AT-stereochemistry, which is an important determinant for the efficient incorporation of unnatural moieties into polyketide structures.

#### *2.2. Mechanistic and Structural Insights into Acyltransferases and Polyketide Synthase Modules*

The challenges of polyketide engineering, resulting from the complex modularity and functionality of PKSs became more obvious after some of the initial PKS engineering experiments, including skipping and swapping of domains/modules failed [101–103]. Disruptions of the modular architecture of the PKS, destructions of the PKS polypeptide integrity and/or prevention of protein-protein interface interactions were common reasons for the insufficiency of the modified PKSs. This demonstrated that not all paradigms derived from investigations of model assembly lines can be arbitrary transferred to another PKS biosynthetic pathway for its engineering. Accordingly, detailed understanding of these systems is required and indispensable for rational PKS modification. This encouraged the more comprehensive mechanistic and structural analysis of the PKS assembly lines. Here, we focus on AT-substrate and AT-ACP-KS interactions, and present examples underlining the significance of the interdomain communication for the polyketide biosynthesis.

#### 2.2.1. Substrate Recognition and Acyltransferase-Acyl Carrier Protein Interactions

Typically, the ATs accomplish the loading of the ACP by binding of the precursor and the release of the CoASH group prior to the interaction with the respective ACP domain [86,104]. During the loading process, the acyl-AT intermediate is exposed to a nucleophilic attack of the thiol residue of the ACP's phosphopantetheine moiety. This reaction scenario was termed "ping-pong-bi-bi" mechanism [105]. However, exceptions of this mechanism were observed. For example, the *trans*-AT Dis/Dsz AT from

the disorazole pathway exhibited different kinetics, compared to the classical "ping-pong" mechanism. Wong et al. demonstrated that the transacylation in case of the *trans*-AT Dis/Dsz AT depends on the interaction with its ACP before CoASH is released from the building block [106].

The significance of the AT-ACP interactions for the transacylation was also confirmed for other assembly lines (e.g., erythromycin, zwittermicin A or vicenistatin).

In the erythromycin PKS DEBS, protein-protein interactions between AT domains, ACP domains and the linkers that flank AT domains were systematically probed [48,107,108]. The ATs and their cognate ACPs exposed at least 10-fold greater specificity for each other than for heterologous proteins [107]. Moreover, the flanking (N- and C-terminal) linkers of an AT domain contributed to the efficiency and specificity of transacylation, underlining the importance of the linker regions for proper protein-protein interaction [107].

The examples of AT-substrate interactions described above involve ATs which select free precursors and load the building blocks onto ACPs being integrated into the PKS. In addition, AT-substrate recognition and protein-protein "communication" were investigated for the less abundant ACP-linked units (standalone ACPs to which the building block is attached, such as hydroxymalonyl-ACP). In zwittermicin A biosynthesis, the acyltransferase domain of the PKS ZmaA (ZmaA-AT) recognizes the hydroxymalonyl-ACP as substrate and transfers the hydroxymalonyl unit to a downstream ACP via a transacylated AT domain intermediate [109]. The X-ray crystal structure of ZmaA-AT revealed that the ACP itself biases extender unit selection [110]. Furthermore, it indicated that the ACP interaction with the hydrophobic motif promotes secondary structure formation at the binding site and leads to opening of the adjacent substrate pocket lid to allow the binding of the substrate in the active site of the AT [110].

In the biosynthesis of vicenistatin, the AT VinK transfers a very early vicenistatin-intermediate between two ACPs, the standalone ACP VinL and VinP1LdACP (loading module) [111,112]. Thus, it was strongly suggested that the AT VinK needs to distinguish between the ACPs VinL and VinP1LdACP. In a later study, a crystal structure of the AT–ACP (VinK-VinL) complex was obtained, which revealed that Arg153, Met-206, and Arg-299 of VinK interact with the negatively charged helix II region of VinL [113]. The structure of the VinK-VinL complex visualized the interaction between an AT and ACP and provided the first detailed mechanistic insights into ACP recognition by an AT enzyme.

#### 2.2.2. Interactions between the Acyltransferase, Ketosynthase and the Acyl Carrier Protein

The condensation of ACP-bound extender units with the growing polyketide chain is essential for completing the biosynthesis and for the formation of the final product. Those steps (translocation and elongation) are dependent on protein-protein recognition between the AT, KS and the ACP domains [47,114–118]. Regions in the KS domain, KS to AT linkers, and the AT domains of DEBS were described as docking site for the ACP during chain transfer and elongation [118,119]. The proteinsubstrate and protein-protein interactions cause conformational changes within PKS modules during the catalytic cycle of polyketide biosynthesis. Based on the cryo-EM data obtained for the full-length PikAIII module of pikromycin, Dutta et al. and Whicher et al. described the dynamics of PKS modules as concerted actions to mediate appropriate substrate processing [120,121]. Specifically, the ACP domain is differently positioned after polyketide chain substrate loading onto the active site of the ketosynthase, after extension to the β-keto intermediate, and after β-hydroxy product generation. The conformational rearrangements enable optimal positioning for reductive processing of the polyketide chain elongation intermediate, bound to the ACP [121].

More recently, the synchronized processing of intermediates on a PKS assembly line, was reported as "turnstile mechanism" (chain elongation induces module closure, while chain translocation to the next module reopens the turnstile) [122]. In this study, it was shown that modules, of which the ACP is occupied by an acyl substrate, are not able to load an ACP-borne diketide intermediate onto their KS. The KS was first accepting an acyl chain after the substrate was released from the ACP. However, the exact mechanisms that "communicate" the ACP-loading status to the KS are still unknown. The presumption that the acylation of the KS itself initiates the transfer of the information about the ACP-conformation and ACP-loading state to the KS remains speculative.

#### 2.2.3. Docking Domains: Intersubunit Communication in Polyketide Assembly Lines

The recent mechanistic and structural data on the PKS assembly lines and their modules provided additional details about these complex systems and uncovered some of the problems, which led to the failure of the initial PKS engineering efforts. From the obtained data, it was possible to generate a more nuanced map of modular *cis*-AT PKS structure and function [49,123], including the determination of so called "docking domains" (DDs) at the N and C termini of the PKSs [124]. The N- and C-terminal DDs ensure the correct PKS assembly into a functional multiprotein complex. For example, the polyketide core of erythromycin A is synthesized by three multienzyme subunits DEBS 1, DEBS 2, and DEBS 3, each harbouring two extension modules (Figure 2). This requires a proper inter-modular transfer between ACP and KS domains as well as inter-protein transfer between the *cis*-AT PKSs. A typical "docking domain", which enables the intersubunit communication, was identified in the erythromycin assembly line [124] (e.g., between the C terminus of DEBS 2 and N terminus of DEBS 3). The structure of the DD contains two separate four *α*-helix bundles, which together mediate not only specific docking interactions, but also promote dimerization of each homodimer [124].

Docking domains were postulated and experimentally confirmed for the less well understood *trans*-AT PKSs [125–127]. In this section, we describe the DDs involved in intersubunit communication of the PKS systems, which must not be confounded with the regions, C-terminal to the KS, which were originally proposed as AT-docking domains for the discrete AT in leinamycin biosynthesis (*trans*-AT PKS pathway) [128,129].

Unlike the *cis*-AT PKSs, in which the DDs for intersubunit communication occur between the PKS modules (the docking is C-terminal to the ACP of an upstream polypeptide and N-terminal to the KS of a downstream polypeptide), *trans*-AT PKSs are disconnected within and between modules. Specifically, the junctions between subunits in *trans*-AT PKSs occur within and between PKS-modules [125,126]. Such C- and N-terminal DDs were observed for VirA and VirFG in the *trans*-AT PKS pathway of virginiamycin [125]. In this study, it was shown that the deletion of the C-terminal partner (VirA CDD) or the downstream catalytic domain, significantly affects the N-terminal DD (VirFG NDD). The N-terminal DD (VirFG NDD) exhibited multiple characteristics of an intrinsically disordered protein [125]. The stability of the protein (VirFG NDD) was recovered after fusion of the docking domains. Similar results were provided for the *trans*-AT PKS system of macrolactin [125]. Two-helix, pseudosymmetric motifs of the length of about 25-residue, which non-covalently connect domains between and within the PKS modules were identified. The docking domains and their cognate domain were heterologously expressed and purified as stable complexes from *Escherichia coli* [126], indicating their importance, not only for the correct assembly of the polypeptides, but also for the overall protein stability of the PKS complex. Furthermore, the structure and portability of the four-helix bundle docking domains was demonstrated in deletion and swapping experiments [126].

Very recently, a fundamentally different mechanism of intersubunit communication at the KS/DH was reported for the *trans*-AT PKS gladiolin (gbn) [130]. In contrast to the virginiamycin and macrolactin PKSs, which use mutually compatible docking domains at both, the N- and C-termini of the interacting subunits [125,126], the GbnD4 KS domain utilizes a single, largely unstructured docking domain at the C terminus for a direct interaction with the GbnD5 DH domain [130]. The data confirmed that the docking domain is required for the communication of the KS with the ACP, appended to the DH.

Successful engineering of PKS pathways, especially in case of non-canonical assembly lines, requires the identification of crucial interaction regions for the PKS subunits and the elucidation of the mechanism involved in the communication between their domains. The new insights are valuable for the maintenance of proper protein-protein interactions, dynamics, and finally, for the chemical outcome of the modified PKS assembly lines.

#### **3. Strategies of Acyltransferase-Based Polyketide Engineering**

The most significant and commonly applied strategies of AT-based engineering of PKS pathways are: the AT domain-swapping, site-specific mutagenesis of the ATs and the cross-complementation of an AT-inactivated PKS by another AT, either from the cognate assembly line or using heterologous ATs (Figure 3 and Table 1).

#### *3.1. Domain-Swapping*

In general, there are two strategies to achieve AT domain swapping in a PKS system. The AT domain can be exchanged by another AT or indirectly replaced by an entire module swapping. For both principles, endogenous ATs/modules from the native assembly line or ATs/modules of a foreign pathway (heterologous parts) can be utilized. In cases where the original AT substrate specificity was altered and functional engineered PKSs were generated, new polyketide derivatives are produced (Figure 4).

#### 3.1.1. Acyltransferase Domain Substitution and the Provision of the Required Non-Native Precursors

One of the most prominent AT-based engineering approaches for PKSs and polyketide structures is the substitution of the AT domain with an AT of a different substrate specificity. The best known example of an assembly line to which this strategy was applied, is erythromycin. In the following, examples of AT-swapping in the erythromycin PKS (DEBS) and in a modified variant of the DEBS (DEBS1-TE), and cases of AT-substitution combined with the optimization of precursor supply are presented. Moreover, the replacement of an AT in a non-erythromycin PKS, is described.

The erythromycin PKS DEBS was successfully engineered by replacing the methylmalonyl-CoA specific ATs from modules 1 and 2. For example, malonyl-CoA specific ATs from *Streptomyces hygroscopicus* ATCC 29253 (Hyg-AT2 from module 2 of a type I PKS and RAPS-AT14 from module 14 of the rapamycin PKS) and from *Streptomyces venezuelae* ATCC 15439, (Ven-AT from a PKS-like pathway) were inserted into the DEBS of the wild type strain *Saccharopolyspora erythraea* ER720 [50,131]. The mutants of *Sacch. erythraea* ER720, harbouring the engineered DEBS1 produced novel, bioactive erythromycins (12-desmethyl-12-deoxyerythromycin A, 10-desmethylerythromycin A and 10-desmethyl-12-deoxyerythromycin A) [50].

The engineering of the DEBS-PKS was also successful in heterologous hosts, such as *S. coelicolor* CH999 [132] and *Streptomyces lividans* K4–114 (in both strains the actinorhodin gene cluster was deleted) [133,134], expressing the entire DEBS-PKS. Liu et al. have shown that swapping of the DEBS-AT from module 6 in the *S. coelicolor* CH999 strain, with the AT originating from module 6 of a rapamycin PKS, leads to the production of the antibacterial erythromycin analog 2-nor-6-deoxyerythronolide B [54]. McDaniel et al. used the two heterologous hosts, *S. coelicolor* CH999 and *S. lividans* K4–114 and applied AT-substitution, combined AT and KR swapping, and other engineering strategies to generate a library of >50 macrolides, including examples of analogs with one, two, and three altered carbon centers of the polyketide products [135].

To minimize the DEBS and achieve a premature release of the polyketide chain from the erythromycin PKS, the assembly line's TE domain was relocated to the C-terminus of various ACPs (e.g., fusion of the TE to the C-terminus of the DEBS1 resulted in DEBS1-TE) [136,137]. The minimized PKS DEBS1-TE, which consisted of the first two modules of the erythromycin PKS-complex and the TE domain, immediately became a model system for studying not only the mechanistic issues in polyketide biosynthesis [138,139], but also for the engineering of the DEBS-PKS (e.g., AT-domain substitution). Among other experiments, the DEBS1-TE was used to replace the entire AT domain from module 1 by a heterologous AT derived from module 2 of the rapamycin PKS [51]. Two new lactones (2-methyl-3,5-dihydroxy-*n*-hexanoic acid δ-lacton and 2-methyl-3,5-dihydroxy-*n*-heptanoic acid δ-lacton) were produced by *S. coelicolor* containing the modified DEBS1-TE [51]. Although functional engineered PKSs were obtained, the yields of the products generated by these chimeras often greatly depended on the position of the polyketide chain at which the new building block was incorporated

and frequently resulted in significantly lower production of the compounds [50,51]. In contrast, Lau et al., generated a variant of the bimodular DEBS1-TE, in which the AT2 domain was replaced with the malonyl-CoA-specific AT2 domain of the rapamycin PKS that produced 10 mg/L of the expected 2-desmethyl triketide lactone [52]. The productivity of the recombinant *S. coelicolor* strain, containing the engineered DEBS1+TE, was 50% relative to the production levels of the parent triketide lactone ((2*R*,3*S*,4*S*,5*R*)-2,4-dimethyl-3,5-dihydroxy-*n*-heptanoic acid δ-lactone), 20 mg/L [52].

**Figure 4.** Examples of AT swapping [50]. (**a**) Part of the erythromycin assembly line illustrating DEBS 1 and the final structure of erythromycin A, produced by the wild type strain *Saccharopolyspora erythraea* ER720. (**b**) Mutant in which the AT domain of module 1 was replaced by a malonyl-CoA specific AT (loading malonate onto the ACP). The mutant strain produced the derivative 12-Desmethyl-12-deoxyerythromycin A. (**c**) Mutant in which the AT domain of module 2 was replaced by a malonyl-CoA specific AT (loading malonate onto the ACP). The mutant produced the derivatives 10-desmethyl-12-deoxyerythromycin A and 10-desmethylerythromycin A. Abbreviations: AT, acyltransferase; ACP, acyl carrier protein; KS, ketosynthase; KR, ketoreductase; mmAT: AT loading methylmalonate; mAT: AT loading malonate.

Further engineering attempts of the DEBS1-TE fusion include the replacement of module 2 of DEBS1-TE with the module 12 from rapamycin or module 3 from the erythromycin assembly line [53,60]. In both cases, engineered PKSs were obtained and triketide lactones, in which for example, acetate extender units were incorporated instead of propionate, were identified in the recombinant *Sacch. erythraea* JC2 strain [140] (*Sacch. erythraea* JC2 is a genome reduced derivative of *Sacch. erythraea* NRRL2338 lacking almost all erythromycin PKS genes). In another study the strain *Sacch. erythraea* JC2 was exploited for the replacement of the AT1 (methylmalonyl-CoA specific) in the model PKS DEBS1-TE with the epothilone EpoAT2 (malonyl-CoA specific in epothilone biosynthesis) and EpoAT3 (utilizes malonyl- and methylmalonyl-CoA at native epothilone production conditions) [55]. Functional PKSs, producing the expected triketide lactone compounds, were generated. However, lower yields of total products were detected when compared to DEBS1-TE (2% and 11.5% respectively) [55].

In some cases the alteration of the AT with an AT incorporating a different extender unit is not sufficient as the substrate is not available in the producer and it requires the external supplementation with the precursor or further genetic modification of the strain to overcome this limitation. Such an observation was made for a substitution experiment in which the AT domain of module 4 in erythromycin (eryAT4/methylmalonate-specific) was exchanged with the AT domain from module 5 (nidAT5/ethylmalonate-specific) of the niddamycin PKS [56]. The strain containing the modified PKS produced erythromycin A, but did not synthesize the ethyl-substituted derivative. After supplementation of the culture with ethylmalonate, moderate production of 6-desmethyl-6-ethylerythromycin A (6-ethylEr) was detected in addition to erythromycin A. To eliminate the limitation and improve the intracellular ethylmalonyl-CoA concentration, the crotonyl-CoA reductase (see also Section 2.1.2) from *Streptomyces collinus* [141] was introduced into the *Sacch. erythraea* EAT4, yielding *Sacch. erythraea* EAT4-crr. The *crr*-expressing strain synthesized the derivative 6-ethylErA [56].

Another example of a combined AT-substitution- and precursor supply optimization approach was provided by Kato and co-authors [57]. In their study, the previously designed *S. lividans* K4–114, harbouring a plasmid with genes encoding the erythromycin DEBS was modified: namely, the AT6 domain was replaced by the presumably hydroxymalonate-specifying *fkbA*-AT8 domain [54,58]. This DEBS construction was utilized for a substitution with an AT providing an exotic substrate (methoxymalonate) to the ascomycin assembly line [57]. The *fkbA*-AT8 (originally AT6 in DEBS) of the modified DEBS in *S. lividans* K4–114 was exchanged by the heterologous AT8 domain (loading methoxymalonate onto the PKS) from ascomycin (FK520, synthesized by *S. hygroscopicus*). In addition, a subcluster of five genes (*asm13-17*) from the ansamitocin biosynthetic gene cluster of *Actinosynnema pretiosum* was coexpressed in the modified *S. lividans* K4–114 strain to provide the methoxymalonyl extender unit for the engineered erythromycin PKS. Two novel analogs of erythronolide, 2-desmethyl-2-methoxy-DEB and 2-desmethyl-6-DEB, were produced by this strain [57].

Besides the AT of the erythromycin PKS, other assembly lines were modified by AT-swapping to produce polyketide derivatives. The AT domains in the synthesizing PKS-complex of geldanamycin, a potential anticancer drug, were substituted in six AT swaps (in modules 1, 2, 3, 4, 5, and 7) of the seven GdmPKS modules [59]. The AT-swapping using the RapAT2 domain and/or the RapAT14 domain, both from the rapamycin PKS, resulted in functional PKSs in four (modules 1, 4, 5, and 7) of the six modules. The geldanamycin analogs: 2-desmethyl, 6-desmethoxy, 8-desmethyl, and 14-desmethyl derivatives, including one analog with a four-fold enhanced affinity for its target (chaperone Hsp90, essential for growth of cancer cells), were produced in *S. hygroscopicus* [59].

3.1.2. Examples of Acyltransferase Domain Substitution by Exchanging the Entire Module and the Supply of Non-Native Precursors

The AT substitution resulting from the exchange of entire modules of a PKS assembly line is an indirect strategy and often a "side-effect" of an engineering experiment, primary aiming at alteration of a module and not necessarily at AT-swapping. The attempts of module swapping were recently discussed elsewhere [138,142–147]. Therefore, we provide only a few of the most relevant or recent examples for this engineering strategy.

The exchange of a whole module was successful for loading modules as well as for a chain extension module of a PKS [148–150]. For example, the wide-specificity loading module of the avermectin was introduced in place of the first module of DEBS and the resulting hybrid PKS gene was expressed in *Sacch. erythraea* [149]. Novel erythromycins derived from endogenous branched-chain acid starter units were observed, confirming the functionality of the engineered PKS and the flexibility of the assembly line, which accepted and prolonged the polyketide chain with alternative moieties [149].

Replacing of an entire loading module by a loading module of the avermectin PKS from *Streptomyces avermitilis* or the erythromcyin PKS of *Sacch. erythraea* was also successful for the spinosyn PKS in *Saccharopolyspora spinosa* BIOT-1066 [151]. The resulting strain derivatives of *Sacch. spinosa* BIOT-1066, expressing the hybrid PKS pathways produced the anticipated spinosyn analogs [151]. In a previous study, the inherent promiscuity of the loading module of the avermectin PKS was exploited for the generation of a compound library in *S. avermitilis* mutants, blocked in the biosynthesis of native starter units [152]. Structures including the most effective antiparasitic avermectin derivatives (e.g., dormectin) [153] were isolated. More recently, the avermectin biosynthetic pathway was engineered to provide alternative doramectin producers [154]. In this study, the loading module of the avermectin PKS in a *S. avermitilis* strain was replaced with a cyclohexanecarboxylic (CHC) unique loading module from phoslactomycin [154,155]. Furthermore, a CHC-CoA biosynthetic gene cassette was introduced into the engineered strain to ensure the production of the precursor for directed biosynthesis of doramectin. The obtained strain synthesized higher amounts of doramectin (53 mg/L), relative to the yields detected in case of an external supplementation of the wild type strain with the substrate (CHC) (9 mg/L). However, the quantity was significantly lower when compared to the parental avermectin producer (500 mg/L). Nevertheless, the valuable "target" compound (doramectin) could be derived from the module swapping approach and the potential limitations, leading to reduced production yields, might be eliminated by applying diverse strain engineering methods (mutagenesis, synthetic biology etc.).

In addition to the loading modules, chain extension modules were replaced, as it was the case for DEBS PKS variants. For example, the second module of the bimodular mini-PKS DEBS (loading + DEBS module 1 + module 2) was swapped with cognate modules of the erythromycin assembly line (DEBS module 3 and 6) and heterologous module 5 from rifamycin [150]. The engineered DEBS-PKSs produced the expected triketide lactone in the heterologous expression host *S. coelicolor* CH999 (see also Section 3.1.1). This demonstrates that both approaches (exchanging of whole loadingor chain extension module) can result in engineered PKSs, which are capable of assembling new polyketide derivatives.

#### *3.2. Site-Specific Mutagenesis of Acyltransferases*

A different approach to the whole domain swapping, aiming at the substitution of AT-substrate specificity, is the site-directed mutagenesis (Figure 5). Recent advances in sequencing methodology, bioinformatics, structural and synthetic biology contributed to the identification of amino acid signatures of ATs and have a significant impact on the engineering of PKS assembly lines [49,61,74,106,113,138,143,144,148,156–169].

Analysis of the amino acid sequences of ATs resulted in the identification of approximately 100 residues at the C-terminal end of the active-site serine in the analysed ATs, which was assigned to different AT-substrate specificities. Indeed, in most cases, the substrate specificity of the ATs could be predicted based on this motif. This observation encouraged the modification of the AT substrate specificity by exchanging the specific sequence signature. Such strategy was applied, for example, to the model PKS-complex DEBS and its ATs. The methylmalonyl-CoA specific YASH-motif of AT1, AT4 and AT6 of DEBS was altered to the malonyl-CoA specific HAFH (and HASH) [60–62], which resulted in AT variants incorporating both building blocks (methylmalonate and malonate) (Figure 5). Mutagenesis

of sequences apart of this motif led to the generation of ATs transferring and loading non-native units onto the PKS however, reduced efficiency was observed [61,62].

**Figure 5.** Examples of AT-site directed mutagenesis [60]. (**a**) The fusion of DEBS 1 and TE domain line (DEBS 1-TE) form the erythromycin assembly line. The strain containing the modified fusion DEBS 1-TE produced the respective triketide lactones (products A). (**b**) The fusion of DEBS 1-TE in which the AT of module 1 was modified by exchanging the specificity motif YASH (native, methylmalonyl-CoA) by the HAFH motif (malonyl-CoA). The strain *Saccharopolyspora erythraea* containing the modified DEBS-TE produced triketide lactones A and B (40:60). (**c**) The fusion of DEBS 1-TE in which the AT of module 1 was modified by exchanging the specificity motif YASH by the HASH motif (malonyl-CoA). The strain *Saccharopolyspora erythraea* containing the modified DEBS-TE produced triketide lactones A and B (60:40). Abbreviations: AT, acyltransferase; ACP, acyl carrier protein; KS, ketosynthase; KR, ketoreductase; TE, thioesterase; mmAT: AT loading methylmalonate; mAT: AT loading malonate.

Harvey et al. exploited the features of the loading AT by mutating one of the ATs of DEBS and obtaining a new class of alkynyl- and alkenyl-substituted macrolides (e.g., 15-propargyl erythromycin A) with activities comparable to that of the natural product [170].

In other studies, the prototypical PKS (DEBS) was subjected to mutagenesis and the generated active site-mutant library was screened for substrate selectivity [61,63–65]. Several single amino acid substitutions, having an impact on the selectivity of the PKS, were identified. The substitution Tyr189Arg in DEBS-AT6 inverted the selectivity of the DEBS from its natural substrate toward an alkynyl-modified unit [65].

Bravo-Rodriguez et al. used a combined approach, which involved molecular modelling and mutagenesis of the AT6 of DEBS and showed that the V295A mutation of this AT leads to a wider active site and improves the promiscuity for non-native substrates [61,63,64].

Based on AT amino acid sequences, loading ATs (ATs incorporated into the loading module) usually have relaxed substrate specificity and load the loading module with diverse starter units. For example, the loading AT from the modular PKS of avermectin accepts more than 40 carboxylic acids as substrate for the loading module leading to the biosynthesis of a series of congeners [165]. Using modelling tools and site-directed mutagenesis targeting the active site residues, altered specificity toward a panel of synthetic substrate mimics was achieved [165].

Very recently, high-resolution X-ray crystal structures of a broadly selective AT SpnD-AT of the splenocin *cis*-AT PKS, accepting long aliphatic chains, up to C7-malonyl-CoA and aromatic benzylmalonyl-CoAs were solved [171]. To the best of the authors' knowledge, their work provided first structures of an AT-substrate complex and enabled the understanding of both, the stereoselectivity and the broad substrate specificity of the SpnD-AT [171]. Furthermore, using the structural data, it was possible to mutate key residues of the canonical Ery-AT6 from the erythromycin PKS and "shift" its substrate preference. The modified Ery-AT6 was able to accept diverse bulky extender units [171]. These results suggest that future efforts to expand AT substrate tolerance should target the three important residues, Q150, Y278, S280 in methylmalonyl-CoA specific AT or Q150, H278, F280 in malonyl-CoA specific AT [171].

In addition to the AT of *cis*-AT PKS, the discrete AT from the *trans* AT-PKS pathways [31,32,172] was investigated to identify engineering opportunities. While the *cis*-AT PKSs occasionally harbour ATs providing non-malonate building blocks to the assembly line, the ATs of the *trans*-AT PKSs are mostly malonyl-CoA-specific at natural, non-manipulated production conditions. The KirCII-AT [173] from the kirromycin pathway [174] is an exception. This AT loads a branched precursor (ethylmalonate) onto one particular ACP (KirACP5) at native, non-manipulated conditions and therefore belongs to one of the most promiscuous discrete ATs for regiospecific PKS engineering. To map the interaction epitope KirCII:KirACP5, the ACP was subjected to modelling and alanine scanning mutagenesis, where 61 surface residues were individually mutated to Ala [175]. Additionally, several KirCIImutants were constructed and tested in vitro. The regions involved in the KirCII:KirACP5 interaction, *trans*-acylation activity were identified. Further in vitro investigation of the substrate specificity of KirCII in ACP loading assays revealed that the AT is able to load KirACP5 with allyl- and propargyl-malonate and to a lesser extent with azidoethyl- and phenyl-malonate [176]. Those precursors are not available in the producer cell and the in vivo production of the respective kirromycin derivatives required external supplementation (feeding) with the substrate [90]. Similar in vitro studies were conducted for the discrete AT (malonyl-CoA specific at native condition) from disorazole biosynthesis. Wong et al. used alanine-scanning mutagenesis of one of the ACPs (ACP1 from DSZS) of the discrete AT and identified a conserved Asp45 residue in ACP1, which is co-responsible for the recognition of this ACP as substrate by the AT enzyme [106].

While the *cis*-ATs, such as the ATs of the erythromycin assembly line are relatively well studied and were manipulated for several times, there is not much known about the substrate flexibility and the exact mechanisms of partner (precursor or ACP) recognition for ATs from other PKS pathways (e.g., those of the *trans*-AT PKSs). The knowledge obtained from manipulations conducted in the past years will contribute to understanding of the less-explored systems and help to overcome the still existing limitations for AT-based engineering of PKS pathways.

#### *3.3. Cross-Acyltransferase Complementation*

In principle, an AT-inactivated *cis*-AT PKS can be complemented in two ways. One option is the insertion of an AT domain (synthetic, from another module of the native or derived from heterologous assembly lines) in place of the original domain into the PKS module (*cis*-to-*cis* AT-complementation). Another approach is the complementation of the PKS by introduction of a discrete AT, which provides the building blocks in *trans* (*cis*-to-*trans* AT-complementation) (Figures 3 and 6 and Table 1).

**Figure 6.** Example of AT cross-complementation [66]. (**a**) Part of the erythromycin biosynthetic pathway illustrating the DEBS complex and the chain intermediates on modules 4, 5 and 6, and the product 6-deoxyerythronolide B (6dEB). (**b**) Part of the erythromycin biosynthetic pathway illustrating the DEBS complex in which the AT6 was inactivated (AT6null) and complemented by a discrete enzyme malonyl-CoA:ACP transacylase (MAT) from *Streptomyces coelicolor*. The strain produced 2-desmethyl-6dEB. Abbreviations: AT, acyltransferase; ACP, acyl carrier protein; KS, ketosynthase; KR, ketoreductase; TE, thioesterase; mmAT: AT loading methylmalonate.

Most of the past AT complementation studies were carried out using the PKSs of the model *cis*-AT PKS assembly line erythromycin from *Sacch. erythraea* and related polyketide pathways. Their modification by deletion of the native AT and integration of a foreign AT, often regarded as domain substitution or swapping, was described in Section 3.1, as well as in diverse reports and reviews [48–50,65,138,142–144,167,168,177–182]. Here, we provide examples of cross-AT complementation, where the functionality of an AT-inactivated PKS (PKS module) was restored by an external heterologous AT protein (e.g., introduction of a heterologous AT into a mutant strain containing an AT-inactivated PKS).

Although the *trans*-AT PKS pathways were overlooked for a long time and are less well characterized than the *cis*-AT PKSs, the number of this type of biosynthetic assembly lines is constantly increasing [32,183]. In most cases, discrete ATs (freestanding ATs or *trans*-ATs) exhibit malonyl-CoA specificity in the native producer strain (e.g., disorazole AT Dis/Dsz [106], bryostatin AT BryP [67], bacillaene AT PksC [184], rhizopodin ATs RizA and RizF [185], and kirromycin AT KirCI [186]) and the diversification of the polyketide chain takes place rather after the elongation by reduction, methylation, and other modifications. However, very few *trans*-AT PKS pathways, which involve discrete ATs with non-malonyl-CoA specificity at native production conditions, were identified. The so far experimentally confirmed examples include the ethylmalonyl-CoA-specific KirCII [173] from kirromycin, the methoxymalonyl-ACP-specific OzmC [187] from oxazolomycins, and the (2*S*)-aminomalonyl-ACP-specific ZmaF [109] required for zwittermicin A biosynthesis. The discrete ATs might exhibit relaxed substrate specificity when the respective precursor is available. This was shown for example for KirCII [90,173,176]. Considering this observation and the fact that the freestanding ATs are encoded by separate genes and act *in trans* to complement the cognate PKSs, their development to a tool for PKS engineering might be easier, compared to the in the ATs embedded in the PKS (*cis*-AT PKSs). In in vitro studies, it was demonstrated that a *trans*-AT from kirromycin (KirCII) and disorazole (Dis/Dsz AT) can complement the AT-null DEBS of the *cis*-AT type [172]. In the past, the AT-null-DEBS module 6 (DEBS, in which the AT domain of module 6 was deleted) was used for similar complementation experiments. For instance, the functionality of the AT-inactivated module was restored after the supplementation with malonyl-CoA:ACP transacylase from *S. coelicolor* (Figure 6), which led to the production of 2-desmethyl-6-dEB [66]. The complementation of AT-null DEBS module 6 was also demonstrated in combination with the AT-domains of the discrete tandem AT BryP from the bryostatin PKS [67] (from ca. *Endobugula sertula*- bacterial symbiont of the marine bryozoan *Bugula neritina*). BryP was also able to catalyse the acyl-transfer onto ACPs of pikromycin and other bryostatin PKS modules [67].

In addition, an in vivo cross-species complementation, using ATs from two *trans*-AT PKSs was successful for the Δ*mmpC*-AT1 mutant from mupirocin [188] (*Pseudomonas fluorescens*). *P. fluorescens* produces a mixture of several pseudomonic acids, named mupirocin [189]. Constructs for BryP-AT1 and the didomain protein BryP-AT1AT2 complementation were introduced into the Δ*mmpC*-AT1 mutant [67]. The complementation of the Δ*mmpC*-AT1 mutant expressing the BryP-AT1 restored the biosynthesis of the main compound of mupirocin, the pseudomonic acid A (PA-A), to approximately 83% of WT production levels.

The efficiency of *trans*-AT complementation is often affected by the identity of the building block and the ACP [172,175]. Thus, the successful complementation and production of efficient modified PKSs still needs a better understanding of the discrete AT-substrate (AT-precursor and AT-ACP) recognition and interaction as well as experimentally verified data, which disclose if the gained engineering knowledge can be transferred to diverse assembly lines or which modifications are required to recover the systems.

#### **4. Advances in Natural Science and Future Perspectives of AT-Based PKS Engineering**

The recent progress and advances related to polyketides and their biosynthetic machineries provide valuable and essential know-how, which makes the AT-based engineering of PKS assembly lines even more promising than it was before. Nonetheless, this strategy is not yet a standard, high-throughput technology, which is applicable to any desired PKS enzyme. To enable an extensive and efficient construction of functional polyketide machineries, further analysis and experimentally confirmed insights into the complex systems, including non-canonical PKSs are indispensable. The understanding of the structural, mechanistic, catalytic, biochemical details not only for individual domains, but the whole assembly line would make the engineering and tuning more predictable and executable. Furthermore, directed optimization of the producer cell "hosting" the biosynthetic complex would lead to improvements in the productivity of those engineered systems. Today's sequencing technologies, bioinformatics, structural biology, chemistry, molecular biology, and other in vitro and in vivo tools facilitate the further investigation of the PKS systems and the PKS engineering. Researchers aiming at AT-based PKS engineering will most probably continue to use the established and approved strategies and methodology. Those include in general, the expanding of the pool of precursors [97,98,190] for the ATs, the modification of the AT enzymes under the consideration of the obtained structural and mechanistic insights, AT-domain swapping and cross-complementation of inactivated ATs with ATs from other modules/pathways, and site-directed mutagenesis to exchange the substrate specificity of the AT. Specifically, the information existing in databases [191,192] is incredibly helpful, not only for direct sequence alignments and phylogenetic analysis [193], but also for the development of advanced bioinformatics tools supporting the AT-targeted PKS engineering [194]. The application of such tools allows the identification of new PKS pathways [195] and/or novel types of ATs, the prediction of AT substrate specificity [61,62,158], the analysis of amino acid coevolution in protein sequences [196–198], and even protein interaction surfaces [41,199,200]. Recently, the computational online platform ClusterCAD, which facilitates the selection of natural *cis*-AT PKS parts to design novel chimeric PKSs for the biosynthesis of small PKS-derived compounds, was developed [144,201].

The protein structures of PKS domains [112,127,162,169,202–205], including the ATs [113,161,162,171,184] and whole modules [120,206] enable a more reliable modelling of the architecture of those enzymes and will improve the structure-based PKS engineering. Diverse protein modelling platforms are already available [207–212] and were successfully exploited. For example, the modelling of the erythromycin DEBS AT6 with (2*S*)-methylmalonyl-CoA confirmed the role of the proposed active-site residues and revealed residues important for the substrate binding [61]. The obtained information was used for simulations on mutants with the native methylmalonyl-CoA and non-native malonyl-CoA extender units, which led to the identification of residues prohibiting the binding of the desired substrate. Consequently, the respective sites were mutated and mutants, which were able to utilize the substrate 2-propargylmalonyl-SNAC were generated [61].

In another study, the small-angle X-ray scattering (SAXS) was used to characterize the structure of a module (*apo* module 5 from the VirA subunit) and to identify the positions of domains flanking the ACP and KS in the in the virginamycin PKS [213]. The outcome of this analysis enabled modelling of the complete intersubunit interface in the virginiamycin *trans*-AT PKS system [125].

Those results are encouraging the implementation of the computational tools, which contribute to the better understanding and more efficient engineering of the complex PKS assembly lines. It is very likely that the bioinformatics software tools will become more important due to their intensive use for diverse simulations and analysis of structural details of PKSs and their AT domains, as well as for the design of engineering experiments in the future. Furthermore, the existing databases will be complemented with the knowledge (e.g., structural data on non-canonical ATs) gained from the planned and currently conducted research on PKSs. This will lead to the development of new computational tools (or tool features) and improve the predictability of PKS-variation as well.

In addition, the structural biology technique itself is constantly improving. A new method for the direct delivery of the sample into an X-ray free-electron laser was used for the *trans*-AT from the disorazole PKS [161]. The novel sample extractor efficiently delivered limited quantities of microcrystals directly from the native crystallization solution into the X-ray beam at room temperature. A crystal structure of the discrete AT with resolution of 2.5 Å was obtained [161]. The crystallization of this difficult to handle enzymes and their complexes (e.g., AT-substrate) is highly desired, as it will provide useful information about the ATs and their interaction partners. We speculate that the progress in this field will continue and support the engineering efforts.

Directed evolution [214–218] was successfully applied to generate enzyme libraries. This approach usually uses saturation mutagenesis, error-prone polymerase chain reaction (PCR) or DNA shuffling to generate a library of mutated proteins. Subsequently the protein variants are screened in an adequate assay to identify the most promising mutants. The selected mutants are subjected to another cycle of mutagenesis to "evolve" the enzyme to the favoured protein version. This artificial process of repeating cycles of mutagenesis and selection mimics the evolution in the Nature. Although the high throughput directed evolution of PKSs and the AT enzymes is mainly limited by rapid screening (assays and analytics for detection of the of the enzyme catalysis), progress can be observed [219–221]. The directed evolution screening/selection strategies, which have been employed to improve or modify the functions of nonribosomal peptide synthetases (NRPSs) and PKSs were summarized in a review [222] recently published by Rui and Zhang. Directed evolution strategies enable the identification of residues affecting the enzyme properties, which might be difficult to detect using alternative methods. Thus, the directed evolution approach provides a good alternative for PKS and AT optimization.

In the future, those AT-based PKS engineering strategies and methodologies, which rely on DNA steps, will be supported by advanced cloning methods such as Red/ET system [223–225], the TAR cloning [226,227], USER cloning [228–230], CRISPR-Cas9 [231–235], and other synthetic biology tools [138,143,144,236–238].

Finally, the efficient production of natural and engineered polyketides often requires the tuning of the native producer strain or the development of suitable heterologous expression hosts [182,239,240]. Preferably, diverse actinomycetes [241–245], *E. coli* [246–248], *Bacillus* [249], *Saccharomyces* [250,251] and/or *Aspergillus* strains [252,253] were used for heterologous expression of polyketide pathways in the past. It is very likely, that the existing hosts will be further optimized and specifically adapted to the expression of engineered pathways, including AT-modified PKSs. Most probably, new potential hosts will be identified for the needs of expression of the respective assembly lines and the biosynthesis of the final engineered compound.

#### **5. Conclusions**

In the past, many attempts of combinatorial biosynthesis, aiming at engineered PKSs resulted in inactive or inefficient enzymes and/or assembly lines. The newly obtained knowledge about the ATs and their PKS pathways in combination with methodology advancements provide exciting new perspectives for the AT-based PKS engineering. The design and generation of functional and efficient PKS pathways might enable the production of new bioactive compounds of which structures would be extremely difficult to access using traditional synthetic chemistry approaches.

**Author Contributions:** The review was written and edited by the authors E. M.-K. and W.W.

**Funding:** The authors and work in their laboratory is supported by the Eberhard Karls Universität Tübingen, the DFG, the BMBF (FKZ 031L 0018A, ERASysApp), and Biovet (Sofia, Bulgaria).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**


#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Review* **Production of** β**-Lactamase Inhibitors by** *Streptomyces* **Species**

### **Daniela de Araújo Viana Marques 1,\*, Suellen Emilliany Feitosa Machado 2, Valéria Carvalho Santos Ebinuma 3, Carolina de Albuquerque Lima Duarte 4, Attilio Converti <sup>5</sup> and Ana Lúcia Figueiredo Porto <sup>6</sup>**


Received: 30 May 2018; Accepted: 12 July 2018; Published: 17 July 2018

**Abstract:** β-Lactamase inhibitors have emerged as an effective alternative to reduce the effects of resistance against β-lactam antibiotics. The *Streptomyces* genus is known for being an exceptional natural source of antimicrobials and β-lactamase inhibitors such as clavulanic acid, which is largely applied in clinical practice. To protect against the increasing prevalence of multidrug-resistant bacterial strains, new antibiotics and β-lactamase inhibitors need to be discovered and developed. This review will cover an update about the main β-lactamase inhibitors producers belonging to the *Streptomyces* genus; advanced methods, such as genetic and metabolic engineering, to enhance inhibitor production compared with wild-type strains; and fermentation and purification processes. Moreover, clinical practice and commercial issues are discussed. The commitment of companies and governments to develop innovative strategies and methods to improve the access to new, efficient, and potentially cost-effective microbial products to combat the antimicrobial resistance is also highlighted.

**Keywords:** actinobacteria; β-lactamase; resistance; antibiotic; β-lactamase inhibitor

#### **1. Introduction**

Infectious diseases, which are caused by bacteria, viruses, parasites, or fungi, persist to be the main cause of mortality worldwide. Despite the success of antibiotics and advances in their production and purification, children and old people are affected by bacterial infections that cause approximately 17 million deaths per year. One of the main reasons for this occurrence is the increasing prevalence of antibiotic-resistant strains [1].

Penicillin was the first antibiotic discovered by Alexander Fleming in 1928, and since then, these compounds have been essential for healthcare [2]. Although the first antibiotic producer discovered was a *Penicillium* strain, many others have this potential, and since 1942, the *Streptomyces*

genus has been known for its extraordinary ability to produce secondary metabolites, mainly antibiotics; nearly two-thirds of which occur naturally. This genus is one of the largest genera as it contains over 500 species, described along with a variety of species recognized as Actinomycetes [3,4].

*Streptomyces* are among the most versatile soil microorganisms, given their high metabolite production rate and the large variety of biotransformations. Overall, intracellular mechanisms control the accumulation of metabolites, which depends on process variables, types of nutrients, their concentrations, and operating conditions in submerged culture [5,6]. Therefore, the study and selection of appropriate culture medium composition is essential to ensure high productivity and low costs of the production process [7]. Additionally, many natural antibiotics must be purified after the production process through cost-effective methods that enable one to recover the final product at the highest level of purity and yield [8].

The first *Streptomyces* species used in industrial antibiotic production were *S. griseus* and *S. venezuelae*, which allowed one to obtain streptomycin and chloromycetin, respectively [8]. Since then, a large number of antibiotics (over 55% of available antibiotics) produced by the genus *Streptomyces* were detected between 1945 and 1978 [9].

The first important antibiotic used in clinical practice was penicillin G (benzylpenicillin), a β-lactam compound that attracted the interest of researchers to develop other derivatives. The β-lactam ring present in the structure of this drug, which is typical of this class of antibiotics, acts by linking intimately to the penicillin-binding proteins (transpeptidases), affecting cell wall biosynthesis in both Gram-negative and Gram-positive bacteria [10,11].

Afterwards, as a result of quick replication, recombination, and the high mutation rate of bacteria, resistance to β-lactam antibiotics emerged among the β-lactamase-producing organisms. These enzymes act directly on the β-lactam ring, inactivating the antibiotic. The first β-lactamase was a plasmid-mediated enzyme [12], named TEM from the name (Temoniera) of a patient in which the enzyme-producing *E. coli* strain was isolated. Subsequently, a plasmid with similar biochemical properties was detected and named TEM-2. Faced with this fact, the efforts to discover inhibitors able to inactivate β-lactamases began around 1970. Among them, clavulanic acid, which was obtained by screening of natural products, and possesses a similar β-lactam ring, was found to be a potent inhibitor of staphylococcal penicillinases and other plasmid-encoded penicillinases present in enteric bacteria, including TEM [13].

Nevertheless, antibiotic resistance was not over, and other bacteria began to produce similar enzymes; thus, the main challenge would be discovering novel inhibitors with activity against a broad spectrum of enzymes from multiple classes [10]. Genetic engineering has been an alternative tool to achieve this aim; thanks to a better knowledge of the expression of regulatory proteins in mutant organisms, it suggested that antibiotic production might be influenced by these regulatory events [14].

In accordance with the points explained above, this review provides a summary of the β-lactamase inhibitors produced by *Streptomyces* species. Herein, we emphasize some topics such as antibiotic resistance, β-lactamases inhibitor producers, production and purification processes, use of β-lactamase inhibitors in clinical practice, and commercial aspects.

#### **2. Antibiotic Resistance**

As previously mentioned, antibiotics produced by *Streptomyces* spp. were detected in 1942 with the discovery of streptomycin, and since then, researchers intensified the search for similar antibiotics within this genus [15].

However, as a mechanism of resistance, bacteria started to produce enzymes with the capacity to hydrolyze β-lactam antibiotics (β-lactamases), hence decreasing their efficiency. Meanwhile, penicillin-resistant strains of *Staphylococcus aureus* and *Streptococcus pneumoniae* emerged, which led to the proposition that the resistance mechanism could be intrinsically associated with the genomes of these bacteria [16–20]. The same phenomenon was observed with Gram-negative bacteria belonging to the *Neisseria* genus [21] and other streptomycin-resistant bacteria [22]. Biofilm formation by microorganisms is also allied with chronic, recurrent human infections and resistance [23]. The spread of resistant strains is linked to the migration of people as well, who dissipate resistant strains among population in remote communities where the use of antibiotics is very restricted [24].

In a bacterial infection, antibiotics act in two stages: in the former, a large part of the bacterial cells are killed, while in the latter, a few of them stay alive and are called persistent bacteria. If the antibiotic is withdrawn, these survivors begin a new multiplication, leading to a slow elimination of infection and recurrent bacterial infections. This phenomenon, which is called persistent sub-minimum inhibitory concentration (MIC), is observed in most bacteria. For instance, tuberculosis, which is caused by *Mycobacterium tuberculosis*, is treated for at least six months, and when the treatment is not continuous, recurrence of infection is commonly observed. Although different from resistant bacteria, the persistent ones are of great concern to the medical field, and the study and knowledge of the mechanisms of this phenomenon are of paramount importance [25].

These sub-MICs suggest a greater selection of resistance, increase bacterial mutation rates, cause phenotypic and genotypic variability, and affect the biofilm formation. In addition, they promote the maintenance of horizontal transmission of resistance genes [26] and can be especially important in multispecies communities, where even small changes in species interaction can have cascading effects [27].

Many Gram-negative bacteria, such as *Haemophilus influenza*; *Klebsiella pneumonia*; *Acinetobacter baumanii*; *Pseudomonas aeruginosa*; *Enterobacter* spp.; *Escherichia coli*; *Serratia* spp.; *Proteus* spp.; *Providencia* spp.; *Helicobacter pylori*; *Salmonella* spp.; *Neisseria gonorrhoeae*; *Shigella* spp.; and some Gram-positive ones such as *Enterococcus faecium*, *S. aureus*, and *S. pneumoniae*, were classified by the World Health Organization (WHO) as "ESKAPE" pathogens, with an extremely important role in the antimicrobial resistance era [28,29].

In addition, another interesting theory was that, unlike antibiotic-producing fungi, *Streptomyces* species defend themselves from antimicrobial attach. These microorganisms possess a self-resistance mechanism to avert the suicide, which made the antimicrobial resistance appear as a natural event preceding the current selective pressure of clinical antibiotic use [30]. The main mechanisms involved in β-lactam antibiotic resistance are inactivation of β-lactamases and modification of penicillin-binding proteins (PBPs). Furthermore, about 90 years after penicillin discovery, β-lactam-based therapies have been largely used against some illness. It is known that the chemical structures of drugs have greatly been modified in the recent past, and this action has become the tactic for most of currently-available antibiotics, which are known as second-, third-, and fourth-generation antibiotics [31,32]. Despite this progress, the indiscriminate use of antibiotics may be responsible for a microbial war in the near future [33].

Beta-Lactam antibiotics were also the first natural antibacterial compounds to be successfully developed and modified by the pharmaceutical industry [34]. This class of compounds has in common a β-lactam four-membered ring associated with different structures (Figure 1). They act by inhibiting bacterial cell wall peptidoglycan biosynthesis, thereby leading to cell lysis and death [35,36]. Specifically, β-lactam antibiotics bind to the acylate active site of PBPs, which is essential for the bacteria cell wall biosynthesis [37].

Beta-Lactam antibiotics have been widely employed for several years to treat infectious diseases because of their high specificity and strong killing efficiency [37]; however, bacterial capacity to develop apparatuses of resistance to β-lactam drugs is one of the big problems of health care [37,38]. Among the mechanisms of antibiotic resistance, we can cite the production of efflux pumps, modification or reduced production of outer membrane porins (in Gram-negative bacteria), alteration of PBPs (the molecular target of β-lactams), and production of β-lactamases [39]. In this review, we will focus on the last resistance mechanism and the mode to overcome this problem by means of biotechnology.

**Figure 1.** Backbone composition of β-lactam structures with the β-lactam ring highlighted (Structures drawn in ChemSpider).

#### **3. Producers of** β**-Lactamases**

Because β-lactamases act by breaking the amide bond present in the β-lactam ring of penicillins, they inactivate the drug biological activity, resulting in the inability of antibiotics to kill bacterial cells [40]. The microbial production of these enzymes strongly depends on cell structure. Gram-negative bacteria, with their inner and outer cell membranes separated from the periplasmic space, might be considered more developed microorganisms than the Gram-positive ones, which possess only a rigid cell wall composed of several peptidoglycan layers. As a result, Gram-negative bacteria produce high amounts of β-lactamases compared with the Gram-positive ones. Beta-Lactamase production by the former group can occur by induction (direct interaction of a β-lactam with the microorganism regulatory system) or constitutively, while that by Gram-positive bacteria is not yet well understood [37]. Among the Gram-negative bacteria, the β-lactamase production level can vary greatly; as an example, the active β-lactamase level in some enteric bacteria can exceed 4% of total soluble protein [41].

The enzyme characteristics are quite dependent on the original host. More than 1600 proteins with capacity to hydrolyze β-lactam rings have already been described [42], which, on one hand, demonstrates the diversity of β-lactamase structure [43], but from the other, hinders understanding of their action mechanism [42]. Efforts have been made to classify these enzymes, which were mainly based on their molecular structure (full nucleotide, amino acid sequence, and conserved motifs), functional characteristics (including substrate and inhibition profiles), or variances in structural features [43]. According to Ambler classification, β-lactamases are grouped in classes A, B, C, and D. Classes A, C, and D include enzymes that use serine in their active site to hydrolyze the substrate forming the acyl enzyme, while those belonging to class B are metallo-β-lactamases that require a bivalent ion, usually Zn2+, for the hydrolysis [10,44].

On the other hand, Bush, Jacoby, and Medeiros in 1995 [43] classified β-lactamases in four groups based on their functional features: (1) cephalosporinases not fully inhibited by clavulanic acid; (2) penicillinases, cephalosporinases, and broad-spectrum β-lactamases inhibited by active site-directed inhibitors of β-lactamases; and (3) metallo-β-lactamases that are able to hydrolyze the main β-lactam antibiotics, but are scarcely inhibited by typical inhibitors of β-lactamases.

Later, Bush and Jacoby [44] expanded this classification to include subgroups with more recently discovered β-lactamases. As an example, the group 2d comprises the OXA enzymes (oxacillinases) that can hydrolyze cloxacillin and oxacillin [44]. Table 1 summarizes the groups of β-lactamases, while deeper description can be found in the work of Bush and Jacoby [44].


**Table 1.** Classification of β-lactamases based on Bush and Jacoby [44].

According to the work of Laxminarayan et al. published in 2016, around 214,000 neonatal sepsis deaths worldwide are related to resistant pathogens each year [45]. To overcome the problem of antimicrobial resistance due to β-lactamase production, the pharmaceutical industry has employed two strategies: (i) improvement of β-lactam antibiotics resistant to these enzymes (as an example, the expanded-spectrum cephalosporins and carbapenems); and (ii) the application of β-lactamase inhibitors (BLIs) associated with a β-lactam antibiotic [34]. However, the first strategy has not demonstrated to be so effective because bacteria have developed the capacity to produce enzymes with enhanced features such as extended-spectrum β-lactamases (ESBLs) [40] and carbapenemases [45]. Therefore, application of BLIs may be a valid approach to overcome the problem of antimicrobial resistance.

#### **4. Inhibitors of** β**-Lactamases**

Although resembling β-lactam antibiotic structure, BLIs are compounds with weak antibacterial activity. Thus, the therapeutic strategy is to co-administer BLIs with penicillins and cephalosporins [36] in two different ways: (i) employing substrates that bind the enzyme reversibly and/or irreversibly, forming unfavorable steric interactions; or (ii) developing mechanism-based or irreversible "suicide inhibitors" as clavulanic acid (CA), sulbactam, and tazobactam (Figure 2) [10,46].

**Figure 2.** Chemical structure of β-lactamase inhibitors (Structures drawn in ChemSpider).

CA is a natural drug isolated in 1976 from *Streptomyces clavuligerus* fermentation broth [41], while sulbactam and tazobactam are BLIs developed by synthetic route in 1980 [40]. CA binds irreversibly with the serine hydroxyl group present in the active site of the enzyme, producing a stable acylated intermediate and inactivating the enzyme. As a result, the antibiotic co-administered with CA performs its main action [47]. Augmentin™ (amoxicillin and potassium clavulanate) [40] and TimentinTM (ticarcillin and potassium clavulanate) [47], both produced by GlaxoSmithKline [10], are examples of combinations available in the market. Drugs combined with CA have proven clinical efficacy against several bacteria (both Gram-negative and Gram-positive) as described by Drawz and Bonomo [10]. CA is associated with other β-lactam antibiotics in its salt form because of its instability under several conditions, such as acidic or alkaline conditions, and in the presence of salts [36].

Sulbactam and tazobactam have inhibitory spectra and mechanisms similar to those of CA [40]. Sulbactam is available on the medicine market in combination with ampicillin (UnasynTM produced by Pfizer) and cefoperazone (MagnexTM produced by Pfizer) [10]. Sulbactam has advantages compared with other BLIs because it has its own activity against some *Acinetobacter baumannii* strains and does not induce class I (Ampc) chromosomal β-lactamases in Enterobacteriaceae [48]. Moreover, sulbactam combinations have not demonstrated strong selective pressures for ESBL-producing Enterobacteriaceae and vancomycin-resistant enterococci [48].

Tazobactam combined with piperacillin (ZosynTM produced by Wyeth) has proven clinical efficacy against Gram-positive and Gram-negative pathogens [10]. Tazobactam was also associated with Ceftolozane being an antibacterial agent with potent activity against Gram-negative bacteria, including drug-resistant *P. aeruginosa* and many ESBL-producing Enterobacteriaceae [49]. Another association used to improve the spectrum of antibiotic action is that of piperacillin–tazobactam and vancomycin [50].

Although clavulanic acid, tazobactam, and sulbactam are commercially available BLIs, they exert inhibitory effect against serine-β-lactamases, mainly belonging to classes A, C, and D, which means that they are not inhibitors of metallo-β-lactamases [34]. Regarding metallo-β-lactamases inhibitors, the most important are the following: (i) thiol derivates, such as thiomandelic acids, which bind at the hydrophobic pocket of the enzyme active site and bind to, or interfere with, the bonding network between the hydrolytic water and the Zn2+ ions [10]; (ii) dicarboxylate, that is, succinic acid, which binds at the dinuclear metal center using one carboxylate to form a monodentate bridge between both zinc ions, and the second carboxylate to bridge between Zn2+ and a conserved Lys residue [51]; (iii) trifluoromethyl ketones and alcohols, the tetrazole portion of whose molecule interacts directly with the active-site Zn2+ ions; (iv) carbapenem analogs; (v) tricyclic natural products; and (vi) penicillin derivatives with C-6-mercaptomethyl substituent [10].

Considering that the bacterial resistance is a health care problem, the development of new β-lactamase inhibitors or improvement of already-available BLIs is essential. Meropenem-RPX7009 and Biapenem-RPX7009 by Rempex Pharmaceuticals that are both boronate β-lactamase inhibitors are still in development phase [42]. Although the first citation of boronic acids as serine-β-lactamases inhibitors was already reported many years ago, they have recently been explored as the next generation of pan-β-lactamase inhibitors. Trigonal boron (III) compounds behave as Lewis acids and are prone to react with nucleophiles, resulting in tetrahedral covalent adducts able to resist enzymatic hydrolysis [34]. Another example is Avibactam, which is a diazabyclooctanone approved by the U.S. Food and Drug Administration (FDA) to be used in combination with Ceftazidime (AstraZeneca Pharmaceuticals, Forest–Cerexa, Actavis–Allergan) [42] to treat complicated intra-abdominal infections (cIAI), complicated urinary tract infections (cUTI), and hospital-acquired pneumonia (HAP) [52].

The most part of β-lactamase inhibitors considered in the present review and available in the market is produced through synthetic routes. The exception is CA, which is produced by *Streptomyces* strains by fermentation. However, the development of biotechnology/bioprocesses in the last decades has shown a new perspective to develop new pharmaceutical products or improve already-established processes. Thus, we will present old and new panoramas to overcome the question of antibiotic resistance using biotechnology.

#### **5. Producers of** β**-Lactamases Inhibitors**

*Streptomyces* spp. are aerobic, filamentous, Gram-positive bacteria, which resemble fungi and are obtained naturally in different environments. These microorganisms are able to create a chain of spores developing a multicellular complex that, after the sporulation phase, forms hyphae with multinucleated mycelium [53]. These morphological characteristics make *Streptomyces* spp. the most adaptable microorganisms found in the soil, given their broad spectrum of metabolite produced and their biotransformation processes.

The discovery of β-lactamase inhibitors introduced new ways to overcome the problem of antibiotic resistance, through drug research and development [54]. However, as they also contain the same β-lactam ring, they are as susceptible to time-limited application as the β-lactam antibiotics [55]. Resistance to the β-lactam/β-lactamase inhibitor combinations especially of Gram-negative microorganisms is an old and clinically difficult situation [11,56] that has stimulated the use of computation-based design methods to develop new β-lactamase inhibitors [54].

As previously mentioned, sulbactam and tazobactam are obtained by synthetic route [54], while clavulanic acid is a secondary metabolite naturally produced by the actinomycete *S. clavuligerus* [57] that belongs to the class of clavams to which it gives its name.

The discovery of clavams, in which an oxygen substitutes a sulfur of penicillins and cephalosporins, occurred during a screening of microorganisms looking for natural products able to inhibit β-lactamases [58,59]. They are structurally related to CA, which, however, is the only one exerting β-lactamase inhibitory activity. Such an activity may be related to the peculiar 3R, 5R stereochemistry of CA, while all the other clavams have a 3S, 5S stereochemistry, although some of them have antibacterial or antifungal characteristics [57,60].

According to Nobary and Jensen [61], *S. clavuligerus* produces the following 5S clavam compounds in addition to CA: 2-hydroxymethylclavam, 2-formyloxymethylclavam, clavam-2-carboxylic acid, and alanylclavam (Figure 3). Even though not clinically effective, these compounds can be used as precursors to produce CA through a different metabolic route.

**Figure 3.** Chemical structure of 5S clavam compounds obtained from *Streptomyces clavuligerus.* (Adapted from Nobary and Jensen [61]).

CA was identified later in other *Streptomyces* species, namely *Streptomyces jumonjinensis* and *Streptomyces katsurahamanus*, while a great variety of *Streptomyces* species showed the capacity to produce other clavam metabolites with structures similar to CA (clavaminic acid, valclavam, and clavamycins) [62]. More recently, genome-sequencing projects have resulted in deposits of DNA sequences that suggest that CA biosynthetic capability exists in a wider range of strains, including *Streptomyces flavogriseus* ATCC 33331 and *Saccharomonospora viridis* DSM 43017 [61]. The number of

species that produce non-CA clavams overpasses that of CA producers, so the ability to biosynthesize and distribute CA is rather limited [62].

Studies investigating biosynthetic ways and genes correlated with the production of CA and other clavams have been performed [63–65]. CA biosynthesis involves early and late stages including oxidative reactions catalyzed by 2-oxoglutarate dependent dioxygenases. The "early" steps are largely accepted, but those of the "later" fraction are not understood or are even unknown. Gene clusters for the biosynthesis of CA, 5S clavams, and their paralogs have been identified in *S. clavuligerus* genome sequence. According to Tahlan et al. [66], *ceaS2*, *bls2*, *pah2*, *cas2*, and *oat2* are genes involved in clavaminic acid biosynthesis, thus contributing to both CA and 5S clavams productions, while other genes are only involved in that of CA.

CA obtained by fermentation is then recovered from the medium and purified several times. This production has been the subject of intensive research in recent years because of its clinical and commercial importance [61]. According to Costa and Badino [67] and Ünsaldı et al. [65], CA fermentation processes provide low CA concentrations, which hinders its large-scale chemical synthesis. Moreover, the commercial CA feasibility is restricted because of its elaborate production process [68]. Thus, it is important to get higher production rates of this valuable compound applying more powerful fermentations.

According to Viana-Marques et al. [69], certain nutrients present in the culture medium (such as C, N, P, S sources and salts) and compounds produced in the biosynthetic steps can enhance CA production. Additionally, there are many ways to increase CA production such as optimization of bioreactor operation, conditions of agitation and aeration, and medium composition [70].

After fermentation, the CA-containing fermented broth is clarified by centrifugation and/or filtration [71], and the clarified broth subjected to primary extraction steps involving liquid–liquid extraction followed by adsorption procedures. We will discuss the production and purification process to obtain β-lactamases inhibitors later.

The production of new antibiotics, and investigating and manipulating the biosynthesis routes, represent an important, efficient, and sustainable tool [60], with a chance to discover novel mechanisms of action [72]. To promote the production of new compounds, many strategies have been employed such as cloning and heterologous expression of biosynthetic gene clusters, affecting the regulatory ways, varying culture conditions, and co-culturing two or more organisms together [73].

#### *5.1. Natural Microorganisms*

The search for bioactive metabolites like novel antibiotics produced by microorganisms for potential use in several industrial applications, mainly in the agricultural and pharmaceutical fields, has become more significant because of the progress of drug/multi-drug resistance in most of pathogenic microbes [74]. Even though several antibiotics used in the clinical practice and agriculture were produced by Streptomycetes, many others were found from natural products produced by this genus and were developed into precious therapeutic agents [62,75].

As mentioned above, *Streptomyces* members are Gram-positive bacteria that grow in different environments with a filamentous form, such as fungi. Soil is the main niche in which they were isolated and identified as prolific producers of effective bioactive compounds other than antibiotics such as antivirals, anticancers, anti-hypertensives, and immunosuppressives [1,74].

*Streptomyces* species were considered as the major producers of bioactive compounds for the biotechnology industry [76]. Genome sequence surveys on various actinomycetes indicate that each bacterium is able to produce about 10-fold more secondary metabolites than the wild type selected by screening analysis before the availability of the genome sequence data. This suggests that actinomycetes are still promising sources of novel bioactive compounds [77].

According to Chater [75], *Streptomyces* are being explored even more intensively in the hope that they will help extensively to the provision of new therapeutic products to face the global threat of antibiotic resistance among pathogenic bacteria, as well as to the supply of other bioactive agents for medical purposes.

More than 23,000 bioactive secondary metabolites obtained from microorganisms have been found, and over 10,000 of them are produced by actinomycetes [78]. In general, the genome sequencing of actinomycetes isolated from soil, or of marine and plant origin, has resulted in the identification of a range of clusters for secondary metabolites, in the quantity of about 20 to 30 clusters per genome. This report reveals unexpected truths in the discovery of new useful agents [79].

Of the 20 (or more) secondary metabolites produced by *S. clavuligerus,* many have clinical importance, like cephamycin C (CephC) and CA, which are synthesized simultaneously through different metabolic routes, rigorously controlling intra- and extracellular factors [80,81].

*S. clavuligerus* is a great model for analyzing the relations involved in the biosynthesis of secondary metabolites because of its productive diversity [82]; for this reason, it has been used in metabolic engineering to obtain new strains able to overproduce CA. Particularly, amplification of biosynthetic genes encoding specific enzymes can conduct to an earlier and more rational method to upgrade strains with higher antibiotic production [62].

#### *5.2. Genetically-Modified Microorganisms*

Large-scale obtention of antibiotics by microbial fermentation has been the ground of the industry since the development of penicillin in the 1940s. Amounts of these products are nowadays very prominent after years of strong improvement of strategies as such mutagenesis and selection. These strategies, which contribute for strain enhancement, were early adopted for the penicillin strain. For example, Olano et al. reported in 2008 a penicillin production by *Penicillium chrysogenum* higher than 70 g/L, while in 1949, the original strain produced only 60 mg/L, which represents more than a 1000-fold increase [83].

Random mutagenesis and selection techniques are frequently used to obtain the strain most suitable for industrial fermentations, aiming to get high amounts of secondary metabolites [82]. This advance can be realized through over-expression of overall regulators, pathway-specific regulators, or biosynthetic genes. However, genetic engineering attempts to create high-yield strains of a specific product have rarely been successful; thus, enhancing production is still considered a challenge [64].

Furthermore, the deployment of new technologies such as DNA sequencing, transcription profiling, genomics, proteomics, metabolomics, transcriptomics, and metabolite profiling have offered new chances to engineer strains for obtaining high yields of natural products [83].

The improvement of microbial species is also an important way to decrease the production costs of industrial fermentations. Now, the mutation and selection technique is used frequently with success; however, it is very slow and labor-intensive [64].

According to Li and Townsend [82], the creation of a new generation of highly performant strains by this approach may take at least five years. Techniques of molecular genetics have been improved, and the possibility to change existing ways or make non-native pathways has progressed fast (Table 2). Advances in genetics, transcriptional analysis, proteomics, metabolic reconstructions, and metabolic flux analysis offer genetic engineering the chance to enhance the approaches for strain improvement in a targeted way.

**Table 2.** Genetic methods applied to improve the production of antibiotics and β–lactamase inhibitors (Adapted from Adrio and Demain [84]).


Most models of CA overproduction have come from manipulation of genes encoding biosynthetic enzymes or transcriptional regulators [62]. CA is obtained by submerged fermentation and then purified from the fermented broth in different ways; however, the most common protocol is centrifugation for cell separation, followed by liquid–liquid extraction with organic solvents and/or adsorption techniques and, finally, chromatography techniques [85].

#### **6. Production of** β**-Lactamase Inhibitors**

#### *6.1. Biosynthesis of Clavulanic Acid*

Studies carried out by Higgens and Kastner [86] and Brown et al. [87] were the first reports on the production of β-lactamase inhibitors by *S. clavuligerus* ATCC27064 and their recovery from the fermented broth. Although more than 40 years have passed since then, their strategy remains one of the most frequently applied to produce important drugs for medicinal use [88].

The biosynthetic pathway to produce CA (Figure 4) has not been fully elucidated, although many intermediates and enzymes have already been isolated. Nonetheless, isotope studies, purification, and characterization of the enzymes involved in the process, together with genetic studies, have contributed to clarify its biosynthetic pathway. Arginine (C5 precursor) and glutaraldehyde-3-phosphate (C3 precursor) were identified as two important precursors for clavulanic acid in *S. clavuligerus* [89,90].

**Figure 4.** Scheme of clavulanic acid biosynthesis. (Adapted from Oliveira et al. [89,90]).

#### *6.2. Production Process*

After the discovery of β-lactamase inhibitors with antibacterial activity reported by Brown et al. [87], many studies have been carried out in recent decades regarding the production processes of β-lactamases inhibitors [70]. Usually, the industrial production of CA is almost entirely based on *S. clavuligerus* cultivation in complex medium [91]. Because of the clinical importance of this compound, the increase in CA production has been the focus of several studies. Strategies to enhance CA production include optimization of batch or fed-batch operation [92,93], temperature [94–96], agitation and aeration [60,93], and medium composition [70,94,95], as well as the selection of new microbial strains [97].

The productivity of microbial metabolites is closely related to the submerged culture process. The selection of the most suitable medium composition is of primary importance to increase the productivity and decrease the cost of any bioprocess [7]. It is well known that extracellular microbial CA production is greatly influenced by medium components, especially carbon and nitrogen sources [70,94], salts composition [70,91,98], and pH [93,94,99]. However, no single medium has been established to optimize CA production by different strains, because each organism requires different conditions for maximum production, which is mainly controlled by intracellular effectors [100].

#### 6.2.1. Carbon Sources

Several carbon sources (glycerol, starch, sucrose, and lipid) have been used to produce CA [70,95,101–103]. Glycerol plays an important role in CA production as the C3 precursor of the molecule [104]. The formation of the C3 precursor did in fact appear to be the rate-limiting step of CA synthesis, while excess arginine, which is the C5 precursor, failed to increase CA production [105]. Bellão et al. [80] investigated the effect of carbon source and feeding conditions on the productions of CA and cephamycin C (CephC) by *S. clavuligerus*. In the experimental range studied, glycerol feeding conditions did not influence maximum CephC production (566.5 mg/L), whereas maximum CA concentration (1022 mg/L) was strongly dependent on culture conditions. These results are consistent with those reported by Saudagar and Singhal [93], who obtained higher amounts of clavulanic acid using glycerol in the production medium.

Wang et al. [106] achieved maximum CA production in shake flask batch cultivation at a glycerol concentration of 15.0 g/L, while Teodoro et al. did so in fed-batch culture at 120 g/L [107]. In a fed-batch study carried out in a 10 L bioreactor to determine the influence of glycerol feeding on CA production by *S. clavuligerus*, the highest CA production (1.6 g/L) was obtained using a glycerol concentration of 180 g/L, highlighting a positive effect of glycerol on CA biosynthesis [102]. Many authors have observed that a glycerol concentration above 15 g/L inhibited batch CA biosynthesis [94,100,108]. Viana et al. [94] observed a decrease in CA production by *Streptomyces* DAUFPE 3060 when glycerol concentration was raised from 5 to 10 g/L. The better performance of the fed-batch process compared with the batch one is likely due to its capacity to prevent substrate inhibition and metabolite repression, besides controlling the growth rate and prolonging the stationary phase.

The preferred carbon source for CA production are lipids because of the inability of *S. clavuligerus* to utilize simple carbohydrates such as glucose. Oils are preferred in terms of energy, as they contain approximately 2.4 times the energy of glucose [57]. Several authors have reported that oils may stimulate CA production [93,99,109–113]. Lee and Ho [109] identified palm and palm-kernel oils as the most suitable carbon sources for growth of *S. clavuligerus* and CA production. Large et al. [110] reported maximum CA production (80 mg/L) in a production medium containing unspecified lipids (C16 and C18 unsaturated and saturated fatty acids). Maranesi et al. [111], studying the use of vegetable oil in CA production by *S. clavuligerus* ATCC 27064, found the highest CA concentration (753 mg/L) in a medium containing 30 g/L soybean oil. Saudagar and Singhal [93] obtained similar CA concentrations in media containing palm oil and soybean oil.

Kim et al. [113] investigated the effect of oils on cell growth and CA production during *S. clavuligerus* NRRL 3585 fermentation. Triolein, whose fatty acid is oleic acid only, was the best oil source for CA production, but free fatty acids generated from oil hydrolysis affected both CA production and cell growth. In the same work, the authors screened for *S. clavuligerus* mutants resistant to high oleic acid concentrations and identified a mutant (*S. clavuligerus* OL13) whose oleic acid minimum inhibitory concentration (MIC = 2.1 g/L) was much higher than that of *S. clavuligerus* NRRL 3585 (0.4 g/L). Not only cell growth was improved, but also maximum CA concentration (1,950 mg/L) was approximately twice as high as that of the parent strain.

Efthimiou et al. [112] described an increase in CA production when 47 mg/L olive oil was used instead of 25 mg/L glycerol as the sole carbon source. In a similar study on the effects of different vegetable oil-based media on cell growth and CA production during *S. clavuligerus* ATCC 27064 cultivation, Salem-Berkhit et al. observed that three out of eight tested oils supported CA production [99] and that the olive oil-containing medium ensured a CA concentration twice as high as glycerol-containing medium.

The use of vegetable oils as the sole carbon source can support bacterial growth and enhance CA production, but a careful choice of the oil is essential to prevent affecting the CA yield [112]. These findings can be explained with the residual oil levels in culture medium [114] and the high

oxygen requirement for oil metabolism. Residual oil levels may lead to problems associated with the increased medium viscosity and warrant additional downstream processing [93].

Comparing glycerol and sucrose as the sole carbon source, Lee and Ho [109] observed no production of CA in a glycerol-containing medium, but high production in sucrose-containing medium. Similar findings were reported by Ives and Bushell [105], who observed no CA production in glycerol-containing C-limited medium. Another study by Thakur et al. [101] demonstrated that the addition of dextrin or glycerol as the sole carbon source neither improved nor decreased CA production. However, two studies reported a totally different observation, that is, a glycerol-containing basal medium allowed for a maximum CA level (348.5 mg/L) about twice as high as a starch-based medium [80,93].

Additionally, two other studies by Saudagar and Singhal [93] and Chen et al. [100] revealed a biphasic dose response of glycerol, whereby CA production was inhibited at either too high or too low concentrations.

#### 6.2.2. Nitrogen Sources

Soybean derivatives (flour, protein isolate, and meal) have been used as nitrogen sources for CA production [7,67,70,94,95,97,106,107]. To perform a screening of medium constituents for fermentative CA production by *S. clavuligerus*, Rodrigues et al. [70] performed fermentations using soybean protein isolate (SPI) and soybean flour (SF) as the primary nitrogen source, and obtained higher CA concentration (437 mg/L) with the former ingredient. On the other hand, with SF, CA production remained steady for a long time likely because this ingredient induced the release of extracellular proteases by *S. clavuligerus*, which hydrolyzed it during the growth phase, providing a steady supply of essential nutrients to the microorganism [115].

Being a by-product of oil extraction, SF has been recognized as a potentially useful and cost-effective ingredient. It consists of approximately 40% proteins and is rich in other organic and inorganic compounds, thus being a good candidate for a culture medium [116]. According to Chen et al. [102], soybean derivatives, such as soy meal flour and soybean protein hydrolyzates, are excellent components of media for CA production because they contain arginine, the precursor of CA.

Viana et al. [94], in their attempt to investigate the effect of SF concentration on CA production by the new isolate *Streptomyces* DAUFPE 3060, observed maximum CA production (494 mg/L) at the highest level (20 g/L). After optimization by response surface methodology, Marques et al. [95] achieved, with the same strain, a maximum CA concentration as high as 629 mg/L using 40 g/L SF.

Ortiz et al. [7] carried out a study on the influence of the type of soybean derivatives as nitrogen sources on CA production by *S. clavuligerus*. Using two different media, one containing 20 g/L SF and the other 20 g/L SPI, they obtained the highest CA production (698 mg/L) with the former ingredient. On the other hand, Teodoro et al. [107], who investigated the effect of SPI level on CA production, achieved the highest CA yield (380 mg/L) at intermediate SPI concentration (20 g/L corresponding to 2.95 g/L total N).

#### 6.2.3. Amino Acids as Supplements in Basal Medium

Arginine and ornithine exert a concentration-dependent stimulation of CA production, and both amino acids are effectively incorporated into the CA molecule [117–119]. The investigation of the role of amino acids as nitrogen sources in CA production began in 1986 [117]. Since then, several studies focusing on the effects of amino acids, mainly arginine and ornithine, have been performed.

The incorporation of arginine into the CA molecule does not establish ornithine as a direct precursor, because the enzyme ornithinecarbamoyl transferase of *S. clavuligerus* exhibits arginase activity, which converts arginine to ornithine [120]. However, Valentine et al. [121] used blocking mutants in the *argF* and *argG* genes, which were unable to convert ornithine into arginine, even though they found a great incorporation of arginine in the CA molecule and a poor incorporation of ornithine. This demonstrates that arginine is the direct precursor of CA and indicates that arginase activity does not produce sufficient ornithine to incorporate into CA.

Townsend and Ho [119] suggested arginine and pyruvate as CA precursors. However, studies have demonstrated that exogenous ornithine, rather than arginine, effectively enhances CA production, provided that there is a sufficient amount of C3 precursor [120]. A 270% increase in CA production was observed intermittently feeding glycerol and ornithine compared with the batch cultivation in shake flasks, and a 150% increase compared with cultures with glycerol and arginine feeding or when only glycerol was fed [122]. Teodoro et al. [102], who investigated the influence of glycerol and ornithine feeding on CA production by *S. clavuligerus* in batch bioreactor, observed an increase in CA productivity, but a small decrease in CA concentration, in the presence of ornithine. Rodrigues et al. [70] observed that glutamate and ornithine negatively affected CA production, while arginine and threonine had no influence.

#### 6.2.4. Salts in Basal Medium

Compounds containing phosphorus, magnesium, and iron are also included in culture media used for CA production [70,106]. Rodrigues et al. [70] published a study on the nutritional requirements of *S. clavuligerus* for CA production where they demonstrated that ferrous sulfate is an essential ingredient of the fermentation medium because the enzymes involved in CA biosynthesis are Fe2+-dependent. A medium containing ferrous sulfate allowed for a CA concentration of 437 mg/L, while a formulation without this salt yielded only 41 mg/L.

Phosphate is a crucial growth-limiting nutrient that regulates the synthesis of antibiotics belonging to different groups; therefore, industrial production of antibiotics is carried out at growth-limiting concentrations of inorganic phosphate [57]. Saudagar and Singhal [98] observed that the optimum concentration of KH2PO4 for CA production (878 mg/L) was 10 mM. According their results, higher CA values are expected at lower temperatures and KH2PO4 concentrations.

#### 6.2.5. Effect of pH

It has been reported that one of the important characteristics of *S. clavuligerus* is its strong dependence on the extracellular pH for cell growth and CA production [93,94,99]. Viana et al. [94], using a fractional factorial design to investigate the influence of the initial medium pH on CA production by *Streptomyces* DAUFPE 3060, observed the highest CA concentration (494 mg/L) at pH 6.0, while Saudagar and Singhal [93], using a L25 orthogonal array, identified pH 7.0 and 7.5 as the optimum pH values for CA production (500 mg/L) and cell growth (140 mg/L nucleic acid), respectively. The marked decrease in the CA yield out of this pH range suggested the occurrence of CA degradation under either acidic or alkaline conditions.

#### 6.2.6. Extractive Fermentation of Clavulanic Acid

CA fermentation processes still present problems such as low CA concentrations [7,92,93,106]; thus, it is essential to search for more effective fermentation methods able to increase the production yield of this valuable compound. Moreover, CA recovery is based on a relatively complex downstream protocol, including successive liquid–liquid extraction steps with organic solvents and a final chromatographic step, which results in low purification yields [68,123]. Therefore, the search for new environmentally friendly (lower amounts of organic solvents) purification strategies is fundamental to achieve higher yields and lower costs. Liquid–liquid extraction in aqueous two-phase systems (ATPS) is an interesting alternative for this purpose [47].

Typical ATPS are the aqueous two-phase polymer systems (ATPPS), formed by two polymers or one polymer and a salt, which separate into two immiscible phases above their critical solubility. Such systems consist of a light phase (top phase), rich in one polymer, and a heavy phase (bottom phase), rich in the second polymer or the salt [124]. Because of the low cost and large difference in

hydrophobicity of the two phases, ATPPS, mainly poly(ethylene glycol) (PEG)/salt systems, have been widely used to separate enzymes [125], other proteins [126], and antibiotics [47,68,127].

Some advantages of applying ATPS to whole culture media are the increase in recovery yields, feasibility for continuous operation, reduction of the number of steps, and decrease in the process costs because of joining clarification and partial purification [127]. For most of the ATPS constituted by PEG and salt, the antibiotic is commonly separated into the PEG-rich top phase, whereas the by-products such as amino acids and peptides are discarded into the salt-rich bottom phase [128].

An alternative process has been proposed, which integrates fermentation and extraction with the aim to improve the rate of product formation and production costs. Extractive fermentation or in situ product recovery provides a technological solution to overcome the limitations of product inhibition and low product titer, which are typical drawbacks of biotechnological processes. The concept, as the name suggests, involves integration of an extractive step as the first stage of downstream processing, to remove the product simultaneously to its synthesis. In extractive fermentation using ATPS, cells are considered to be immobilized in one of the phases or at the interface, and the required product to partition into the other phase by proper handling of the system [128].

The advantages of extractive fermentation also include reduction of the toxic effect of product on microbial growth [129] and extended fermentation time [130]. Moreover, continuous product removal during the entire fermentation minimizes the temperature- and pH-dependent degradation of product [131], by reducing its exposure to such damaging conditions. This is particularly beneficial for labile products like CA.

Viana Marques et al. [127] carried out an optimization study according to a 22-central composite design to investigate the influence of four variables, specifically, PEG molar mass, PEG and phosphate concentrations, and agitation intensity, on CA extractive fermentation with a PEG/phosphate ATPS. Whereas CA partitioned towards the PEG-rich top phase, cells positioned at the interface. Moreover, it was found that 25% PEG with molecular weight of 8000 g/mol and phosphate salts at 240 rpm were the best conditions for the extractive fermentation, leading to the best results in terms of CA partition coefficient (*K* = 8.2), yield in the PEG-rich phase (*Y* = 93%), and productivity (*P* = 5.3 mg/L.h).

Panas et al. [132], who attempted the purification of CA produced by *S. clavuligerus* via submerged fermentation using different ATPS, obtained the highest CA recovery yield (64.91%) and purification factor (22.70) with PEG-600/sodium polyacrylate-8000 and PEG-600/cholinium chloride, respectively. These results support the use of these systems as effective techniques to purify CA from fermented broth in a single partitioning step.

#### **7.** β**-Lactamase Inhibitors in Clinical Practice**

Immediately after the discovery of β-lactam antibiotics as natural antibacterial compounds, they were effectively modified by the pharmaceutical industry for clinical purposes. However, they even constitute an effective group of antibiotics. With the advance of technologies through rapid bacterial genome sequencing, improvements in protein structure determination, and advanced genetic engineering, many industries started to modify the antibiotic structures leading to the second, third, and fourth generations of these drugs. Nevertheless, the problem of antimicrobial resistance persisted, and combined drug therapies arose [31,32].

Mycobacteria antibiotic resistance is multifactorial, including enzyme inactivation and cell wall impenetrability [133]. Additionally, because this bacteria group developed genetic and epigenetic resistance through selection [132], treatment options for mycobacteria have been restricted and inert for over thirty years [134]. An example is the rapid and global diffusion of genes of strains with multidrug resistance such as NDM-1-producing *K. pneumoniae* [135].

Co-therapy with β-lactam antibiotics and β-lactamase inhibitors, with or without a β-lactam ring, has shown some success (Table 3) [136–138]. Oxyimino-cephalosporins or carbapenems are still potentially very reliable and represent a validated strategy to overcome antimicrobial resistance in Gram-negative and Gram-positive bacteria.

Meanwhile, taking into account that Avibactam (and derivatives) are inhibitors of both class A and class C (and some of class D) serine enzymes, new approaches and new targets are essential to diversify treatment options. The approval of Vabomere (meropenem/vaborbactam) or Avycaz (ceftazidime-avibactam) demonstrates that novel combinations could lead to an amenable successful clinical development.

**Table 3.** Combinations of β-lactamase inhibitors and β-lactam antibiotics of clinical use (Adapted from Docquier and Mangani et al. [34] and Bush [41]).


FDA, U.S. Food and Drug Administration; EMA, European Medicines Agency; cUTI, complicated urinary tract infection; cIAI, complicated intra-abdominal infection.

Boronic acids, which have recently been discovered as a new class of pan-β-lactamase inhibitors, are selective, devoid of toxicity, and able to inhibit both serine- and metallo-β-lactamases. There is a perspective that these inhibitors could be promising and available for clinical use [34].

Phosphonic and phosphinic acids, which contain an inert C–P bond, constitute a group of bioactive small molecules with great pharmaceutical potential. Among the most known examples are fosfomycin (the only FDA-approved antibiotic to treat acute cystitis during pregnancy) and fosmidomycin (a potent antimalarial agent), as well as glyphosate and phosphinothricin (widely used herbicides). However, the use of phosphonic acids as antibiotics/herbicides led to the occurrence of multiple mechanisms of resistance to them [169].

Possible uses of kinase inhibitors in another field, like immune response to bacterial or viral infections, is being investigated in preclinical studies. The receptor tyrosine kinase inhibitor, approved by the FDA, affects *M. tuberculosis* growth through increased lysosomal targeting and suppression of signal transducer and activator of transcription activation [170]. These drugs could also have the ability to inhibit the survival of other microbes and the replication of viruses and, consequently, to decrease the resistance to drugs in patients with infections [171].

#### **8. Commercial Use**

Despite the emerging need for new antibiotics, there is no advanced research into new antimicrobial drugs. In 2017, only 39 antibiotics were in stages I and III of progress, an amount

insufficient to meet the current and planned clinical demand. Further, considering these 39 drugs, only 13 (33%) will be turn into a marketable drug [172]. Because of growing antibiotic resistance and multiresistance, the mortality rate is expected to reach 10 million by 2050, with an estimated economic cost of US \$100 trillion [173]. As an example, 250,000 people die every year from the drug-resistant tuberculosis, and only 52% of patients worldwide are successful in treatment. And yet, just two novel antibiotics to treat this disease have reached the market in 70 years.

At first glance, although large companies have some advantages in producing new antibiotics, such as an established research method, sophisticated tools for dosage study, and fast approval by regulatory agencies, they give priority to the development of other drugs for the treatment of other diseases. This occurs because the profit of antibiotic investment is low, representing a market of US \$45 billion, losing only to drugs used in cardiovascular and central nervous diseases [174].

A further issue is the antibiotic price when it reaches the market. The competition is great with generic drugs that have a far lower price. Therefore, in some instances, large companies transfer responsibility to small business to develop these drugs; for example, this happens with daptomycin, produced by Cubist and licensed by Lilly [175,176].

The United States and Europe are joining forces to reach the purpose of producing 10 to 15 new antibiotics every decade. This initiative is part of a government program that aims to make US \$10–\$30 billion available in the market over the next 10 years [177], which would be a real war against drug resistance. Nevertheless, the main obstacle to these strategies is their implementation, as it often implies public acquisition of the rights to distribute the antibiotic that poses a significant risk to companies and major upfront public costs to support this burden.

#### **9. Future Perspectives**

Nowadays, multidrug resistance is the biggest concern of government agencies and companies that develop antibiotics. Multidrug resistant tuberculosis affects half a million people every year, requires two years of treatment with success in only few cases, and is frequently observed in sites with low human development index. However, many of the novel drugs that are arriving in the market are not directed to infections caused by antibiotic-resistant pathogens. Additionally, according to Pew Charitable Trust studies, only 31% of drugs in advanced stages of clinical trials are effective against an ESKAPE pathogen, and only 33% against the multidrug resistant ones [178].

There are companies and entities around the world that are encouraged to solve the problem of antibiotic resistance. The Global Antibiotic Research and Development Partnership (GARDP), a non-profit organization, is designed to fund research and commercialization of antibiotics. Between 2016–2018, it received funding of more than €5 million for research on new antimicrobial drugs [179].

The isolation of new microorganisms from different environments, such as actinomycetes and more specifically Streptomycetes, able to produce new substances, may considerably accelerate the development of new treatments. Additionally, genetic engineering application and proteomic techniques could generate high performing strains able to produce new bioactive compounds and, consequently, new powerful antibiotics [65].

Besides that, most of the research did not take into account the interactions among bacterial species, which are used to live in communities in the natural environment. In addition, the leakage of antibiotics into natural environments has the potential to radically alter the evolution of resistance along with the microbial dynamics and structure of the communities. Faced with this, the important collaboration of the Amazon Biotechnology Center (CBA) and TB Alliance in the search for new antibiotic producers, and funds to support for the final phase of clinical trials and regulation of using antibiotics, respectively, are a hope [1].

The risk of "super bugs" resistant to relatively all licensed antibiotics may rise in the future; therefore, constant worldwide surveillance for multidrug-resistant bacteria is urgently required. Because of this, strong emphasis on collaboration between companies and governments encourages synergy across the search of new antibiotics and the antibiotic market [1].

#### **10. Conclusions**

This review focused on a general discussion on the production of β-lactamase inhibitors by the members of the genus *Streptomyces*. Antibiotic resistance has been around for some time and has grown quite a bit as a result of increased mutations in genes encoding enzymes such as β-lactamases, which are responsible for inactivation of β-lactams antibiotics. The classification of these enzymes was reorganized because of the appearance of new genes and, consequently, new enzymes. In view of the fact that antibiotics used in clinical practice at the time were no longer very effective, β-lactamase inhibitors have arisen to circumvent the serious problem constituted by the spread of many extended-spectrum β-lactamases.

Clavulanic acid (CA), the first and most important β-lactamase inhibitor used in clinical practice to date, was obtained naturally by *Streptomyces clavuligerus*, while the other two most known inhibitors, Tazobactam and Sulbactam, are of synthetic origin. The CA industrial production has been extensively studied. Several sources of carbon, nitrogen, vitamins, amino acids, and salts were tested in batch or fed-batch and small or large-scale cultivations. The influence of physicochemical parameters such as pH, temperature, agitation, and aeration on fermentation was also investigated for the purpose of improving CA production and reducing time and costs. The most used purification processes are chromatographic methods, aqueous two-phase extraction using polymers and salts, and extraction systems using solvents. Extractive fermentation is a promising emerging technology that integrates the production and extraction stages, thus increasing productivity and reducing time.

Although *Streptomyces* spp. are known to be excellent producers of antibiotics and β-lactamase inhibitors, rapid and global diffusion of the genes of strains with multidrug resistance is a concern; thus, it is still a considerable challenge to improving production and reducing costs. In this respect, genetic engineering may make it possible to construct combinations of genes capable of coding new, hitherto unknown antibiotics; what we might call hybrid antibiotics.

Recently, bacteria named "ESKAPE" by WHO have emerged, which are extremely important pathogens that spread high levels of resistance across the world. Faced with this, the search for new producers of antibiotics and β-lactamase inhibitors and combined drug therapies are emergency alternatives. For this to occur consistently, companies and government must join forces to develop new low-priced antibiotics to solve the problem of antibiotic resistance.

**Author Contributions:** All authors conceived, drafted and revised the manuscript together.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Review* **Actinomycetes, an Inexhaustible Source of Naturally Occurring Antibiotics**

#### **Yoko Takahashi \* and Takuji Nakashima ¯**

Kitasato Institute for Life Sciences, Kitasato University, 5-9-1 Shirokane, Minato-ku, Tokyo 108-8641, Japan; takuji@lisci.kitasato-u.ac.jp

**\*** Correspondence: ytakaha@lisci.kitasato-u.ac.jp

Received: 17 April 2018; Accepted: 23 May 2018; Published: 24 May 2018

**Abstract:** Global public health faces a desperate situation, due to the lack of effective antibiotics. Coordinated steps need to be taken, worldwide, to rectify this situation and protect the advances in modern medicine made over the last 100 years. Work at Japan's Kitasato Institute has been in the vanguard of many such advances, and work is being proactively tailored to promote the discovery of urgently needed antimicrobials. Efforts are being concentrated on actinomycetes, the proven source of most modern antibiotics. We devised a novel physicochemical screening mechanism, whereby simple physico-chemical properties, in conjunction with related detection methods, such as LC/MS, LC/UV, and polarity, could be used to identify or predict new compounds in a culture broth, simply by comparing results with existing databases. New compounds are isolated, purified, and their structure determined before being tested for any bioactivity. We used lyophilized actinomycete strains from the Kitasato Microbial Library, most more than 35 years old, and found 330 strains were producers of useful bioactive substances. We also tested organisms found in fresh samples collected in the complex environments from around plant roots, as well as from sediments of mangrove forests and oceans, resulting in the discovery of 36 novel compounds from 11 actinomycete strains. A compound, designated iminimycin, containing an iminium ion in the structure was discovered from the culture broth of *Streptomyces griseus* OS-3601, which had been stored for a long time as a streptomycin-producing strain. This represented the first iminium ion discovery in actinomycetes. Compounds with a cyclopentadecane skeleton containing 5,6-dihydro-4-hydroxyl-2-pyrone ring and tetrahydrofuran ring, designated mangromicins, were isolated from the culture broth of *Lechevalieria aerocolonigenes* K10-0216 obtained from sediment in a mangrove forest. These structures are extremely unique among natural compounds. From the same culture broth, new steroid compounds, named K10-0216 KA and KB, and other new compounds having a thiazole and a pyridine ring, named pyrizomicin A and B, were discovered. New substances can be found from actinomycetes that have been exhaustively studied. Novel compounds with different skeletons can be found from a single broth of one strain. The sought after new antibiotics will arise from continued exploitation of the actinomycetes, especially rare actinomycetes. Work on new organisms and samples should be augmented by re-examination of known actinomycetes already in storage. New research should also be carried out on the manipulation of culture media, thereby stimulating actinomycete strains to produce novel chemicals. The establishment of wide-ranging international research collaborations will facilitate and expedite the efficient and timely discovery and provision of bioactive compounds to help maintain and promote advances in global public health.

**Keywords:** actinomycetes; secondary metabolites; novel compounds; physicochemical screening; physical and chemical properties; structural diversity; biological activity

#### **1. Introduction**

The use of chemicals to maintain or improve human health is as old as recorded history. In recent times, the discovery and development of chemicals to kill or overcome bacteria or pathogens is regarded as one of the most significant medical achievements of the 20th century, and countless millions of human lives have been saved as a result. Natural products have been and remain the mainstay of medical treatments. Chemicals produced in nature, or compounds based on them, accounted for 65% of the 1211 small molecule drugs approved by the United States Food and Drug Administration (FDA) in the 34 years from 1981 to 2014 [1]. The wide and diverse range of microbial primary or secondary metabolites that possess potent and sometimes unique bioactivity, coupled with the enormous and as yet relatively untapped potential and promise they offer, will heavily influence and drive forward future antibiotic research, while simultaneously emphasizing the importance of prioritizing natural products discovery over the manufacturing of synthetic compounds [2].

Penicillin was first used for human treatment in 1942, and it revolutionized the treatment of bacterial infections. It has since saved hundreds of millions of lives, as well as galvanizing the search for similar antibacterial or antimicrobial chemicals. As a result of the worldwide research effort, there was a flood of new antibiotics identified throughout the 1950s and 1960s, with the approval of several distinct novel classes of efficacious antibiotics for human use. However, since that "Golden Age", the number of new antibiotics registered has steadily declined, and very few new classes of antibiotics have reached the marketplace and clinical use. In reality, scientific and economic factors will likely delay the appearance of any new antibiotics. From preclinical testing to approval for human use takes 10–15 years, and the costs involved are prohibitive [3,4]. A recent analysis suggests that, in 2014, the actual cost of driving a new compound from concept to the marketplace was in excess of \$1.3 billion [5]. Furthermore, approximately 1 in 1000 potential drugs proceed to clinical trials, and then almost 90% fail in the human testing phase. For example, in the antibiotic field, the prevailing dangerous lack of new antibiotics, coupled with the loss of effectiveness of many already being widely used, threatens a return to the pre-antibiotic era, and the reversal of the gains made in global public health during the 20th century, accompanied by the potential loss of millions of lives.

The world is fast waking up to this dreadful scenario, and in 2015, the World Health Assembly endorsed the Global Action Plan on Antimicrobial Resistance [6]. This committed all member states to prepare national action plans and take proactive steps to promote the discovery, development, and sustainable exploitation of new antibiotics, especially those with novel modes of action. These commitments were reiterated by the United Nations General Assembly in 2016 [7].

So how can we find the urgently needed drugs? We firmly believe that actinomycetes will prove to be the primary source of the desperately needed biological substances over the next 2–3 decades, and this publication contains substantial evidence to support that point of view.

The actinomycetes are a heterogeneous group of Gram-positive bacteria with high guanine (G) and cytosine (C) content in their DNA. They are extremely diverse, with at least 350 genera known to date. They constitute one of the largest bacterial phyla, and are ubiquitous in aquatic and terrestrial ecosystems. Most (especially the streptomycetes are saprophytic, soil-dwelling organisms, but they are also found in fresh and salt water, and the air. They are typically present in soil at densities of 106 to 109 cells per gram of soil, with streptomycetes accounting for over 95% of all actinomycete strains isolated from soil [8]. Many species are harmless to animals and higher plants, while some are important pathogens.

The actinomycetes, particularly species from the genus *Streptomyces*, have proved to be a tremendous high-impact source of valuable chemicals. They have yielded many clinically essential antimicrobial compounds, including streptomycin, actinomycin, and streptothricin [9]. Besides streptomycin (discovered in 1944 from *Streptomyces griseus*), other examples of the success of this traditional discovery research approach are chloramphenicol (1947, *S. venezuelae*), tetracycline (1948, *S. rimosus*), erythromycin (1952, *Saccharopolyspora erythraea*), leucomycin (1952, *S. kitasatoensis*), and vancomycin (1956, *S. orientales)*. Additionally, in 1963, gentamicin was discovered, isolated

from *M. purpurea*, a member of the *Micromonospora*. This triggered the search for new compounds from the so-called "rare actinomycetes", which are lower frequency of isolation than members of the genus *Streptomyces* that are well isolated from soil. Compounds from the rare organisms include teicoplanin (1978, *Actinoplanes teichomyceticus*), fortimicin (1977, *M. olivoasterospora*), rosamicin (1972, *M. rosaria*), and nocardicin (1976, *Nocardia uniformis*). Incidentally, salinosporamide A, which holds promise for development of an anticancer drug, is produced by a strain of the genus *Salinispora,* a rare actinomycete isolated from a heat-treated marine sediment sample [10].

Approximately two-thirds of all known antibiotics are produced by actinomycetes, predominantly by *Streptomyces* [8]. It is believed that the actinomycetes are the source of some 61% of all microorganism-derived bioactive substances so far discovered [11], with 16% of the total originating from the "rare actinomycetes", mostly from the *Micromonosporaceae*, with additional smaller contributions from the *Pseudonocardiaceae* and *Thermomonosporaceae*. This suggests that rare actinomycetes are a valuable source of novel compounds, and that improved isolation strategies are required to increase the frequency in which they are isolated.

#### **2. Historical Discovery of Novel Compounds from Actinomycetes by the Kitasato Omura-Drug ¯ Discovery Group**

The Kitasato Institute has, since its inception, concentrated its investigations on soil dwelling microbes, particularly the actinomycetes, as a potential source of bioactive compounds. Up until the mid-1970s, the singular universally employed discovery process involved identifying microorganisms in soil (or other) samples, culturing them and then testing any primary or secondary metabolites or other chemicals they produced to identify predetermined bioactivity that would meet a human need.

Decades of success in our exploration of the actinomycetes is exemplified by the discovery in the Kitasato Institute by Satoshi Omura in the early-1970s of ¯ *Streptomyces avermectinius* (synonym *S. avermitilis*) MA-4680T, the microbe which produces the avermectins [12,13]. The avermectin derivative, ivermectin, is perhaps the world's greatest, most effective, and safest drug for the treatment and prevention of a diverse range of human diseases and conditions [14]. The importance and significance of the discovery and development of these compounds was recognized by the 2015 Nobel Prize in Physiology or Medicine being awarded to Prof. Omura and Prof. William C. Campbell ¯ of Merck & Company, Inc., Kenilworth, NJ, USA, representing the industrial partner which has become essential for the discovery, development, production, marketing, and distribution process of all modern-day antibiotics. The award citation stated "William C. Campbell and Satoshi Omura ¯ discovered a new drug, avermectin, the derivatives of which have radically lowered the incidence of River Blindness and Lymphatic Filariasis, as well as showing efficacy against an expanding number of other parasitic diseases" [15,16]. The 2015 award was the third Nobel Prize given for discovery of an antibiotic, following those for penicillin (for Fleming, Florey, and Chain in 1945) and for streptomycin (for Waksman in 1952), the man who first coined the term "antibiotic".

The discovery of ivermectin arose because of Omura's unwavering belief that microorganisms ¯ are a limitless source of useful chemical compounds—"Microbes do not produce useless metabolites: we just have little knowledge of their usefulness for mankind" [17,18], and because the partnership he set up between his group and Merck scientists were looking for specific anthelmintic compounds. Although ivermectin has proved to be a multifaceted, extremely effective chemical with a wide range of impacts, the original bioactivity screening focused almost predominantly on looking for an anthelmintic. Hence, that was what was found.

In the early-1970s, Omura decided to introduce an innovative new approach to drug discovery, ¯ namely to simply identify novel chemicals with no fixed goal in mind, carry out preliminary assays and evaluations, catalog and store both the chemicals and the producing microorganisms, and make the chemicals available for others to assay for all variety of bioactivity, or for use as biological or chemical reagents. This novel process was referred to as physicochemical (PC) screening.

As members of the Kitasato Institute for Life Sciences' Drug Discovery Group, with Satoshi Omura ¯ as team leader, we have long and extensive experience in the search for novel compounds derived from microorganisms. Our cohesive integrated research program now encompasses the following three foci:


Latterly, our isolation work has been significantly refocused. We are now investigating existing but hitherto underutilized actinomycetes which exist in storage. We have also switched our attention from soil dwelling microbes to the exploration of microorganisms living in the complex environments found in the immediate vicinity of plant roots. We quickly discovered that whereas more than 90% of actinomycetes isolated from soil are *Streptomyces* strains, the rare actinomycetes dominate in strains isolated from plant roots. Currently, some 642 strains of actinomycetes have been isolated from 16 plant root locations, about 80% of which are rare. Two new genera (*Phytohabitans sufuscus* and *Rhizocola hellebori*) plus seven new species have, so far, been proposed through taxonomic study of these strains [19–21].

In our case, the discovery of useful microbial chemicals has been facilitated and accelerated by employing a two-pronged approach. Initially, using the traditional method attempting to acquire a new compound with a preconceived specific biological activity; more recently by identifying any and all novel substances by detecting and exploiting the basic physical and chemical properties and structural features of compounds. This bifurcated approach, both mechanisms of which are ongoing in Kitasato University, has led to the discovery of over 500 compounds, most of which were found using the original method [22].

In the mid-1970s, PC screening was introduced initially using Dragendorff's reagent to identify nitrogen-containing compounds (alkaloids) which would cause a simple, visible color change. Staurosporine [23] was discovered as the first indolocarbazole compound from the culture broth of *Saccharothrix aerocolonigenes* subsp. *staurosporeus* AM 2282T [24] (renamed *Lentzea albida* in 2002 [25]) in 1977, using this method. We initially determined that the compound possessed antifungal properties, and demonstrated a hypotensive effect. Nine years after discovery, in 1986, another research group discovered that staurosporine was a nanomolar inhibitor of protein kinases, as assessed by the prevention of ATP binding to the kinase [26,27]. This interesting biological activity stimulated an explosion in exploratory research for selective protein kinase inhibitors by numerous laboratories and pharmaceutical companies worldwide, staurosporine becoming the parent compound for many of today's highly-successful anticancer agents. This example helps to illustrate that all substances produced by microbes may be of great benefit, and that they should be examined for potential use in all forms of human endeavor, especially for use in modern medicine, and that they should be made available for exhaustive testing and use by all, wherever practical and possible.

We now routinely search for novel chemicals from actinomycetes by analyzing a range of physico-chemical properties, such as LC/MS, LC/UV and polarity. It is now possible to predict whether a new substance is present by analyzing results and comparing with existing databases. This approach has so far identified some 36 novel compounds (including analogs) [28]. In this report, we describe these results, and discuss the ability of actinomycetes to produce a wide spectrum of novel chemicals, as well as draw attention to the diversity of metabolites that a single microbial strain can provide for us.

#### **3. Novel Compounds Discovered by Physicochemical (PC) Screening of Cultured Broths of Actinomycetes**

The novel compounds derived from actinomycetes discovered through our PC screening during the past eight years are displayed in Table 1. The compounds are accompanied by the name of the producing microorganism, their original source, the primary biological activity of the compound and relevant publications. The PC screening procedure was carried out as follows.

After cultivation in 10 mL of several kinds of preset media, an equivalent amount of ethanol was added, the ingredients were thoroughly mixed, the cells were then disrupted, and the ethanol extract was subjected to PC screening.

After LC/MS and LC/UV analysis, each peak recorded was compared with known data from the Dictionary of Natural Products, and our own database. A peak was predicted to be a novel substance. When this was the case, we scaled up the culture and isolated and purified the target compound using column chromatography and preparative HPLC. After obtaining the unique compound, its structure was determined by high-resolution mass spectrometry, NMR, etc. The new compounds underwent preliminary bioassays, either in-house or in established collaborative research projects with other groups.

Identification of the strains being cultured was carried out using morphological characteristics, chemical composition in cells, and phylogenetic analysis based on 16S rRNA gene sequences.

The Kitasato Omura-Drug Discovery Group has already discovered avermectin [ ¯ 14], staurosporine [23], herbimycin [29], setamycin [30], and lactacystin [31] from secondary metabolites of actinomycetes [22]. The actinomycete strains producing these compounds (as well as strains producing a variety of other chemicals) have all been catalogued, freeze-dried, and stored in the Kitasato Microbial Library (KML). In an effort to respond to the urgent global demand for new antibiotics, we have recently revived the KML strains to confirm their viability, the continuance of compound production, and the reliability of the preservation process. Survival rates and compound productivity maintenance rates have been good, but specific data in this respect will be reported elsewhere. During this work, PC screening was carried out on culture broths of 330 strains, resulting in the discovery of several new compounds (No. 1 to No. 3 in Table 1).

With respect to the three entries in question, the name of the original compound and retention period by lyophilization are stated, all three having been stored for 35 years or more. The compounds recently discovered, namely bisoxazolomycin [32], the iminimycins [33,34], and the nanaomycins [35,36], would probably not have been detected by an assay system seeking a specific bioactive property, the compounds being found as a direct result of PC screening. Discovery of the iminimycins and nanaomycins are described in detail below in Section 4.2.

With respect to new isolates (No. 4 to No. 13 in Table 1), actinomycete strains isolated from around the roots of plants (No. 4 to No. 7), sediment from mangrove forests (No. 8 to No. 10), sea sediment (No. 11), and soil samples (No. 12 & No. 13) are listed. Actinoallolides [37], hamuramicins [38], spoxazomicins [39,40], and trehangelins [41,42] were discovered from endophytic actinomycete strains, and these are classified as rare actinomycetes. The mangromicins [43–45], K10-0216 KA and KB [46], and pyrizomicins [47], which have differing core structures, were found in a culture broth of *Lechevalieria aerocolonigenes* K10-0216 isolated from sediment from mangroves. Mumiamicin [48] was found in an actinomycete strain isolated from sea sediment, while sagamilactam [49] and the dipyrimicins [50] originated in actinomycete strains from soil. In Section 5.1, we describe, in detail, the discovery of other compounds, notably the trehangelins from *Polymorphospora rubra* K07-0510 and compounds from *Lechevalieria aerocolonigenes* K10-0216.

Assays of the 36 compounds, involving collaboration with other research groups, led to the discovery of varying bioactivity, as shown in Table 1. These results help demonstrate the usefulness and cost/time effectiveness of PC screening, as well as the potential diversity of metabolites produced by a single microorganism. The outcome clearly demonstrates that, as Prof Omura rightly opines, ¯ "microorganisms are a treasure trove of new natural products".



 KML: Kitasato Microbial Library, length of preservation by lyophilization; No. 1–3: Compounds from the KML; No. 4–13: Compounds from fresh isolates (No. 4–7; Roots of plants, No.8–10;Sedimentofmangroveforest,No.11;Marinesediment,Nos.12&13;Soil).

#### *Antibiotics* **2018** , *7*, 45

#### **4. Novel Compounds Discovered from the Kitasato Microbial Library (KML)**

#### *4.1. Iminimycin A by Streptomyces griseus OS-3601, a Streptomycin Producing Strain*

KML strain OS-3601 (Figure 1) was isolated from a soil sample collected at Aso, Kumamoto prefecture, Japan, in 1971, and identified as *Streptomyces griseus*, which produces streptomycin. As part of our screening of 330 preserved actinomycete strains, using four different liquid production media, a unique metabolite, predicted to be a new compound, was observed in an extract of a culture of this strain grown on defatted wheat germ medium. The metabolite was not observed following growth in the other three production media. The compound produced by strain OS-3601 showed an 242.1910 *m/z* [M + H]+ and maximal absorption at 269 and 282 nm. Purification from the culture broth of strain OS-3601 yielded a new iminium compound, designated iminimycin A (Figure 1) [33], which was strongly supported by an IR absorption spectrum, indicative of the presence of an iminium ion. Several plant-derived compounds containing an iminium ion are known [51] but, to our knowledge, this is the first compound arising from an actinomycete source.

**Figure 1.** Scanning electron micrograph of the aerial spore chain of the streptomycin-producing strain *Streptomyces griseus* OS-3601 (**Left**) and structures of iminimycin A and B discovered from the culture broth (**Right**).

An unidentified compound with physical and chemical properties similar to iminimycin A was observed in the octadecyl silyl fraction lacking iminimycin A; the compound having an 399.1736 *m/z* [M + H]+ and maximal absorption at 269 and 282 nm. Purification of this compound revealed a new indolizine alkaloid, designated iminimycin B (Figure 1) [34], that possessed *N*-acetylcysteine and pyridinium moieties, instead of the iminium moiety of iminimycin A. Iminimycin A and B both show antimicrobial activity against Gram-positive and Gram-negative bacteria. Since 1943, when Selman A. Waksman discovered streptomycin from the secondary metabolite of *Streptomyces griseus*, about 200 compounds have been reported from strains identified as *Streptomyces griseus* [9].

Among the actinomycetes, *Streptomyces griseus* is one of the organisms most frequently isolated from soil samples, and has been studied extensively. It was therefore surprising that our PC screening allowed us to find new substances from this strain, demonstrating, yet again, the unmatched ability of microorganisms, especially actinomycetes, to produce a plethora of chemicals.

One chemical component of the culture broth of strain OS-3601 was predicted to be novel from analysis of HPLC and LC/MS data gathered through PC screening. Our prediction proved true following isolation, purification, and structure determination.

Mapping of the genome of streptomycin-producing *Streptomyces griseus* has already been completed [52], and this information also indicates that production of the novel substance is predictable. So why has it not been discovered before? It is conceivable that productivity is low, or that it has been overlooked because it has been masked by the cycloheximide produced in large quantities at the

same time in a lipid soluble fraction, or because the compound is inherently unstable and short-lived. We believe that prediction of the existence of new compounds can be envisaged through a combination of an enquiring mind, technological progress, and the creation and use of more sophisticated equipment. Indeed, it can be said that Prof. Omura's belief that "microorganisms are infinite resources" and that ¯ "microbial research has really just begun" are an accurate depiction of the current situation.

#### *4.2. Nanaomycins F, G and H Derived from Nanaomycins A–E Produced by "Streptomyces rosa subsp. notoensis" OS-3966*

Nanaomycins A, B, C, D, and E produced by "*Streptomyces rosa* subsp. *notoensis*" OS-3966 (Figure 2a,b) [53–55] were isolated from soil sampled in Nanao City, Japan. The nanaomycins contain a naphthoquinone skeleton and were found to have antimycoplasma properties. Nanaomycin A (Figure 2b) was developed as a therapeutic agent for cattle dermatophytosis in 1980 [56]. It generates semiquinone radicals by forming double bonds at positions 4a and 10a, damaging DNA in the process, consequently displaying antibacterial and antifungal activity.

**Figure 2.** Nanaomycin-producing strain "*Streptomyces rosa* subsp. *notoensis*" OS-3966 and new analogs discovered by PC screening from a culture broth. (**a**) Scanning electron micrograph of aerial spore chain of "*S. rosa* subsp. *notoensis*" OS-3966 grown on agar medium; (**b**) Nanaomycins A–E discovered as antibacterial and antifungal substances during 1974–1979; (**c**) New analogs (F–H) discovered from the culture broth via PC screening.

We undertook PC screening using the culture broths of stored freeze-dried nanaomycin-producing strain OS-3966. Two new compounds appeared in the EtOH extract obtained from culturing in defatted wheat germ production medium. The first compound had 337.0917 *m/z* [M + H]+ and maximal absorption at 231, 248 (sh), 267 (sh), and 347 nm; the second compound had 351.1075 *m/z* [M + H]+ and maximal absorption at 225, 256 (sh), and 308 nm. Purification of these compounds revealed they were two new nanaomycin analogs. The first, which we named nanaomycin F (Figure 2c) [35], is a 4a-hydroxyl analog of nanaomycin B (Figure 2b). The second, which we named nanaomycin G (Figure 2c) [35], has a unique 1-indanone skeleton fused with a tetrahydropyran ring. During chromatographic purification of these compounds, another new compound was obtained, nanaomycin H (Figure 2c) [36]. Structure elucidation of nanaomycin H showed it to be a pyranonaphthoquinone with a mycothiol moiety. Our assays detected no antibacterial or antifungal activity in these three compounds. The reason for this seems to be that the radical-generating ability disappeared due to the reduction of the double bond of the quinone skeleton. It was found that the compounds inhibited epithelial-mesenchymal transition, inducing proliferation of mammalian cells [57]. These compounds would not have been found using an assay simply targeting bioactivity. Incidentally, the production medium of nanaomycin A to E is mainly composed of glycerol and soybean meal [53], whereas nanaomycins F, G, and H were produced on a medium containing mainly soluble starch and defatted wheat germ [35,36]. This raises the interesting possibility that simply changing the composition of culture media may allow the discovery of new compounds. These results showed to support OSMAC (one strain, many compounds) approach [58].

#### **5. Novel Compounds Found from Fresh Isolates**

#### *5.1. Novel Substances, Trehangelins Found from Metabolites of the Plant-Derived Rare Actinomycete Polymorphospora Rubra K07-0510*

Microbes isolated from plant root environments were cultured in each of four production media, with the resulting broths being subjected to PC screening. Strain K07-0510, when grown in one of the production media, yielded a peak (indicating a new compound) that was predicted based on spectrometric data.

Strain K07-0510 was isolated from the roots of an orchid collected on Iriomote Island, Okinawa, Japan. Short spore chains were formed, and spores with a smooth surface were cylindrical in shape (Figure 3). Whole-cell hydrolysates contained *meso*-DAP (diaminopimelic acid). The 16S rRNA gene sequence was determined and analyzed using the EzTaxon-e database (present name EzBioCloud) [59] to reveal a 99.9% similarity with *Polymorphospora rubra* TT97-42T. On the basis of the morphological and cultural properties and 16S rRNA gene sequence analyses, strain K07-0510 was identified as *Polymorphospora. rubra*.

A new predicted peak showing 507.2087 *m/z* and maximal absorption at 216 nm was also found in the culture broth of strain K07-0510. Purification of this compound eventually identified three new compounds, which were named trehangelin A, B, C (Figure 3) [41,42]. These compounds were separated from the culture broth of strain K07-0510 by ethyl acetate extraction, followed by silica gel and ODS column chromatography, with final purification by HPLC. Eighteen liters of culture broth yielded 59 mg of trehangelin A as the major component, along with 4.4 mg and 1.3 mg of the minor components trehangelin B and C, respectively.

Structural analysis revealed that two molecules of angelic acid were bound to one molecule of trehalose. Trehangelin A, the main compound, binds angelic acid to the 3,3 positions of trehalose, while B and C does so to the 3,2 and 4,4 positions, respectively. After substance isolation, preliminary bioassays were carried out, and it was found that trehangelin A and C inhibited erythrocyte hemolysis by photooxidation. Trehangelin A and C, which have symmetric structures, showed more potent inhibition than ascorbic acid. In addition, it has been found that the compounds facilitate cytoprotective action and accumulation of procollagen type I C-peptide in a cell culture assay [60], and further research

is currently in progress. Our group is now close to elucidating the genetic basis of the biosynthesis of the trehangelins [42].

**Figure 3.** Scanning electron micrograph of the aerial spore chain of the trehangelin-producing strain *Polymorphospora rubra* K07-0510 and structures of trehangelin A, B, and C.

Angelic acid is known to occur in many plants, particularly chamomile, and has been used as a tonic and sedative to treat a variety of minor complaints, but reports of it being produced by microorganisms are extremely rare. It is, thus, interesting to note that a microbe in a plant root environment also produces the compound.

#### *5.2. New Compounds Produced by Lechevalieria Aerocolonigenes K10-0216, Mangromycin A-I, K10-0216 KA & KB, and PyrizomicinA&B*

*Lechevalieria aerocolonigenes* K10-0216 (Figure 4a) is a rare actinomycete isolated from sediment collected in a mangrove forest in Iriomote Island, Okinawa Prefecture, Japan in 2011. In the culture broth of this strain, we found 13 new compounds.

Mangrove trees growing in brackish water are known to be a rich source of microorganisms, and many rare actinomycetes have been discovered in such environments [61]. We isolated 65 actinomycetes strains from five samples collected from a mangrove forest. Simple use of 16S rRNA gene sequences resulted in identification of 44 strains of the genus *Micromonospora*, as well as a few from the *Actinomadura* and *Verrucosispora*. The so-called rare actinomycetes accounted for 83% of organisms isolated.

Strain K10-0216 grown on an inorganic salt–starch agar medium produced sparse white aerial mycelia that formed characteristic clumps of interwoven hyphae, as shown in Figure 4a. The 16S rRNA gene sequence, which was determined and analyzed using the EzTaxon-e database [59], demonstrated a 99.8% similarity to *Lechevalieria aerocolonigenes* ISP 5034T. On the basis of the morphological and cultural properties and 16S rRNA gene sequence analyses, strain K10-0216 was identified as *Lechevalieria aerocolonigenes*.

**Figure 4.** *Lechevalieria aerocolonigenes* K10-0216 and the mangromicins discovered from the culture broth. (**a**) Scanning electron micrograph of a clump of interwoven hyphae of *L. aerocolonigenes* K10-0216 grown on inorganic salt–starch agar at 27 ◦C for four weeks; (**b**) Structure of mangromicin A–I; (**c**) Productivity of mangromicin A. Black circle: Basic medium; soluble starch 2.0(%), defatted wheat germ 1.0, glycerol 0.5, dry yeast 0.3, CaCO3 0.5, meat extract 0.5. Open circle: Improved medium; soluble starch 5.0(%), defatted wheat germ 1.0, glycerol 0.5, dry yeast 1.0, CaCO3 0.5, meat extract 0.0. (**d**) HPLC analysis of the mangromicins. Chromatographic separation was undertaken using a MonoBis (3.2 × 150 mm, Kyoto Monotech Co., Ltd., Kyoto, Japan) at 40 ◦C. With regard to gradient elution, solvent A was water with 0.1% formic acid, and solvent B was methanol with 0.1% formic acid. The gradient elution was 0–10 min and 5–100% B. The flow rate was 0.5 mL/min, the injection volume was 5 μL, and detection occurred at 254 nm using a photodiode array detector.

We found that the culture broth contained predicted novel substances with molecular weights of 410 or 392, and maximal absorption at 251 nm or at 236 nm, respectively. We named these compounds mangromicin A and B [43]. However, these peaks were not obtained in a jar fermenter culture. A total of 500 mL Erlenmeyer flasks containing 100 mL culture medium were used to obtain the target substances that were isolated and purified. Structural analysis revealed a unique structure of a cyclopentadecane skeleton containing a 5,6-dihydro-4-hydroxyl-2-pyrone ring and a tetrahydrofuran ring (Figure 4b). Both substances displayed antitrypanosomal activity, and a patent application was filed. Subsequently, we experimented with different production media in order to obtain a large amount of the target compounds. As a result, by changing the concentration of soluble starch in the medium ingredients from 2.0% to 5.0%, and dry yeast 0.3% to 1.0%, the amount of mangromicin A increased dramatically, from 0.24 μg/mL to 88.6 μg/mL (Figure 4c). Eventually, nine analogs were obtained from 15 L of culture solution using this procedure (Figure 4b) [44,45]. The HPLC profile is shown in Figure 4d. After isolation and purification, preliminary bioassays identified DPPH radical scavenging activity and, in particular, NO scavenging was potent [44,45].

Furthermore, from the same culture broth, we also discovered four novel compounds (Figure 5). Two contained a steroid skeleton (named K10-0216 KA and KB) that inhibited lipid accumulation in 3T3-L1 adipocytes [46], while the others displayed a thiazole and a pyridine ring (named pyrizomicin A and B) that showed antibacterial activity [47].

**Figure 5.** Two steroid compounds (K10-0216 KA and KB) and two compounds containing thiazole and pyridine (pyrizomicin A and B) from *L. aerocolonigenes* K10-0216 (Figure 4a).

As described above, 13 compounds, including three with different skeletons, were found from a single culture broth of one microorganism. This reinforces our belief that it is highly likely to be able to identify and acquire new compounds simply by changing the culture method and production medium.

#### **6. Conclusions**

This report describes compounds derived from actinomycete strains which were found via PC screening between 2011–2017 by the Kitasato Omura-Drug Discovery Group. Iminimycins and ¯ nanaomycins were found from culture broths of conserved KML strains, while the trehangelins from *P. rubra* K07-0510 and new compounds, mangromicins, K10-0216 KA and KB, pyrizomicin A and B, from *L. aelocolonigenes* K10-0216, originated in fresh isolates. Our results illustrate that PC screening can unearth novel compounds that are not likely to be discovered through traditional targeted bioassay systems. Genomic analysis of actinomycetes has found that more than 30 secondary metabolite biosynthetic genes may be involved in chemical production, depending on the strain. However, knowing the genetic production mechanism may not necessarily make obtaining the compound easier.

This report also shows that it is possible to obtain a novel compound from actinomycete species, such as *Streptomyces griseus*, that has already undergone long and intensive study. In addition, as in the case of *L. aerocolonigenes* K10-0216, a variety of compounds with significantly different skeletons can be obtained from a single culture broth using only one microbe strain.

Our work also demonstrates that the aptly classified "rare actinomycetes" remain a unique and, so far, relatively unexploited source of potentially useful chemicals. This is supported by work that found that strains of the genus *Actinoallomurus* (a rare actinomycete) have high secondary metabolite production capacity [62]. We also found five actinoallolide analogs [37] (shown in Table 1) produced by *A. fulvus* MK 10-036 and *A. fulvus* K 09-0307, later discovering two more new compounds together with seven known compounds from an *A. fulvus* K09-0307 broth.

Microorganisms can present us with interesting compounds with unique structures which our current scientific knowledge and expertise cannot easily predict nor easily replicate. For example, the mangromicins have unique skeletons which subsequently attracted attention in the field of organic synthesis. Organic chemists tried to devise a total synthesis of mangromicin A, and finally managed to achieve it via a complicated and lengthy 30 step process [63], whereas *L. aerocolonigenes* K10-0216 produces the compound naturally during culturing.

To illustrate the immeasurable scope for success in this respect, it has been reported that 99% of microorganisms in nature have not yet been isolated [64]. Furthermore, our results suggest that interesting compounds can be found even from *Streptomyces* strains that are thought to have been exhausted, simply by devising new identification methods or culture conditions. It is therefore essential to revisit existing actinomycetes, common and rare, and comprehensively examine them for new compounds.

Traditionally, the search for useful natural products has been advanced using approaches that focus on specific biological activity and target molecules. In this approach, there is a clearly defined goal. However, with this method alone, it is likely that many of the microorganisms isolated have been exploited and then discarded without fully utilizing their abilities. It is clearly difficult to devise a wide variety of screening systems and, certainly, almost impossible for all these systems to be operational in a single institution. Consequently, extensive, multifaceted research collaborations will need to be established to work towards full and comprehensive testing of all chemicals, be they newly isolated from existing compound libraries or from new sources. Naturally, many obstacles will need to be overcome, including protection of Intellectual Property Rights, transfer of technology, and capacity building aspects. None of these should be insurmountable, especially so if the goal of getting as many new bioactive substances in the shortest time possible is to be achieved.

We remain firm in our commitment to discover as many new compounds as possible by exploiting the ability of microorganisms, especially the actinomycetes, to produce such attractive substances. Moreover, we will continue to follow a twin-pronged approach to this task, while striving to devise other alternative screening methods, and adopt any other measures that could help expedite the research and development process.

**Author Contributions:** Y.T. and T.N. designed and performed the experiments, analyzed the data, and wrote the paper

**Funding:** This research was funded by the Institution for Fermentation, Osaka (IFO).

**Acknowledgments:** We deeply appreciate the guidance and support of Satoshi Omura (Distinguished Emeritus ¯ Kitasato Omura-Drug Discovery Group director, Kitasato University). And for his constant emphasis of the ¯ importance of isolation, culture and classification of microbes—to "Learn from microorganisms," and "Be grateful for microorganisms." We are grateful to all members of the Kitasato Omura-Drug Discovery Group, especially ¯ to Atsuko Matsumoto for management of the actinomycete strains and to Kazuro Shiomi, Toshiaki Sunazuka, Masato Iwatsuki and Kazuhiko Otoguro for structure determinations and bioassay work, and to Yuki Inahashi, Hirotaka Matsuo and Suga Takuya for their efforts in seeking new compounds by PC screening and the subsequent

isolation, purification and structure determination. We would also like to extend our thanks to Toru Kimura, Rei Miyano, Yoshiyuki Kamiya, Shoko Izuta and many other researcher staff and students of the Kitasato Institute for Life Sciences for their collegiate support. We also thank Jun Nakanishi of the National Institute for Materials Science for assistance with biological activity evaluation. We would like to express our deepest gratitude to the IFO for their financial support.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Review* **Concepts and Methods to Access Novel Antibiotics from Actinomycetes**

**Joachim J. Hug 1,2, Chantal D. Bader 1,2, Maja Remškar 1,2, Katarina Cirnski 1,2 and Rolf Müller 1,2,\***


Received: 12 April 2018; Accepted: 17 May 2018; Published: 22 May 2018

**Abstract:** Actinomycetes have been proven to be an excellent source of secondary metabolites for more than half a century. Exhibiting various bioactivities, they provide valuable approved drugs in clinical use. Most microorganisms are still untapped in terms of their capacity to produce secondary metabolites, since only a small fraction can be cultured in the laboratory. Thus, improving cultivation techniques to extend the range of secondary metabolite producers accessible under laboratory conditions is an important first step in prospecting underexplored sources for the isolation of novel antibiotics. Currently uncultured actinobacteria can be made available by bioprospecting extreme or simply habitats other than soil. Furthermore, bioinformatic analysis of genomes reveals most producers to harbour many more biosynthetic gene clusters than compounds identified from any single strain, which translates into a silent biosynthetic potential of the microbial world for the production of yet unknown natural products. This review covers discovery strategies and innovative methods recently employed to access the untapped reservoir of natural products. The focus is the order of actinomycetes although most approaches are similarly applicable to other microbes. Advanced cultivation methods, genomics- and metagenomics-based approaches, as well as modern metabolomics-inspired methods are highlighted to emphasise the interplay of different disciplines to improve access to novel natural products.

**Keywords:** metagenomics; rare actinomycetes; dereplication; metabolomics; genome mining; natural products

#### **1. Introduction**

Antimicrobial resistance (AMR) gains more and more public attention as one of the biggest threats to prevention and treatment of an increasing number of infections. In January 2018, the WHO released a report on global antibiotic resistance crisis. For this report, the new Global Antimicrobial Surveillance System (GLASS) implemented in May 2015 was used to support standardised AMR surveillance globally. The main objective of that study has been tracking the resistance-related issues of medicines in use for treatment of hospital and community acquired infections across the 52 participating countries. The survey focus is on *Escherichia coli*, *Klebsiella pneumoniae*, *Staphylococcus aureus*, *Streptococcus pneumoniae*, *Salmonella* spp., *Neisseria gonorrhoeae*, *Shigella* spp. and *Acinetobacter baumanii*. According to the report, *N. gonorrhoeae* is evolving into a superbug, as it is resistant to 3rd generation cephalosporins and fluoroquinolones and is classified by the WHO Priority Pathogens List of antibiotic-resistant bacteria. *Salmonella* spp., *S. aureus* and *N. gonorrhoeae* were prioritised for research and development of new and effective antibiotic treatments, while *A. baumanii*, *E. coli* and *K. pneumoniae* are classified as critical priority and *Shigella* spp. along with *S. pneumonaiae* are classified as medium priority [1]. Obviously, new lead compounds exhibiting lack of cross-resistance with known antibiotics are urgently searched for.

Between 1950 and 1960—the so-called "golden age of antibiotics" [2]—microbiology and chemistry went hand in hand; bacteria were cultivated and their secondary metabolites extracted to yield a large number of compounds with antibacterial activity. Among these bacteria the genus *Streptomyces* has provided numerous novel bioactive molecules, not just antibiotics but also antifungals, antiprotozoals and antivirals [3]. However, the discovery of novel drugs derived from streptomycete secondary metabolites has continued to steadily decrease since, reflected in a 30% drop of natural product-based drugs in clinical studies between 2001 and 2008 [4]. Natural product research was widely abandoned by pharmaceutical industry due to severe rediscovery issues; in addition, challenging isolation of new producers and low production concentrations contribute to the more difficult access to novel scaffolds with promising activity using well-known natural product sources [4]. In response to the growing needs of medicine for new antimicrobial compounds, the focus in natural products discovery shifted towards underexplored habitats, especially marine and extreme environments. Encouragingly, rare genera of actinomycetes—in general considered as non-streptomycetes—and novel *Streptomyces* isolates found in these habitats have already afforded the discovery of novel antimicrobials with unique chemical moieties, confirming that microbial natural products are still a promising source for drug discovery [5–8]. In particular, non-streptomycetes belonging to genera such as *Micromonospora*, *Nocardia*, *Actinomadura*, *Actinoplanes, Streptoverticilllium* and *Saccharopolyspora* have been found to produce chemically unique antibiotics featuring potent activities [8], such as abyssomicins [9] and proximicins [10] from *Verrucosispora* strains. Consequently, isolation of natural products from rare actinomycetes already led to the discovery of approximately 2250 new bioactive secondary metabolites until 2005 [11,12]. Bioinformatic analysis of genomes from long known actinomycetes—for instance, the first whole genome sequenced actinomycete *Streptomyces coelicolor* A3 (2) [13] known to produce only a few secondary metabolites until then—revealed that many secondary metabolite genes are most likely not expressed under conventional laboratory conditions or their corresponding products cannot be detected with widely used analytical methods. This discrepancy translates into a "silent" potential of the microbial world for the production of yet unknown natural products. However, despite these encouraging insights, the majority of nature's microbial biosynthetic potential will remain elusive, if natural product research continues to rely exclusively on a classical culture-dependent discovery platform. Efforts must be intensified to exploit the majority of bacteria not cultured under laboratory conditions to date, and some suitable concepts and methods for that purpose have emerged recently [14].

#### *Outline of this Review*

In light of the numerous available review articles covering microbial natural products research—including highly specialised articles diving into specific sub-topics in much detail—we here aim to present a summary of selected technical and conceptual proceedings in the field of natural product research during the last decade. We focus on examples from the world of the actinomycetes although conceptual developments using alternative producers have significant impact on this field [15–19]. The idea of exploring new habitats harbouring rare actinomycetes and their subsequent isolation and cultivation utilising novel cultivation methods serves as a starting point in Section 2. If large-scale cultivation is difficult to achieve or even not possible, culture-independent approaches are required to extend the number of microorganisms accessible for the discovery of novel natural products. Section 3 therefore outlines examples of natural product discovery by metagenomics and heterologous expression of selected pathways as an alternative means to investigate

the uncultured environmental diversity and access its hidden natural product potential. Metagenomics mainly relies on prior examination of biosynthetic gene clusters with *in silico* bioinformatics tools. Hence, Section 4 deals with genomics-based approaches; in particular, genome mining to access silent biosynthetic pathways in cultured actinobacteria will be highlighted. Section 5 emphasises developments in analytical chemistry and the increasing use of metabolomics-inspired methods by combining data-driven and experimental approaches. The latter approaches today allow for the prioritisation of minute amounts of compounds that escaped detection previously.

#### **2. Exploring New Habitats**

Because of over-exploitation of terrestrial streptomycetes in almost all pharmaceutical and agrochemical companies as well as numerous academic groups, in recent years, the search for new bioactive molecules has moved towards their marine relatives [20], taxa found in the extreme environments [21], endophytic species [22] and on non-*Streptomyces* actinomycetes. Especially marine ecosystems represent one of the richest and underexploited habitats with great microbial diversity showing potential for discovery of novel and chemically diverse antimicrobial compounds [23]. Progress in the beginning of this millennium in the field of selective isolation methods has been achieved, improving access to these diverse actinomycetes. Since *Streptomyces* are the most dominant genus within the actinomycete group, it is necessary to increase chances for the isolation of desirable new species (enrichment) by reducing undesirable background, i.e., previously isolated species (pretreatment) [24]. A combination of thermal pretreatment, beneficial to promote the spore germination and improve the selective isolation of rare actinomycetes and chemical techniques to eliminate bacterial contaminants has proven successful for isolation of some rare actinomycetes [24]. An advanced method of pretreatment applies polyvalent bacteriophages specifically targeting different unwanted background bacteria [25]. Traditional enrichment is based on simple trial and error methods using different chemical and physical treatments to unearth the uncultured bacterial majority. Chaxalactin A–C production—a rare class of 22-membered macrolactone polyketides produced by *Streptomyces* strain C34 which was isolated from the Chilean hyper-arid Atacama Desert soil [26,27]—was highly dependent on the culture medium used for its growth, highlighting the influence of the culture medium on growth and secondary metabolism of actinomycetes. Other recent examples for the empiric progress in isolation of actinomycetes is the halotolerant *Nocardiopsis* sp. HR-4 strain producing (-)-7-deoxy-8-*O*-methyltetrangomycin [28] and the non-extremophilic *Streptomyces* spp. ERI-26 producing 6,61-bis(1,5,7-trihydroxy-3-hydroxylmethylanthraquinone) [29,30]. The reason bacteria are remaining "unculturable" is that the growth under laboratory conditions fails to mimic essential aspects of the environment, e.g., the absence of competitors [31]. One common method for simulation of environmental conditions uses diffusion chambers to enable the exchange of nutrients and growth factors within the two chambers while separating the isolates of interest from their competitors. Recovery rates up to 40% have been achieved, comparing the number of growing microcolonies by microscopic counts of cells in the initial inoculum, whereas the same inoculum on standard petri dishes yielded only a very low recovery rate of 0.05% [32]. A further developed and miniaturised *in situ* cultivation platform based on diffusion chambers is used for an isolation chip (iChip). The iChip is composed of several hundred miniature diffusion chambers, of which each is inoculated with an individual environmental cell [33]. To capture on average a single cell, the through-holes of the iChip plate are immersed into a suspension of mixed environmental cells. Covered with an upper and lower plate, the assembled iChip provides a miniature diffusion chamber for each single cell (Figure 1). This technology platform has shown the great potential for upscaling the throughput and improvement of the isolation of previously uncultured bacteria in general. The polycyclic xanthone antibiotics neocitreamicins from *Nocardia* strain G0655 [34] along with other examples described in literature, are showing that simulation of the original environment accomplished the isolation of novel strains [35]. The glycosylated macrolactams NOVO3 and NOVO4 from the strains *Streptosporangium* P1532 and *Amycolatopsis* Z0363, respectively [36], the ribosomal 16-membered

cyclic peptide lassomycin from *Lentzea kentuckyensis* [37], and the antibiotic teixobactin from the new β-proteobacterial species *Eleftheria terrae* [38] have been isolated by using the iChip technology.

**Figure 1.** Scheme of isolation strategies, partly adapted from Nichols et al. [33] and Goers et al. [39]. (**A**) Soil sample or marine sample undergoes enrichment and/or pretreatment to increase the chance to isolate new species and/or reduce undesirable background from previously isolated strains. In general, empirical established methods for fermentation varies in incubation time, media composition, additives, pH and temperature to enable growth of desirable strains. Chaxalactin A produced by *Streptomyces* sp. C34 [26,27], 6,61-bis (1,5,7-trihydroxy-3-hydroxylmethylanthraquinone) produced by *Streptomyces* spp. ERI-26 [29,30] and (-)-7-deoxy-8-*O*-methyltetrangomycin from *Nocardiopsis* sp. HR-4 [28] are examples for novel metabolites found using this conventional method. (**B**) Soil sample or marine sample is co-cultivated with other microorganisms to promote culturable isolates or to stimulate the secondary metabolism. Co-cultivation is categorised in microfluidic systems, petri dish co-culture systems, co-cultures on solid supports, co-culture systems using bioreactors and transwell systems [39]. Co-cultivation of *Actinokineospora* sp. EG49 and *Nocardiopsis* sp. RV163 induces the biosynthesis of three natural products namely *N*-(2-hydroxyphenyl)-acetamide, 1,6-dihydroxyphenazine and 5a,6,11a,12-tetrahydro-5a,11a-dimethyl[1,4]benzoxazino[3,2-b][1,4]benzoxazine [40]. (**C**) Soil sample or marine sample is used to create a suspension of mixed environmental cells. The isolation chip (iChip) plate is immersed into this suspension to capture (on average) a single cell. Covered with an upper and lower plate, the assembled iChip provides a miniature diffusion chamber for each single cell [33]. NOVO 3 [36], the cyclic peptide lassomycin [37] and the antibiotic teixobactin [38] were isolated from strains obtained by the iChip technology.

These case studies demonstrated how environmental conditions facilitate access to the diversity of yet uncultured microorganisms [32]. An alternative method for separation of uncultured bacteria is encapsulation of single bacterial cells in microdroplets of solidified agarose [41–43]. To acquire insights into metabolic requirements a metatranscriptomic approach can be utilised. High-throughput sequencing of RNA transcripts enables to access expressed genes of growing bacteria that are not culturable in large-scale. Following this metatranscriptomics-based method, an uncultured *Rikenella*-like bacterium in the leech gut was found to utilise mucin as a carbon and energy source [44].

A different approach to simulate authentic circumstances in the environment comprises co-cultivation of different species derived from the same environment, demonstrating significant influence on the production of secondary metabolites [45–47]. The main co-cultivation systems can be categorised by the type of technology used in microfluidic systems, petri dish co-culture systems, co-cultures on solid supports, co-culture systems using bioreactors and transwell systems for co-cultivation, allowing asymmetric levels of separation between various numbers of populations [39]. Co-cultivation of *Actinokineospora* sp. EG49 and *Nocardiopsis* sp. RV163 induces the biosynthesis of three natural products namely *N*-(2-hydroxyphenyl)-acetamide, 1,6-dihydroxyphenazine and 5a,6,11a,12-tetrahydro-5a,11a-dimethyl[1,4]benzoxazino[3,2-b][1,4]benzoxazine [40]. A hypothesised model of mechanisms on the molecular level tries to explain the reason for the necessity of co-cultivation; i.e., neighbouring bacteria are decreasing oxidative, metabolic and environmental stress, provide common nutrients or other growth factors and siderophores in order to bind and solubilise Fe3+ [31]. The Black Queen Hypothesis, a theory of reductive evolution, tries to explain how selection leads to helper-dependent bacterial isolates, which lost the ability to perform essential processes on their own. In conclusion the theory states that the "loss of a costly, leaky function is selectively favoured at the individual level and will proceed until the production of public goods is just sufficient to support the equilibrium community" [48]. Similar observations hold true for endosymbiotic marine bacteria, responsible for the prolific production of secondary metabolites [49]. Among the underexplored habitats, analysis of marine ecosystems has flourished (Figure 2). New actinomycetes were isolated from marine sediments, water columns at various depths and even from marine fauna. Screening of 20 marine samples from marine environments in Egypt, employing different methods for selective isolation, has yielded 112 *Streptomyces* isolates. Up to 76% of the isolates displayed activities against tested Gram-positive pathogens, up to 41% against Gram-negative pathogens, 32% against yeasts and 28% against fungi. Although depending on the type of media used, 28 isolates showed activity against methicillin-resistant *S. aureus* (MRSA), i.e., 25% of all obtained isolates. In addition, all *Streptomyces* strains showed significant nematicidal activity against second stage (J2) juveniles of the root-knot nematode *Meloidogyne incognita,* making them a potential candidate for use in agriculture [50]. However, the value of these observations has to be confirmed as the molecular structures of the bioactive compounds have yet to be elucidated. The one strain–many compounds (OSMAC) approach is a strategy to overcome the limited expression of biosynthetic pathways in microbes. Prioritised strains are cultured in a variety of different media compositions and fermentation conditions to maximise the amount and diversity of produced compounds [51]. The use of OSMAC approach has enabled the recovery of a new thiopeptide antibiotic, TP-1161, from marine sediment derived *Nocardiopsis* isolate. The antimicrobial profile revealed no anti-Gram-negative activity; however, the activity against Gram-positive clinical isolates was comparable to the one of vancomycin [52]. Another example for the OSMAC approach is the isolation of the anti-MRSA and anti-vancomycin-resistant *Enterococcus* armeniaspirols produced by *Streptomyces armeniacus* DSM 19369. The armeniaspirols are only produced by the "well-characterised" *S. armeniacus* when cultivated on a malt-containing medium [53].

**Figure 2.** Underexploited habitats of actinobacteria attracted more attention for microbial natural product discovery. Currently, oceans [54], deserts [55], mountains [30] and Antarctica [56] ranges together with hot springs [57] and endophytes [58] and symbionts [59] are focuses of the search for new bioactive compounds.

One of the more recent examples is the isolation and cultivation of the abyssal actinobacterium *Pseudonocardia carboxydivorans* M-227, isolated at 3000 m depth in water column of the Avilés submarine canyon. It produces the two new antibiotics branimycin B and C (Figure 3). Branimycins are structurally related to nargenicins and are mainly active against *S. aureus*. However, both compounds were found to be moderately active against not only *S. aureus* (MIC values 32–64 μg/mL) but also Gram-negative bacteria (MIC values 20–80 μg/mL). It is important to point out that they exhibited activity against MRSA, several clinical *S. aureus* isolates, *E. faecalis* and even *H. influenzae* ATCC 49247 [60]. Contrarily, the novel γ-butyrolactones ghanamycin A and B, isolated from a marine-derived strain *Streptomyces ghanaensis* TXC6-16, only displayed weak anti-Gram-negative activity against *Pseudomans syringae*, *Erwinia* sp. and selected plant pathogens such as *F. oxysporum*, *A. solani* and *P. oryzae* [61]. Lobophorin E and F, new spirotetronate antibiotics, exhibited moderate antimicrobial activity against *B. thuringiensis* SCSIO BT01, *S. aureus* ATCC 29213 and *E. faecalis* ATCC 29212. Interestingly lobophorin F also exhibits moderate cytotoxic activities on human tumour cell lines SF-268, MCF-7 and NCI-233 H460, emphasising the influence of chemical modifications on the observed bioactivity [62]. Spithioneine A and B are structurally unique bohemamine-type pyrrolizidine alkaloids from a marine isolate *Streptomyces spinoverrucosus*. They are especially interesting as they incorporate ergothioneine into a polyketide. Ergothioneine exhibits a wide variety of activities, such as inhibition of oxidative stress, promotion of neuronal differentiation and metal ion chelation, so it will be interesting to see the antimicrobial profile of these novel compounds incorporating this chemical moiety [63].

The question if marine derived small molecules chemically differ from those produced by terrestrial strains remains a matter of debate among natural product researchers. The famous abyssomicin was first thought to be exclusively produced by a marine streptomycete. However, it was later shown to be produced by a terrestrial streptomycete [9,64–66] as well. Probably the best

known example that marine and terrestrial actinomycetes are producing similar lead compounds is shown by salinosporamide A from *Salinispora tropica* [67], featuring high structural similarity to the cinnabaramides A−G, isolated from the terrestrial *Streptomyces* sp. DSM 15324 [68]. Recently isolated ilamycins, produced by a *Streptomyces atratus* SCSIO ZH16, a strain isolated from South China deep-sea sediment (Figure 2) [54], carry two unusual building blocks, L-3-nitro-tyrosine and L-2-amino-4-hexenoic acid. New ilamycins produced by a genetically modified strain show very good anti-tuberculosis activity, with the MIC values against *M. tuberculosis* H37Rv in the single-digit nanomolar range (9.8 nM) [54].

**Figure 3.** Novel natural products deriving from new habitats.

#### *2.1. Extreme Environments as a Rich Source for Novel Strains*

Soil samples from extreme environments such as deserts and samples collected from higher altitudes offer potential to isolate many previously not cultured bacteria. Soil sample from Indian mountains uncovered a new *Streptomyces* sp. isolate, ERI-26, which produces a novel anthraquinone (Figure 2) [30]. The compound exhibits weak antibacterial activity (anti-Gram-positive and anti-Gram-negative activity >62.5 μg/mL) and moderate to good antifungal activity, with MIC values as low as 3.9 μg/mL [29]. Screening of the Egyptian desert revealed that 32 out of 75 isolated strains were active against the tested pathogenic organisms and thus showing that these extremophiles are metabolically active. According to the antimicrobial activities, 12.5% of the isolates were active against Gram-positive as well as Gram-negative bacteria and yeast strains [7]. However, without correlation of these observed activities to any known or unknown compounds, the authentic benefit of these findings is hard to evaluate. Non-*Streptomyces* actinomycete *Saccharotrix* SA198 isolated from Saharian soil has also lead to the discovery of two novel antibiotics A4 and A5 (Figure 2) [55]. Both exhibited good anti-Gram-positive and anti-Gram-negative activity (MIC values between 10 and 50 μg/mL), as well as very good antifungal profile (MIC values between 1–5 μg/mL), but no activity against yeast [55]. Furthermore, the non-*Streptomyces* actinobacterium isolate LAM143cG3 was obtained from an underexplored sebkha lake (dry lakebed consisting primarily of salt) of Kenadsa (Bechar, Southwestern Algeria). Based on the chemical and morphological characteristics, this isolate was identified as a member of the genus *Spirillospora.* Agar diffusion tests showed that crude extracts are primarily and only moderately active on Gram-positive bacteria and have only weak activity on

Gram-negative bacteria [69]. Similar antibiotic bioactivity profile was exhibited by the halotolerant *Nocardiopsis* sp. HR-4 recently isolated from salt lake soil sample in Algerian Sahara [28], however the activities were not connected to known or novel natural products. One of the more promising compounds is actinomadurol, also isolated from a rare actinomycete *Actinomadura* strain KC 191 as it exhibits potent activity against *S. aureus*, *K. rhizophila*, and *P. hauseri* (MIC = 0.39–0.78 μg/mL) [70]. Actinomadurol is not inhibiting the growth of various tested cancer cell lines, even at high concentrations (IC50 > 100 μg/mL). Genera *Brevibacterium*, *Gordonia*, *Micromonospora*, *Arthrobacter*, *Demetria*, *Rhodococcus*, *Janibacter*, *Leifsonia*, *Dermacoccus*, *Kocuria*, *Lapillicoccus*, *Microbacterium* and *Nocardioides* were present in soil samples from Antarctica (Figure 2) [56]. These isolates were tested against following pathogens: *C. albicans* ATCC 10231T, *S. aureus* ATCC 51650T, MRSA ATCC BAA-44T and *P. aeruginosa* ATCC 10145T. Of all the present genera *Brevibacterium* showed the best antimicrobial activity, three isolates were active against *C. albicans*, one against MRSA and one against *S. aureus* [56].

Recently, thermal water sources have been explored in search for new antimicrobial entities. Analysis of the Tengchong hot springs revealed predominantly the presence of the genus *Streptomyces* (Figure 2) [57]. However, with the presence of *Actinomadura*, *Microbispora*, *Micromonospora*, *Nocardiopsis*, *Nonomuraea*, *Promicromonospora*, *Pseudonocardia* and *Verrucosispora*, a high number of rare non-*Streptomyces* actinobacteria could be found. Most of the isolates had a weak activity against *A. baumanii* and some against *M. luteus* [57]. Much stronger antimicrobial profile was exhibited by *Streptomyces* sp. Al-Dhabi-1, collected from Tharban hot spring in Saudi Arabia. Ethyl-acetate extract showed moderate to weak activity against tested pathogens, both Gram-positive and Gram-negative as well as fungi with MIC values <1 mg/mL [71]. Other recently explored habitats are cave systems populated with actinobacteria as its highly diverse microbial community may exhibit antagonistic properties against pathogens. Search for antifungal agents against *P. destructans*, causing white-nose syndrome in bats, led to the identification of 15 novel *Streptomyces* species, encouraging the further investigation of caves (Figure 2) [59] as valuable source for novel actinomycetes producing potent antimicrobials [59]. Unfortunately, most of the studies mentioned do not connect crude extract activity to known or novel natural products and are thus difficult to interpret regarding the potential for new antibiotics. Extreme environments may harbour many antibiotic natural products produced by a plethora of microorganisms, but solely investigate bioactivities without structural correlation to secondary metabolites is not beneficial for the discovery of potential new antibiotics.

#### *2.2. Endophytic Actinomycetes*

Endophytic actinomycetes represent a potential repository of novel bioactive compounds as there are almost 300,000 plant species, each hosting one or more types of endophyte [72]. Discovery of new compounds is however hindered by their slow growth and level of production, as it is usually significantly lower than the level of production of *Streptomyces* strains. The genus *Streptomyces* represents only 26% of all actinomycetes found in various plant tissues or in the rhizosphere. Cultured broth of 333 strains from plant roots and 137 strains from rhizospheric soil showed that isolated actinomycetes do have a potential as new sources of antimicrobials. Antimicrobial activity against *B. subtilis* KB-211 was present in 8.5% of isolates from plant roots, 9.1% against *K. rhizophila* KB-212, 5.7% against *X. campestris* pv. *oryzae* KB-88, 3.4% against *M. racemosus* KF-223, 1.7% against *C. albicans* and nothing is active against *E. coli* [73]. In general, the rhizosphere isolates are highlighting more isolates with antimicrobial activity. Overall, 20.3% of the rhizosphere isolates were active against *B. subtilis* KB-211, 17.1% were active against *K. rhizophila* KB-212, 8.9% were active against *X. campestris pv. oryzae* KB-88, 21.1% were active against *M. racemosus* KF-223, 9.8% were active against *C. albicans* and 8.1% were active against *E. coli* [73]. Bioprospecting rhizosphere isolates from the Caatinga biome (northeast Brazil) (Figure 2) [58] gave similar results as the above-mentioned example, where only 16% of isolates showed some activity against tested microorganisms. The most promising isolate was further investigated and crude ethanolic extract of *Streptomyces parvulus* D1.129 inhibited the growth of *S. aureus* and *B. subtilis* at 0.97 μg/mL. Despite the low percentage of isolates with antimicrobial

properties, taxonomic difference and genome sequencing data strongly indicate the potential of these strains to produce novel bioactive compounds [58]*. Streptomyces* sp. strain RTd22—isolated from the Mexican sunflower *Tithonia diversifolia*—shows potential as a producer of novel natural products. Although analysis of the genome revealed an abundance of biosynthetic gene clusters (BGCs) not connected to known natural products, until now no novel bioactive compound was isolated from this strain [74].

#### *2.3. Symbiotic Actinomycetes*

Bacteria and insects are ubiquitously distributed and some species have naturally developed a symbiotic relationship guided through a co-evolutionary adaptation, where the bacteria are located in specialised anatomical compartments. It is known that attine ants live in a tripartite mutualism with the fungi *Leucoagaricus gongylophorus*, which provides food to the ants, and with antibiotic-producing actinomycetes. Attine ants are hypothesised to acquire actinobacteria from the soil, selecting and maintaining those species that produce useful antibiotics as a way of biocontrolling the fungal counterparts [75]. Similar findings were discovered for wasp—*Streptomyces* and termite—*Streptomyces* symbiosis. Sceliphrolactam was found to be produced by a *Streptomyces* strain isolated from a wasp exoskeleton and inhibits the growth of amphotericin B resistant *C. albicans* [76] (Figure 4). A termite-associated actinomycete, *Amycolatopsis* sp. M39, produces glycosylated polyketide macrolactams, namely the macrotermycins. They exhibit very strong activity against *S. aureus* and *B. subtilis* and a moderate activity against *C. albicans* and *S. cerevisiae* [77]. Another termite-associated natural product, natalamycin A, exhibits a very high and broad-range antifungal activity [78]. *Pseudonocardia* spp. isolated from exoskeletons from several genera of ants were found to produce gerumycins, piperazic acid-containing cyclic depsipeptides that selectively inhibit a fungal pathogen *Escovopsis* [79]. A strain from *Pseudonocardia* spp. isolated from ant colonies harboured a plasmid encoding the biosynthetic gene cluster of a rebeccamycin analogue. Rebeccamycin was originally isolated from *Lechevalieria aerocolonigenes* (*Nocardia aerocolonigenes*) [80]. Rebeccamycin and several analogues have a very pronounced antitumour activity as they inhibit the mammalian topoisomerase I [81]. *Streptomyces* sp. CLI250 isolated from a fungus-growing ant yielded an unusual peptide tryptorubin A that harbours a few unique moieties, such as the linkage between two tryptophans in *para* position to the indole nitrogen. Another unusual structural feature is the linkage between tyrosine aromatic rings to the indole nitrogen of tryptophan, which feature has not been reported in any other natural product [82].

**Figure 4.** Novel natural products from symbiotic actinobacteria.

#### *2.4. Conclusions*

Underexplored habitats represent a largely untapped repository of compounds with unique chemical structures. Overall, new actinomycete isolates frequently exhibit moderate to strong activity against Gram-positive bacteria and fungi. Active compounds against Gram-negative bacteria remain a scarce finding in most habitats whereas the likelihood of finding anti-Gram-negative active natural products appears higher in rich, strongly diverse terrestrial communities. This is probably due to increased selective pressure as a consequence of the struggle of microbes to prevail in their fight for nutrients and territory in densely populated niches. In extreme environments, Nature itself already stamped out a large portion of the competition by making the habitat difficult to live in; therefore, it was previously assumed that the need of remaining microbes to utilise secondary metabolites is greatly decreased [83]. However, recent investigations propose that extremophile microorganisms are in fact able to produce diverse secondary metabolites [84]. The picture is admittedly incomplete since only few extremophiles have been systematically screened for the production of secondary metabolites; nevertheless findings from those that have been analysed are encouraging [84]. Access to products from extremophiles is also limited due to the lack of knowledge on how to isolate and cultivate them under laboratory conditions. As part of increased efforts to overcome this bottleneck, technologies such as the iChip platform can complement empirical improvements in isolation and fermentation of previously uncultured bacteria. Such novel microbiology-based methods are still underrepresented in most natural product discovery pipelines. It is worth taking into consideration that current discovery pipelines are more focused on exploiting new non-extreme habitats to obtain rare actinomycetes, which can be seen as a strategy to circumvent the need for extraordinary cultivation conditions. Although this approach cannot unlock access to extremophiles and uncultured microorganisms it is understandably the preferred approach implemented in many laboratories: as soon as microorganisms can be maintained under laboratory conditions, it is possible to make use of genome mining, apply genetic tools for the activation of silent biosynthetic gene clusters and scrutinise secondary metabolomes using a panel of sensitive analytical methods.

Taken together, access to the full complement of novel natural products within a single microorganism is only achievable through combination of available methods from multiple disciplines. Even without maintaining a constant supply of novel actinomycete isolates, these combined approaches can increase the availability of novel natural products. Considering that the majority of microorganisms cannot be cultured under laboratory conditions, the development of culture-independent approaches are indispensable before their metabolites can be investigated. Thus, eliminating the inherent bottleneck of cultivation is the distinguishing idea of metagenomic approaches, which are based on prior examination of biosynthetic gene clusters with *in silico* bioinformatics tools to enable heterologous expression of selected pathways originated from uncultured microorganisms.

#### **3. Metagenomic Approach to Exploit the Uncultured Bacterial Majority**

Since to date only an estimated 1% of bacteria are cultured under laboratory conditions according to 16S ribosomal RNA (rRNA) [14], novel methods are required to access the greater diversity of natural products, circumventing the limitations of traditional culture-based approaches. Metagenomics as a culture-independent approach is based on DNA recovery directly from the environment (eDNA) to gain access to the hidden reservoir of secondary metabolite encoding sequences [85]. Metagenomic approaches in combination with classical culture-based approaches already became an important means to obtain deep insights into the microbial response to contamination or bioremediation techniques [86]. For example, microbial communities were used as environmental biosensors for nitrate [87] and uranium [87,88] or oil [89] contamination by probing 16S rRNA sequencing using high-throughput screening to determine the taxonomic composition of the microbial community. Since several key taxa are indicative of a particular contaminant, this metagenomic screening enables monitoring the presence and extent of contamination in the environment [86]. Therefore, this part of the review focusses on the recent developments in applied metagenomics employed in natural product

research and aims to give an account of such culture-independent methods and involved technologies for accessing secondary metabolites from actinobacteria.

#### *3.1. The Metagenomic Screening Workflow*

In general, there are two main strategies applying metagenomic methods. The older strategy is relying on functional screening of individual eDNA clones and carried out with the help of huge cosmid libraries. Clones are typically selected by phenotypic readout such as visually detectable signals, bioactivity or more advanced screening procedures such as reporters/biosensors or through specific catalytic assays. The more recent strategy is based on biosynthetic considerations, relying on DNA sequence similarity of conserved biosynthetic genes to selectively get access to potential BGCs within the environmental sample. *In silico* selection of potentially interesting BGCs and subsequent heterologous expression can be used to finally access novel compounds [90]. Figure 5 summarises the main steps in the metagenomics workflow for both methods. Environmental samples are collected from ecologically and geographically diverse environments. In the classical functional metagenomic screening approach, metagenomic libraries are created by extracting, cloning and ligating environmental DNA (eDNA) into a shuttle vector and transformed subsequently into appropriate heterologous hosts. The metagenomic library is screened either for observable phenotypes or for the presence of target DNA sequence. Positive clones are recovered from the metagenomic library and the eDNA insert is sequenced. The antibacterial pigment violacein was isolated from soil metagenomic libraries by applying the classic functional metagenomic screening pipeline [91]. In the targeted sequence-based metagenomic screening approach, crude eDNA is obtained from environmental sample and screened by polymerase chain reaction (PCR amplicons specific for sequences within BGCs. DNA sequence tags are phylogenetically organised, evaluated for biosynthetic origin, compared to reference database and biosynthetic gene clusters are reassembled *in silico*. A metagenomic library is generated from environmental samples harbouring the *in silico* reassembled BGC of interest. Environmental DNA is extracted and screened for the specific sequence of interest. For both approaches, the novel BGCs are finally assembled and modified for heterologous expression in appropriate host and the produced natural product is isolated and structurally elucidated. An example of the targeted sequence-based metagenomic screening is shown by the proteasome inhibitor landepoxin A, which was discovered by comparing ketosynthase (KS) domain sequence tags to the KS domains from known epoxyketone biosynthetic gene clusters to identify epoxyketone proteasome inhibitor derivatives from metagenomic samples [92].

**Figure 5.** Scheme of metagenomic discovery pipelines adapted from Katz et al. (2016) [93]. Environmental samples are collected from ecologically and geographically diverse environments. (**A**) The classical functional metagenomic screening is based on the generation of metagenomic libraries by using appropriate heterologous hosts. Afterwards the metagenomic library is screened either for observable phenotypes or for the presence of target DNA sequence. (**B**) Crude eDNA is obtained from environmental sample and screened by PCR amplicons specific for sequences within biosynthetic machinery. DNA sequences tags are used to obtain *in silico* reassembled biosynthetic gene clusters (BGCs). A metagenomic library is generated from environmental samples harbouring the *in silico* reassembled BGC of interest. Environmental DNA (eDNA) is extracted and screened for the specific sequence of interest. (**C**) Novel BGCs are finally assembled and modified for heterologous expression in appropriate host and the produced natural product is isolated and structurally elucidated. Violacein is a natural product obtained by the approach (**A**) [91], and landepoxin is a natural product obtained by approach (**B**) [92].

One of the major obstacles in the field of metagenomics is the usage of the model host organism *E. coli* with limited heterologous expression capabilities of complex natural products and the limited insert size of the cosmid-based libraries. These limited heterologous expression capabilities—in particular for secondary metabolites with corresponding huge modular biosynthetic architecture such as polyketides, nonribosomal peptides and hybrids of these—are not surprising, since *E. coli* is in contrast to actinobacteria and myxobacteria not a prolific producer of secondary metabolites [94]. Circumventing these major obstacles will be the key to improving novel natural products output from the metagenomics approach in the future. The usage of broad-host-range shuttle vectors allows effective eDNA clone shuttling to test heterologous expression in taxonomically diverse organisms [95]. Another approach to overcome limited heterologous expression capabilities of natural products in *E. coli* is to increase the transcription and translation of exogenous DNA by changing the host organism. Compared to *Streptomyces*, *E. coli* contains only half the number of different

RNA polymerase sigma factor species. Furthermore, *Streptomyces* has been shown to express more genes deriving from metagenomic libraries than *E. coli* [96], revealing it as a potent host organism for heterologous expression. Because *E. coli* remains one of the few organisms that are featured with easy genetic manipulation as well as fast growth, additional measures than changing the host organism was taken into consideration to overcome its limitations. Overexpression of the missing sigma factor σ<sup>54</sup> led to the production of oxytetracycline during heterologous expression of the previously silent oxytetracyline BGC originated from *Streptomyces rimosus* in *E. coli* [97]. This is just one powerful example demonstrating the utility of modified *E. coli* strains as heterologous platforms for metagenomic derived BGCs, indicating that even codon usage bias, starter and extender units availability are surmountable obstacles [98–100]. In particular, *E. coli* has proven to be an useful host for the heterologous expression of short biosynthetic pathways such as ribosomally synthesised and post-translationally modified peptides (RiPPs) [101], since the underlying biosynthetic machinery—the ribosome—is provided by the primary metabolism.

Even with a suitable broad host expression platform at hand, heterologous expression remains the critical part in the metagenomic workflow [102,103]. It is a common finding after reassembling of the whole BGC that most of the unmodified BGCs remain silent. To overcome this issue, several different non-targeted approaches are currently utilised such as simulation of environmental conditions through microcolony cultivation [104], co-cultivation [105] and the use of histone deacetylase (HDAC) inhibitors [106]. Targeted approaches to promote heterologous expression comprise the replacement of inactive native promoters through strong artificial or active native promoters, functional in the heterologous host either upstream of positive regulatory elements or individual biosynthetic genes. The operon structure of BGCs requires often the replacement of several promoters to enable heterologous expression; for example the anti-proliferative compounds lazarimide A and B deriving from a silent eDNA BGC, could only be expressed after yeast homologous recombination was employed to perform multiplex promoter exchange [107]. Another case study demonstrates that every gene from the eDNA derived BGC has to be controlled by a separate inducible promoter to finally produce erdasporine, a novel carboxy-indolocarbazole containing tryptophan dimer [108].

#### *3.2. Direct Functional Metagenomic Screening*

Functional metagenomic screening is mainly but not exclusively based on phenotypic readouts such as visual detection and chromatographic separation, which is attributed to production of new small molecules. Another simple strategy is testing for growth inhibition against pathogenic microorganisms in a top agar overlay assay [109]. More advanced approaches use reporter/biosensor-based screens such as metabolite-regulated expression (METREX) and substrate-induced gene expression screening (SIGEX). METREX screening can be used to screen for production of quorum-sensing inhibitors able to bind to the LuxR transcriptional activator, which induces expression of a target gene. This leads to accumulation of a green fluorescent protein (GFP) used as reporting system for positive cosmid clones inducing bacterial quorum sensing [110]. SIGEX is based on the substrate-induced expression of a catabolic gene with the similar reporting system for positive cosmid clones such as METREX [111]. Both methods can be used for discovering new genes and gene products that would not have been detected with a sequence-based approach, since no prior knowledge of the gene and gene product is necessary. However, the outcome of natural products discovered by functional metagenomics is to date rather disappointing with only few successful examples such as the antibacterial pigments violacein [91], indirubin with its isomer indigo [112], *N*-acyltyrosines [113] and the turbomycins [114] as shown in Figure 6. All of these examples show simple non-actinomycte derived secondary metabolites, synthesised by very few biosynthetic enzymes. Hence, functional metagenomic screening is obviously failing to access the complexity of actinomycete secondary metabolism.

Enzymatic activity unique or predominantly present in secondary metabolism can be used for advanced functional phenotypic screening to improve the identification of functional BGCs [115,116]. Owen et al. [117] described for example a screening approach based on the single module nonribosomal peptide synthetase (NRPS) *bps*A as reporter gene that synthesises the coloured indigoidine from L-glutamine through activation by 4-phosphopantetheinyl transferase (PPTase)-containing clones. *PPTase* genes are often located in close proximity to a BGC, however most of the BGCs are remote from their required *PPTase* gene, since most microorganisms possess only one *PPTase* gene in their genome, which is involved in secondary metabolism [118]. Wherefore indigoidine production in general is a powerful indicator to identify positive eDNA clones containing BGCs deriving from NRPS or polyketide synthase (PKS) pathway machinery. Five years after inventing this functional *bpsA* gene expression-type BGC screening method, Brady and co-workers screened a soil eDNA library hosted in *S. albus::bpsA* Δ*PPTase* and identified clones containing NRPS, PKS and NRPS-PKS biosynthetic gene clusters, resulting in the rediscovery of myxochelin A the biosynthesis of which is dependent on a very simple NRPS [119]. This study is a proof of concept for an innovative functional screening method albeit it suffers from a rediscovery issue based on the DNA-sequence paralleling the traditional culture-based high rediscovery rate based on the metabolic level. Furthermore, the relatively simple structure with its underlying plausible biosynthetic pathway, consisting of four biosynthetic proteins (with only one NRPS module) [120], is questioning the general applicability of this method for more complex biosynthetic pathways, since many BGCs are exceeding the typical insert size of a cosmid preventing the functional detection.

**Figure 6.** Example of natural products obtained by direct functional metagenomic screening.

#### *3.3. Sequence-Based Metagenomic Discovery Efforts*

In contrast to functional metagenomic screening, the identification of potential BGCs through the sequence-based metagenomics approach relies on *in silico* analysis of sequence tags representing microbial genomes for the initial screening process instead of attempting heterologous expression. The complex mixtures of PCR amplicons consisting of numerous domain fragments from potential biosynthetic gene clusters deriving from environmental samples are defining the term of Natural Product Sequence Tags (NPSTs). Unlike whole genome sequences, those datasets are consisting of much shorter DNA stretches, which hampers the identification of biosynthetic gene clusters. Therefore, conventional tools in the field of natural product research such as antiSMASH [121,122] are less suitable for analysing those datasets. Predictive tools such as Surveyor of Natural Product Diversity (eSNAPD) [123] and Natural Product Domain Search (NaPDos) [124] are based on the phylogenetic relationships of sequence tags which enables searching for both close and distant relatives in the large

datasets [125]. The advantage of eSNAPD to organise NPST is the requirement of little computational power and the comparable low cost of sequencing small PCR amplicons in contrast to whole genomes. One nuisance with sequence-based metagenomic discovery is the enormous size and density of soil metagenomes in soil samples, where a single sample can contain up to 10<sup>5</sup> unique genomes [93]. Furthermore, metagenomic DNA libraries are sparsely populated with biosynthetic genes of interest, since only a fractional amount of each genome is devoted to secondary metabolism [126]. Therefore, gene cluster enrichment strategies can be used to simultaneously reduce the size and increase the density of biosynthetic sequence tags within eDNA libraries. Employing PCR in sequence-based approaches using degenerated PCR primers to amplify biosynthetic genes of interest was shown to be more sensitive in identifying variants of biosynthetic domain sequences with known architecture than shotgun sequencing [127]. However, these amplicon sequencing-based methods are inherently biased, with regards to detect standard gene organisation such as RiPPs, hybrid NRPS-PKS and nucleoside antibiotic sequences [101,128]. In general, modern sequence-based methods are centred on amplifying conserved biosynthetic enzymes via degenerated primers to construct metagenomic libraries rather than analyse randomly crude eDNA since the majority of DNA sequences are not associated with biosynthetic pathways. A promising conserved biosynthetic gene utilised to find novel metagenomic-based natural products is for example the KS-β [129]. Identification of tailoring and other enzymes occasionally involved in biosynthesis of natural products such as sulfotransferases [130], esterases [131], lipases [132], β-galactosidases [133] and isonitrile synthases [134] could also be utilised to obtain novel natural products such as new sulfated glycopeptide derivatives [130] or (*E*)-3-(2-isocyanovinyl)-1*H*-indole [134] from metagenomic samples. However, those specialised screening approaches of individual biosynthetic sequences are not offering a fully systematic method for accessing the complex entirety of available metagenomes, since these enzymes are also distributed in primary metabolism, and are not common for secondary metabolism, in particular esterases and lipases.

Brady and co-workers could obtain the rare tryptophan dimers hydroxysporine and reductasporine [135], the cytotoxic anthracycline arimetamycin A [136], the tetracyclic MRSA-active antibiotic tetarimycin A [137], the antibiotic cyclic lipopeptides malacidins A and B [138] and the epoxyketone proteasome inhibitors clarepoxcins A–E and landepoxcins A and B [92] with the sequence-based metagenomic approach (Figure 7). These natural products were obtained by using *Streptomyces albus* as heterologous host, underlining this strain as a gifted heterologous host in particular for actinomycete derived BGCs [139]. Recent developments of utilising different so-called broad host heterologous expression platforms for microbial natural product biosynthetic pathways—in particular various hosts from different phylogenetic background such as myxobacterial microorganisms—was described by Müller et al. [102]. The calcium-dependent antibiotics malacidins A and B were discovered by using degenerated primers targeting NRPS adenylation domains to generate amplicons from an arrayed collection of eDNA isolated from 2000 unique soil samples. The malacidins are structurally distinct in comparison to known calcium-dependent antibiotics such as daptomycin and friulimicin, which emphasises the sequence-guided metagenomic pipeline to probe for new congeners of the natural product family. In contrast, the tryptophan dimers were discovered by screening diverse soil samples using degenerate primers for the presence of unique chromopyrrolic acid synthase (CPAS) sequences. This approach demonstrates the large-scale sequence tag-based pre-filtering of diverse environmental samples as input for the metagenomic pipeline to push the discovery of novel BGCs [135]. It is also possible to use NPST libraries to recover natural product genes from metagenomic libraries to obtain through the unique sequence label (barcode) the BGC of interest and the specific sub pool of the library [125].

However, many BGCs are exceeding the typical insert size of a cosmid, and thus several overlapping DNA cosmid clones spanning the complete biosynthetic gene cluster have to be reassembled into a bacterial artificial chromosome (BAC) by the frequently used method of transformation-associated recombination (TAR). Afterwards, the reassembled BGC can be transferred from yeast into different bacterial strains for heterologous expression. Consequently, this approach is

rather a heterologous expression based on a metagenomic library than an expression of metagenomic samples [140].

**Figure 7.** Example of natural products obtained by sequence-based metagenomic approaches.

#### *3.4. Metagenomics for the Assessment of Marine Endophytes*

Besides the extensively investigated ecology of soil-dwelling bacteria, the marine environment is still less investigated in terms of natural product research (see Section 2.1) [12]. The major drawback of marine natural product discovery is that cultivation of bacteria is frequently challenging and production of compounds of interest is often not possible in sufficient amounts for structure elucidation [141]. Occasionally, serendipitous discovery of alternative producers of marine natural products might enhance the production and help to characterise subsequently the underlying biosynthetic pathway. One example are the bengamides, a class of natural products that have been characterised as inhibitors of methionine aminopeptidases, emphasising the potential as anticancer compounds [142]. Previously, they have been assumed to exclusively derive from sponge *Jaspis* cf. *coraciae* until they have been found in the terrestrial myxobacterium *Myxococcus virescens* ST200611 (DSM 15898) which allowed studies to be conducted on the biosynthesis of bengamides, their heterologous expression, and the self-resistance mechanism of their producer [143,144]. However when no alternative producer can be found, metagenomic approaches are playing a crucial role to access the diversity of marine natural products by identifying biosynthetic gene clusters for the heterologous production of secondary metabolites [109]. This approach can provide information about the origin of biosynthetic gene clusters, as compounds thought to be produced by sponges, have been later on elucidated to originate from their bacterial symbionts [145]. One example are the unusual antifungal peptides microsclerodermins, which have been reported to derive exclusively from the marine genera *Microscleroderma* and *Theonella* [146,147]. Recent findings, however, revealed the

terrestrial alternative producers *Jahnella* sp. MSr9139 and *Sorangium cellulosum* So ce 38 producing microsclerodermins and pedeins along with additional derivatives [148]. The genomic sequence for these two myxobacterial producers enabled to propose a conclusive biosynthetic model for the underlining pathway. Combined with recent metagenomic studies emphasising myxobacteria as putative sponge symbionts, these outcomes provide evidence that a terrestrial bacterial symbiont might be the real biosynthetic source of this "marine" natural product [149]. Considering that seawater bacteria are presumably 10-fold less represented in cultured isolates compared to soil-dwelling bacteria [150], culture-independent methods in marine environment could be expected to access a larger untapped reservoir of chemical and biological diversity of natural products than soil-dwelling bacteria. The following section briefly highlights recently conducted metagenomic case studies, which were guided by previously characterised structure or associated biological activity. The first and second example describes the biosynthetic characterisation through metagenomics of the endosymbiotic derived polytheonamides and calyculin A, the last case study highlights the recent discovery of the anti-HIV lanthipeptides, the divamides, which exemplifies the convergence of traditional functional screening and sequence-based metagenomics (Figure 8).

**Figure 8.** Example of marine natural products with elucidated biosynthesis achieved through metagenomic approach.

The marine sponge *T. swinhoei* contains numerous uncultivated bacterial symbionts and is a prolific source of bioactive secondary metabolites [151]. One of those marine natural products with remarkable structure is the 48-residue large peptide polytheonamide, incorporating 13 nonproteinogenic amino acids. Therefore, it was assumed for years that the marine cytotoxic polytheonamide originates from a nonribosomal peptide biosynthesis, despite the gigantic size of 48-residue NRPS machinery [152]. To clarify the biosynthetic origin of the polytheonamides, a semi-nested PCR approach with designed primers specific for precursor peptide consisting of proteinogenic L-configured amino acids was conducted from *T. swinhoei* metagenomic sample [153]. The assembled sequenced amplicons revealed 11 clustered genes, with seven open reading frames forming an operon. Furthermore, several

bacterial transposition elements and Shine–Dalgarno sequences were found; these findings not only confirm the ribosomal origin but also suggest that the polytheonamides are produced by bacterial endosymbionts [154]. Further studies of the sponge *T. swinhoei* involving single-cell genomics assisted by enriched bacterial fractionation, fluorescence assisted cell sorting and whole genome amplification combined with pathway specific PCR, revealed the *Entotheonella* spp. bacterial symbiont as the native producer. Similarly, *Candidatus Entotheonella* spp. was assigned as the non-cultured bacterial symbiont of *T. swinhoei*, responsible for the production of onnamides [155] (Figure 8).

Calyculin A first isolated from the marine sponge *Discodermia calyx* in 1986, is a PKS-NRPS hybrid cytotoxic natural product [156]. Since further calyculin-related natural products have been isolated from different sponges, it was assumed that associated bacterial symbionts are producing those compounds [157]. Recently the biosynthetic gene cluster was identified by a metagenomic approach comparable to the methods described for the identification of the BGCs of onnamides and polytheonamides. The PKS-NRPS hybrid origin of calyculin leads to the design of primer screening trans-acyl transferase (AT)-type KS, adenylation domains and 3-hydroxy-3-methylglutaryl-coenzyme A synthase (HMGS)-like motifs using *D. calyx* metagenomic DNA as a template. In total 250,000 clones of a metagenomic library of *D. calyx* total DNA had to be controlled applying a sophisticated pooling strategy [158], before the calyculin BGC could be assigned [159]. Furthermore, the biosynthetic gene cluster was used as a probe, employed catalysed reporter deposition-fluorescence *in situ* hybridisation (CARD-FISH) and laser microdissection was revealing a filamentous bacterium with 97% identity in the 16S rRNA sequence to the *Candidatus Entotheonella* symbiont from the *T. swinhoei* sponges.

Organic extract of the Prochloron-harbouring tunicate *Didemnum molle* (E11-036) from the Eastern Field of Papua New Guinea represented a promising hit in an anti-HIV assay. However further assay-guided fractionation and spectroscopic analysis could not reveal the complete chemical structure of the natural product responsible for the observed antiviral activity, due the very limited sample size. The natural product could at least preliminarily be determined as a novel peptide containing the modified amino acids lanthionine and *N*-trimethylglutamate, as well as the partial peptide sequence "GTTR". The acquired assembled metagenome was searched for the amino acid motif GTTR and for overrepresentation of thioether cross-linked amino acids (cysteine, threonine and serine at the C-terminal end) from which lanthipeptides derive their name [160], yielding one putative divamide BGC. Combined *in silico* analysis of the biosynthetic pathway and re-examination of the acquired NMR (nuclear magnetic resonance spectroscopy) data culminated in the structure elucidation of the divamides. Moreover, heterologous expression of different divamide BGCs originating from the symbiotic cyanobacteria *Prochloron didemni* in *E. coli* confirmed the predicted structure and the biosynthetic pathway [161]. The discovery of the divamides highlights the synergy of different strategies for the search of novel bioactive natural products whereas biological activity within an ecologically relevant system is screened first by functional screening, with the resulting chemical and metagenomic approaches following promising biology. The scarcity of isolated material is circumvented by applying NMR structure elucidation combined with metagenomics and synthetic biology to fully characterise newly discovered natural products and their underlying biosynthesis.

The last case study demonstrates the usage and value of metagenomic approaches in terms of biosynthetic gene cluster identification for the discovery of new natural products from marine endophytes. The existing examples to date have afforded mainly PKS, NRPS or PKS-NRPS hybrids, assembled by gigantic modular biosynthetic pathway machineries. The high mutual similarity of corresponding domain types, in particular KS and AT domains in PKS and condensation (C) and adenylation (A) domains in NRPS, greatly facilitates the *in silico* identification of genes encoding these gigantic microbial biosynthetic pathway machineries, as long as they follow the so-called collinearity rule [162]. In contrast, other types of biosynthetic pathways such as RiPP and nucleoside antibiotics may be much smaller and less obviously clustered than multifunctional NRPS and PKS gene clusters and thus more difficult to identify. However, in terms of heterologous expression, the smaller RiPP BGCs are much easier to express than PKS, NRPS or PKS-NRPS hybrids with the underlying advanced

biosynthetic machinery. In terms of structural features, they nevertheless may be at least equally interesting. Additionally, the biosynthetic characterisation of the polytheonamides is demonstrating the use of modern single cell separation and sequencing technologies, which might accelerate the future perspective of metagenomic approaches.

#### *3.5. Sequence Boom: Potential of Next Generation Sequencing and Single-Cell Genomics*

With affordable sequencing costs for whole genome sequencing, it became obvious that genome sequencing of single cells would give new perspectives of genetic analysis. In the field of metagenomics, single-cell microorganism sequencing enables genome assembly of new phyla and grant new insights into the microbial dark matter as it can be used for the "non-culturable" bacteria [163], such as described for *T. swinhoei* above. Furthermore, single-cell genomics provide a valuable advantage by reducing the complexity of the genomic signal through the physical separation of cells or chromosomes. However, obtaining the whole genome from a single DNA molecule harboured by individually isolated cells is a technically challenging procedure. Especially the guanine-cytosine (GC) rich genomes of actinobacteria and myxobacteria, both prolific sources of natural products [164,165], are emphasising the problems of whole genome sequencing techniques relying on short reads (typically in the range of several hundred base pairs) [166]. The four major technical challenges in the field of single-cell genomics are efficient isolation of individual cells, amplification of the genome of the specific cell to circumvent the scarcity of genomic material, sequencing of the previously amplified DNA amplicons representing the genome sequence and the evaluation of acquired data [163]. Three whole genome amplification (WGA) methods are commonly applied now: pure PCR-based methods such as degenerate oligonucleotide primed PCR (DOP-PCR), isothermal based methods such as multiple displacement amplification (MDA) and hybrid methods of both such as multiple annealing and looping based amplification cycles (MALBAC) and PicoPLEX. PicoPLEX is based on an initial isothermal preamplification followed by PCR amplification of the amplicons generated during the first step. Among the WGA methods, DOP-PCR achieves higher coverage uniformity with a low physical coverage of the genome. MDA on the contrary exhibits greater genome coverage with lower coverage uniformity of the genome and the hybrid methods MALBAC and PicoPLEX are characterised by coverage and uniformity [163]. A convergent method approach of single-cell genomic sequencing based on MDA and metagenomic library screening accelerated the identification of the apratoxin A biosynthetic gene cluster, a potent cytotoxic compound [167]. Furthermore, the genomes of two chemically distinct *Entotheonella* symbionts, the producers of onnamides and polytheonamides, could be reassembled by using combined single cell isolation and MDA based whole genome amplification [155].

Nevertheless, a truly novel natural product was not discovered to date using single-cell isolation combined with MDA-based whole genome amplification. Single-cell genomics has the potential to assemble the genomes of species that are present at low frequencies in metagenomic samples [167], as well as to produce assemblies of genomes of completely uncharacterised microorganisms. In conclusion, this interesting technology is highlighting great promises and single-cell genomic sequencing might advance the field of natural product research, however the downstream limitations of the metagenomic workflow remain challenging for the discovery of novel compounds.

#### *3.6. Conclusions and Future Considerations*

Since metagenomics is a relatively new technology, further advances are necessary to overcome major inherent bottlenecks not only for functional but also for sequence-guided approaches. Brady and co-workers could uncover novel natural products such as the calcium-dependent antibiotics malacidins A and B or the epoxyketone proteasome inhibitors clarepoxcins and landepoxcins via metagenomic approaches. In the field of marine natural product research, metagenomic approaches have been used to identify the biosynthetic pathways of pharmaceutically relevant compounds. Nevertheless, both approaches could not have been successfully conducted without fundamental biosynthetic knowledge. Besides the successful discovery of novel natural products, there are still some key limitations which

have to be circumvented in the future [168]. One of the main bottlenecks of metagenomics is the ability to isolate DNA from soil sample for the generation of metagenomics libraries, since samples are inhabited by a great variety of microorganisms and collecting samples containing eDNA harbouring novel biosynthetic pathways is time consuming and requires fundamental ecological background knowledge. Substantial improvement has already been achieved in terms of extraction of high molecular weight DNA and elimination of environmental inhibitors by synchronous coefficient of drag alteration (SCODA) [169,170], indirect DNA extraction through microbial cell separation and formamide treatment [171]. Another limitation arises from the heterogeneous origin of metagenomic libraries, which prevents the taxonomical correlation of biosynthetic gene cluster or bioactivity to the corresponding microorganism. Managing the size and complexity of metagenomic datasets will require for the long-term automated genome mining tools and pattern recognition based algorithms [172]. Nevertheless, these data processing obstacles might be overcome in the near future, since the field of bioinformatics is evolving rapidly; therefore, the final heterologous expression of (reassembled) BGCs remains the major limitation in the near future. Very often promising biosynthetic gene clusters functional in the natural producers remain silent in the heterologous host. Different codon usage, rare tRNAs, tightly regulated promoters, toxicity and stability are only a few of the diverse obstacles. Expressing BGCs from underrepresented bacterial taxonomic origin is still challenging, since even so-called broad host heterologous expression platforms are not always capable of producing the corresponding natural products. Therefore, further progress in understanding the biosynthetic logic of assembly lines found by metagenome mining, including ways to achieve heterologous expression, will help to overcome this inherent bottleneck.

#### **4. Genome Mining: Current Reality and Future Promise of the Post-Genomic Era**

With first whole genome sequence of the model microorganism *Streptomyces coelicolor* A3 (2) [13] published in 2002, it became evident that actually little was known about the so-called well-studied strain in terms of secondary metabolism. By that time, *S. coelicolor* was known to produce actinorhodin, methylenomycin, calcium-dependent antibiotic and undecylprodigiosin [173–176]. The last of these compounds was discovered already in 1985. Development of antiSMASH [177] in 2011, by now the most commonly used tool for automatic genomic identification and analysis of biosynthetic gene clusters [121,178], allowed the prediction of additional BGCs located in the genome of *S. coelicolor*, indicating potential for production of hitherto unseen metabolites. This section therefore gives an overview about how biosynthetic gene clusters can be accessed for the discovery of novel natural products. Besides prediction of biosynthetic gene clusters and possible read-outs of *in silico* analysis, the impact of genetically minimised strains for heterologous expression and different approaches to target silent gene clusters will be discussed in detail.

#### *4.1. Biosynthetic Gene Cluster Prediction and Targeted Activation of BGCs*

Every experimental method for the assessment of a biosynthetic gene cluster relies on *in silico* analysis of the bacterial genome. NRPS-precursors and modular polyketide synthases assembly lines can be predicted by NRPSpredictor2 [179] or the Stachelhaus code [180], by domain organisation and their predicted substrate specificity [181], respectively, both algorithms being integrated into antiSMASH. Bioinformatic prediction of not yet seen natural product structures based on biosynthetic genes is a powerful tool for follow-up analysis, such as genome–metabolome correlation. However, structural prediction of natural products solely based on *in silico* data is still not reliable. Targeted activation or knockout of BGCs requires bioinformatic analysis to prioritise the selection of the BGC. Natural products with structures from new compound families are favoured, therefore online tools that correlate biosynthetic gene clusters of NRPS and RiPPs with analytically acquired data are developed [182,183] as described in more detail in Section 5.2. BGCs are also being analysed for resistance genes located close to the biosynthetic genes that could provide information about the mode of action of the associated natural product. Predictions are available as an online tool PRISM3 [184,185]

and recently developed the Antibiotic Resistant Target Seeker (ARTS) [186]. Cases of resistance genes located within the biosynthetic gene cluster are reported, for example, oxytetracycline with *otrA* gene encoding the elongation factors EF-Tu and EF-G that protect ribosomes of the producer strain [187] and griselimycin's *griR* resistance gene encoding an additional copy of the DNA polymerase III beta subunit [188]. Therefore, the prioritisation of BGCs according to the resistance genes found in the gene cluster shows great promise specifically to identify novel antibiotics. Beside these specialised self-resistance conferring genes, there are more general resistance genes located within BGCs; during the heterologous expression of bottromycins in *S. coelicolor*, the replacement of the native promoter by the strong *ermE*\* promoter in front of the gene encoding the respective efflux pump *botT*, showed a 20-fold increased production concentration compared to the natively expressed resistance gene [189].

#### *4.2. Utilising the Complexity of the Biosynthetic Machinery for the Discovery of Novel Natural Products*

Genetic modification of producer strains of interesting natural products is often challenging, as they are featured with several BGCs and manipulation often affects more than one metabolic pathway. In some cases, knocking out one biosynthetic gene of a known compound in a producer strain can therefore enhance the biosynthesis of other encoded natural products in the genome. The newly discovered ten pentangular polyphenols, namely amexanthomycins A–J [190], were only found after knocking out the *rifA* PKS gene responsible for rifampicin biosynthesis from *Amycolatopsis mediterranei* S699 (Figure 9).

In addition, transcription regulators encoded nearby the biosynthetic gene clusters of known natural products have a great influence on secondary metabolite expression [191–193]. Some act as activators and some as repressors, but exchange of their natural promoters with constitutive or inducible promoters can lead to overexpression or inhibition of the production of targeted BGC. For example the null mutation of the *tetR* repressor within two silent BGCs in *Streptomyces* sp. PGA64 and *S. ambofaciens* culminated in the isolation of the novel angucyclinone metabolite UVM6 [194] and the previously described kinamycins [195], whereas the production of gaburedin A was induced through the inactivation of the repressor *gbnR* in *S. venezuelae* [196]. In contrast, the discovery of stambomycins from *S. ambofaciens* was achieved by constitutive expression of a LAL family regulator, acting as activator of the stambomycin biosynthesis [197]. Exchange of the native promoter with an inducible promoter, upstream of the first biosynthetic gene in the operon, which can be tightly regulated, provides control over biosynthetic genes. For example, the polycyclic tetramate macrolactam 6-*epi*-alteramide A was obtained by introducing the strong *ermE*\* promoter in front of the hybrid type I PKS-NRPS operon [198] (Figure 9). Without induction no metabolite can be detected, comparable to a knockout mutant, but addition of inducer promotes metabolite overexpression [199]. For the above-mentioned promoter exchange strategies, decent understanding of operon structures is required in combination with comprehensive metabolic profiling with accurate detection systems such as high-resolution mass spectrometry to detect small concentrations and corresponding the target masses. Currently-used native, modified and synthetic promoters were summarised by Rebets et al. [191]. As promoters might be necessary for yield improvement, research groups are working on expanding promoter libraries and they already reported the successful generation of 56 synthetic promoters. Three of these promoters of different strengths are active in multiple actinobacterial strains [200] and 38 synthetic promoters ranging from weak, medium to strong promoters are active in several *Streptomyces* species [201]. This finding opens the potential for much better control of genetic regulation in the future. This is of importance as highest expression often does not go hand in hand with increase of product yield due to limitations in translational machinery or self-resistance, just to name a few.

Considering this observation, the generation of strains with minimalised genomes is also crucial to have a heterologous expression platform where gene functions can be studied in detail and metabolic changes can easily be correlated to specific manipulations in the strain. For actinomycetes several strains with minimalised genomes are available such as *S. albus* Del1, *S. albus* J1074 and *S. avermitilis* SUKA17 [202,203]. Unnecessary genomic parts, such as insertion sequence (IS) elements and secondary metabolite gene clusters, are removed to increase genetic stability and reduce the metabolic burden. *S. coelicolor* M1152 with four deleted native biosynthetic gene clusters and additional point mutations introduced into *rpoB* and *rpsL* genes enhanced transcription and translation, respectively; depending on the heterologous cluster type, the obtained strain produced 20–40 times more of the compounds under study when compared with parental *S. coelicolor* M145 [204]. After successful implementation of such platforms with minimalised genomes, they can be used for the expression of silent BGCs from various strains and heterologous expression of BGCs found with metagenomic platforms.

#### *4.3. Silent BGC Activation by Chemical Elicitors, Ribosome Engineering and Chromatin Remodelling*

Many BGCs found in genome sequence that do not have detectable natural product, due to small concentrations being produced under certain growth conditions, probably promoting negative regulation, are termed silent or cryptic biosynthetic gene clusters [205,206]. To access "silent" BGCs there are two strategies currently applied. Either they are targeted by empirical optimisation of growth conditions [206], addition of chemical elicitors [106,207], trace metal ions [208], provision of exogenous small molecules to producer strain [205], ribosome engineering [209–211] or using targeted approaches based on genome sequence [191] with some of them mainly focusing regulatory genes [193]. Empirical variation in growth conditions such as temperature, pH, co-cultivation, or addition of chemical elicitors can induce BGC expression. Natural products such as goadsporin promotes secondary metabolism and morphogenesis in *Streptomyces* [212], subinhibitory concentrations of trimethoprim activated the expression of the malleilactone BGC from *Burkholderia thailandensis*—silent under standard laboratory conditions [213]—siderophore desferrioxamine E produced by *S. griseus* stimulates growth and development of *S. tanashiensis* in co-cultivation experiments [214]. In 2016, a Canadian compound collection of 30,569 small molecules was screened for their ability to alter the pigmentation of *S*. *coelicolor* colonies, usually linked with production of different secondary metabolites, during growth on solid medium. Several compounds referred as ARC2 served as a potent general elicitor inducing production of cryptic metabolites [215]. Its derivative Cl-ARC was used on 50 different actinomycete strains; 10 μM Cl-ARC was added to five growth solid media per bacteria and bacteria were grown for seven days at 30 ◦C. *N*-butanol extracts of controls and Cl-ACR treated bacteria were subjected to comparative LC-MS analysis. At least 23% of BGCs got activated in addition to three rare secondary metabolites that showed activity against bacteria and/or against eukaryotes [216]. However, follow-up studies to show that indeed novel metabolites are produced at larger scale have yet to be reported.

**Figure 9.** Generic representation of BGC consisting of core biosynthetic genes, genes for tailoring enzymes (dashed line), positive and negative regulators and self-resistance conferring genes. Strategies can be employed in the native or heterologous host. (**A**) A promoter is either inserted in front of a positive regulator or an additional copy of the positive regulator with a promoter is inserted in the genome; consequently, a higher concentration of natural product is expected. The macrolide stambomycin A was obtained through constitutive expression of an LAL family regulator [197]. (**B**) Promoter is inserted in front of the operon of biosynthetic genes enhancing transcription and production of natural product. The polycyclic tetramate macrolactam 6-*epi*-alteramide A was obtained by introducing the *ermE*\* promoter in front of the hybrid type I PKS-NRPS operon [198]. (**C**) Repressor gene is disrupted, production of natural product is enhanced. Inactivation of the repressor *gbn*R in *S. venezuelae* induced the production of gaburedin A [196]. (**D**) Resistance gene protects producer strain against its own natural product. During the heterologous expression of bottromycins in *S. coelicolor*, the replacement of the native promoter by the strong *ermE*\* promoter in front of the efflux pump *bot*T, showed a 20-fold increased production concentration compared to the natively expressed resistance gene [189]. (**E**) In some cases, knocking out the biosynthetic genes of known compounds in producer strains enhances biosynthesis of other encoded natural products in the genome. Amexanthomycin A was found after knocking out PKS *rifA* gene responsible for rifampicin biosynthesis from *Amycolatopsis mediterranei* S699 [190]. (**F**) Targeted induction of mutation in RNA polymerase and ribosomal proteins by antibiotics can cause upregulation of BGC expression. Mutations in *rpsL* and *rpoB* genes activated the silent BGC of piperidamycins from *S. mauvecolor*, culminating in the isolation of the antibacterial piperidamycin A [209].

Targeted mutation induction in RNA polymerase and ribosomal proteins by antibiotics rifamycin, streptomycin, and gentamicin can cause upregulation of BGC expression. Mutations in *rpsL* and *rpoB* genes activated silent BGC of piperidamycins from *S. mauvecolor* [209] and are shown to awaken cryptic BGCs [210] (Figure 9). Chromatin remodelling approaches were used on fungi, but interestingly these approaches were also applicable on *Streptomyces* species [106,205] as bacterial genomes are compacted presumably by nucleoid-associated proteins, RNAs and differential supercoiling leading to a comparable compaction for certain genes such as in eukaryotic organisms [217]. Cryptic BGCs could be located in tightly packed heterochromosome regions therefore chromatin remodelling or epigenetic modifications with addition of DNA methyltransferase (DNMT) and histone deacetylase (HDAC) can influence their expression [218]. Up to date, no new natural products were found applying these approaches, but differences in biosynthetic gene expression were observed even though in majority of cases expression was found reduced.

#### *4.4. Conclusions*

The majority of the above mentioned empirical methods or targeted approaches yielded examples for the identification of new natural products. Development of bioinformatic prediction tools improved genome mining and prioritisation of BGC, however there is definitely a lack of standardisation. New genomes are sequenced and prediction tools are being developed and provide useful platform although they lack the ability to perfectly predict compound structures. Even though biosynthetic gene cluster prediction algorithms are available, *in vivo* or *in vitro* confirmation are obviously the only way forward. Deciphering of biosynthetic genes is still not routine, so there remains much more to be learned about biosynthetic pathways as well as about BGC regulation in actinomycetes. However, the field now has a well-established set of genetic tools that greatly facilitate future work. Accurate analytical methods are of great importance, as comparative metabolic profiling is necessary for detection of differences between wild-type strains and their mutants.

#### **5. Metabolomics for the Discovery of New Antibiotics Produced by Actinomycetes**

In the pre-genomic era of antibiotic discovery most of the identified molecules were isolated by a classical "top-down" approach, which implies that they were produced in sufficient amount for their subsequent detection [219]. Since this conventional approach more and more failed because of frequent rediscovery of known metabolites, new approaches were required to systematically access novel antibiotics [220]. The increasing availability of genomic data for producers of natural products led to the recognition of the hidden potential of microbes to produce an even bigger variety of secondary metabolites as originally expected, and this notion changed the overall strategy for antibiotic discovery [211,221–224]. Genome mining and metagenomic analysis were added to the portfolio of methods for the prioritisation of newly isolated microbial strains, as well as for realising the previously unseen biosynthetic potential of already analysed strains [85,225]. As powerful as these *in silico* strategies are, they can only succeed when combined with analytical chemistry techniques, particularly high-resolution mass spectrometry and NMR (nuclear magnetic resonance spectroscopy) for structure elucidation of new molecules [226]. To access the full biosynthetic potential of microbes—in particular actinomycetes—as antibiotic producers, metabolomics is one of the most important tools due to its capability to systematically assess all primary and secondary metabolites in a biological sample [227,228]. The use of hyphenated techniques and the extension of the scope of already existing technologies in combination with advanced data evaluation set the stage to enlarge the window of observable metabolites. In this part of the review, we therefore highlight advances in the field of analytical chemistry including imaging mass spectrometry (IMS), liquid chromatography coupled to nuclear magnetic resonance spectroscopy (LC-NMR) and supercritical fluid chromatography (SFC). Molecules detected with IMS in the past 10 years are summarised and differences in ionisation methods compared. LC-NMR and SFC are briefly outlined, as they represent to date underexploited technologies in the field of actinomycetes compound discovery, showing already promising results in other natural product applications. Moreover, an overview is given about conceptual improvements for natural products dereplication using metabolomic data during the last 10 years.

#### *5.1. Innovations in Analytical Instrumentation for Natural Product Discovery*

#### 5.1.1. Imaging Mass Spectrometry (IMS)

The introduction of ultra-high performance liquid chromatography (UHPLC) in combination with high-resolution MS detectors such as time of flight mass spectrometry (TOF-MS) remains one of the breakthroughs for natural product discovery [229]. This combination has since provided a rapid and robust analysis of the chemical composition of microbial extracts [16,230]. One of the main limitations of this approach comprises the use of a mixture of cells in different growth states for preparation of extracts for LC-MS measurement, wherefore chemotypes cannot be connected to phenotypes [231]. Among the Gram-positive bacteria, actinobacteria show the greatest morphological differentiation. They are able to form complex structures such as spore chain, sporangia or substrate mycelia with long-branching hyphae [232]. The impact of specific phenotypes on the antibiotics production was strikingly shown by Onaka et al. [233] for *Streptomyces endus* S-522: mycolic acid-containing bacteria such as *Tsukamurella pulmonis* stimulate the antibiotic production of other actinomycetes when grown together on agar-plates. Following this co-cultivation approach, Onaka et al. [233] were able to isolate alchivemycin A, an antibiotic unknown before. Modern mass spectrometry ionisation techniques such as nanospray desorption electrospray ionisation (NanoDESI), matrix-assisted laser desorption electrospray ionisation (MALDI-TOF) and secondary ion mass spectrometry (SIMS) imaging opened a way to further investigate the precise mechanism of these cell-to-cell contact interactions [234,235]. Even though IMS is nowadays mostly used for discovery of disease-related biomarkers, it has also proven useful for the detection of natural products. Figure 10 gives an overview of anti-infectives produced by actinomycetes detected by imaging mass spectrometry in the past 10 years.

Interestingly MALDI was the ionisation method of choice in most of the experiments, even though utilisation of a matrix for covering the sample and to adsorb laser energy comes with more time-consuming sample preparation [235]. A reason for this might be that MALDI is currently the most widely accessible IMS technique and provides soft ionisation. This, in turn enables high-resolution mapping of ions within a sample without destroying sensitive molecules [235,236]. NanoDESI belongs to the so-called soft ionisation methods as well, but is a rather new technique compared to MALDI. Desorption electrospray ionisation (DESI) is based on a two capillary system building a small bridge of charged solvent on the sample surface that absorbs the analytes and directs them into the atmospheric inlet of a mass spectrometer [236]. Compared to the quite robust MALDI, DESI is accompanied by great technical challenges since obtaining an ideal analyte signal relies on a big variety of geometrical and instrumental parameters [237]. Even though SIMS is the oldest IMS techniques out of the three methods compared here, it only plays a subordinate role for natural product discovery. This is very likely caused by the fact that a high degree of optimisation is needed to record useful data. Furthermore, measurements of intact natural products is often not possible because of the requirement for a direct ion beam that is focused on the sample surface [235].

How IMS can be successfully implemented for natural product discovery was shown by Kersten et al. [238] through their application of a "peptidogenomic" workflow. MALDI-TOF MS and subsequent MS<sup>n</sup> sequence tagging was used to match the mass spectrometry data in an iterative approach to the genome-derived peptide structures generated with NRP prediction tools. Limiting the *m*/*z* range to 1500–5000 Da, they were able to detect nine novel lassopeptides from seven genome-sequenced *Streptomyces* strains, as well as the already known stendomycin produced by *S. hygroscopicus* ATCC 53653 [238].

IMS has proven its capability to detect various classes of known natural products produced by actinomycetes. It is an important tool for the direct observation of interspecies interactions and can be used for the detection of novel antibiotics. Nevertheless, most publications reporting the use of IMS focused on known metabolites and even though unassigned masses have been detected, the corresponding compounds still await their isolation and structure elucidation.

**Figure 10.** Natural products produced by actinomycetes detected and annotated by IMS. Fifteen known bioactive compounds were identified in different co-cultivation experiments. Only methylenomycin [239] exclusively had been detected using SIMS, whereas all molecules identified with Nano-DESI were also detectable by MALDI-TOF. Undecylprodigiosin [239,240] and actinorhodin [239,240] could be identified by all of the three methods. Prodiginine [240], coelichelin [240], different desferrioxamine derivatives [240], surfactin and plipastatin [241,242] could be assigned by NanoDESI and MALDI-TOF. Chalcomycin [243], daptomycin [243], actinomycin D [244], efomycin G [244], elaiochelin [244], linearmycin [243,245] and arylomycin [231] were only detected by MALDI-TOF.

#### 5.1.2. Liquid Chromatography Coupled to Nuclear Magnetic Resonance Spectroscopy (LC-NMR)

MS and NMR are sharing the first place of most important technologies for identification and structure elucidation of new natural products [246]. Whilst NMR is commonly used for pre-purified samples, its hyphenation with liquid chromatography expands its application towards the screening stage in natural products discovery. Compared to the combination of liquid chromatography with mass spectrometry (LC-MS), NMR is limited by lower sensitivity but offers the possibility to provide important structural information on-line that are not accessible by MS. It was already proposed for many years, that a combination of NMR and LC (hence LC-NMR) could make structural information available earlier in the discovery workflow compared to classical procedures utilising various steps of fractionation, purification and subsequent structure elucidation using different spectroscopic methods (whereas NMR is usually the final step) [247]. Nevertheless, only the implementation of flow-through probes such as SPE-NMR or CapNMR and technical advantages in NMR spectrometer field strength in the last decades finally could pave the way into the natural products labs for LC-NMR technology [248]. LC-NMR has already proven its value for dereplication of plant extracts for various compound classes such as alkaloids [249,250], phenolic compounds [251,252], isoflavones [253], flavonoids [254] and lignans [255]. Johansen et al. [256] described in 2011 the characterisation of 15 compounds, of which four have not been previously identified, in the extract of the safflower *Carthamus oxyacantha*. They used a combined HPLC-PDA-HRMS-SPE-NMR approach allowing the preselection of peaks observed by PDA-HRMS on a SPE cartridge for subsequent NMR analysis. A following microfractionation in 96-well plates for bioactivity assays as described by Lang et al. [257] would complete a full automated screening process with maximum information about the composition of extracts. Consequently, Lang et al. classified LC-NMR as an evolving trend in the dereplication of fungal extracts.

For bacterial extracts, identification and structure elucidation of six linear peptides (macyranones A–F) from a myxobacterial extract was performed by Keller et al [258] using LC-SPE-NMR-MS techniques in 2015. Furthermore, Lin et al. [259] successfully used a LC-MS-NMR platform for the identification of four known compounds from a single LC injection of a cyanobacterial extract and additionally identified one new bioactive compound in the extract. With regards to actinobacteria LC-NMR has apparently been used only for structure elucidation of four new polyketides (granaticin C and metenaticin A–C) by Pham et al. [260] from a *Streptomyces violaceoruber* extract back in 2005. The finding that LC-NMR did not seem to be used frequently for the isolation and direct structure elucidation of natural products from microbial extracts in the past 10 years may be explained at least in part by the fact that most compounds produced in sufficient amount for identification by NMR—notably being much less sensitive than MS—are already known and for that matter not a target for structure elucidation. Nevertheless, for the investigation of newly isolated strains from new genera, species and families that are very likely to produce chemical diverse unknown metabolites, LC-NMR should be considered a feasible screening strategy [15].

#### 5.1.3. Super Critical Fluid Chromatography (SFC)

Liquid and gas chromatography are irreplaceable instruments for the separation of highly complex bacterial crude extracts [261–263]. Another evolving analytical separation technique that has already proven its impact in phytochemistry is supercritical fluid chromatography (SFC) [264]. SFC uses supercritical fluids, most commonly CO2, as mobile phase. In this stage, the fluids uniquely combine two of the most desirable features of a mobile phase: high dissolving capabilities but densities comparable to a liquid. As orthogonal method to liquid chromatography it offers the possibility to cope with metabolites not separable by normal phase or reversed phase liquid chromatography [264–267]. The use of co-solvents such as alcohols extends the spectrum of compound mixtures accessible by SFC. Almost all stationary phases used in HPLC are also feasible for SFC and diverse detectors such as diode array detector (DAD), evaporative light scattering detector (ELSD), MS, DC or charged aerosol detector (CAD) can be fitted to the instrument [268,269]. For plant extracts SFC has already been used for the isolation of nonpolar compounds such as terpenes [270] or flavonoids [271] but it also has proven to be a useful tool for the isolation of more polar natural products such as carsonic acid from rosemary extracts [272] and valerenic acid from *Valeriana officinalis* [273]. Notwithstanding, to our knowledge SFC has never been used for the discovery of new antibiotics from actinomycetes. In consideration of the wide field of natural products it has been used, it certainly will not take long until first research groups will successfully employ SFC for the separation of microbial natural products.

#### *5.2. Dereplication Using Metabolomic Data*

Metabolomics is the systematic acquisition of small molecules produced by a biological system at a certain point in time. Increasing resolution of the data acquired by state-of-the-art technologies produces an expanding volume of data hereby [274]. To evaluate these data in the best way, dereplication using bioinformatics tools is indispensable [275]. First introduced by Beutler et al. [276] back in 1990 as "a process of quickly identifying known chemotypes" used for the identification of compounds from bacterial extracts responsible for a biological activity in a simple phorboldibutyrate (PDBu) receptor binding assay, the term "dereplication" today covers various strategies and different fields of research from pharmacology, chemistry, plant sciences to biotechnology and food science [226]. One of the main issues in the field of natural product antibiotic discovery is the high rediscovery rate, which can only be limited by careful dereplication. Chanana et al. [277] made the statement, that it is more likely to identify a known metabolite from actinobacteria than finding a new one which is certainly true as many academic and industrial groups have worked on actinomycetes for decades. This group used comprehensive principal component analysis (PCA) for the prioritisation of most interesting bacterial strains and molecules and were therefore able to isolate two novel natural products from an *Actinomadura* sp. (strain WMMB-449) and patented another one [277]. For the prioritisation of bacterial extracts showing promising antibacterial activities it is crucial to sort out extracts containing known metabolites possibly responsible for the observed activity [278]. In-house databases—containing as many chromatographic data from known analytes as possible—provide the advantage of being perfectly matched to the conditions used for the sample of interest, but are difficult to keep up-to-date due to the requirement of continuous manual curation. Commercially available libraries such as Dictionary of Natural Products [279], AntiBase [280], MarinLit [281] and MS databases such as MassBank [282], Metlin [283], mzCloud [284] and ReSpect [285] certainly are useful sources, but are either not freely available, do not contain MS data or do not allow customised use of its reference library. In addition, connectivity between these databases is rather poor, leading to an unpredictable degree of overlap. Dereplication by database search is therefore a cumbersome and time-consuming procedure and last but not least an expensive one [286]. A better connection between already existing libraries, lowering the access barrier and a more intense data exchange between groups working in the field of natural products is strongly required. An important first step in the direction of better data exchange is Global Natural Products Social molecular networking (GNPS), an open-access knowledge platform for sharing tandem mass spectrometry data [287]. In the course of only a few years, GNPS has already proven its high impact for the natural product community. Crüsemann et al. [288] for example used GNPS for the analysis of 146 marine *Salinispora* and *Streptomyces* strains. They were able to identify 15 molecular families of diverse natural products and showed that GNPS is in principle ready for increased throughput screening thereby [288]. However, the simple comparison of MS/MS spectra with spectra deposited in the GNPS is not yet sufficient for a comprehensive analysis of secondary metabolome data. Algorithms such as DEREPLICATOR [289] or the just recently published VarQuest [290] were designed to expand the library search to identify variants of known peptidic natural products. The complete workflow from genome sequence to the putative biosynthetic gene cluster and its product with the associated spectra would ideally be done in a one-step procedure for high-throughput data analysis. Several tools have taken on this challenge and specifically address the connection of genes with secondary metabolites. Table 1 is a comparison of six different bioinformatic software packages comprising a structure prediction tool for chemical products derived from biosynthetic gene clusters: PRISM 3, SeMPI, antiSMASH 4.0, Pep2Path, RiPPquest and NRPquest. As some of them are designed for much broader analysis of genomic data, we would like to point out that they are henceforth just compared concerning structure prediction. Tools with more than one software release were summarised with functions available in the newest version to date.

**Table 1.** Comparison of six different bioinformatic tools for structure prediction of chemical products derived from biosynthetic gene clusters: PRISM 3, SeMPI, antiSMASH 4.0, Pep2Path, RiPPquest and NRPquest. Tools with more than one software release were summarised with functions available in the newest version to date.


<sup>1</sup> No matching with experimental MS/MS data; <sup>2</sup> Limited to peptide natural products; <sup>3</sup> Limited to RiPPs; <sup>4</sup> Limited to NRPs.

AntiSMASH 4.0 presumably is one of the prediction tools that offers the broadest range of *in silico* analysis. Biosynthetic gene clusters are predicted using hidden Markov models and manually curated BLAST databases for identification of biosynthetic domains involved in secondary metabolite production. Subsequently the domains are arranged into clusters and chemical structures of the products are predicted. The hits are finally compared with a library of known natural products [122,295]. PRISM 3 follows a similar concept. However recent software releases of AntiSMASH were more focused on improving the genomic data evaluation and accessibility of prediction for additional cluster types such as terpenes [121,122,292]. Therefore, PRISM 3 remains the most promising bioinformatics tool for prediction of tailoring reactions on NRPS and PKS products [184,185]. Other bioinformatic tools such as Pep2Path, RiPPQuest and NRPQuest are limited to peptidic natural products [183,293,294]. Peptidic natural products (PNPs) however feature characteristic MS/MS-fragmentation patterns and usually reliable ionisation behaviour [293].

MS/MS metabolomic data can be used to complement the theoretical *in silico* prediction of chemical products produced by biosynthetic gene clusters. Thus Liu et al. [296] were able to identify the stenothricin biosynthetic gene cluster in *Streptomyces roseosporus* in 2014. Peptidogenomics in this manner is a powerful tool for full-automated high-throughput screening for bacterial extracts using both experimental and *in silico* data. Another elegant way to correlate natural products to their corresponding BGCs is using the tools mentioned above to predict specific physicochemical properties. The macrolactam salinilactam isolated from *Salinospora tropica* could be identified in a crude extract by searching for characteristic UV absorption of polyenes [297]. Since detection by UV absorption is not very specific when applied alone, rediscovery rate with such an approach is high. Therefore, MS-based methods using the identification of chlorine, bromine, fluorine and sulphur in a molecule by their characteristic isotopic pattern in combination with the exact molecular mass are more specific for the identification of secondary metabolites. In particular, this method has proven to be useful for the identification of related compounds as exemplified by the isolation of the thioholgamides from a crude extract of *Streptomyces malaysiense* MUSC 136 57. A combination of search for BGCs related to the one of thioviridamide and prioritising of potential products by their characteristic isotopic pattern revealed these natural products with promising cytotoxic activity [298]. Additionally, feeding experiments of

isotopically labelled precursors are useful to narrow down potential products of the BGC present in the crude extract and provide further information about the biosynthesis of the secondary metabolite.

#### *5.3. Conclusions*

The widespread availability of high-resolution mass spectrometry in the past two decades has been a milestone for analytical chemistry not only for actinobacterial extracts. Especially in combination with HPLC this technology provides the opportunity for rapid and robust analysis of the bacterial metabolome. This advance increased the quality as well as the quantity of data acquired on a daily basis. The focus of metabolomics-driven microbial natural product research therefore was on extending the scope of molecules accessible to high resolution MS and subsequent data analysis including dereplication. IMS, LC-NMR and SFC are recent technical improvements useful for the characterisation of bacterial extracts. Although most of the modern analytical systems are commercially available, the application of most of the presented techniques was focused on one specific research field such as phytochemistry for SFC. Mass spectrometry and NMR still form the stable foundation for metabolomics of microorganisms. Their sphere of influence has already been extended by hyphenated techniques such as LC-NMR and will be certainly further increased by the combination with underrepresented techniques such as SFC in the coming years. The rising quality and quantity of metabolomic data obtained with hyphenated techniques in the past decades increased the importance of bioinformatics tools to organise and prioritise the data. There are various tools published, that either can be used for library matching or even for structure formula prediction. One big issue in natural product research is data exchange within the different research groups to hinder rediscovery of already known compounds. The establishment of the GNPS that allows sharing MS2-data of natural products and other raw data is a good start towards a better exchange within the community. Nevertheless, as much as technical improvements can only unfold their full potential when combined with bioinformatic tools, they are more powerful, when genetic information is included in the search for new antibiotics from actinobacteria.

#### **6. Summary and Conclusions**

Actinomycetes are still a prolific source for natural products with intriguing bioactivities and inspiring chemical characteristics as exemplified by the new antimicrobial agents discovered in the past 10 years [164]. Considering that the majority of microbial secondary metabolite producers are not culturable under laboratory conditions yet, the potential for discovery of novel actinomycete metabolites is promising and far from exhausted [14]. However, as actinomycete research has been extensively performed in the past by academic and industrial groups, chances for the identification of novel chemistry may be higher in currently less studied microbial resources such as cyanobacteria or myxobacteria [299,300]. Natural product research is fundamentally dependent on the interplay of different disciplines of natural sciences. Particularly, microbiology for the isolation of new secondary metabolite producers, molecular biology for the characterisation of the biosynthetic machinery and analytical chemistry for identification and isolation of natural products form the basis for the characterisation of novel compounds from actinobacteria (Figure 11).

In the past years, focus on natural product discovery from actinomycetes shifted from the extensively investigated soil-dwelling isolates towards underexplored habitats of rare actinomycetes from unusual ecosystems. This strategy has been shown to give an impact on the discovery platform for novel compounds with promising bioactivities despite the fact that a large amount of the reservoir of habitats still awaits exploration [301]. The full potential of the natural product "treasure trove" of environmental soil samples from exotic or conventional habitats is still underexplored, since present isolation procedures fail to access the uncultured microbial majority. Nevertheless, the "genomic revolution" has revealed that actinobacteria carry more biosynthetic gene clusters in their genome than secondary metabolites found under standard laboratory conditions. This finding shows that harvesting novel strains just for few compounds understates their immense potential. The simulation

of environmental conditions to mimic essential aspects of the environment extended the scope of microorganisms culturable under laboratory conditions [35]. Although technologies such as the iChip platform have shown great potential for increasing throughput and improving the isolation of previously uncultured bacteria, novel microbiology-based methods are still underrepresented in the natural product discovery pipeline. Genomics as a field that has seen tremendous development over the past 10 years, in contrast has rapidly triggered the establishment of several innovative methods to explore the diversity and distributions of BGCs encoding natural products on the genetic level [302]. To overcome obstacles of conventional isolation procedures and access the uncultured microbial majority of rare actinomycetes, metagenomic approaches have been developed. With the growing underlying in-depth knowledge of biosynthetic gene organisation, affordable whole genome sequencing technologies and prediction methods of the resulting natural product structure, "silent" BGCs can be leveraged to yield novel compounds. BGCs from genetically non-manipulable or uncultured actinomycetes can be accessed via heterologous expression through capturing the BGC and refactoring the gene arrangement. The combination of methods from synthetic biology, biotechnology and genetics is thus sets to expand Nature's chemical diversity through production of "unnatural" natural products and naturally expression of downregulated BGCs [303,304]. Despite the enormous impact of genomics, we must urgently remind ourselves that all sequence-based developments need to be supported by microbiology and analytical chemistry for production and isolation of compounds. In the field of analytical chemistry, advances have been made for the development of bioinformatic tools for an easier and faster dereplication. Furthermore, applied analytical chemistry in the field of natural product research is up to date focused on retro-biosynthetic logic to correlate secondary metabolites to their BGCs. Technologically, hyphenated techniques such as LC-NMR and separation techniques such as SFC have entered the stage more recently for the discovery of novel antibiotics but are not fully utilised for the actinobacterial natural product research yet [247]. Furthermore, imaging mass spectrometry has proven its ability to directly provide deep insights in the connection between phenotypic traits and secondary metabolism, but upscaling remains a formidable challenge [242]. On this occasion, advances in microbiological cultivation strategies are required to utilise the full potential of this method for the discovery of antibiotics. This is just one of many examples where improvements in one of the three key disciplines in natural product research will have synergistic effects when efficiently combined with other methods. Only the combination of technologies and exchange of knowledge between microbiology, molecular biology and analytical chemistry will help to not only scratch the surface of the untapped natural product reservoir but comprehensively exploit its full potential.

**Figure 11.** Interplay of disciplines for the exploitation of the untapped natural product reservoir with major developments in the past 10 years covered by this review.

**Funding:** This research received no external funding.

**Acknowledgments:** We are thankful to Daniel Krug, Chengzhang Fu and Tadeja Lukezic for their scientific advice. J. Hug was supported by a PhD fellowship of the Boehringer Ingelheim Fonds.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Review Streptomyces* **Differentiation in Liquid Cultures as a Trigger of Secondary Metabolism**

#### **Ángel Manteca and Paula Yagüe \***

Área de Microbiología, Departamento de Biología Funcional IUOPA, Facultad de Medicina, Universidad de Oviedo, 33006 Oviedo, Spain; mantecaangel@uniovi.es

**\*** Correspondence: paula.yague@gmail.com; Tel.: +34-9-8510-3000 (ext. 5289)

Received: 26 February 2018; Accepted: 9 May 2018; Published: 14 May 2018

**Abstract:** *Streptomyces* is a diverse group of gram-positive microorganisms characterised by a complex developmental cycle. *Streptomycetes* produce a number of antibiotics and other bioactive compounds used in the clinic. Most screening campaigns looking for new bioactive molecules from actinomycetes have been performed empirically, e.g., without considering whether the bacteria are growing under the best developmental conditions for secondary metabolite production. These screening campaigns were extremely productive and discovered a number of new bioactive compounds during the so-called "golden age of antibiotics" (until the 1980s). However, at present, there is a worrying bottleneck in drug discovery, and new experimental approaches are needed to improve the screening of natural actinomycetes. *Streptomycetes* are still the most important natural source of antibiotics and other bioactive compounds. They harbour many cryptic secondary metabolite pathways not expressed under classical laboratory cultures. Here, we review the new strategies that are being explored to overcome current challenges in drug discovery. In particular, we focus on those aimed at improving the differentiation of the antibiotic-producing mycelium stage in the laboratory.

**Keywords:** *streptomyces*; screening; antibiotics; secondary metabolism; differentiation; elicitors; morphology; liquid cultures

#### **1. Introduction**

The *Streptomyces* genus includes an important group of biotechnological bacteria. They produce two-thirds of the antibiotics of medical and agricultural interest, several antitumor agents, antifungals, and a great number of eukaryotic cell differentiation effectors, such as apoptosis inducers and inhibitors [1]. Drug discovery from *streptomycetes* fell considerably after initial screenings where the most common compounds were discovered. Antibiotic resistance is increasing dramatically, and new antibiotics are urgently required in the clinic. Alternative methods, such as the exploration of chemical libraries and combinatorial chemistry, have provided limited yields. Screening from nature has resumed through methods such as exploring new environments, looking for elicitors, accessing the metagenome, etc.

One of the most important characteristics of *Streptomyces* is its complex life cycle, which is closely related to secondary metabolite production [2] (outlined in Figure 1). In solid sporulating cultures, development starts with spore germination and the rapid development of compartmentalised hyphae into the medium (early substrate mycelium or MI) [3]. After that, programmed cell death (PCD) occurs (red cellular segments in Figure 1) which triggers the differentiation of the multinucleated (MII) antibiotic-producing hyphae (late substrate mycelium, early MII) [3,4]. Then, the mycelium starts to grow into the air forming the aerial mycelium (late MII). At the end of the cycle, there is a second round of PCD, and most of the remaining viable hyphae undergo a process of compartmentalisation that culminates in the formation of unigenomic spores [5].

*Antibiotics* **2018**, *7*, 41

**Figure 1.** *Streptomyces* growth in solid cultures (**upper panels**) and liquid cultures (**lower panels**). In solid cultures (petri plates), spores germinate developing a compartmentalised mycelium (early substrate mycelium, MI) with 1 μm average cross-membrane spacing [6]. Some of the MI cells suffer a first round of programmed cell death PCD (red segments). The remaining viable segments start to grow as a multinucleated mycelium with sporadic septa (early MII, late substrate mycelium) [6]. The mycelium substrate suffers a second round of PCD (red segments) and differentiates into a mycelium that starts to grow into the air (the medium/agar border is indicated by a brown line) (late MII, aerial mycelium). Part of the aerial hyphae form spore chains (black circles). In liquid cultures, there is germination, MI development, PCD (in the centre of the mycelial pellets) and MII differentiation (in the periphery of the pellets). In most species, there is no aerial mycelium formation or sporulation, and hyphae form pellets and clumps [2]. Secondary metabolites (outlined as yellow circles and blue starts) are produced by the MII hyphae.

Most *streptomycetes* do not sporulate in liquid cultures. Therefore, it was previously assumed that under these conditions, there was no differentiation. However, industrial antibiotic production is mostly performed in liquid cultures (flasks and bioreactors). Currently, it is known that in liquid cultures, differentiation is comparable to that observed in solid cultures (Figure 1). In liquid cultures, there is a first mycelium stage (MI), PCD and the differentiation of a secondary metabolite, producing mycelium (MII). However, in most *Streptomyces* strains, aerial mycelium formation and sporulation are blocked [6] (Figure 1). *S. coelicolor* proteomic and transcriptomic studies have shown that physiological differentiation in liquid and solid cultures is comparable [6,7]. MII expresses/translates the genes/proteins involved in secondary metabolism in both solid and liquid cultures [6,7].

Surprisingly, *Streptomyces* differentiation as a trigger for antibiotic production remains almost unexplored. The absence of a developmental model to describe differentiation in liquid cultures has inhibited the understanding of the relationship between macroscopic morphology (pellet and clump formation) and differentiation. Pellet and clump formation has been classically correlated with secondary metabolite production, but the relationship between both processes remains obscure. Most authors have affirmed that pellets and clumps are fundamental for secondary metabolite production (e.g., retamycin in *S. olindensis* [8], nikkomycins in *S. tendae* [9], hybrid antibiotics in *S. lividans* [10]), while some authors have affirmed that pellet and clump formation reduces antibiotic production (e.g., nystatin in *S. noursei* [11], tylosin in *S. fradiae* [12]). More recently, our group demonstrated that one of the key events in the activation of secondary metabolite production in

*Streptomyces* liquid cultures is the differentiation of MII (e.g., actinorhodin/undecylprodigiosin production in *S. coelicolor* [2,13], microbial transglutaminase production in *S. mobarensis* [14], apigenin and luteolin production in *S. albus* [15]). The differentiation of this mycelium is conditioned by PCD of the vegetative hyphae (MI) [2], which, in liquid cultures, depends on the growth rate of the strain and hypha aggregation (pellet/clump formation) [2,7,14–16]. However, secondary metabolism has additional regulations (elicitors activate specific biosynthetic pathways) [17], and most *Streptomyces* strains do not display all their potential secondary metabolites under standard developmental laboratory conditions, even if they are differentiated at the MII stage [7].

Each *Streptomyces* strain can harbour up to 30 secondary metabolite pathways, but only a few of these are active in usual screening processes [18]. Activating these pathways in the lab will be crucial in the process of screening for new secondary metabolites from actinomycetes. Here, we review the most important strategies that are being explored to activate cryptic pathways and/or those that are being explored to enhance secondary metabolites production.

#### **2. Screening for New Secondary Metabolites from** *Streptomycetes*

The search for new actinomycetes in unexplored niches or from the screening of strains that have not been previously cultivated is useful, but usually leads to the rediscovery of already known compounds [19]. New screening strategies are necessary to overcome the current challenges of discovering new bioactive compounds [19]. In 2013, Arryn Craney et al. [20] summarised the new strategies that are being used to enhance secondary metabolite production and activate cryptic pathways, dividing them into unselective and selective methods [20]. Unselective methods are non-specific methods that are used to screen for new activities, whereas selective methods are biosynthetic cluster-specific methods that are used to improve the production of already known molecules [20].

Non-specific methods were largely used during "the golden age of antibiotics", and they are still useful. These methods include classical strategies, such as changing media components, increasing general precursors (metabolic engineering), inducing stress responses (with heat/ethanol/salt/acid shock, nutrient limitations) [21], and obtaining strains that overproduce secondary metabolites by random mutagenesis [22–24]. More novel non-specific methods include ribosomal engineering (the alteration of ribosomal proteins to activate cryptic secondary metabolites in *streptomycetes*) [20,25] and the use of small molecules as elicitors of secondary metabolism [20,26] (Table 1). Differentiation of the antibiotic producer mycelium (MII) as a non-specific method to activate antibiotic production remains almost unexplored. There has been no previous analysis of the frequency of *Streptomyces* strains that do not produce secondary metabolites because they are not differentiated at the MII stage in the laboratory.

Biosynthetic cluster-specific methods include self-resistance engineering (upregulation of self-resistance genes), regulatory engineering (overexpression of activators or elimination of repressors) and genome mining to search for new biosynthetic pathways [20] (Table 2). One of the most important biosynthetic cluster-specific methods is heterologous expression. Heterologous expression has been used to express *Streptomyces* industrial enzymes, such as laccases, in microorganisms with simpler developmental cycles than *Streptomyces*, such as *E. coli* [27]. However, the complex biosynthetic pathways of *Streptomyces* rarely can be expressed in simple expression hosts, such as *E. coli* or *Bacillus*. Thus, other *streptomycetes*, such as *S. lividans*, *S. albus*, *S. coelicolor* or *S. avermitilis*, are commonly used as expression hosts [28]. The activation of cryptic metabolites through the expression of the *Streptomyces coelicolor* pleiotropic regulator, *Afs*Q, in other *streptomycetes* [29] has been successfully achieved. Combinatorial biosynthesis, chemical modification of existing molecules, has been largely developed over the last 20 years, in particular, progress has been made in the last few years thanks to genome mining and synthetic biology [30–32]. Differentiation of *Streptomyces* MII was successfully used to enhance the production of various products [2,13–15] through its role as a trigger for antibiotic production (described in Section 2.3).

#### *2.1. Streptomyces Differentiation Strategies Based on Elicitors*

In the last few years, effort has been made to elucidate the mechanism by which some small molecules (elicitors) affect differentiation and secondary metabolite production in *Streptomyces* strains. Elicitors can be defined as diffusible signals that are able to induce cryptic pathways and/or differentiation in *Streptomyces* cultures [17]. Some elicitors act as signals for interspecies interaction [33]. Thus, subinhibitory concentrations of certain antibiotics produced by a given *Streptomyces* strain accelerate differentiation and antibiotic production in other *Streptomyces* strains through "pseudo" gamma-butyrolactone receptors [33]. Another good strategy is the use of random chemical probes (natural or synthetic) as elicitors (reviewed in [21]).

One of the most common strategies used to activate secondary metabolism and differentiation is mimicking the ecological environment through co-cultures of different microbes [17,34]. This methodology typically uses species that have symbiotic relationships with *Streptomyces* in nature [35,36] or pathogen partners that activate the production of antimicrobial compounds [37–39]. For instance, fungal elicitors (complex mix of cell walls and filtered cultures) positively affect the production of natamycin [40], bacterial and yeast elicitors improve valinomycin production [41], nutrients such as glucose and xylose repress the production of actinorhodin [42,43], and small molecules, such as GlcNAc or phosphate, can trigger differentiation and antibiotic production in *S. coelicolor* through the activation of *act*II-ORF4/*red*Z genes [44].

Pimentel-Elardo et al. [45] developed an activity-independent screening method based on the use of elicitors, to prevent the rediscovery of the most active/abundant compounds. In addition, cheminformatics techniques are used to identify the putative biological activities of identified compounds [45]. The use of elicitors increases the production of low-abundant compounds which were undetected in the classical activity dependent screening. The chemical elicitor "CI-ARC" has been identified as being responsible for triggering several cryptic biosynthetic genes [45].

#### *2.2. Differentiation Strategies Based on Macroscopic Morphology*

#### 2.2.1. The Genetic Control of Aggregation and Macroscopic Morphology in Liquid Cultures

Large-scale antibiotic production is mostly performed in liquid cultures. It is almost unanimously accepted that the macroscopic morphology of the mycelium (pellets and clump formation) is correlated with the production of secondary metabolites. However, it was not until recently that the genes controlling pellet and clump formation have been characterised. The *S. coelicolor mat* gene cluster [46] and the *cslA*, *glxA*, *dtpA* genes [47–49] are responsible for mycelial aggregation and pellet formation. These genes could be a great tool for controlling the morphology in industrial fermentation.

The *Streptomyces* life cycle in liquid cultures starts with the germination of spores. Awakening from the dormant spore state depends on the level of AMPc in the cultures [50] and involves the small hydrophobic protein NepA, [51]. The expression of several sigma factors involved in osmotic and oxidative stress (SigH, SigB, SigI, SigJ) undergoes remarkable changes during germination, indicating that germination evokes stress-like cell responses [52]. Several genes encoding proteins involved in lipid metabolism and membrane transport are overexpressed during germination [52]. The conservation of D-alanyl-D-alanine carboxypeptidase (SCO4439) contributes to the swelling phase of germination [53]. Cell wall hydrolases participate in germination [54]. SsgA protein marks the germinative tube emission points [55]. Recently it was described that during germination, spores aggregate due to extracellular glycans synthesized by the MatA, MatB [46,56] and the CslA/GlxA/DtpA proteins [56]. These aggregates determine the macroscopic morphology (pellets and clumps) of the culture [56] which triggers PCD and the physiological differentiation of the antibiotic producer, mycelium MII [2].

Another issue that influence secondary metabolite production is sporulation. Several *streptomycetes* are able to sporulate in liquid cultures [57] and some strains, that normally do not sporulate are also able to sporulate in bioreactors due to the stress generated in the fermenter [13]. Sporulation stops metabolism, including secondary metabolite production. Consequently, in industrial fermentations and during screening for new secondary metabolites, it is important to avoid sporulation to increase and maintain secondary metabolism for as long as possible [13].

#### 2.2.2. Monitoring of *Streptomyces* Macroscopic Morphology and Differentiation in Liquid Cultures

Pellet and clump formation led to differentiation and secondary metabolism [2]. Consequently, new methodologies to monitor macroscopic morphology have been developed. Laser diffraction has been used to measure pellet size [58]. Flow cytometry has been used to establish pellet size distribution of culture populations [59,60]. Recently, a useful algorithm was developed as a plug-in for the open-source software, ImageJ, to characterize the morphology of filamentous microorganisms in liquid cultures [61]. Mathematical models have been performed to predict the behaviour of *Streptomyces* liquid cultures based on pellet/clump morphology [62,63].

Biophysical parameters (e.g., pH, viscosity, agitation, dissolved oxygen levels and surface tension, among others) directly affect morphology and differentiation [13,64]. These parameters must be considered when scaling up production to industrial conditions [65]. Interestingly, a recent study downscaled liquid cultures to the 100 μL scale in microtiter plates [66], reproducing the same range of production and morphology as large-scale bioreactors, making screening easier and facilitating further upscaling.

2.2.3. Macroscopic Morphology Conditions, Programmed Cell Death and Second Mycelium Differentiation in Liquid Cultures

PCD is the key event that triggers the differentiation of the antibiotic producer, mycelium (MII), in liquid and solid cultures [2]. However, the specific signals derived from cell death are not yet known. The production of *N*-acetylglucosamine from peptidoglycan dismantling accelerates development and antibiotic production [67,68] and might be one of the signals released during PCD.

A simple methodology based on fluorometric measures of cultures stained with SYTO9 and propidium iodide was designed to quantify PCD in liquid cultures [69]. This method allows the efficiency of antibiotic production to be predicted based on the level of PCD [69].

Strains showing dispersed growth take a long time to suffer PCD, and sometimes, PCD does not occur. Modify the developmental conditions to enhance PCD and MII differentiation, leads to an improvement in secondary metabolite production. This approach was recently applied to enhance flavonoid production in a strain of *Streptomyces albus* [15] and to enhance microbial transglutaminase production from *Streptomyces mobaraensis* [14]. The "PCD-MII" approach complements other approaches well; there is no secondary metabolite production without differentiation of MII, but there are biosynthetic pathways that in addition to MII differentiation, need specific elicitors to become active [70].

#### *2.3. L-Forms*

An interesting alternative that would avoid the problems of mycelial growth in industry, is the use of L-forms, which are individual cells without cell walls [71]. However, until now, the antibiotic levels reached by *Streptomyces* L-forms have been quite minor compared to those reached by the regular form. Therefore, future research should explore whether L-forms could offer an industrial alternative.

#### *2.4. Other Strategies*

A big challenge in screening for new secondary metabolites is exploring non-cultivated bacteria. The scientific community is aware of the huge quantity of microorganisms that are not cultivated under laboratory conditions. Next Generation Sequencing revealed the big pharmacological potential of uncultured bacteria. Innovative culturing techniques, such as the isolation chip (iChip), are being used successfully in combination with co-cultures to grow previously uncultured bacteria [72]. The study of unexplored niches to look for new Actinomycetes is another strategy that enables the discovery of

new species and compounds [73–75]. The combination of these two methods is a promising strategy to identify new compounds.

One of the newest strategies focusses on primary metabolism and vegetative growth. Very recently, work by Schniete et al. [76] showed how genetic redundancy within actinobacterial genomes allows functional specialization of two pyruvate kinases in *Streptomyces* under different life cycle stages and environmental conditions. Genetic redundancy within actinobacteria genomes as being a key to understanding how the plasticity of this microorganism enhances the production of clinically useful molecules. Furthermore, Cihak et al. [77] recently described the production of secondary metabolites during germination in *Streptomyces coelicolor*. The germination stage was ignored in most secondary metabolite screening campaigns and constitutes a potential source of bioactive compounds to be explored [77].


**Table 1.** Non-specific methods and some successful examples of their enforcement. "Enhance" means an improvement in production; "cryptic" means activation of the expression of cryptic pathways.


**Table 2.** Biosynthetic cluster specific methods and some successful examples of their enforcement.

#### **3. Conclusions**

We generally face the great challenge of fighting antibiotic resistance, which is growing much faster than our capacity to find new antimicrobials and new strategies to face this problem. The *Streptomyces* genus is still a huge source of natural bioactive compounds, but we need to form new strategies to avoid rediscovering compounds. There is not a single methodology to trigger differentiation, activate cryptic secondary metabolism pathways and improve the discovery of new bioactive compounds. However, the multidisciplinary biosynthetic cluster specific and non-specific approaches discussed in this manuscript, will be key to improving the screening for new secondary metabolites from *streptomycetes*.

**Author Contributions:** P.Y. planned the topic of the review and searched the information; A.M. and P.Y. wrote the manuscript.

**Acknowledgments:** We thank the Spanish "Ministerio de Economía y Competitividad" (MINECO; BIO2015-65709-R) for financial support.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Review*

## **Unraveling Nutritional Regulation of Tacrolimus Biosynthesis in** *Streptomyces tsukubaensis* **through** *omic* **Approaches**

### **María Ordóñez-Robles 1,2, Fernando Santos-Beneit 2,3 and Juan F. Martín 1,\***


Received: 27 February 2018; Accepted: 26 April 2018; Published: 1 May 2018

**Abstract:** *Streptomyces tsukubaensis* stands out among actinomycetes by its ability to produce the immunosuppressant tacrolimus. Discovered about 30 years ago, this macrolide is widely used as immunosuppressant in current clinics. Other potential applications for the treatment of cancer and as neuroprotective agent have been proposed in the last years. In this review we introduce the discovery of *S. tsukubaensis* and tacrolimus, its biosynthetic pathway and gene cluster (*fkb*) regulation. We have focused this work on the *omic* studies performed in this species in order to understand tacrolimus production. Transcriptomics, proteomics and metabolomics have improved our knowledge about the *fkb* transcriptional regulation and have given important clues about nutritional regulation of tacrolimus production that can be applied to improve production yields. Finally, we address some points of *S. tsukubaensis* biology that deserve more attention.

**Keywords:** *Streptomyces tsukubaensis*; tacrolimus; FK506; *omics*

#### **1. Discovery of** *S. tsukubaensis* **and Tacrolimus Use in Current Clinics**

*Streptomyces tsukubaensis* and its secondary metabolite tacrolimus were discovered in 1984, during a screening performed by the Fujisawa Pharmaceutical Co. (since 2005 merged to Yamanouchi Pharmaceutical Co. to form Astellas Pharma). *S. tsukubaensis* was isolated from a soil sample in the Tsukuba region (Japan) and tacrolimus was identified in its culture broths, becoming the first immunosuppressant discovered with macrolide structure [1,2]. The strain, patented as *S. tsukubaensis* No. 9993, is currently known as *S. tsukubaensis* NRRL 18488 and is the parental strain of most of the strains used for the industrial production of tacrolimus.

Macrolides such as erythromycin are composed of 14–16 C-membered macrolactone rings to which one or more deoxysugars are attached. Tacrolimus, a 23-carbon macrolide (822 Da), was initially named as compound FR900506 but, later on, it received other names such as FK506 or fujimycin. The name of tacrolimus was established as an acronym of "Tsukuba Macrolide Immunosuppressant" [3]. The first reference to tacrolimus was made at the 11th International Congress of the Transplantation Society, held in Helsinki in 1986, one year before the first publications by Kino and coworkers. The first clinical assays, focused on hepatic transplantation, were developed at the University of Pittsburgh in 1989. Two years later the first international congress on tacrolimus was celebrated in that city [3]. Tacrolimus acts as a calcineurin inhibitor, showing a mechanism of action very similar to that of cyclosporine (Figure 1) [4]. When tacrolimus interacts with its cytosolic receptors, mainly FKBP12- [5], the calmodulin-dependent serine/threonine phosphatase activity of calcineurin is inhibited, resulting in the arrest of T cell proliferation [6]. The mechanism of action is conserved in human T cells and

yeast and thus, tacrolimus also has antifungal activity [7,8]. This activity is useful for the qualitative detection of tacrolimus by bioassay against susceptible strains such as *Saccharomyces cerevisiae* TB23 [9].

**Figure 1.** Mechanism of action of tacrolimus (FK506). Tacrolimus interacts with cytosolic receptors such as FKBP12. The complex FKBP12-FK506 inhibits the calmodulin-dependent serine/threonine phosphatase activity of calcineurin. In this situation, calcineurin can no longer dephosphorylate transcriptional factors (e.g., NFAT). The dephosphorylated TFs are required for governing T cell proliferation. L: ligand; R: receptor; CM: calmodulin; CN: calcineurin; TF: transcription factor; P: phosphate group; FKBP-12: FK506 binding protein 12.

Since its approval by the FDA for the treatment of hepatic transplantation in 1994, tacrolimus has been also applied to medulla, kidney and heart transplantation [10–12]. This macrolide is also used for the treatment of other diseases such as atopic dermatitis [13,14] and is applied to the stents implanted in coronary arteries [15]. Several works have been published about its use in immune diseases such as rheumatoid arthritis and intestinal inflammatory diseases [16,17]. Tacrolimus has shown antiviral activity against orthopoxvirus, HIV and feline immunodeficiency virus (FIV) [18–21] and has properties such as a hair growth stimulator [22]. Neuroprotective and neuroregenerative activities have been also reported [23–25] as well as its potential application in the treatment of cancer [26]. More recently, the efficacy of tacrolimus ointment in the treatment of allergic ocular diseases has been reported [27].

The efficacy of tacrolimus in the treatment of organ transplantation is the basis of its industrial importance. Tacrolimus is between 10 and 100 times more potent than cyclosporine and has been shown to be more effective in several clinical trials [28,29]. Tacrolimus generates important benefits for the pharmaceutical market; for example, the sales of tacrolimus under the commercial names "Prograf" and "Protopic" yielded a total of \$1727 million to Astellas Pharma in 2016 (data from http://www.pharmacompass.com).

#### **2. Biosynthetic Pathway and Gene Cluster**

The first studies on the tacrolimus biosynthetic pathway were performed by researchers from the pharmaceutical company Merck (USA) during the 90's [30–33]. Tacrolimus is a polyketide synthesized by a hybrid polyketide I synthase-non-ribosomal peptide synthase (PKSI-NRPS) system encoded by the *fkb* cluster, which encompasses a minimum of 19 genes (Figure 2). Until now, more than 15 tacrolimus-producing species have been reported [34], the last being *S. tsukubaensis* F601 [35]. There are two types of *fkb* clusters in the tacrolimus producing strains [36]: (i) A short version comprising the genes *fkbQ*, *fkbN*, *fkbM*, *fkbD*, *fkbA*, *fkbP*, *fkbO*, *fkbB*, *fkbC*, *fkbL*, *fkbK*, *fkbJ*, *fkbI*, *fkbH*, *fkbG*, *allD*, *allR*, *allK* and *allA* (found in *Streptomyces tacrolimicus* and *Streptomyces kanamyceticus* KCTC 9225) and; (ii) An extended version found in *S. tsukubaensis* NRRL 18488, *S. tsukubaensis* L19 and *Streptomyces* sp. KCTC 11604BP that includes 5 additional genes in the 5' region of the *fkbG gene* (*allMNPOS*/*tcs12345*) and one or two extra genes (depending on the species) in the 3' region (*tcs6*-fkbR/*tcs67*). Deletion of *allMNPOS* genes in *Streptomyces* sp. KCTC 11604BP does not significantly affect tacrolimus production; thus, it is dubious that they are involved in tacrolimus biosynthesis [37]. Actually, their transcription levels are low, which supports this assumption [38,39].

**Figure 2.** Tacrolimus biosynthesis cluster (*fkb*). Genes present in both the short and extended version of the *fkb* cluster are depicted in black. Genes present only in the extended version are depicted in red. These groups also correspond to their FkbN transcriptional dependence (**Black**) or independence (**Red**). The transcriptional units identified to date are indicated by boxes.

The first step in tacrolimus biosynthesis is the formation of (4R, 5R)-4,5-dihydroxycyclohex-1-enecarboxylic acid (abbreviated DHCHC) from chorismate through the so-called chorismatase activity of FkbO (Figure 3) [40]. DHCHC acts as starter unit for the subsequent formation of the carbon skeleton and corresponds to the cyclohexane ring in the final structure of tacrolimus. This ring is the most tolerant target for structural modifications that do not eliminate the immunosuppressant activity [41]. The polyketide synthases FkbA, FkbB and FkbC catalyze 10 elongation steps from DHCHC using as extender units malonyl-CoA (2 molecules), methylmalonyl-CoA (5 molecules), methoxymalonyl-ACP (2 molecules) and allylmalonyl-CoA (one molecule). The latter two extender units are unusual in the formation of polyketides and result in the methoxyl group of C13 and C15 and the allyl radical of C21, respectively [36,37,42,43]. The biosynthesis of methoxymalonyl-ACP from 1,3-biphosphoglycerate depends on the enzymes encoded by the *fkbGHIJK* subcluster [42–45]. The incorporation of allylmalonyl-CoA is the sole difference between tacrolimus and ascomycin (FK520), in which biosynthesis ethylmalonyl-CoA is used instead. The *all* subcluster is involved in the formation of allylmalonyl-CoA and encodes a polyketide synthase of unusual structure [46]. Nevertheless, ketoreductase and dehydratase activities encoded outside the *fkb* cluster might be involved in some steps of allylmalonyl-CoA formation and these activities could be shared with fatty acid synthases [37]. The tacrolimus cluster does not encode an ACP-CoA transacylase necessary for the final reaction leading to allylmalonyl-CoA [37], but the acyltransferase domain of the fourth module in FkbB (AT4FkbB) is able to transfer an allylmalonyl unit to the ACP domain [47].

For the cyclation of the macrolide, FkbL generates L-pipecolate from L-lysine [48], which is then incorporated into the carbon skeleton by NRPS FkbP [30,49,50]. Finally, two modification steps are necessary to achieve the final molecule with biological activity: a methylation of the hydroxyl group located at C31 and an oxidation at C9. Both groups are important for the binding of tacrolimus to FKBP12 [51,52]. The methylation is catalyzed by the S-adenosylmethione dependent O-methyltransferase FkbM and the oxidation by the cytochrome P450-oxidoreductase FkbD [31,53]. Both activities are encoded in the same operon and can occur in any order [31,45,54]. Interestingly, the reaction catalyzed by FkbD (a double step oxidation involving 4 electron transfers and the formation

of the alcoholic intermediate 9-hydroxy-FK506) is known for terpenoid biosynthesis but was first described for polyketide biosynthesis [45].

**Figure 3.** Scheme representing the assembly of the tacrolimus polyketide and the early and late biosynthetic steps. In the upper part the arrows represent the three PKS genes (*fkbA*, *fkbB*, *fkbC*) of the cluster. Note that the *fkbA* gene is physically separated from *fkbB* and *fkbC* genes in the *fkb* cluster (see Figure 2). The modules of the PKSs are boxed and indicated as M1 to M10. Domains in the modules are indicated by circles: ACP, acyl carrier protein; AT, acyltransferase; ER, enoyl reductase; CAS, CoA synthetase; KR, 3-oxoacyl (ACP) reductase; DH, 3-oxoacyl thioester dehydratase; KS, 3-oxoacyl (ACP) synthase. DHCHC: (4R, 5R)-4,5-dihydroxycyclohex-1-enecarboxylic acid. Biosynthetic and late modification steps, and the encoding genes for the starter (*fkbO*), elongation units (*fkbL*, *fkbP*) and late modification reactions (*fkbM*, *fkbD*). Based on data from Motamedi and Shafiee [30].

#### **3. Transcriptional Regulators and Recent Insights through Transcriptomic and RNAseq Studies**

The first sequence analyses of the *fkb* cluster revealed three potential regulators: *fkbN*, *fkbR* and *allN* (belonging to the LAL, LysR and AsnC families, respectively). FkbN is a large regulatory protein of the LAL family (Large ATP binding regulators of the LuxR family). The LAL regulators are large proteins (872–1159 amino acids) that contain a LuxR-type HTH DNA binding region near the C-terminal end of the protein and an ATP binding motif in the N-terminal end [55,56]. Similar FkbN-like genes have been found in several other macrolide gene clusters including RapH of the rapamycin producer *Streptomyces hygroscopicus* [57], PikD of the pikromycin producer *Streptomyces venezuelae* [58], GdmR1 and GdmR2 of the geldanamycin producer *Streptomyces hygroscopicus* [59], FkbN of the ascomycin producer *S. hygroscopicus* var. *ascomyceticus* [44], FscRI in the candicidin producer *Streptomyces griseus* [60,61], PimM of the pimaricin producer *Streptomyces natalensis* [62,63], NysR from the nystatin producer *Streptomyces noursei* [64], AmphRIV in the amphotericin B producer *Streptomyces nodosus* [65] and PteF in the filipin producer *Streptomyces avermitilis* [66,67].

The second regulatory protein FkbR belongs to the family of the LysR-type transcriptional regulators, also named LTTR, which are very common autoregulatory genes in bacteria [68]. In fact, they are widely distributed in *Streptomyces*: genome sequencing revealed about 40 LTTRs in *S. coelicolor* [69]. FkbR, as occurs with other members of the LTTR family, is a relatively small protein of less than

325 amino acids that is characterized by an HTH DNA binding motif in the C-terminal and by a ligand (co-inducer) binding sequence in the N-terminal region [70,71]. Other LTTRs acting as pathway-specific regulators include SCLAV\_p1262 of *S. clavuligerus* (77% identity), ThnI from *Streptomyces cattleya* (39% identity), AbaB from *Streptomyces antibioticus* or ClaR from *S. avermitilis* [72,73].

The third putative regulatory gene of the tacrolimus gene cluster is *allN*. This gene is located in the 5' end of the extended version of the tacrolimus gene cluster and encodes a protein that has similarity with regulatory proteins involved in nitrogen metabolism, particularly with regulators of AsnC family [74]. This gene is included in a region that is involved in the formation of the precursor allylmalonyl-CoA (all genes) [37,46].

Functional analysis of the role of FkbN, FkbR and AllN in *S. tsukubaensis* was performed by gene disruption and complementation studies. Whilst the inactivation of *fkbN* resulted in the lack of tacrolimus production, disruption of *fkbR* reduced tacrolimus yields to 20% of that of the parental strain and the inactivation of *allN* did not affect tacrolimus production [36]. Thus, it was concluded that both *fkbN* and *fkbR* encode positive regulators whilst *allN* has no influence on tacrolimus production [36]. In addition, AllN (also named Tcs2) seems to be not involved in tacrolimus production in other strains such as *S. tsukubaensis* L19 [75]. Overexpression of *fkbN* or *fkbR* in the wild type strain using the *ermE*\* promoter produced an increase of the final yield of tacrolimus of 55% and 30%, respectively, using a culture medium optimized for tacrolimus production. These results agree with the observations published by Mo and coworkers on the effect of FkbN in *Streptomyces* sp. KCTC 11604BP [76].

There are important differences between FkbN and FkbR that we summarize here as follows: (1) *fkbN* is present in both the extended and the short version of the *fkb* cluster but *fkbR* is only present in the extended cluster version [37]; (2) FkbN always shows a positive effect on tacrolimus production whilst FkbR can have positive or negative effects [36,76,77]; (3) A complete lack of tacrolimus production is only produced by inactivation of *fkbN* (but not with that of *fkbR*) [36,38]; (4) transcription of *fkbR* is constant and low throughout the culture whilst that of *fkbN* increases before the onset of tacrolimus production and is maintained during the production phase (Figure 4) [36,38,75].

**Figure 4.** Transcriptional profiles of genes encoding transcriptional regulators of the *fkb* cluster. Transcription of *fkbN*, *fkbR* and *allN* in *S. tsukubaensis* NRRL 18488 grown in MGm-2.5 production media. As indicated in the graph, phosphate depletion occurs between 80 h and 89 h and tacrolimus is detected from 89 h. The cultures were performed in duplicated flasks. Error bars have been omitted to facilitate the visualization of the results.

#### *3.1. Characterization of fkb Cluster Transcriptional Subunits*

Early studies using the *rppA* chalcone synthase reporter systems and qRT-PCR showed that the inactivation of *fkbR* or *fkbN* prevents transcription of certain genes in the *S. tsukubaensis fkb* cluster such as *fkbG* or *fkbB*, implying that some *fkb* genes are regulated by FkbN while others are not [36]. However, more recent transcriptomic studies with the same *fkbN* inactivated mutant have confirmed that FkbN controls the expression of most of the genes of the *fkb* cluster [38]. Two types of gene expression were observed in response to *fkbN* inactivation: (a) Genes clearly induced by FkbN coinciding with the onset of tacrolimus biosynthesis (in the so called "induction phase") and whose expression is significantly reduced in the *fkbN* mutant (i.e., *fkbABC*, *fkbGHIJK*, *fkbL*, *allAKRD*, *fkbO*, *fkbP*, *fkbD* and *fkbM*) and (b) Genes poorly expressed through the culture time and not affected by *fkbN* inactivation (i.e., *allMNPOS* and *fkbR*) (Figure 2). Thus, the complete transcriptional dependency of the *fkb* genes on FkbN, with the exception of *allMNPOS* and *fkbR* (only present in the extended versions of the *fkb* cluster), which are FkbN-independent, was demonstrated.

The use of tiling probes covering the *fkb* cluster allowed the identification of 6 transcriptional units: *fkbR*, *tcs6-fkbQ-fkbN*, *fkbOPADM*, *fkbBCLKJIH*, *fkbG* and *allAKRD*. It was concluded that *fkbR* is transcribed as a leaderless mRNA and that *fkbN* forms an operon along with *tcs6* and *fkbQ* whose transcription depends on two different promoters, one FkbN-dependent and the other FkbN-independent [38]. These results are supported by the EMSAs performed with the FkbN-DNA binding domain in *S. tsukubaensis* L19 by Zhang and coworkers [75], who reported FkbN binding to the promoter regions of the same six transcriptional units and identified two new ones corresponding to *allNPOS* and *allM*. More recently, differential RNA-seq (dRNA-seq) transcriptional profiling has been performed in *S. tsukubaensis* by Bauer and coworkers [39], who identified 9 transcriptional units that are in good agreement with previous studies (Figure 2). The main finding is that *allOS* and *allNP* are transcribed as independent mRNAs [39].

*fkbR* seems to be transcribed as a leaderless mRNA and is not directly regulated by FkbN [38]. In fact, it is likely that FkbR regulates its own expression, although detailed information is not available. Recently, the binding of FkbR to the promoter regions of *tcs6-fkbQ-fkbN* and *fkbR* in *S. tsukubaensis* L19 has been reported [75].

#### *3.2. Genes Located Outside of the Tacrolimus Gene Cluster Regulated by FkbN*

It has been reported that cluster-situated regulators (CSR) can regulate genes located outside their own cluster [78,79] and, therefore, the utilization of transcriptomic studies is a good tool to identify them. The transcriptomic analysis performed with the *fkbN* mutant by Ordóñez-Robles and coworkers [38] revealed potential genes located outside the *fkb* cluster that might be targets of FkbN such as *ppt1*, encoding a 4 -phosphopantetheinyl transferase that is known to be involved in CDA formation in *S. coelicolor* [80]. This gene showed an FkbN-dependent profile and a putative FkbN binding sequence [38]. In agreement with these results it was reported that the orthologue of *ppt1* is involved in tacrolimus production in *S. tsukubaensis* L19 [81] and later, it was observed that *ppt1* and *fkbN* share a common transcriptional response to glucose, glycerol and *N*-acetylglucosamine additions (see below). The study identified acyl-CoA dehydrogenase and methoxymalonate biosynthesis coding genes that were negatively affected by the *fkbN* inactivation and thus, might be involved in tacrolimus biosynthesis. On the contrary, some PKS coding genes located in a chromosomal region that has been predicted to encode a cluster for the production of a bafilomycin-like compound [82] were upregulated after *fkbN* inactivation, which might reflect competition for precursors between these two clusters for the biosynthesis of secondary metabolites.

Using the information-theory of Schneider [83], a putative FkbN binding sequence would be composed by two 7 nt inverted repeats [38]. This sequence would be similar to that identified for binding of PimM in the genome of *S. natalensis* [63].

In-depth knowledge of the *fkb* cluster regulation is necessary to achieve higher tacrolimus production yields. In this sense, the identification of transcriptional start sites (TSS) is useful for the introduction of artificial promoters without affecting the structure of mRNAs. Bauer and coworkers [39] reported that 22% of the transcripts identified by dRNAseq are predicted to present long leader mRNAs (greater than 150 nt), which points out the importance of post-transcriptional regulation of the *fkb* cluster through the formation of RNA secondary structures [84]. In fact, the *allAKRD* operon was reported to be transcribed with a rather long untranslated 5' region (5'-UTR; 247 bp) that is predicted to form a secondary structure.

#### **4. Classical Strategies to Increase Tacrolimus Production**

Despite the efficacy of tacrolimus in the treatment of organ transplantation, its use in clinical therapy is expensive. This is mainly due to the low production yields of the producer strains used but also to the formation of byproducts such as ascomycin (FK520) or FK525, which are structurally similar to tacrolimus but differ in the nature of some radical groups [85]. The presence of byproducts in the culture broths hampers extraction and purification of tacrolimus; thus, different approaches involving the use of organic solvents and/or chromatography have been developed to increase tacrolimus purity [86]. As an example, ascomycin production can represent 20% of tacrolimus production in *S. tsukubaensis* NRRL 18488 and 8% in *Streptomyces clavuligerus* KCTC 10561BP [86,87]. The chemical synthesis of tacrolimus was described in the 90's but it is not applied in practice due to its low efficacy and high costs [88,89].

In the last decades, the research on tacrolimus production enhancement has been mainly focused on culture media optimization and genetic engineering of the strains. For a recent review on the improvement of tacrolimus biosynthesis through synthetic biology approaches see [90,91]. The optimization of culture media encompasses formulation of defined compositions, precursor supply and the addition of stressing agents. Defined media are highly necessary to perform nutritional studies in which the stimulating or inhibitory effect of a particular nutrient on growth and antibiotic production is tested. The first defined media for the growth of *Streptomyces* sp. MA6858 (ATCC 55098) was formulated by Yoon and Choi [92]; later, Martínez-Castro and coworkers [93] developed two additional media, MGm-2.5 and ISPz. MGm-2.5, which contain starch as the main carbon source and glutamate as carbon and nitrogen sources whilst ISPz, an optimization of ISP4 medium, contains glucose and corn dextrin as the main carbon source. MGm-2.5 has been further used to perform transcriptomic analyses on the carbon and phosphate control of *S. tsukubaensis* [94,95]. This medium supports dispersed growth and high tacrolimus production yields. Moreover, this medium permits an estimate of the onset of tacrolimus production since this process has been shown to take place when phosphate is depleted from this medium [93].

Considering that the availability of precursors is a limiting factor in the biosynthesis of secondary metabolites, precursor supply is a straightforward strategy to increase antibiotic yields [96]. A summary of the compounds that have been applied to increase tacrolimus production is shown in Table 1. At this point of the review and as a conclusion of all the mentioned work, it is interesting to note that (1) The effect of a precursor depends on its concentration; (2) The combination of positive additions does not always have an additive positive effect and (3) The positive effect can be exerted through growth promotion, production stimulation or both.


**Table 1.** Common precursors used for tacrolimus production enhancement in different *S. tsukubaensis* strains. The precursor, *S. tsukubaensis* strain used and bibliographic reference are indicated.


**Table 1.** *Cont.*

Nevertheless, the addition of precursors in industrial fermentations can be a non-efficient strategy from an economical point of view (i.e., shikimate, chorismate and pipecolate are expensive; [107]); thus, an alternative strategy is to increase the copy number of tacrolimus biosynthetic genes by genetic engineering. In this manner, the overexpression of genes coding for the synthesis of methylmalonyl-CoA, methoxymalonyl-ACP and allylmalonyl-CoA has been shown to have a positive impact on the tacrolimus production yields [104,108].

Finally, the addition of stressing agents, such as dimethylsulfoxide (DMSO) or sodium thiosulfate, has been shown to stimulate polyketide production in different bacteria [109,110] as well as tacrolimus production in *S. tsukubaensis* NRRL 18488 [90].

#### **5. Omic Approaches in** *S. Tsukubaensis* **and Their Application in Tacrolimus Production**

#### *5.1. Metabolomic and Proteomic Studies*

The inactivation or overexpression of a particular gene involved in a certain biosynthetic pathway can affect other metabolic pathways and also the growth of the microorganism. For this reason, global studies covering the whole transcriptome, proteome or metabolome are usually preferred. In *S. tsukubaensis*, several metabolomic studies have been performed in the last decade. Huang and coworkers [100,111] developed a genome-scale metabolic model (GSMM) for *S. tsukubaensis* D852 including 865 chemical reactions and 621 metabolites to predict targets for genetic manipulation. These models reconstruct the organism metabolism from the genome annotation, taking into account genes encoding enzymes and transporters. By this means it was predicted that some of those modifications in the primary metabolism pathways leading to the accumulation of erythrose-4-phosphate, α-ketoglutarate, fumarate, succinate, pyruvate, phosphoenolpyruvate, NADPH, chorismate and malonyl-CoA have a positive effect on tacrolimus production. This implies that both the pentose phosphate pathway and the TCA cycle are positively correlated with tacrolimus production. Regarding the biosynthetic cluster, the overexpression of genes involved in the formation of the starter unit DHCHC, pipecolate and in different modification reactions (*fkbO*, *fkbL*, *fkbP*, *fkbM* and *fkbD*; see Table 2) also has a positive effect. Interestingly, as mentioned before, the combination of positive mutations does not always have an additive effect, i.e., the combined overexpression of *fkbL* and *fkbP* reduced biomass formation due to the use of lysine for tacrolimus production. More recently, a metabolomic approach has been reported in which lysine, shikimate, malonate, and citrate (the last three ones in the form of sodium salts) were supplied to the culture media of *S. tsukubaensis* D852 [102]. In this study, the addition of compounds targeting different precursor pathways facilitates the comprehension of the metabolic switches that are positive for tacrolimus production, and the application of weighted correlation network analysis (WGCNA; [112]) allowed the identification of hub modules and key metabolites depending on the culture stage. For example, 48 h after the feeding, pyruvate, phosphoenolpyruvate and methylmalonate show a high degree of connectivity whilst 72 h after the feeding, shikimate and aspartate control tacrolimus production. Supporting previous results, it was reported that the pentose phosphate, shikimate and aspartate pathways are crucial for the biosynthesis of the immunosuppressant. Overexpression of *aroC* and *dapA* (involved in shikimate pathway and lysine biosynthesis, respectively) increased production of the macrolide by 40% and 23%, respectively. See a summary of the distinct gene modifications that produce a positive impact on tacrolimus production in Table 2.


**Table 2.** Genetic modifications predicted through metabolic modelling in *S. tsukubaensis* to improve tacrolimus production. The target gene, type of modification, strain and bibliographic reference are indicated.

The GSMM developed by Huang and coworkers [111] is a pseudo-steady metabolic model, that is to say, it assumes that there is no depletion or accumulation of intracellular metabolites. Dynamic flux balance analysis (DFBA) takes into consideration the fluctuations in metabolite concentrations and thus allows the study of the interaction between metabolism and environmental changes [114]. Wang C. and coworkers [113] developed a genome-scale DFBA (GS-DFBA) model for *S. tsukubaensis* NRRL 18488 which uncovered new targets for genetic manipulation (see Table 2) that resulted in increased tacrolimus production; i.e., inactivation of *gcdh* (glutaryl-CoA dehydrogenase) and overexpression of *tktB* (transketolase), *msdh* (methylmalonate semialdehyde dehydrogenase) and *ask* (aspartate kinase).

The approached used by Xia and coworkers [99] consisted of the growth of *S. tsukubaensis* TJ-04 in two media of similar composition but resulting in different tacrolimus productivity. They analyzed the concentration of a wide range of metabolites and compared them between the two media to identify key metabolites that correlate positively with tacrolimus production. In good agreement with the results of Huang and coworkers [100,111], intermediates of the TCA cycle such as oxaloacetate, citrate, α-ketoglutarate and, especially, succinyl-CoA and acetyl-CoA, showed a positive correlation with tacrolimus production. In addition, the intracellular levels of pentose phosphate pathway intermediates were lower in the high production media, supporting the assumption that this pathway is positively correlated with tacrolimus production. Regarding metabolites from the tacrolimus biosynthetic pathway, methylmalonyl-CoA showed the best correlation.

More recently, Wang and coworkers [103] performed a comparative proteomic and metabolomic approach in *S. tsukubaensis* NRRL 18488 grown under soybean oil feeding. The positive effect of this carbon source on growth and on tacrolimus production has been already reported in other producing strains [97–101] and, as expected, increased tacrolimus production by 89%. This work has unraveled the effect of soybean oil on tacrolimus production, which mainly affects primary metabolism proteins (42%), redox proteins (12.5%), transcriptional regulators, signal transduction components and translation proteins (11%). The key metabolites associated with tacrolimus production correlate well with those identified previously by Xia and coworkers [99] and include malic acid, gluconic acid, citric acid, α-ketoglutarate, hexadecanoic acid, threonine, fumaric acid, succinic acid, proline, valine, oleic acid, trehalose, pyruvate, ornithine, 10-undecenoic acid, shikimic acid, mannose, and lactate. Several enzymes involved in the lower glycolytic pathway and the TCA cycle (i.e., triosephosphate isomerase, phosphoglycerate mutase, pyruvate kinase or citrate synthase) were overproduced under the soybean oil condition, and the rate-limiting enzyme of the pentose phosphate pathway glucose-6-phosphate dehydrogenase showed higher amounts in the fed condition, which supports the above-mentioned positive correlation of the pentose phosphate and TCA cycle pathways with the tacrolimus production process. Finally, enzymes related to fatty acid, shikimic acid, valine and isoleucine metabolisms (which can be transformed in the extender units methylmalonyl-CoA and propionyl-CoA) were also upregulated (valine and isoleucine can be transformed in the extender units methylmalonyl-CoA and propionyl-CoA). Interestingly, higher amounts of the transcriptional regulators Crp and AfsQ1 were detected under the soybean oil feeding condition, pointing to their possible involvement in tacrolimus production regulation.

#### *5.2. Transcriptomic Studies on Phosphate Regulation of the fkb Cluster*

Understanding how a biosynthetic cluster is regulated is important to develop strategies to improve secondary metabolite production. Our group has studied the phosphate regulation of antibiotic production in different *Streptomyces* species in the last two decades, including *S. tsukubaensis* [94,115,116]. It is well known that high phosphate concentrations in the culture media downregulate antibiotic production [117]. This regulatory phenomenon is exerted, at least in part, through the two-component system PhoR-PhoP, which is formed by a sensor kinase and a response regulator, respectively [115,118]. When phosphate is depleted from the culture media, PhoR phosphorylates PhoP. The binding of phosphorylated PhoP (PhoP-P) to its target sequences (known as PHO boxes) can have a positive or negative transcriptional effect depending on the location of the PhoP-P binding site [118–120]. In *S. tsukubaensis*, the negative regulation of tacrolimus biosynthesis by phosphate was reported in 2013 [93] and later the PhoR-PhoP system was studied in detail [94]. In the work, transcriptomics were

applied to identify genes that are transcriptionally activated after phosphate depletion. The study allowed the identification of not only common Pho members but also of potential new species-specific members, like, for example, three overlapping genes encoding a two component system and a small hydrophilic protein. In addition, a bioinformatic search for PHO boxes was developed [121]. Putative PHO boxes were identified in most of the genes responding to phosphate starvation, supporting the transcriptional results. A putative PHO box was identified in the promoter region of *fkbN* and also in primary metabolism genes that might be involved in tacrolimus precursor supply such as STSU\_30046, encoding an acetoacetate-CoA ligase [94].

#### 5.2.1. Transcriptomics of Carbon Catabolite Regulation of Tacrolimus Biosynthesis

A second regulatory mechanism governing secondary metabolite production is carbon repression. Similar to phosphate, the presence of ready-to-use carbon sources in the media reduces or blocks antibiotic production and this can happen at the transcriptional or at the posttranslational level [122,123]. The mechanisms involved in this nutritional regulation are not completely understood in streptomycetes and, as it can be deduced, its unveiling is very interesting in order to use easily assimilated carbon sources that allow faster growth in the culture broths without hampering tacrolimus biosynthesis. Regarding this subject, our group observed that glucose and glycerol, when added as carbon sources at a concentration of 0.22 M at the first growth phase (and before phosphate depletion), arrest tacrolimus production in *S. tsukubaensis*; the glucose effect being stronger than that of glycerol [95]. Both glucose and glycerol additions resulted in a lack of transcriptional activation of the *fkb* cluster; thus, it was concluded that transcriptional repression plays a role in this regulatory mechanism. In addition, the effect of these carbon sources can be exerted at the intermediary metabolism level: glucose addition increased transcription of genes involved in glycolysis, pyruvate and oxaloacetate formation but downregulated genes involved in the TCA cycle. These results are coherent with the previous assumption that the TCA cycle is positively correlated with tacrolimus production whilst glycolytic metabolites show a negative correlation [99].

In the MGm-2.5 medium used in the work, transcription of *fkbN* increases in a two-step fashion before tacrolimus is detected in the broths [38]: a slight increase in mRNA levels occurs between 80 h and 89 h and then it is followed by a higher increase from 92 h to 100 h (Figure 4). The first step coincides with phosphate depletion, supporting the proposal that *fkbN* is under phosphate control [95] (Figure 4). Taking into account that *fkbN* transcription is not strongly self-regulated [38], it seems that a key transcriptional regulator, co-activator molecule or sigma factor might be absent in the presence of glucose or glycerol. Therefore, the identification of this additional factor would be useful to trigger tacrolimus production under carbon repressing conditions. Actually, key sigma factors (i.e., *hrdA* or *bldN*) and transcriptional regulators (i.e., *eshA*, *atrA*, *afsR*) were downregulated under glucose or glycerol addition conditions [95]. HrdA might control secondary metabolism genes [124], and EshA and AtrA are both involved in antibiotic production in *S. coelicolor* and *S. griseus* [125–128]; thus, it seems interesting to analyze the effect of their inactivation and overexpression on tacrolimus production. Finally, AfsR is a very interesting candidate for these studies since it is overexpressed in an *S. tsukubaensis* strain that overproduces tacrolimus [101].

#### 5.2.2. Transcriptomics of *N*-acetylglucosamine Addition in Tacrolimus Biosynthesis

A third example of the nutritional regulation of secondary metabolite production is that exerted by *N*-acetylglucosamine, the monomer of chitin. This compound shows a dual regulatory role, accelerating differentiation and antibiotic production under poor nutritional conditions and arresting them under rich nutritional conditions, which have been traditionally named as "famine" and "feast" conditions, respectively [129,130]. We observed a negative effect of *N*-acetylglucosamine addition on tacrolimus production when *S. tsukubaensis* was grown in MGm-2.5 medium, which might be due, at least in part, to the transcriptional repression of *fkbN*, since we observed a significant decrease in its transcription soon after *N*-acetylglucosamine addition (Ordóñez-Robles et al., unpublished

data). The transcriptional response to *N*-acetylglucosamine addition is very similar to that exerted by glucose, which is not surprising since both carbon sources share a common catabolic pathway from fructose-6-phosphate.

Overall, the application of transcriptomics to nutritional studies in *S. tsukubaensis* unveils potential candidates for the rational engineering of industrial strains. It has also improved our knowledge about other aspects of its physiology such as the possible members of the PHO regulon in this species or the mechanisms operating in the presence of repressing carbon sources. These findings are worthy to detect potential targets for the bypass of nutritional repression of secondary metabolism in *Streptomyces*.

#### **6. Conclusions and Future Prospective**

It has been more than 30 years since *S. tsukubaensis* and its secondary metabolite tacrolimus were discovered. Despite the importance of this immunosuppressant macrolide in current clinics, there are still many aspects to be elucidated about the transcriptional and nutritional regulation of tacrolimus biosynthesis, and further studies are necessary to improve the yield and reduce the costs of its industrial production. In this sense, the *omic* approaches constitute an important basis to understand the producer microorganism physiology from a genome- [131], proteome- and metabolome-wide point of view. Initial *omic* studies performed in *S. tsukubaensis* have given important clues such as the positive correlation of the pentose phosphate pathway and TCA cycle with tacrolimus production or the identification of targets for genetic manipulation. These types of studies can be applied not only to the overproduction of tacrolimus but also to the awakening of cryptic clusters [132]. In fact, similar to most streptomycetes, *S. tsukubaensis'* genome contains several clusters for the production of secondary metabolites which might encode useful compounds. One of the potential products encoded is predicted to be similar to bafilomycin [133] and two other clusters show homology to those for biosynthesis of nigericin and enduracidin [134,135]. Nevertheless, we must keep in mind the interpretation of the *omic* results in the framework of the strain and culture media used since there are important physiological differences depending on the strain and the culture conditions. Therefore, the comparison of different models can broaden our perspective of tacrolimus production and *S. tsukubaensis'* physiology.

There are still some interesting points to address in the study of the *fkb* cluster such as the role of the *allMNPOS* subcluster in the strains that contain it. Although not strictly required for tacrolimus production, the *all* subcluster might be involved in the generation of macrolide variants with useful properties. Thus, the overexpression of these genes under promoters regulated by FkbN seems an interesting study. In addition, the *ppt1* and *scoT* genes, which are affected by the inactivation of *fkbN*, might be potential targets for tacrolimus biosynthesis improvement. Considering the transcriptional regulation of *eshA* and *atrA* under tacrolimus producing and repressing conditions, both genes seem good candidates for genetic engineering of the strains.

The transcriptional regulation of *fkbN* is also interesting given that it is the main transcriptional activator of the *fkb* cluster. The identification of transcriptional regulators that bind to its promoter region is a good approach to identify new targets for genetic engineering of the strains that overexpress *fkbN* and therefore, to increase tacrolimus production. Finally, the post-transcriptional regulation of the *fkb* cluster deserves further attention. As reported by Bauer and coworkers [39], a high percentage of genes are transcribed with long leader sequences in *S. tsukubaensis* (i.e., *allAKRD*). Long 5'-UTRs might be involved in the formation of secondary structures that regulate transcription of the cistrons and might be potential targets for manipulation.

**Author Contributions:** Juan F. Martín wrote the sections on biosynthesis of tacrolimus and regulatory genes, corrected the text and supervised the final version. María Ordóñez-Robles wrote the other sections and Fernando Santos-Beneit corrected and improved the text.

**Acknowledgments:** We acknowledge Paloma Liras for helpful scientific discussion.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Review* **The Cellular Mechanisms that Ensure an Efficient Secretion in** *Streptomyces*

#### **Sonia Gullón and Rafael P. Mellado \***

Departamento de Biotecnología Microbiana, Centro Nacional de Biotecnología (CNB-CSIC), c/Darwin 3, 28049 Madrid, Spain; sgullon@cnb.csic.es

**\*** Correspondence: rpmellado@cnb.csic.es; Tel.: +34-915-854-547

Received: 27 February 2018; Accepted: 11 April 2018; Published: 14 April 2018

**Abstract:** Gram-positive soil bacteria included in the genus *Streptomyces* produce a large variety of secondary metabolites in addition to extracellular hydrolytic enzymes. From the industrial and commercial viewpoints, the *S. lividans* strain has generated greater interest as a host bacterium for the overproduction of homologous and heterologous hydrolytic enzymes as an industrial application, which has considerably increased scientific interest in the characterization of secretion routes in this bacterium. This review will focus on the secretion machinery in *S. lividans*.

**Keywords:** *Streptomyces lividans*; secretion pathways; secretory proteins; signal peptides

#### **1. Introduction**

In their natural soil environment, the streptomycetes bacteria are characterized by the formation of an aerial mycelium during their life cycle before the sporulation. Along their life cycle, the streptomycetes produce and secrete a variety of hydrolytic enzymes [1,2] and enzyme inhibitors as well as signaling molecules and antibiotics in order to ensure the continuation of their existence in competitive habitats.

Their capacity to produce extracellular homologous and heterologous proteins of industrial application has remarkably increased the interest and targeted research for increasing the knowledge about the functioning of the streptomycetes secretion machinery. Therefore, this information could be used to engineer efficient streptomycetes strains for the overproduction of commercially valuable proteins needed for biotechnological application.

*Streptomyces lividans* possess a relaxed restriction-modification system that facilitates its transformation with exogenous DNA without degrading it. Moreover, it has been a satisfactory potential host for extracellular protein overproduction because of its ability to secrete a variety of hydrolytic enzymes. Additionally, its genome sequence is known [3,4]. The use of *S. lividans* as a host to engineer the production of secretory proteins is well documented [5–7].

There are two pathways commonly used to secrete extracellular proteins in *S. lividans.* The Sec route is the major system used to release extracellular proteins through the cellular membrane [8]. A secondary system, also found in *S. lividans*, is the Tat route. The Tat-secreted proteins have the property of appearing fully folded in the culture supernatant [9]. Different types of proteins with the predicted Sec or Tat signal peptides together with the antibiotic undecylprodigiosin and other proteins lacking signal peptides have detected in vesicles present inside droplets found in *Streptomyces* grown on top agar plates. This produces harmful effects in fungi and is a new way to deliver proteins with potentially different applications, which deserves to be explored further [10]. The function of the secretion system known as Esx or Type VII described for the secretion of small size proteins in Gram-positive bacteria [11] is not found in streptomycetes [12].

The characterization of the *S. lividans* cellular response when challenged to overproduce vast amounts of particular secretory model proteins has also been researched in the recent past. This research has provided in-depth information and increased our knowledge of the bacterial factors that may affect the efficiency of secretory protein production by causing different types of stress in the bacterial cell.

#### **2. The Sec Pathway**

A characteristic signal peptide present in the pre-secretory amino end is the signal recognized by intracellular chaperones to target these proteins to the membrane. A typical signal peptide consists of three regions including a short stretch of positively charged amino acids close to the N terminus (n-region), a much longer stretch of hydrophobic amino acids (h-region), and the region at the end of the signal peptide, which contains the stretch of amino acids that the signal peptidase will cleave (c-region). *Escherichia coli* SecB chaperon prevents pre-secretory protein folding by targeting it to the translocase complex in the membrane by cooperating with SecA [13]. This post-translational secretory protein transport mechanism is widely used for bacterial protein secretion via the Sec pathway. Although genes homologous to SecB are found in Gram-positive bacteria, they are not present in the *Bacillus subtilis* genome [14] or in the genomes of *S. coelicolor* [15] or the genomes of *S. lividans* 66 [3] strains. This fact has triggered more research to find out alternative means that these Gram-positive bacteria use to transport secretory proteins to the membrane.

The signal recognition particle (SRP) seems to be involved in a co-translational protein targeting mechanism conserved from bacteria to mammals [16–18]. In mammals, SRP interacts with the ribosome nascent chain complex (RNC) to translocate proteins across the endoplasmic reticulum membrane. The entire complex links with an SRP-receptor complex to be attached to the membrane [19].

The *E. coli* SRP mechanism comprises a 4.5S RNA (scRNA), which is the fifty-four homolog protein (Ffh) that is homologous to the mammalian SRP54 protein. It is also homologous to the FtsY protein, which has a C-terminal domain 300-residues long that is similar to the alpha subunit of the mammalian SRP receptor [20]. The interaction with the nascent pre-protein chain takes place via the M domain located at the Ffh carboxyl end. The minimal functional SRP conserved consists of Ffh and the scRNA [21]. The signal for targeting FtsY to the membrane is contained in the A domain located at the FtsY amino end [22–24]. SRP is responsible for targeting *E. coli* integral membrane proteins [25,26] and the export of *E. coli* SecB independent secretory proteins takes place without the postulated intervention of SRP [27]. The *E. coli* SRP based and the SecB-based targeting systems are substrate specific [28] and all the *E. coli* SRP components are essential for cell growth.

The *B. subtilis* SRP contains the scRNA, Ffh, and the histone-like protein HbsU [29]. The role of HbsU is still unknown. Although some studies indicated that the SRP could target *B. subtillis* secretory proteins via the Sec pathway [30], other authors have suggested that SecA could have a dual function in which it targets the newly made secretory protein to the membrane and pushes it through the translocase [30,31]. Therefore, there is a lack of fully convincing evidence showing whether, in the absence of SecB, the *B. subtilis* SRP could target both membrane and secretory proteins. The *S. lividans* SRP system consists of Ffh and an 82 nt long small size RNA (scRNA) [32]. The receptor protein FtsY forms part of the system and the three components are apparently expressed throughout cellular growth. No viable mutants in any of these three components have been obtained, and a possible essential role has been assigned to each of them. Experimental evidence obtained by co-immunoprecipitation studies has shown that *S. lividans* SRP is involved in targeting model secretory proteins [32,33]. The level of the SRP components seems to not be a limiting factor for the overproduction of secretory proteins in *S. lividans*, according to the results obtained with overproduced model proteins [34]. The *S. lividans* FtsY hydrophobic N-terminal segment identifies the protein as a homolog to other integral FtsY-like membrane proteins from other Actinobacteria with similar hydrophobic profiles [35]. The *S. lividans* strain is defective in the major type I signal peptidase (SipY; [34]) and has provided a useful tool for temporarily blocking translocation favoring the accumulation of pre-protein linked to SRP at the membrane. Therefore, it is helping detect the SRP ability to escort the pre-protein to the translocase complex [33]. This SipY deficiency also favored

detecting an in vivo interaction of the SRP with a soluble form of FtsY, which acts as a functional cytoplasmic SRP receptor [36].

In *E. coli*, SecY, SecE, and SecG form a heterotrimeric stable complex that gives rise to the channel that conducts the protein to be translocated [8] and the translocation of the *Streptomyces* secretory proteins is thought to take place in *E. coli*. Streptomycetes appear to have only one gene for each of these Sec proteins in their genomes including SecA [15]. In vitro experiments showed that *S. lividans* SecA was required for secretion of the Sec-dependent alpha-amylase [37], which suggests a comparable functioning of the *E. coli* equivalents. *S. coelicolor, S. lividans* [38], and *B. subtilis* [39] encoded SecG proteins have a functional homology to that of *E. coli*. Deletion of the *E. coli* or *B. subtilis* SecG causes a cold-sensitive phenotype while this seems not to occur in *S. lividans* [39,40]. Protein secretion is impaired in the *S. lividans* SecG deficiency, which delays the extracellular appearance of model secretory enzymes [38,41]. In *E. coli*, the heterotrimeric membrane complex formed by SecD, SecF, and YajC associates with the SecYEG channel, although YajC does not seem to be required for the complex to function [8]. The *S. coelicolor* genome contains two sets of SecD and SecF homologs required for an efficient secretion of some proteins [42]. Genes homologous to *yajC* have been identified in eleven *Streptomyces* genomes [43] even though a homolog to this does not seem to be present in the *S. lividans* 66 genome [3]. The functionality of these genes has not been determined experimentally. The YidC protein is essential in *E. coli* since it helps insert membrane proteins in a Sec-dependent [44] or a Sec-independent manner [45,46]. Two genes encoding potentially equivalent YidC proteins seem to form part of the *S. lividans* genome [3], where the YidC protein acts in *Streptomyces* in a Sec-dependent manner [47]. Figure 1 shows a scheme of how the Sec pathway works.

**Figure 1.** *Streptomyces lividans* major secretory route (Sec pathway). The secretory protein precursor is targeted to the membrane translocation complex by SRP, which is a GTP-dependent process, that requires the help of the FtsY receptor. When the signal peptidases (exemplified by the major signal peptidase, SipY) act, the SRP components are released to be used again. The extracellular presence of incorrectly folded mature proteins triggers the expression of the CssRS two-component system, which, in turn, induces the synthesis of the three HtrA-like proteins that are working cooperatively and degrades the incorrectly folded extracellular proteins. The acquisition of an active, correctly folded structure may need the action of enzymes involved in the formation of disulfide bonds (Dsbs) and/or peptidyl-prolyl cis-trans isomerases (FKBP-like).

#### **3. The Twin-Arginine (Tat) Pathway**

The Tat route is an export pathway found in the thylakoid membranes of plant chloroplasts and described in several bacteria [48]. Proteins using this route appear to be secreted across the cytoplasmic membrane in a folded state [48]. This feature confers to the Tat pathway a particular interest to be exploited for the overproduction of secretory proteins in streptomycetes. The Tat pathway seems to consist of four proteins in *E. coli* including TatA, TatB, TatC, and TatE. The TatB-TatC complex binds to the signal peptide in an energy-independent manner to target the pre-protein to the membrane where Tat A polymerizes to form the channel for the extracellular protein to exit the cell. TatE has been shown in vivo and in vitro to interact with the translocase complex and is considered to be a regular element of the *E. coli* translocase [49]. In *B. subtilis* the TatB component seems to be absent and the Tat machinery is formed by three TatA subunits (TatAc, TatAd and TatAy) and two TatC subunits (TatCd and TatCy). In addition, two independent acting complexes with different substrate specifity are formed (TatAd-TatCd and TatAy-TatCy) and remain ambiguous to the TatAc function [50–53].

*Streptomyces lividans* contains three Tat components (TatA, TatB, and TatC) with two different *tatA* genes [54]. Recognition of the Tat pre-secretory proteins in *S. lividans* is carried out by the heterodimeric complex formed by TatA and TatB, which theoretically target the pre-proteins to the membrane [55]. This heterodimer complex interacts with TatC seated in the membrane for the pre-proteins insertion. Then TatA oligomerises form the pore used by the folded secretory protein to leave the cell [55]. The eubacteria signal peptide of Tat peptides has some characteristic features that differentiate them from those of the Sec proteins. The n-region of the Tat signal peptide contains a conserved motif R-R-x-U-U where R-R is the twin-arginine motif and x and U are usually polar and hydrophobic amino acids, respectively [56–58]. Their h region is less hydrophobic and the c-region may contain basic residues [59]. The algorithms designed to predict the existence of this type of signal peptides identified a large number of potential Tat proteins (145–189) in *S. coelicolor* [60,61]. When membrane proteins from TatC deficient strains were compared to those of the isogenic wild type strain, only 27 proteins were unequivocally defined as Tat-dependent [58]. The algorithms predicted up to 127 Tat proteins in *S. lividans* [62]. However, the final number has not yet been experimentally determined even though the DypA peroxidase has been found to be a *S. lividans* Tat-dependent protein in this bacterium [63]. The annotated sequence of the *S. coelicolor* genome predicts more than 800 secretory proteins [15]. Therefore, in comparison, the assigned 27 Tat proteins involve a small number that reflects the minor role played by the Tat route in secretion. However, these 27 Tat proteins still represent a significant number when compared to those found in other bacteria. This reinforces the interest in exploring the Tat route for the secretion of correctly folded proteins in streptomycetes. The short number of predicted *E. coli* Tat proteins seems to bind redox cofactors, which are important for respiratory metabolism [64] while only three of the 27 experimentally confirmed *S. coelicolor* Tat-dependent proteins may bind cofactors. This reveals that the streptomycetes Tat pathway can export a broader spectrum of proteins than other bacteria [58]. Figure 2 illustrates the *S. lividans* Tat secretion pathway.

**Figure 2.** *Streptomyces lividans* minor secretory route (Tat pathway). The nascent secretory pre-protein using the Tat route is thought to be targeted to the membrane by the heterodimeric TatA-TatB complex. The signal peptide is cleaved by the signal peptidases (exemplified by the major signal peptidase, SipY) before releasing the mature protein to the cell wall.

#### **4. Stresses Caused by Overproduction of Secretory Proteins in Streptomycetes**

The engineering of streptomycetes strains to overproduce homologous or heterologous secretory proteins require more in-depth knowledge of the potential detrimental effects that this overproduction may cause to the cell. Therefore, acquiring the best possible understanding of the physiological limiting steps that could be encountered when engineering bacterial secretory protein factories is optimal.

#### *4.1. Temporal Translocase Blockage*

Mutations in the *S. lividans* translocase component SecG or in the major signal peptidase SipY lead to a temporal translocation blockage in which the extracellular presence of several secretory factors are considerably reduced in both mutants, which is determined by the corresponding proteomics analyses and is consistent with the transcriptional profiles in each case [65]. The accumulation of unprocessed pre-proteins in the translocase complex generates a membrane stress in the SipY and SecG deficient mutants. A large number of genes encoding secretory proteins were down-regulated in the SipY deficient strain than in the SecG one, which was determined by transcriptional analyses and is in agreement with the proteomics results. This suggests a more dramatic effect of the SipY mutation than that of the SecG one. The formation of the aerial mycelium involves the action of extracellular proteins [12,66]. Temporal blockage of the pre-protein processing method would likely result in a severe limitation in efficiency protein secretion, which may lead to malfunction of the aerial mycelium formation and a deficient sporulation (the so-called bald phenotype). Comparison of the wild type and the mutant strains transcriptomic profiles identified a different set of bald-related genes down-regulated in each mutant strain in which a morphological analysis of the mycelial filaments under the scanning electron microscope showed that the effect was more dramatic in the SipY that in the SecG one. The differences in the expression of these bald-related genes together with the differences detected on the mutant strains extracellular proteins when compared to those of the wild type revealed by tandem mass spectrometry analysis constitute a characteristic profile of the "translocation stress" caused in the bacterial cell when the natural translocation process is impaired [65].

#### *4.2. Induction of the Stringent Response*

Bacteria frequently use a variety of sensor kinase genes contained in their genomes to respond to changes in the environment. The two-component environmental sensor systems are well represented in the streptomycetes genomes. A two-gene operon has been identified among the *S. coelicolor* sixty-seven two-component systems [67] that shares homology with the *B. subtilis degS-degU* operon, which regulates synthesis of some secretory proteins and the level of antibiotic production. This temporarily modulates the secondary metabolism of the bacterium [68,69]. Transcriptomic and tandem mass spectrometry analyses of the two-component operon defective strain as well as the strain harboring the regulator gene in high copy number allowed us to conclude that, when present in a high copy number, the regulator gene enhances antibiotic production and sporulation in early times of growth and considerably increases the expression of genes encoding the secretory protein to such an extent that the cellular amino acid pool suffers from depletion. This precursors deficiency triggers a stringent response in the bacterium, which mainly down regulates ribosomal genes [67]. Propagation of the regulator gene in high copy number could be useful if associated with overproduction of homologous or heterologous extracellular proteins in *Streptomyces*, but, at the same time, clearly points out the danger of provoking amino acid depletion, which may affect cell viability.

Lipoproteins are membrane-translocated proteins that remain associated with the cytoplasmic membrane. Bacterial lipoproteins are known to be involved in a number of essential cellular processes since most of the solute binding proteins are lipoproteins. They also take part in the cell envelope biogenesis process and in signal transduction pathways. In Gram-positive bacteria, a fair number of peptidyl-prolyl isomerases are lipoproteins responsible for the correct folding of extracellular proteins [70]. The pre-lipoproteins have a type II signal peptide that is specifically cleaved by a type II signal peptidase (Lsp) for which the presence of an essential sequence motif is necessary for the correct processing of the protein. After the translocation has taken place, a diacylgycerol transferase lipoprotein (Lgt) links a lipid molecule at the conserved cysteine residue where Lsp will cleave the signal peptide. Proteins homologous to Lgt have been found in the *S. coelicolor* and *S. lividans* genomes [3,71]. Proteins homologous to the N-acyl transferase Lnt add more lipids to the conserved cysteine residue in Gram negative bacteria and have also been described in streptomycetes strains. However, their deletion seems not to cause any effect to the *S. scabies* bacterium [72]. Most of the bacterial lipoproteins are Sec-dependent but some use the Tat system instead [71].

The transcriptomic study of Lsp-deficient *S. lividans* strain revealed that translocase blockage takes place, which the consequent negative effect on extracellular protein appearance causes sporulation delay and triggers the bacterial translocation stress response since the translocation is blocked by the accumulation of unprocessed lipoproteins [73]. This phenotype is observed when the cells were defective in the major type I signal peptidase, SipY, or the translocase SecG protein. Additionally, Lsp deficiency caused depletion of solute binding lipoproteins, which caused the depletion of nutrients inside the cell and triggered a stringent response [73]. This mimicked the effect caused by overproduction of secretory proteins when the regulator gene of the two-component operon described above was propagated in a high copy number [67]. These results are in accord with the key roles played by bacterial lipoproteins and they will have to take into account in the optimization of *Streptomyces* strains for the overproduction of secretory proteins. A Venn diagram [74] illustrates the relative degree of coincidence among the genes transcriptionally affected in the SipY, SecG, or Lsp deficient strains (see Figure 3). Therefore, 52.45% of the total genes regulated in the SecG deficient cells coincide with 42.10% of the total genes regulated in the SipY deficient cells. Lsp deficient cells show a coincidence on the 22.0% and the 14.0% of its genes with the 14.47% and the 11.47% of the genes regulated by the SipY and SecG deficiencies, respectively. Most of the affected genes were involved in nitrogen/amino acids metabolism, morphological differentiation and genes encoding a possible secreted protein that were downregulated in most cases (Table S1). Therefore, a selected subset of these genes could eventually be used to monitor an efficient secretory protein production in streptomycetes.

**Figure 3.** Venn diagram summarizing the relative degree of coincidence among the genes transcriptionally affected in the *Streptomyces lividans* SipY, SecG, or Lsp deficient strains. For comparative purposes, the set of genes involved in the stringent response in the Lsp deficient strains have been subtracted.

#### **5. Overproduction of Model Secretory Proteins in** *S. lividans*

The comparison of transcriptomic analyses of *S. lividans* strains overproducing secretory proteins via the Sec (alpha-amylase) or the Tat route (agarase) in *S. lividans* resulted in different cellular responses [75]. Basically, overproduction of the Tat model protein elicited the characteristic down regulation of genes involved in the stringent response while the same set of genes was upregulated when the Sec model protein was overproduced. The stringent response may cause cellular death and its potential induction has to be taken into account when engineering extracellular protein production in streptomycetes. Therefore, despite the potential attractiveness of using the Tat secretory route to obtain a potentially correctly folded extracellular protein, the Sec pathway could be the preferred route for secretory protein production in *Streptomyces* strains.

Additionally, some proteins appear to be able to use both routes. Therefore, when the Tat model protein agarase is overproduced in *S lividans*, a small amount of it was apparently secreted at the early phase of growth at a time when the Tat pathway is not functional, which was determined by co-immunoprecipitation studies using antibodies against agarase and the SRP components. This suggests that the protein could eventually be secreted via the Sec route [32]. A similar account has also been described for *E. coli* in which Tat signal peptides could trigger proteins to the Sec pathway [76] and the first two transmembrane domains of a Tat-membrane protein could be recognized by the Sec pathway. Sec and Tat routes seem to cooperate for the integration of a particular membrane protein [47]. Some membrane proteins with extracytoplasmic domains apparently need the cooperation of the Tat and Sec pathways [77].

The construction of chimeric precursor proteins interchanging the Sec-dependent alpha-amylase and the Tat-dependent agarase signal peptides showed that agarase was able to use the Sec route when led by the alpha-amylase signal peptide. However, the alpha-amylase did not succeed in using the Tat route led by the agarase signal peptide [41]. The inability of alpha-amylase to use the Tat signal peptide argues in favor of other parts of the protein apart from the signal peptide contributing to the streptomycetes secretion as described in *E. coli* [78]. The structure of the proteins may have evolved to select a particular secretory route. Most of the *S. coelicolor* unequivocally assessed Tat proteins [58] do not have predictable disulphide bonds while the major Sec proteins have at least one disulphide bond [65]. Structurally less complex secretory proteins may be engineered to use either secretory route while structurally complex proteins would have to be processed via the Sec pathway. Therefore, the Sec route may be the one to choose to overproduce extracellular proteins in streptomycetes.

#### **6. Extracellular Stability of Secretory Proteins**

Overproduction of secretory proteins in streptomycetes, as in other bacteria, could likely result in the extracellular accumulation of misfolded secreted proteins. In practical terms, they are inactive contaminants of the correctly folded, functionally active secreted proteins present in the total population of the overproduced ones. Therefore, previous identification and characterization of these quality factors are required in order to optimize extracellular protein production via the Sec route.

The *S*. *lividans* CssRS two-component system regulates the degradation of accumulated incorrectly-folded extracellular proteins outside the cell known as the secretion stress response. This two-component system is known to induce the synthesis of specific proteases to degrade the incorrectly-folded proteins and it has been described in *B. subtilis* [79] and *S. lividans* [80].

Overproduction of the *B. amyloliquefaciens* alpha-amylase in *B. subtilis* results in phosphorylation of the CssR regulator, which activates the expression of the genes *htrA* and *htrB* encoding two HtrA-like proteases [81,82]. An equivalent CssRS two-component system has been described in *S. lividans* [81]. When activated by the presence of incorrectly-folded alpha-amylase outside the bacterial cell, the regulator induced the expression of three different HtrA-like proteases encoding genes, *htrA1, htrA2*, and *htrB* ([83]; Figure 1). The CssR operon is a main element of the bacterial secretion stress response [80,84].

Genes encoding HtrA-like proteins are present in bacterial genomes and a significant number of bacteria have more than one HtrA-like protease [85]. HtrA proteins normally assemble into complex oligomers mainly via their PDZ domains in the carboxyl terminal part of the protein involved in substrate recognition, regulation of protease activity, and protein-protein interaction [86].

The *S. lividans* HtrA-like proteases seem to act in a cooperative manner since the alpha-amylase activity detected in each of the individual deficient strains was severely reduced when compared to the wild type one. A functional model for the combined action of the *S. lividans*' three proteases has been proposed ([83]; Figure 1).

Different ways of action have been described for HtrA-like proteins in different bacteria and it is clear that, whichever they work, the final outcome is the degradation of the incorrectly-folded proteins present outside the cell that result from Sec-dependent secretion. Theoretically, secretory proteins using the Tat pathway should not need to activate the synthesis of HtrA-like proteins since these are secreted in an active conformation (see Figure 2).

Degradation of the extracellular accumulated incorrectly-folded secretory proteins becomes necessary to avoid their possible interference with essential bacterial cell processes and it is an effective way to correct that potentially harmful situation. An additional efficient way to deal with this accumulation is to procure other means to intervene in the correct folding of Sec-dependent proteins such as the action of the peptidyl-prolyl-isomerases (FKBP in Figure 1) and/or or thiol-disulfide oxidoreductases (Dsbs in Figure 1) equivalent to those described in many bacteria. A lipoprotein with peptidyl-prolyl isomerase activity likely involved in the folding of secretory proteins has been identified in *S. coelicolor* [71]. The mode of action of all these streptomycetes folding proteins still remain to be fully elucidated.

Exploring the *Streptomyces* response to different stresses that could potentially take place when a protein is overproduced has helped identify potential drawbacks in the secretory protein production as those described above, which facilitates a means to avoid them. Despite the potential existence of these stress responses, the actual amount of accumulated knowledge on how the streptomycetes protein secretion takes place and the potential use of the Table S1 listed genes to monitor the scale-up of secretory production points to *S. lividans* as a sufficient candidate for engineering the overproduction of extracellular proteins at the industrial level.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2079-6382/7/2/33/s1, Table S1: Genes commonly modulated by the translocase blockage.

**Acknowledgments:** This work was funded by the Spanish Ministry of Economy, Industry, and Competitivity/ European Regional Development Fund grant BIO2015-71504-R (AEI/FEDER, UE).

**Author Contributions:** Sonia Gullón and Rafael P. Mellado conceived and wrote the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Review* **Complex Regulatory Networks Governing Production of the Glycopeptide A40926**

**Rosa Alduina 1,\*, Margherita Sosio 2,3 and Stefano Donadio 2,3**


Received: 9 March 2018; Accepted: 3 April 2018; Published: 5 April 2018

**Abstract:** Glycopeptides (GPAs) are an important class of antibiotics, with vancomycin and teicoplanin being used in the last 40 years as drugs of last resort to treat infections caused by Gram-positive pathogens, including methicillin-resistant *Staphylococcus aureus*. A few new GPAs have since reached the market. One of them is dalbavancin, a derivative of A40926 produced by the actinomycete *Nonomuraea* sp. ATCC 39727, recently classified as *N. gerenzanensis*. This review summarizes what we currently know on the multilevel regulatory processes governing production of the glycopeptide A40926 and the different approaches used to increase antibiotic yields. Some nutrients, e.g., valine, L-glutamine and maltodextrin, and some endogenous proteins, e.g., Dbv3, Dbv4 and RpoBR, have a positive role on A40926 biosynthesis, while other factors, e.g., phosphate, ammonium and Dbv23, have a negative effect. Overall, the results available so far point to a complex regulatory network controlling A40926 in the native producing strain.

**Keywords:** glycopeptide antibiotics; *dbv* cluster; regulatory genes; StrR; LAL; LuxR solo; dalbavancin; A40926

#### **1. The Glycopeptides**

The glycopeptides are a class of antibiotics with a complex chemical structure and relatively high molecular weight. Since 1953, about 50 glycopeptide antibiotics (GPA) have been isolated [1], and several of these have been approved for clinical use. These include vancomycin, produced by *Amycolatopsis orientalis* and marketed in 1958, and teicoplanin, produced by *Actinoplanes teichomyceticus* and marketed in 1987. The second-generation glycopeptides telavancin, derived from vancomycin, dalbavancin, derived from A40926, and oritavancin, derived from choloroeremomycin, were introduced onto the market in 2009, 2014 and 2015, respectively. All glycopeptides are used to treat persistent infections by Gram-positive multi-resistant pathogens [2]. The second-generation glycopeptides are nearly 4- to 8-fold more effective than vancomycin against Gram-positive pathogens, and are also active against vancomycin-intermediate or vancomycin-resistant strains of *Staphylococcus* and *Enterococcus* spp. [3]. While dalbavancin impedes the late steps of cell wall biosynthesis principally by blocking transglycosylase activity, oritavancin and telavancin bind to the bacterial membrane by the lipophilic side chain linked to their disaccharide moiety, disturbing membrane integrity and leading to bacteriolysis [3].

Chemically, glycopeptides are a class of molecules constituted by a heptapeptide core consisting of both proteinogenic and non-proteinogenic amino acids, such as 3,5-dihydroxyphenylglycine (Dpg) and 4-hydroxyphenylglycine (Hpg). A heptapeptide is produced by a non-ribosomal peptide synthetase (NRPS) and, while tethered to the large multi-functional enzyme, the peptide scaffold is made rigid through oxidative cross-linking of the electron-rich aromatic side chains by P450s and chlorinated [4–6]. Further tailoring steps may include one or more glycosylations, methylation, sulfation and modification of the added sugar(s) by acylation and acetylation.

Glycopeptide producers are widespread among distantly related genera of actinomycetes [1]: vancomycin, balhymicin and ristocetin were isolated from distinct species of the genus *Amycolatopsis*, belonging to the family *Pseudonocardiaceae*; teicoplanin and UK-68597 are produced by members of the genus *Actinoplanes*, family *Micromonosporaceae*; A40926 is from the genus *Nonomuraea*, family *Streptosporangiaceae*; and pekiskomycin and A47394 are from the genus *Streptomyces*, family *Streptomycetaceae* [7]. Thus, production of GPAs is widespread among actinomycetes, as shown by the relatively high frequency at which glycopeptide producers can be detected in environmental samples after applying appropriate selection procedures [8].

The medical interest and importance of these molecules has prompted the analysis of the genes required for their synthesis. Different glycopeptide biosynthetic gene clusters have been reported [9]; combining the information obtained from these clusters, a function has been assigned to most genes involved in glycopeptide formation by in vivo gene disruption in the producing strain(s) and by biochemical studies of the overproduced enzymes. While these results have analyzed different pathways, the emerging overall picture has contributed to deciphering most of the biosynthetic steps and the timing of the events in the biosynthesis of all GPAs [4–6,8,10].

#### **2. Development of Dalbavancin**

Dalbavancin is a second-generation glycopeptide derived from A40926 with an improved antibacterial activity over teicoplanin, the most closely correlated marketed GPA. The enhanced pharmaco-dynamic properties of the molecule and lipophilic anchoring to the bacterial cell membrane confer more potent in vitro and in vivo activity than teicoplanin. The most prominent peculiarity of dalbavancin is a significantly extended half-life in plasma, which allows once-a-week dosing by intravenous injection. The drug has been approved for treating complicated acute bacterial skin and skin structure infections. Its synthesis involves the deacetylation of the final biosynthetic intermediate A40926 (a process achieved during recovery from the fermentation broth), protection of the carboxyl group present in the aminosugar, conversion of the C-terminal carboxyl group into a (3-dimethylamino)-1-propylamide, and final deprotection of the aminosugar carboxyl group [11]. The main components of the A40926 complex differ mainly in the acyl chain attached to the sugar, with B0 and B1 as the major representatives, characterized respectively by an iso-C12:0 and a n-C12:0 acyl moiety bound to the aminoglucuronic acid moiety [12]. The structures of A40926 and of dalbavancin are shown in Figure 1.

**Figure 1.** Chemical structures of *O*-acetyl A40926 and of dalbavancin. Only the component B0 is shown for simplicity. The chemical modification present in dalbavancin is indicated in red type.

Dalbavancin obtained market authorization in 2014 in the USA and the following year in Europe. This was a noteworthy success in view of the intricate history related to its development, which started back in the early 1990s and involved at least six different legal entities, as recently summarized [13].

In this review, we have organized the text into three separate sections: the first concerns the improvement of antibiotic yield by modifying the media components; the second describes the biosynthetic gene cluster and its transcriptional organization (Figure 2), along with the biosynthetic steps (Figure 3); and the last section deals with the cluster specific regulatory genes.

**Figure 2.** Genetic organization of the *dbv* cluster. The thin black arrows indicate experimentally determined operons. Red triangles indicate experimentally determined Dbv4 binding sites, with the corresponding transcripts as red thick arrows; the thin green arrows represent the transcriptional units controlled by Dbv3. The *dbv* genes are grouped by functional category as indicated. See also Table 1.

**Table 1.** Transcriptional units and biosynthetic roles of the corresponding proteins.


**Figure 3.** Simplified model of *O*-acetyl A40926 biosynthesis. Note that the heptapeptide is drawn right (N-terminus) to left (C-terminus), consistent with Figure 1. Cross-links are indicated by blue (C–O–C) or red (C–C) arcs. Sugars are represented as blue hexagons. Refer to Figure 2 and Table 1 for details.

#### **3. Improvement of A40926 Production**

Improvement of glycopeptide production has very likely been achieved through several rounds of mutagenesis and screening, leading to the current industrial strains producing vancomycin, teicoplanin, chloroeremomycin and A40926. However, most of this work has not surfaced in the scientific literature, and we will limit ourselves to published reports on the A40926 process.

Initial work established the influence of growth conditions on A40926 production by *Nonomuraea* sp. ATCC 39727, recently classified as *N. gerenzanensis* [14]. In a chemically defined medium, low initial concentrations of phosphate and ammonium led to increased A40926 production, while glucose limitation did not (Figure 4). In particular, the level of residual ammonium and phosphate strongly influenced A40926 production rates and final titers, but not the initiation of production [15]. In a similar medium, A40926 production was repressed by calcium, but supported when L-glutamine or L-asparagine were added as nitrogen sources instead of ammonium salts (Figure 4) [16]. Since the catabolic products of branched chain amino acids represent biosynthetic precursors for the formation of the branched chain acyl moieties of A40926 [17], studies were undertaken on the influence of valine supplementation. Addition of 1 to 3 g/L-valine to complex media improved both the relative and absolute production of the B0 congener with decrease of the B1 component in the A40926 complex [18]. A40926 yields were found to also be controlled by stringent response in both complex and chemically defined media (Figure 4) [19].

It has also been recently reported that a *Nonomuraea* strain producing high levels of A40926 in an optimized production medium was isolated after UV mutagenesis. This mutant strain was used to study the effect of carbon and nitrogen sources and of different ions on antibiotic productivity; addition of the scarcely assimilated carbon source maltodextrin and the nitrogen source soybean meal strongly affected A40926 production, which reached 1 g/L in a 10-L fermenter. Furthermore, Cu2+ stimulated A40926 biosynthesis while Co2+ showed an inhibitory effect. As shown for valine, even L-leucine addition led to an increased production of total A40926 and changed the complex toward the B0 compound (Figure 4) [20]. While the shift in complex composition after amino acid addition can be easily rationalized, there are currently no clues as to why certain carbon sources and metal ions stimulate or inhibit growth and/or A40926 production.

**Figure 4.** Nutrients, biosynthetic products and proteins regulating A40926 production in *Nonomuraea gerenzanensis*.

#### **4. The** *dbv* **Gene Cluster: Main Features**

The characterization of the gene cluster necessary for A40926 biosynthesis [21] laid the foundation for understanding the regulatory mechanisms working in the producer strain [22,23]. The *dbv* gene cluster is constituted by 37 protein coding sequences involved in antibiotic biosynthesis, regulation, immunity, and export [21] (Figure 2).

In particular, Dbv1, Dbv2, Dbv5, Dbv30-34 and Dbv37 are involved in biosynthesis of the two non proteinogenic amino acids Hpg and Dpg, while Dbv16-17 and Dbv25-26 constitute the NRPS that joins the amino acids Hpg, Tyr, Dpg, Hpg, Hpg, Tyr and Dpg in a ribosome-independent manner. The A40926 aryl groups are linked by three ether links and one C–C link through the action of Dbv11-14 P450s, while the single halogenase Dbv10 chlorinates Dpg-3 and Tyr-6. By analogy with other glycopeptides, halogenation should occur on an NRPS-bound substrate [5], while Tyr beta hydroxylation [24] might also involve interaction with an NRPS-bound substrate or intermediate. Additional modifications require the action of: Dbv27, for *N*-methylation of the terminal Hpg-1 residue; Dbv9, Dbv21, Dbv8 and Dbv29, for addition of *N*-acetyl glucosamine, deacetylation and acylation with long chain fatty acids, and sugar oxidation, respectively [25–27]; and Dbv20 and Dbv23, for mannosylation of Dpg-7 and its *O*-acetylation [28]. The different functions are illustrated in Figure 2 and summarized in Table 1, and a simplified model of *O*-acetyl A40926 biosynthesis is depicted in Figure 3.

The last biosynthetic step is possibly represented by acetylation at position 6 of the mannose moiety carried out by Dbv23 [28,29]. A strain deleted in *dbv23* produced only glycopeptides lacking the O-linked acetyl residue. Interestingly, antibiotic production in a complex medium by the mutant strain occurred at twice the levels of the wild type. The low amount of glycopeptide produced by the wild-type strain might be dependent upon an inhibitory effect exerted by the acetylated compound, the final pathway intermediate. Consistently, spiking the production medium with 1 μg/mL of the acetylated glycopeptide inhibited total glycopeptide production in the mutant strain, while the deacetylated glycopeptide had no effect [28]. It is thus tempting to speculate that A40926 production is regulated by its end product, ensuring that A40926 does not occur during growth of the strain. This might occur through a two-component signal transduction process, in which a specific receptor could activate a response regulator and repress A40926 biosynthesis. This might be relevant in industrial processes, in which a seed culture is eventually used to inoculate the production medium. Any A40926 produced in the seed culture might be sufficient to inhibit A40926 production when the strain is inoculated in the production medium. This mechanism might be related to the inherent sensitivity of the strain to its own product, as described below.

Glycopeptides bind to the D-Ala-D-Ala portion of lipid II and thus inhibit the transpeptidation and transglycosylation reactions, thereby blocking peptidoglycan polymerization. The first and best characterized mechanism for glycopeptide resistance was established in enterococci, where glycopeptide action is avoided by deploying a modified target through a complex process that requires at least three biosynthetic genes (*vanHAX*) and a regulatory circuit (reviewed in [30–32]). Glycopeptide resistance in actinomycetes can also involve reprogramming of the peptidoglycan precursor by the action of VanHAX-related enzymes, as, for example, in *Amycolatopsis balhimycina* [33,34]. Instead, *Nonomuraea gerenzanensis* lacks the typical *vanHAX* cassette and the *dbv* cluster encodes the carboxypeptidase Dbv7, which has been shown to provide a modest but measurable resistance effect in the wild-type strain and in a heterologous background [35]. It should be noted that glycopeptide resistance in actinomycetes is still far from being completely understood, with there being a subtle interplay between glycopeptide resistance and glycopeptide tolerance [36]. Finally, the ABC transporters Dbv18, Dbv19, and Dbv24 and ion-dependent transmembrane transporter Dbv35 may contribute to glycopeptide resistance through active export from the cell, as observed for the Dbv24 homolog in the balhimycin producer [37].

The transcriptional organization of the *dbv* cluster was elucidated by RT-PCR targeting desired regions of the gene cluster [22]. The results, illustrated in Figure 2, denote a complex transcriptional organization, with at least 14 promoters, the two-gene operons *dbv1*-*dbv2*, *dbv19*-*dbv18*, *dbv21*-*dbv20*, *dbv23*-*dbv22*, the larger operons *dbv5*-*dbv7*, *dbv24-dbv28*, and *dbv30*-*dbv35*, and the largest operon *dbv17*-*dbv8*. Apparently, *dbv3*, *dbv4*, *dbv29*, *dbv36* and *dbv37* are transcribed as monocistronic units. The results are summarized in Table 1, which also lists the functions of the corresponding proteins. Real-time RT-PCR showed that a promoter is present upstream to *dbv14*, directing expression of the *dbv14-dbv8* operon through a leaderless transcript. However, a longer operon is likely to be transcribed from upstream promoter(s), since RT-PCR analysis showed the existence of a transcript spanning *dbv15* and *dbv14*. Some of the associated regulatory networks controlling A40926 biosynthesis are described below.

#### **5. Cluster-Specific Regulatory Genes**

The *dbv* cluster contains two regulatory genes, *dbv3* and *dbv4,* and the members of a putative two-component system, *dbv6* and *dbv22* [21–23]. Over a decade ago, a comparative analysis of the then-available five glycopeptide gene clusters—namely, those for chloroeremomycin, balhimycin, A47934, A40926 and teicoplanin—revealed that a StrR-like protein (i.e., Dbv4) was present in all clusters [38]. We previously demonstrated cross-binding among StrR-like regulators from glycopeptide clusters; specifically, Bbr (from the balhimycin cluster) can bind to the *dbv30* upstream region, while Dbv4 binds to the regions upstream of *bbr* and *oxyA* in the balhimycin cluster [22]. The target regions of Bbr and Dbv4 contain the highly conserved palindromic consensus sequence GTCCAR(N)17TTGGAC. This sequence was considered to be the Dbv4 binding site and was found in two regions of the *dbv* cluster and in five regions of the balhimycin cluster [39]. Consistently, this conserved palindrome is part of a conserved intergenic region present in the five glycopeptide clusters mentioned above [38].

In addition to the common regulation of the oxygenase transcription through a Dbv4-type regulator, diverse regulatory schemes are apparently used in the other biosynthetic gene clusters. Actually, in the teicoplanin cluster, Dbv4-like protein also positively regulates the transcription of the gene operon involved in Dpg biosynthesis and, as a matter of fact, the Dbv4 target sequence was found upstream of this operon, suggesting a Dbv4-type dependent regulation. In contrast, while Bbr, through the binding of its upstream region [39], functions as an autoregulatory protein, Dbv4 did not. Similarly, since the conserved palindrome is apparently missing in the region upstream of the

corresponding genes in the A47394 and teicoplanin, the Dbv4-like regulator is not expected to control its own expression in these clusters.

A40926 production is repressed by high initial concentrations of phosphate, and this repression was demonstrated to occur through Dbv4 [22]: phosphate depletion induces *dbv4* transcription in a defined medium, allowing Dbv4 to enhance expression of the operons *dbv14*-*dbv8* and *dbv30*-*dbv35*. However, phosphate did not influence the expression of most analyzed *dbv* genes [22]. The biosynthesis of many diverse secondary metabolites is controlled by phosphate [40]. Phosphate control of antibiotic biosynthesis in *Streptomyces coelicolor* and *S. lividans* is dependent upon the two-component system PhoR-PhoP [41], and PhoP was found to bind promoters of phosphate-regulated genes in *S. coelicolor* [42]. However, we were unable to identify Pho boxes in the region upstream of *dbv4* [22], suggesting divergence in phosphate control of antibiotic biosynthesis in different actinomycetes.

Another regulator from the *dbv* gene cluster has been experimentally characterized: Dbv3, a LuxR solo regulator belonging to the large ATP-binding regulators of the LuxR protein family. Dbv3 positively regulates A40926 production, since the Δ*dbv3* strain does not produce antibiotic and shows reduced transcription levels of *dbv4* and of many other *dbv* genes [23]. Thus, both the LuxR- and StrR-like regulators act as activators of A40926 biosynthesis.

The experimental evidence obtained for different glycopeptide pathways indicates that the StrR-like regulators Bbr, Tei15 and Dbv4 regulate balhimycin, teicoplanin and A40926 biosynthesis, respectively [22,39,43], whereas the LuxR-like regulators Dbv3 and Tei16 positively regulate A40926 and teicoplanin biosynthesis, respectively [23,43]. The balhimycin cluster does not encode a LuxR-like regulator. In A40926 biosynthesis, Dbv3 positively regulates Hpg biosynthesis, heptapeptide backbone biosynthesis, mannosylation, hexose oxidation and export. In addition, Dbv3 was found to hierarchically control *dbv*4 transcription in a cascade-like regulatory mechanism, so that Dpg biosynthesis and transcription of the *dbv14*-*dbv8* operon are also under indirect control of Dbv3. In addition, Dbv4 and Dbv3 expression seems to be differently modulated, since transcription of *dbv4* and Dbv4 target genes was found to be repressed by phosphate, while the Dbv3 target genes were not [22]. It should be noted that in teicoplanin biosynthesis, the expression of at least 17 genes is directly governed by Tei15, the Dbv4-like regulator, which directly controls transcription of *tei*16, the *luxR*-type regulator [43]. The targets of Tei16 have not been reported yet.

Notwithstanding the absence of obvious targets for yield improvements by gene knockouts (e.g., repressor genes), genetic manipulation of selected *dbv* genes has led to increased yields of A40926. Knockout of the acetyltransferase *dbv23* (see above) or overexpression of Dbv3 resulted in higher (2-fold) A40926 production than in the wild type strain in rich medium, providing useful examples of knowledge-based strain improvement [23,28]. Analysis of the additional regulators encoded by the *dbv* cluster, the sensor kinase Dbv22 and the response regulator Dbv6, has established their role in the regulation of A40926 and provided additional strategies for rational intervention.

#### **6. Future Perspectives**

This review summarizes the main achievements in understanding A40926 biosynthesis in *N. gerenzanensis* in relation to other glycopeptide producers and model *Streptomyces* strains. While many studies have addressed antibiotic production in model streptomycetes, like *S. coelicolor*, we continuously learn new mechanisms and pathways as we extend these analyses to industrially relevant antibiotics and, especially, to actinomycetes other than *Streptomyces* spp. In this respect, strains belonging to the genus *Nonomuraea* represent complex systems, with limited genetic tools available. Current results suggest an interplay between nutrients, resistance determinants and the end product. Even if many factors and proteins have been found to control A40926 biosynthesis (Figure 4), further studies are necessary to fill the many gaps present in our understanding of the strain's physiology and of the interplay between A40926 production and resistance before this information can be applied for A40926 yield improvement.

The recent availability of the *N. gerenzanensis* genome sequence [44] and of a large insert library [45] represent important assets for further work on the complex but intriguing regulatory network of the A40926-producing strain.

**Acknowledgments:** Most of the research described in this report was supported by Fondo per il finanziamento delle attività base di ricerca 2017 (FFABR 2017) 2017 to Rosa Alduina.

**Author Contributions:** Rosa Alduina, Margherita Sosio and Stefano Donadio wrote, read and approved the final manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Review* **Novel Aspects of Polynucleotide Phosphorylase Function in** *Streptomyces*

#### **George H. Jones**

Department of Biology, Emory University, Atlanta, GA 30322, USA; george.h.jones@emory.edu

Received: 14 February 2018; Accepted: 16 March 2018; Published: 18 March 2018

**Abstract:** Polynucleotide phosphorylase (PNPase) is a 3 –5 -exoribnuclease that is found in most bacteria and in some eukaryotic organelles. The enzyme plays a key role in RNA decay in these systems. PNPase structure and function have been studied extensively in *Escherichia coli*, but there are several important aspects of PNPase function in *Streptomyces* that differ from what is observed in *E. coli* and other bacterial genera. This review highlights several of those differences: (1) the organization and expression of the PNPase gene in *Streptomyces*; (2) the possible function of PNPase as an RNA 3 -polyribonucleotide polymerase in *Streptomyces*; (3) the function of PNPase as both an exoribonuclease and as an RNA 3 -polyribonucleotide polymerase in *Streptomyces*; (4) the function of (p)ppGpp as a PNPase effector in *Streptomyces*. The review concludes with a consideration of a number of unanswered questions regarding the function of *Streptomyces* PNPase, which can be examined experimentally.

**Keywords:** polynucleotide phosphorylase; *Streptomyces*; ribonuclease; regulation; promoter; RNA decay; polyadenylation; (p)ppGpp; antibiotic

#### **1. Introduction**

Polynucleotide phosphorylase (PNPase, EC 2.7.7.8) was the first enzyme shown to synthesize polyribonucleotides [1], and for some time, it was thought to be the bacterial RNA polymerase. The enzyme was subsequently characterized in *Escherichia coli* and other bacteria, and was shown to catalyze the following reaction:

$$(\mathrm{p}^{\mathrm{5'}}\mathrm{N}^{\mathrm{3'}}\mathrm{OH})\_{\mathrm{X}} + \mathrm{Pi} \rightleftharpoons (\mathrm{p}^{\mathrm{5'}}\mathrm{N}^{\mathrm{3'}}\mathrm{OH})\_{\mathrm{X}-1} + \mathrm{pp}^{\mathrm{5'}}\mathrm{N}^{\mathrm{5'}}$$

where N is any of the four bases found in RNA [2,3]. As written, the reaction depicts the phosphorolytic degradation of RNA chains, and this activity appears to reflect the major function of PNPase in vivo. The reaction is reversible, however, and PNPase will synthesize polyribonucleotide chains, using nucleoside diphosphates (NDPs), rather than triphosphates, as substrates. The polymerizing activity of PNPase played an important role in the synthesis of polyribonucleotides used to unravel the genetic code [4,5].

PNPase is found in all bacteria examined to date, except *Mycoplasma*, and is also present in eukaryotic organelles [6]. The enzyme has not been identified in Archaea [7]. In *E. coli*, PNPase and RNase II are the major 3 -exonucleases involved in RNA degradation [8]. In addition to its degradative function, PNPase plays a role in the bacterial response to environmental stresses, such as cold shock [9–11], is involved in biofilm formation [12,13] and virulence determination [14,15], and the activity of the enzyme is modulated by a number of small molecule effectors, at least in vitro [16–19].

*Streptomyces* are Gram-positive, soil-dwelling bacteria, notable for their ability to form spores and for their capacity to produce antibiotics [20,21]. Nearly 70% of all antibiotics used in clinical and veterinary medicine worldwide are synthesized as natural products by members of the genus [22]. Of particular relevance to this review, a number of biochemical and genetic features of *Streptomyces*

PNPase distinguish it from its counterparts in other bacteria. In what follows, the functions of PNPase in *Streptomyces* will be explored. The reader is referred to several excellent reviews as sources of additional information on PNPase from *E. coli* and other bacteria [2,3,23,24].

#### **2. Organization and Expression of the PNPase Gene in** *Streptomyces*

In *E. coli*, and other organisms that have been studied, the PNPase gene, *pnp*, is a part of an operon that also includes *rpsO*, the gene for ribosomal protein S15 [9,25,26]. That operon is transcribed from two promoters in *E. coli*, designated P*rpsO* and P*pnp* [25,27]. P*rpsO* is situated upstream of the *rpsO* gene, and P*pnp* is located in the intergenic region between the two genes. Transcription from P*rpsO* ends at a rho-independent terminator and produces the *rpsO* transcript, but transcription through this terminator occurs with significant frequency and produces a readthrough transcript, containing both *rpsO* and *pnp*. Transcription from the intergenic promoter, P*pnp*, produces a transcript containing *pnp* only [25,27]. In addition to the *rpsO* terminator, the *rpsO–pnp* intergenic region contains a second stem–loop that functions as a processing site for the double strand specific endoribonuclease, RNase III. RNase III processing plays an important role in *pnp* expression [25,28–30].

Our interest in the mechanisms of RNA decay in *Streptomyces* led to an examination of the transcriptional organization of the *rpsO–pnp* operon in *Streptomyces coelicolor*, the paradigm for biological studies in the genus. To our surprise, primer extension analysis of RNAs isolated from a parental strain of *S. coelicolor* and from an RNase III null mutant revealed not two, but four extension products, suggesting the presence of four promoters within the *rpsO–pnp* operon [31]. A visual inspection of 162 putative *Streptomyces* promoters from Strohl [32] and Yamazaki et al. [33], and a group of synthetic promoters generated by Seghezzi et al. [34], indicated that all four of the 5 -ends identified by primer extension in our studies were preceded by sequences similar to the −10 and −35 regions of characterized streptomycete promoters. Promoter probe cloning of DNA fragments containing these putative promoter sequences verified that the *rpsO–pnp* operon of *S. coelicolor* is transcribed from four promoters, two situated upstream of *rpsO* (P*rpsO*A and P*rpsO*B) and two situated in the intergenic region, upstream of *pnp* (P*pnp*A and P*pnp*B, Figure 1).

**Figure 1.** Schematic representation of the *Streptomyces coelicolor rpsO–pnp* operon. P*rpsO*A, B and P*pnp*A, B represent the upstream and intergenic promoters found in *S. coelicolor*, respectively. The ball-and-stick structures immediately following *rpsO* and *pnp* represent rho-independent transcription terminators. The ball-and-stick structure just upstream of *pnp* represents the intergenic hairpin which is cleaved by RNase III. The diagram is not drawn to scale.

Of particular interest in this analysis was the observation that the four promoters were temporally regulated. That is, the activity of the four promoters varied with time after inoculation of liquid cultures and the variations were promoter-specific, as shown in Figure 2. In terms of maximal activity, P*pnp*B was most active followed by P*rpsO*B, P*pnp*A, and P*rpsO*A.

**Figure 2.** (**A**) Growth of the *S. coelicolor* strains containing promoter probe constructs. Growth was measured as the increase in optical density at 450 nm. The arrows in the figure indicate the onset of the production of two of the secondary metabolites synthesized by *S. coelicolor*, undecylprodigiosin (red) and actinorhodin (act). (**B**) Catechol dioxygenase (CATO2ase) activity of mycelial extracts of *S. coelicolor* derivatives containing the putative *rpsO*A and *rpsO*B promoters, cloned in the promoter probe vector pIPP2 [35]. Mycelium was harvested at the indicated times, disrupted by sonication, and following centrifugation, supernatants were assayed for catechol dioxygenase, as described previously [31,35]. The catechol dioxygenase gene is the reporter in the promoter probe vector [35]. (**C**) CATO2ase activities of extracts of strains containing the putative *pnp*A and *pnp*B promoters. The results shown are the averages of duplicate assays from two independent experiments ± SEM. This figure is reprinted from *Gene*, 536, Patricia Bralley, Marcha L. Gatewood, George H. Jones, Transcription of the *rpsO–pnp* operon of *Streptomyces coelicolor* involves four temporally regulated, stress responsive promoters. 177–185, Copyright (2014), with permission from Elsevier [31].

A major mechanism for the modulation of promoter activity in bacterial systems involves the use of alternative sigma factors by RNA polymerase. The *S. coelicolor* genome encodes over sixty alternative sigma factors, many of which play roles in differentiation and responses to stress (reviewed in [36]). It was of considerable interest to determine whether the four *rpsO–pnp* promoters might require alternative sigma factors for transcription in *S. coelicolor*.

To this end, we obtained null mutants for a number of sigma factors, including σB, σH, and σ<sup>L</sup> [36], and their corresponding parental strains. We transferred the *rpsO–pnp* promoter probe constructs to each strain and measured promoter activity. The results obtained for the σ<sup>H</sup> and σ<sup>L</sup> mutants were quite similar to those for the parental strain of *S. coelicolor*, that is, the same pattern of temporal regulation was observed in both of these sigma factor mutants as in the parental strain (cf. Figure 2). In marked contrast, the P*rpsO*A, P*rpsO*B, and P*pnp*B promoters were completely inactive in the σ<sup>B</sup> mutant. P*pnp*A, however, was as active in the σ<sup>B</sup> mutant as in the parental strain, and showed a similar pattern of temporal expression [31]. This result indicates that P*rpsO*A and B and P*pnp*B are dependent on σ<sup>B</sup> for activity, and suggests that these promoters are transcribed by an RNA polymerase holoenzyme containing σB.

PNPase is a cold shock protein in many bacteria, and it has been shown that cold shock leads to an increase in PNPase levels in *E. coli* and other organisms [9–11]. It was of interest to determine whether PNPase levels increased in cold shock in *S. coelicolor*, and whether any such increase reflected changes in the activity of the *rpsO–pnp* promoters. As shown in Figure 3C, PNPase activity increased significantly (two-fold) in *S. coelicolor* over three hours of cold shock at 10 ◦C. This increase in activity was accompanied by an increase in the activities of all four of the *rpsO–pnp* promoters (Figure 3A,B) as compared with their activities at the normal growth temperature, 30 ◦C.

**Figure 3.** Cold shock responses of *S. coelicolor*. Derivatives containing the *rpsO–pnp* promoter probe constructs were grown and 30 ◦C, and half of each culture was then shifted to 10 ◦C. Mycelium was harvested at the indicated times, disrupted by sonication, and following centrifugation, supernatants were assayed for promoter activity, as described [31,35]. Panel **C** shows the results of PNPase polymerization assays. In Panels **A** and **B**, PNPase promoter activities are expressed relative to the activity measured at 30 ◦C at zero time, immediately before the shift to 10 ◦C. The results shown are the averages of duplicate assays from two independent experiments ± SEM. In the first experiment, PNPase levels were measured in *S. coelicolor* containing P*rpsO*A and in the second, PNPase levels were measured in the derivative containing P*pnp*B. This figure is reprinted from Gene, 536, Patricia Bralley, Marcha L. Gatewood, George H. Jones, Transcription of the *rpsO-pnp* operon of *Streptomyces coelicolor* involves four temporally regulated, stress responsive promoters. 177–185, Copyright (2014), with permission from Elsevier [31].

Thus, PNPase is a cold shock protein in *Streptomyces*, and the cold shock response involves changes in the activities of the promoters responsible for transcription of the *rpsO–pnp* operon [31].

#### **3. PNPase Function as an RNA 3 -Polyribonucleotide Polymerase in** *Streptomyces*

As is the case in eukaryotes, bacterial RNAs have oligo- and polyribonucleotide tails at their 3 -ends, and these tails are added post-transcriptionally (reviewed in [37]). In *E. coli*, the primary enzyme responsible for the synthesis of these tails is poly(A) polymerase I (PAP I), and the tails are composed primarily of A residues [38]. In bacteria, poly(A) tails function to facilitate RNA degradation as the major 3 -exoribonucleases, viz. RNase II and PNPase in *E. coli*, digest these tails processively in vitro and in vivo [8,39].

It was shown some years ago by Mohanty and Kushner that an *E. coli* mutant lacking PAP I still added tails to the 3 -ends of its RNAs. However, these tails were not composed exclusively of A residues; the tails contained G, C, and U residues as well, i.e., they were heteropolymeric [40]. Mohanty and Kushner demonstrated that the enzyme responsible for the synthesis of these heteropolymeric tails was none other than PNPase [40]. Thus, even in the absence of PAP I, PNPase can add 3 -tails to facilitate the degradation of cellular RNAs.

*Streptomyces* do not contain PAP I. Yet the 3 -ends of streptomycete RNAs do possess tails. Moreover, the tails are heteropolymeric in composition, like those synthesized by PNPase in *E. coli* in the absence of PAP I [41,42]. These observations, and the report that PNPase functions as the RNA 3 -polyribonucleotide polymerase in plant chloroplasts and in cyanobacteria [43,44], led to the hypothesis that PNPase played the same role in *Streptomyces* [45]. The straightforward way to test this hypothesis would be to create a *pnp* null mutant, e.g., in *S. coelicolor*, and to determine whether the mutant was still capable of adding 3 -tails to its RNAs. However, attempts to disrupt *pnp* in *S. coelicolor* and in the sister species, *Streptomyces antibioticus*, were only successful when a second copy of *pnp* was added to the genome. In other words, gene disruption attempts revealed that, unlike the situation in *E. coli* and in the Bacilli, *pnp* is an essential gene in *Streptomyces* [42].

A model for the function of PNPase as an RNA 3 -polyribonucleotide polymerase is presented in detail below.

#### **4. Function of PNPase as Both an Exoribonuclease and as an RNA 3 -Polyribonucleotide Polymerase**

PNPase activity is highly processive and the enzyme is impeded by stem–loop structures [46]. Streptomycete genomes are GC rich, so that enzymes involved in RNA decay may have evolved to degrade the RNAs derived from these genomes efficiently. A possible strategy for facilitating this degradation was suggested by the observation that PNPase appears to utilize its polymerizing activity to add 3 -tails to streptomycete RNAs [42,45]. It seemed possible that the enzyme might add such tails during phosphorolysis to create single stranded 3 -ends that would then function as the substrates for that phosphorolysis. If this were the case, it might be expected that nucleoside diphosphates, the substrates for polymerization, would stimulate phosphorolysis. To test this hypothesis, two model PNPase substrates were constructed from the sequence of the *rpsO–pnp* operon of *S. coelicolor*. Both substrates contained the *rpsO–pnp* terminator and the intergenic hairpin. Thus, both model substrates contained secondary structure that would be expected to impede phosphorolysis by PNPase. One substrate, designated 5601, also possessed a single stranded 3 -tail, 33 bases in length, while the other substrate, 5650, terminated at the base of the intergenic hairpin, and did not have a single stranded tail [47].

The phosphorolysis of these two substrates was studied in the absence and presence of a mixture of all four nucleoside diphosphates (NDPs) with the interesting result, predicted by our hypothesis, that the NDPs, normally the substrates for the polymerizing activity of PNPase, did stimulate RNA degradation by phosphorolysis. Figure 4 shows the results of phosphorolysis of the 5650 transcript (labeled RP3 in the figure).

Analysis of the results shown in Figure 4 revealed that NDPs at 20–30 μM in phosphorolysis mixtures stimulated that reaction by 2–3 fold as compared with controls, but only when the structured RNA, 5650 (RP3) was used as the substrate [47]. NDPs had no effect on the phosphorolysis of the 5601 substrate, possessing the single stranded tail. Kinetic analyses showed that NDPs affected the *K*m for phosphorolysis. Thus, the *K*m value for phosphorolysis of the 5650 substrate in the absence of NDPs was 3.1 μM. This value decreased to 0.65 μM in the presence of all four NDPs at 20 μM. This latter *K*<sup>m</sup> was almost identical to that obtained in the absence of NDPs for the 5601 substrate, which has a single stranded 3 -tail (0.62 μM). NDPs did not further decrease the *K*<sup>m</sup> for the 5601 substrate. It is noteworthy as well, that NDPs had no effect on the phosphorolysis of either substrate by *E. coli*

PNPase, and that the *E. coli* enzyme was intrinsically less active with the structured substrate than was its counterpart from *S. coelicolor* [47].

**Figure 4.** Effects of nucleoside diphosphates on the phosphorolysis of the 5650 transcript. Phosphorolysis reactions were performed as described in [47], and reaction products were separated by gel electrophoresis. The top panel shows the results obtained with *S. coelicolor* PNPase and the bottom panel results using *E. coli* PNPase. Reactions were conducted in the presence of increasing concentrations of a mixture of ADP, CDP, UDP, and GDP (nucleoside diphosphates (NDPs)) as indicated. RP3 is the 5650 transcript, and RP4 represents the product obtained by complete digestion of the intergenic hairpin in RP3 by PNPase. Note that as PNPase is highly processive [48], no intermediates with mobilities between those of RP3 and RP4 were observed. Copyright © American Society for Microbiology (*J. Bacteriol.* 190, 2008, 98–106, DOI:10.1128/JB.00327-07) [47].

Our model for the explanation of the foregoing observations is shown in Figure 5 [49].

We posit that the stem–loops of structured substrates, like 5650, block PNPase action. The addition of short 3 -tails during phosphorolysis, or the presence of naturally occurring tails on substrates like 5601, allows for the breathing of stems and thus permits PNPase, which might otherwise stall at the stem–loops in structured substrates, to continue phosphorolysis through those structures. As indicated above, this mechanism may represent an evolutionary adaptation occasioned by the high GC content of streptomycete genomes and their transcripts.

It must be noted, however, that the evidence for the function of PNPase as a 3 -polyribonucleotide polymerase in *Streptomyces* is indirect. In vivo evidence for this function remains to be uncovered.

**Figure 5.** Model for the effects of NDPs on the activity *S. coelicolor* PNPase. The model posits that *S. coelicolor* PNPase (PacMan symbol) is able to phosphorolyze 5650 and other structured substrates to a limited extent in the absence of NDPs, as indicated by the dashed X. In the presence of NDPs, PNPase synthesizes unstructured 3 -tails in vivo, and these tails then provide an anchor for the enzyme, thus facilitating the digestion of structured substrates. Copyright © American Society for Microbiology (*J. Bacteriol.* 195, 2013, 5151–5159, DOI:10.1128/JB.00936-13) [49].

#### **5. (p)ppGpp as a PNPase Effector**

Highly phosphorylated guanine nucleotides, (p)ppGpp, guanosine pentaphosphate, and guanosine tetraphosphate, are alarmones that play a number of roles in the regulation of bacterial metabolism (reviewed in [50,51]). (p)ppGpp is synthesized by the product of the *relA* gene in *E. coli* [52], and that gene is found in *S. coelicolor* [53,54], *S. antibioticus* [55], and other streptomycetes [56]. In *Streptomyces*, ppGpp plays an important role in the regulation of antibiotic synthesis [53–56].

PNPase from *S. antibioticus* was shown to synthesize pppGpp in vitro [57,58]. While this activity may be an in vitro artifact, as no other PNPases are known to possess it, the observation suggested a possible relationship between (p)ppGpp and RNA decay in *Streptomyces*. To begin to examine this relationship, the effects of (p)ppGpp on polymerization and phosphorolysis by *S. coelicolor*, *S. antibioticus*, and *E. coli* PNPases were measured in the absence and presence of (p)ppGpp [59]. As shown in Figure 6, both guanosine penta- and tetraphosphates inhibited the activity of *S. coelicolor* PNPase, in both phosphorolysis and polymerization, though ppGpp was a more potent inhibitor than pppGpp.

Essentially identical results were obtained for *S. antibioticus* PNPase (not shown). By contrast, neither ppGpp nor pppGpp were effective inhibitors of the activity of *E. coli* PNPase. Indeed, at concentrations up to 1 mM, pppGpp actually stimulated the polymerizing activity of the *E. coli* enzyme.

**Figure 6.** Effects of (p)ppGpp on the activity of PNPase. Polymerization and phosphorolysis reactions were performed in the absence and presence of guanosine tetraphosphate (ppGpp) or guanosine pentaphosphate (pppGpp), using purified PNPase from *S. coelicolor* and *E. coli* [59]. Results are expressed relative to the activities measured in the absence of (p)ppGpp, set arbitrarily to 100%.

In the same study, the effects of (p)ppGpp on the stability of bulk mRNA in *S. coelicolor* were examined [59]. It was initially observed that the half-life of bulk mRNA increased by 1.8-fold in stationary phase cultures as compared with exponential phase. That this increase might be related to the effects of (p)ppGpp was suggested by studies with an *S. coelicolor relA* mutant, and a strain containing an inducible *relA* gene. While the half-life of bulk mRNA was longer in the *relA* mutant than in the parental strain (e.g., 8.9 min vs 3.2 min in exponential phase), the half-life decreased slightly in the *relA* mutant in stationary phase (to 7.2 min). In the strain containing an inducible *relA* gene, producing increased levels of (p)ppGpp, induction occasioned a ca. two-fold increase in the half-life of bulk mRNA, from 6.6 to 11.8 min. Taken together, these observations suggest that (p)ppGpp may stabilize mRNAs in stationary phase *S. coelicolor* cells, as compared with cells growing exponentially.

Why and how might this stabilization occur? It is well established that although levels of RNA and protein synthesis decrease dramatically as *Streptomyces* cultures move from the exponential to the stationary phase of growth, a basal level of synthesis is maintained throughout stationary phase [60,61]. This basal level of macromolecular synthesis is presumably required to produce enzymes and other proteins involved in the synthesis of the secondary metabolites these organisms produce in stationary phase. Stabilization of the transcripts for these proteins would represent one strategy the organisms could employ to ensure the persistence of macromolecular synthesis to support secondary metabolite production. It is known that (p)ppGpp is present in significant amounts, even in stationary phase streptomycete cultures [62,63]. Thus, the inhibition of PNPase by (p)ppGpp might represent a strategy used by *Streptomyces* to stabilize essential mRNAs during stationary phase. It would be interesting to determine whether (p)ppGpp inhibits the activity of other exo- and endonucleases and while such analyses have yet to be performed, it is noteworthy that ppGpp inhibits PNPase from another actinomycete, *Nonomuraea* sp. [64].

#### **6. Conclusions and Unanswered Questions**

It is apparent from the brief analysis above that PNPase is a multitalented enzyme that plays a critical role in the metabolic activities of bacterial cells. Despite the wealth of information that has been accumulated about PNPase, a number of important biological questions remain unanswered, particularly as they relate to the functions of *Streptomyces* PNPase.

First, why is the *rpsO–pnp* operon of *S. coelicolor* transcribed from four promoters? The answer to this question may relate to the fact that the operon contains a ribosomal protein gene, as well as the gene for PNPase. It is possible that the promoters are not only critical to the regulation of *pnp* expression, but that they also play and important role in ribosome biogenesis via their regulation of the levels of the *rpsO* transcript. Mutation of the four promoter sequences may provide insight into these possibilities.

Second, what is the significance, if there is any, to the fact that the *S. coelicolor rpsO–pnp* operon produces six transcripts (two *rpsO* transcripts from P*rpsO*A and B, two readthrough transcripts from the same two promoters, and two *pnp* transcripts from P*pnp*A and B)? It should be noted that Northern blot analysis of the transcripts derived from the *S. coelicolor rpsO–pnp* operon did not reveal the presence of six separate transcripts [65]. It is possible that the different transcripts are not sufficiently different in size to have been resolved on the Northern blotting gels. Another intriguing possibility is that the longer transcripts, obtained from the upstream promoter in each case, might be processed at their 5 -ends by RNase J, which possesses 5 –3 -exoribonuclease activity. RNase J has recently been characterized in *S. coelicolor* [66,67].

Third, as described above, PNPase is a cold shock protein in *Streptomyces*. It is relevant to ask whether PNPase responds to other environmental stresses, such as heat, oxidative stress, metal ion stress, etc. It would be of interest, in particular, to examine the effects of various types of stress on the activities of the *rpsO–pnp* promoters. Mutational analyses again might reveal important aspects of promoter function in stress conditions in *Streptomyces*.

Fourth, a number of small effector molecules modulate the activity of *E. coli* PNPase, e.g., ATP, citrate, and cyclic-diGMP [16–19]. It has been proposed that these effectors connect RNA decay to other metabolic pathways in bacterial cells. It would be interesting to determine whether these effectors also affect the activity of *Streptomyces* PNPase. It is noteworthy, in this regard, that in silico molecular docking studies suggest that citrate, which inhibits the activity of *E. coli* PNPase, will bind to PNPase from *S. antibioticus* [18].

Finally, in *E. coli* and other organisms, PNPase is part of a larger macromolecular complex generally referred to as the degradosome ([68,69]. In *E. coli*, the components of the degradosome are organized around a scaffold provided by the single strand specific endoribonuclease, RNase E [70]. RNase E is present in *S. coelicolor*, and has been shown to interact with PNPase in vivo [71]. However, unlike the situation in *E. coli*, the identities of other proteins that might be involved in the degradative machine are unknown in *Streptomyces*.

It is fervently hoped that the foregoing and other important questions related to PNPase structure and function will continue to attract interest and experimentation to provide answers to them.

**Acknowledgments:** Much of the research described in this report was supported by grants MCB-0133520 and MCB-0817177 from the U.S. National Science Foundation to the author.

**Conflicts of Interest:** The author declares no conflicts of interest.

#### **References**


© 2018 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Antibiotics* Editorial Office E-mail: antibiotics@mdpi.com www.mdpi.com/journal/antibiotics

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18