**1. Introduction**

During our ongoing investigations into the natural products of Australian marine and terrestrial microbes, we have encountered many new and unusual cyclic and acyclic peptides and depsipeptides, including the antimalarial glyco-cyclohexadepsipeptide-polyketide mollemycin A from a north Queensland marine sediment-derived *Streptomyces* sp. CMB-M0244 [1]; the antitubercular cyclohexapeptide wollamides A–B from a north Queensland desert soil-derived *Streptomyces* sp. MST-115088 [2]; the acyclic peptaibol nonapeptide trichodermamides A–E from a Queensland termite nest-derived fungus *Trichoderma virens* CMB-TN16 [3]; the nitro-depsitetrapeptide-diketopiperazine waspergillamide A from a Queensland mud dauber wasp-derived *Aspergillus* sp. CMB-W031 [4]; the lipocyclopentapeptide scopularides A–H from Queensland mullet gastrointestinal tract-derived *Scopulariopsis* spp. CMB-F458 and CMB-F115, and *Beauvaria* sp. CMB-F585 [5]; and *N*-methylated acyclic undeca- and dodecapeptide talaropeptides A–D [6], and the cycloheptapeptide hydroxamate talarolide A [7], from a Queensland marine tunicate-derived fungus *Talaromyces*

**Citation:** Salim, A.A.; Hussein, W.M.; Dewapriya, P.; Hoang, H.N.; Zhou, Y.; Samarasekera, K.; Khalil, Z.G.; Fairlie, D.P.; Capon, R.J. Talarolides Revisited: Cyclic Heptapeptides from an Australian Marine Tunicate-Associated Fungus, *Talaromyces* sp. CMB-TU011. *Mar. Drugs* **2023**, *21*, 487. https:// doi.org/10.3390/md21090487

Academic Editor: Dehai Li

Received: 11 August 2023 Revised: 6 September 2023 Accepted: 6 September 2023 Published: 11 September 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

sp. CMB-TU011. In the latter case, we took advantage of altering cultivation conditions, with a YES broth cultivation of *Talaromyces* sp. CMB-TU011 yielding the talaropeptides [6] and an M1-saline agar cultivation yielding talarolide A [7].

Notwithstanding that traditional spectroscopic and chemical approaches are generally very effective at assigning structures inclusive of absolute configurations to cyclic peptides, our 2017 account of talarolide A proved challenging, with the proposed structure **1a** inconsistent with a subsequent total synthesis by Brimble et al. [8]. In an effort to address this anomaly, this report describes the application of an innovative miniaturized cultivation profiling methodology (MATRIX) [9] to optimize the production and enable the isolation and characterization of talarolides A–D (**1**–**4**). With access to larger quantities of talarolide A, we were able to secure superior NMR data, which, together with refinements to the Marfey analysis methodology, as well as partial and total syntheses, allowed us to propose a revised structure **1** for talarolide A and to assign structures to the new analogues talarolides B–D (**2**–**4**) as shown (Figure 1).

**Figure 1.** Structures for talarolide A, incorrect (**1a**) [7] and revised (**1**), and new analogues for talarolides B–D (**2**–**4**) from *Talaromyces* sp. CMB-TU011. Highlights (light blue, green, and yellow) show the difference between the incorrect structure (**1a**) and revised structure (**1**) of talarolide A. Pink highlight in structures **2**–**4** shows the amino acid variation compared to **1**.

#### **2. Results and Discussion**

Since our initial 2017 report on talarolide A [7], we have augmented our microbial biodiscovery efforts by implementing a miniaturized 24-well plate microbioreactor approach to support more comprehensive cultivation profiling (MATRIX) [9], to better optimize production and provide higher yields. Furthermore, we have integrated our MATRIX approach with a chemical profiling strategy employing in situ extraction followed by HPLC-DAD-ESI(+)MS and a UPLC-DAD-QTOF-MS/MS analysis, with the latter visualized as a Global Natural Products Social (GNPS) [10] molecular network, to better detect and prioritize target chemistry (i.e., new from known and rare from common). Applying MATRIX cultivation profiling to *Talaromyces* sp. CMB-TU011 involved 24-well plate cultivations using eleven different media (Table S1) under three conditions (solid agar, and static and shaken broth) (Figure 2B) at 26.5 ◦C, over 10 days. Following incubation, the resulting 36 individual wells, together with uninoculated media controls, were extracted in situ with EtOAc, and the resulting extracts subjected to chemical profiling. While visualization of the HPLC-DAD-ESIMS data using single ion extraction (SIE, *m*/*z* 718) detected **1** in most extracts, production levels were highly variable with maximum yields observed under M1-salt, ISP-4, PDA and PYG solid agar, and static and shaken broth conditions returning far lower yields (Figure S1). Significantly, a GNPS analysis of the MATRIX extracts revealed a talarolide molecular family (sodiated adducts) incorporating **1** (*m*/*z* 740), and nodes for the deoxy analogue **2** (*m*/*z* 724), lower homologue isomers **3** and **4** (*m*/*z* 726), and an unidentified minor analogue (*m*/*z* 752) (Figure 2A). Based on these analyses, a scaled up (×200 plate) 20-day solid phase ISP-4 agar cultivation of CMB-TU011 was extracted and fractionated by solvent partitioning and gel and reversed phase chromatography, to yield talarolides A–D (**1**–**4**) (Figures 2C and S2). An account of the structure elucidation of **1**–**4** (including structure revision of **1**) is summarized below.

HRESI(+)MS analysis of **1** revealed a molecular formula (C35H55N7O9, Δmmu +2.5) requiring 12 double bond equivalents (DBEs), consistent with our earlier 2017 account of talarolide A [7]. Marfey's analysis of **1** returned *N*-Me-L-Tyr, D-*allo*-Ile, *N*-Me-D-Leu, L-Ala, D-Ala, and *N*-Me-D-Ala (Figure S34). While this analysis differed from our earlier assessment of talarolide A (i.e., *N*-Me-L-Ala rather than *N*-Me-D-Ala), on revisiting and repeating our earlier analytical HPLC protocols it became apparent that the relative retention times of Marfey's D-FDAA (or L-FDAA) derivatives of *N*-Me-L-Ala and *N*-Me-D-Ala were very similar, so much so that replicate analyses could experience a reversal in elution times, likely due to subtle variations in eluant composition (i.e., pH) over time. To address this lack of reliability, in this current report, we rely on new analytical HPLC conditions optimized for the unambiguous resolution of Marfey's derivatives of *N*-Me-L-Ala and *N*-Me-D-Ala (Figures 3 and S38). Likewise, we also developed and applied new, superior analytical HPLC conditions optimized for the differentiation of Marfey's derivatives of Leu, Ile, and *allo*-Ile (Figures 4 and S39). With the identity and absolute configuration of the amino acid residues in **1** assigned, we next turned our attention to the amino acid sequence. In our earlier structure elucidation of talarolide A, assignment of the planar sequence of amino acid residues relied on an incomplete set of HMBC correlations and interpretation of the MS/MS fragmentation patterns (the latter challenging for cyclic peptides). Fortunately, the re-isolation of **1** enabled the acquisition of superior NMR (DMSO-*d*6) data (Tables 1, 2 and S2, and Figures 5 and S3–S8), which allowed for a comprehensive set of HMBC correlations and unambiguous assembly of the amino acid sequence, as shown. To assign the regiochemistry of the L-Ala and D-Ala residues in **1,** we relied on our earlier 2D C3 Marfey's analysis [11] where talarolide A was subjected to partial hydrolysis, derivatization, and chromatographic fractionation to yield the dipeptide D-FDAA-D-*allo*-Ile-D-Ala, with the D-Ala configuration confirmed by a subsequent round of hydrolysis and Marfey's analysis [7]. Thus, the revised structure for talarolide A (**1**) is as shown. Of particular interest is the unprecedented *N*-OH-Gly residue and its ability to engage in an extensive network of ROESY interactions (and H-bonding) across the cyclic peptide ring (Figure 5,

dashed pink), which presumably also facilitates the observed long-range ROESY linkages (Figure 5, dashed green).

**Figure 2.** (**A**) GNPS molecular network of *Talaromyces* sp. CMB-TU011 in a selection of five media, with an expansion of the talarolide molecular family. Node segment size correlates with relative yield/metabolite/media; (**B**) images of 24-well plate MATRIX cultivation in 11 different media under three conditions: (i) agar, (ii) static broth, (iii) shaken broth; (**C**) HPLC-DAD-MS chromatograms of CMB-TU011 EtOAc extract obtained from ISP4 agar cultivation, with single ion extractions showing **1**–**4**. (\* this peak is not a talarolide analogue).

**Figure 3.** Optimized HPLC conditions for the resolution of Marfey's derivatives L-FDAA-*N*-Me-L-Ala (pink) and L-FDAA-*N*-Me-D-Ala (light blue). (**A**) synthetic L-FDAA-*N*-Me-L-Ala; (**B**) synthetic L-FDAA-*N*-Me-D-Ala; (**C**) L-FDAA-*N*-Me-L-Ala derived from talarolide A (**1**); (**D**) synthetic L-FDAA-*N*-Me-D-Ala co-injected with L-FDAA-*N*-Me-L-Ala derived from talarolide A (**1**).

**Figure 4.** Optimized HPLC conditions for the resolution of Marfey's derivatives L-FDAA-D-*allo*-Ile (red), L-FDAA-D-Ile (blue) and L-FDAA-D-Leu (green). (**A**) synthetic L-FDAA-D-*allo*-Ile; (**B**) synthetic L-FDAA-D-Ile; (**C**) synthetic L-FDAA-D-Leu, (**D**) L-FDAA-D-*allo*-Ile derived from talarolide A (**1**); (**E**) synthetic L-FDAA-D-Ile co-injected with L-FDAA-D-*allo*-Ile derived from talarolide A (**1**).

**Figure 5.** Selected 2D NMR (DMSO-*d*6) correlations for talarolide A (**1**).

HRESI(+)MS analysis of **2** revealed a molecular formula (C35H55N7O8, Δmmu +0.4) consistent with a deoxy analogue of **1**. Indeed, Marfey's analysis of **2** returned *N*-Me-L-Tyr, D-*allo*-Ile, *N*-Me-D-Leu, L-Ala, D-Ala, and *N*-Me-D-Ala (Figure S35), while the NMR (DMSO-*d*6) data for **2** (Tables 1, 2 and S3, and Figures 6 and S11–S16) revealed chemical shifts and diagnostic correlations that permitted assignment of the same planar amino acid sequence as **1**, where the *N*-OH-Gly in **1** had been replaced by a Gly residue in **2**. Partial hydrolysis of **2** followed by derivatization with L-FDAA followed by UPLC-DAD-MS analysis detected a dipeptide that co-eluted with an authentic synthetic sample of L-FDAA-D-*allo*-Ile-D-Ala, but not synthetic L-FDAA-D-*allo*-Ile-L-Ala (Figure S41), confirming a D-Ala and L-Ala regiochemistry in **2**, common with that independently established for **1.** To further confirm this assignment, we carried out a successful solid phase peptide synthesis of **2** (Scheme 1), with the synthetic sample proving to be identical to natural talarolide B (Figures S50–S52), including co-elution on HPLC (Figure S48).

HRESI(+)MS analysis of 3 revealed a molecular formula (C34H53N7O9, Δmmu +3.0) suggestive of a lower homologue (-CH2) of 1, with Marfey's analysis returning *N*-Me-L-Tyr, *N*-Me-D-Ala, D-Ala, L-Ala, D-Val, and *N*-Me-D-Leu (Figure S36). As with 1, the *N*-OH-Gly residue in 3 was not detectable via Marfey's analysis, although its presence was evident in the NMR (DMSO-*d*6) data (Tables 1, 2 and S4, and Figures 4 and S19–S24). Diagnostic 2D NMR correlations (Figure 6) permitted assignment of a planar amino acid sequence comparable to 1, but where the D-*allo*-Ile in 1 was replaced by D-Val in 3. The regiochemistry of the D-Ala and L-Ala residues in 3 was assigned on the basis of biogenetic comparison to 1 and 2, with the structure for talarolide C (3) assigned as shown.


**Table 1.** 1H NMR (DMSO-*d*6) data for talarolides A–D (**1**–**4**).

a–c resonances with the same superscript within a column are overlapping, <sup>d</sup> signal is obscured by DMSO, detected by HSQC. <sup>e</sup> occurs as an equilibrating mixture of major and minor conformers, with the major conformer tabulated.


**Table 2.** 13C NMR (DMSO-*d*6) data for talarolides A–D (**1**–**4**).

a–c resonances with the same superscript within a column are interchangeable.

HRESI(+)MS analysis of **4** revealed a molecular formula (C34H53N7O9, Δmmu +3.0) suggestive of an alternate lower homologue (-CH2) of **1**, with Marfey's analysis returning *N*-Me-L-Tyr, D-Ala, L-Ala, D-*allo*-Ile, and *N*-Me-D-Leu (Figure S37). As with **1**, the *N*-OH-Gly residue in **4** was not detectable via Marfey's analysis, although its presence was evident in the NMR (DMSO-*d*6) data (Tables 1, 2 and S5, and Figures 6 and S27–S32). Diagnostic 2D NMR correlations revealed a planar amino acid sequence comparable to **1**, but where the *N*-Me-L-Ala in **1** was replaced by an L-Ala in **4**. The regiochemistry of the D-Ala and L-Ala residues in **4** were assigned on the basis of biogenetic comparison to **1** and **2**, with the structure for talarolide D (**4**) assigned as shown.

**Figure 6.** Selected 2D NMR (DMSO-*d*6) correlations for talarolides B–D (**2**–**4**).

Of note, both the *N*-OH cyclic peptides **3** and **4** exhibit the same extensive pattern of ROESY correlations (Figures S10, S18 and S26) associated with the *N*-OH moiety evident in **1**, suggesting that all three adopt a common stable conformation dominated by hydrogen bonding to the *N*-OH. Not only is such conformation stabilization not accessible to the cyclic peptide **2**, but the NMR data for **2** reveals two equilibrating conformations (Figure S53), supporting the hypothesis that *N*-hydroxylation can have a pronounced effect on cyclic peptide conformation and stabilization.

In an effort to understand this latter phenomenon, we calculated a solution structure for **1** DMSO-*d*<sup>6</sup> at 298 K using 2D ROESY NMR spectra, calculated from 41 ROE distance restraints, three backbone ϕ-dihedral angle restraints derived from <sup>3</sup>*J*NH-CHα, one *cis*amide between *N*-Me-L-Ala6-*N*-Me-L-Tyr7, and one hydrogen bond restraint between *N*-OH-Gly1 and the D-Ala5 carbonyl oxygen (Figure 7). This hydrogen bond restraint was supported by the low temperature coefficient for the *N*-OH-Gly1 in variable temperature 1H NMR experiments (Figure 8). Structures were calculated in XPLOR-NIH using a dynamic simulated annealing protocol in a geometric force field, and energy minimized using the CHARMM force field [12,13]. The 10 lowest energy structures for talarolide A (**1**) had no distance (≥0.2 Å) or dihedral angle (≥2◦) violations and were rigid, convergent structures (average pairwise Ca RMSD 0.18 Å) (Figure 7). The structure for **1** supported observations made in the VT (variable temperature) NMR experiments, with the N-OH-Gly<sup>1</sup> to D-Ala<sup>5</sup> carbonyl oxygen hydrogen bond and *cis*-amide bond between *N*-Me-L-Ala6-*N*-Me-L-Tyr<sup>7</sup> forming a non-classical alpha turn centered at D-Ala5-*N*-Me-L-Ala6-*N*-Me-L-Tyr7, and with L-Ala2 and *N*-Me-D-Leu3 forming a distorted beta turn. The D-*allo*-Ile<sup>4</sup> amide proton projects toward the interior of the structure and is shielded from solvent, while the D-Ala<sup>5</sup> amide proton is in close proximity to the *N*-OH-Gly1 carbonyl oxygen, suggestive of a hydrogen bond and also less accessible to solvent. The opposite side of the molecule features an exposed L-Ala2 amide proton, making it more accessible to solvent. From these observations, it can be concluded that the presence of the *N*-OH-Gly provides access to a

hydrogen bond that defines the overall conformation of the cyclic peptide. It is intriguing to speculate whether this effect is unique to the talarolide scaffold, with its mix of L and D amino acid residues, or whether it is a more general phenomenon. If the latter, it is possible that *N*-hydroxylation could prove to be a valuable molecular tool for accessing new peptide chemical space.

**Scheme 1.** Top: General outline of the solid phase peptide synthesis (SPPS) of talarolide B (**2**). (**i**) Fmoc-Gly-OH coupling to 2-CTC resin, (**ii**–**vii**) sequential peptide chain elongation of Fmoc amino acids, (**viii**) cleavage of linear protected peptide from resin, (**ix**) cyclization of linear protected peptide and (**x**) deprotection to yield **2**. Bottom: Experimental details for SPPS of **2**: (**i**) Fmoc-Gly-OH coupling to 2-CTC resin in the presence of DIPEA (2 h), (**ii**) elongation of peptide sequence through a coupling cycle: Fmoc deprotection with 20% of piperidine in DMF (twice, 5 and 10 min), and a 5 min DMF flow-wash followed by coupling with preactivated Fmoc-amino acid (3.2 eq.) over 2 × 30 min, or 2 × 3 h for coupling of Fmoc-amino acids to sterically hindered *N*-Me-amino acids, (**iii**) cleavage of linear protected peptide from resin using 20% HFIP/DCM (3 × 20 min), (**iv**) cyclization of linear protected peptide using HATU, HOBT, and collidine (14 h), followed by deprotection of *N*Me-L-Tyr using 90% formic acid 40 min to give **2** (16 mg, 23% overall yield).

**Figure 7.** Backbone superimposition of the 10 lowest energy NMR calculated structures for **1** in DMSO-*d*<sup>6</sup> at 298 K showing hydrogen bonding between *N*-OH-Gly<sup>1</sup> and the carbonyl in D-Ala5 (dashed line) and a *cis*-amide bond between *N*-Me-L-Ala6 and *N*-Me-L-Tyr<sup>7</sup> forming a non-classical alpha turn. The D-*allo*-Ile4 amide is projected inward and shielded from solvent, while L-Ala<sup>2</sup> is solvent exposed. Non-polar hydrogens are omitted for clarity, with backbone carbon atoms (green), sidechain carbon atoms (grey), oxygen atoms (red), nitrogen atoms (blue), and hydrogen atoms (cyan).

**Figure 8.** Temperature dependence of the amide NH and OH NMR (DMSO-*d*6) chemical shifts for **1**. Line slopes indicating temperature coefficients (Δδ/T) for each residue. Circle: *N*-OH-Gly1 (Δδ/T = 1.4 ppb/K); triangle: L-Ala2-NH (Δδ/T = 3.9 ppb/K); black square: D-Ala5-NH (Δδ/T = 3.5 ppb/K); opened square: D-*allo*-Ile4-NH (Δδ/T = 0.3 ppb/K). Small temperature coefficients (Δδ/T) for *N*-OH-Gly<sup>1</sup> and D-*allo*-Ile4-NH indicates hydrogen-bonds or solvent shielded [14].
