*4.3. Phylogenetic Analysis*

Genomic DNA was extracted by using a rapid method with Chelex-100 as described previously [89]. The 16S rRNA gene amplification and sequencing analysis were performed using the universal primers 27F (5'-AGAGTTTGATCMTGGCTCAG-3') and 1492R (5'-GGTTACCTTGTTACGACTT-3'). The PCR reaction mixture (50 μL) included 25 μL 2× supermix (TransGen Biotech, Beijing, China), 1 μL each of the primers (10 mM, Sangon

Biotech, Shanghai, China), 1.5 μL DNA, and 21.5 μL ddH2O. The reaction conditions were as follows: 95 ◦C for 3 min, 30 cycles of 94 ◦C for 1 min, annealing at 60 ◦C for 1 min, extension at 72 ◦C for 1 min, followed by final extension for 10 min at 72 ◦C. The amplified products were sent to Shanghai Shenggong Company for sequencing. The sequencing data were BLAST analyzed using the GenBank NCBI (http://www.ncbi.nlm.nih.gov/, accessed on 11 November 2021) and the EzBioCloud database [90] to determine the similarity with type strains. Multiple alignments were generated using the Clustal\_X tool in MEGA version 7.0 [91]. A phylogenetic tree based on the neighbor-joining method was constructed under Kimura's two-parameter model [92]. Bootstrap analysis with 1000 replications was performed with MEGA version 7.0 and finally visualized via the Interactive Tree of Life (iTOL) web service [93].

#### *4.4. Extracts Preparation and Bioactivity Assay*

Based on the analysis of phenotypic and phylogenetic characteristics, 179 strains were selected from the 521 isolated actinomycetial strains to examine their antibacterial potentials. The strains were inoculated into 100 mL ISP2 broth in 500 mL conical flasks and cultured for 7 days in a shaking incubator at 180 rpm at 28 ◦C. A total of 300 mL (3 × 100 mL) cultural broth of each strain was pooled and centrifuged at 4200 rpm for 20 min to separate the mycelium portion. The supernatants were extracted three times with ethyl acetate (1:1, *v*/*v*). The organic layers were combined and evaporated to obtain crude extracts. The crude extracts were dissolved in 3 mL methanol and used for antibacterial assay by the paper disc diffusion method.

The methanol sample (30 μL) was dripped on a paper disk (6 mm diameter). A total of 30 μL methanol and levofloxacin solution (10 μL, 1 mg/mL) were used as the negative and positive control, respectively. After being dried in a biosafety hood, the paper disks were transferred to agar plates seeded with pathogenic bacteria and incubated at 37 ◦C for 24–48 h. The antibacterial activity was evaluated by measuring the diameters of the inhibition zones with a vernier caliper. The indicator bacteria used for antimicrobial assay were six sets of indicator bacteria, including *Enterococcus* sp. (ATCC 33186 and 310682), *Staphylococcus aureus* (ATCC 29213 and ATCC 33591), *Klebsiella pneumonia* (ATCC 10031 and ATCC 700603), *Acinetobacter baumannii* (2799 and ATCC 19606), *Pseudomonas aeruginosa* (ATCC 27853 and 2774) and *Escherichia coli* (ATCC 25922 and ATCC 35218). Their drug susceptibility testing was identified and confirmed by the Beijing Key Laboratory of Antimicrobial Agents, Institute of Medicinal Biotechnology. Each set consisted of two strains, one drug-sensitive strain (the former), and one drug-resistant strain (the latter). Isolate 310682 was resistant to vancomycin. Meanwhile, isolate 2774 was resistant to aminoglycosides and carbapenems. Indicator bacteria were obtained from either the American Type Culture Collection (ATCC) or the clinic isolation from hospital in China, and they were deposited in the Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences and Peking Union Medical College.

#### *4.5. PCA and OPLS-DA Analysis*

Twenty-three strains with zones of inhibition larger than or equal to 10 mm against MRSA were selected for dereplication and microbial strain prioritization studies using UPLC-HRMS-PCA and UPLC-HRMS-OPLS-DA. Three biological replicates were prepared for each actinomycetial strain. Each strain was cultured in triplicate for 7 days in ISP2 broth medium (3 × 100 mL) as mentioned above. Only 15 mL supernatant from 100 mL cultural broth were extracted three times (3 × 15 mL) with ethyl acetate, then dried under vacuum to obtain the crude extract. The dried crude extracts were weighed and dissolved in methanol to yield a stock solution with a concentration of 2 mg/mL. The ISP2 broth medium was used as a blank medium control. After centrifugation at 14,000 rpm for 10 min, the supernatant of the stock solution was diluted 4-fold with methanol to yield the test solution (0.05 mg/mL). A quality control (QC) sample was prepared by mixing an equal volume of each test solution (including bank medium control). All test solutions were stored at 4 ◦C before analysis.

UPLC-HRMS/MS experiments were carried out on Waters ACQUITY UPLC I-Class system combined with Waters Xevo G2-XS Q-TOF mass spectrometer (Waters, Manchester, UK). A Waters ACQUITY UPLC BEH C18 column (2.1 × 100 mm, 1.7 μm) maintained at 25 ◦C was used, and the PDA scan range was 200–800 nm. The binary mobile phase consisted of solvent A (water containing 0.1% formic acid) and solvent B (acetonitrile). The gradient elution program was applied as follows: 0–1 min, 10%(B); 1–18 min, 10–95%(B); 18–20 min, 95%(B); 20–22 min, 10%(B). The flow rate was 0.3 mL/min. The injection volume was 2 μL, and the QC sample was analyzed after every six injections to evaluate system stability.

The ESI source parameters in the positive mode were set as follows: capillary, 2 kV; sampling cone, 40 V; source offset, 80 V; source temperature, 100 ◦C; desolvation temperature, 250 ◦C; cone gas, 50 L/h; and desolvation gas, 600 L/h. The desolvation and cone gases were nitrogen, and the collision gas was argon. The MSE acquisition (dataindependent acquisition) was obtained in the continuum format with a mass range of 100–2000 Da in both low-energy (function 1) and high-energy (function 2) scan functions. For function 1, the collision energy was 2 V. For function 2, a collision energy ramp of 40–80 V was used. The scan time was 0.10 s. The mass accuracy was maintained by using a lock spray with leucine–enkephalin ([M+H]+ = 556.2771 Da) at a concentration of 2 ng/mL and a flow rate of 5 μL/min as reference. The run sequence started with a blank solvent, then a blank medium, followed by the samples. The instrument controlling and data acquisition were performed by MassLynx V4.1 software (Waters, Milford, CT, USA).

The acquired raw data from 87 samples, including 69 test samples, six blank samples (blank solvent and blank medium), and 12 QC injections, were all imported into Progenesis QI 3.0 software (Waters, Milford, USA) to operate the chromatographic peak alignment, experimental design setup, peak picking, normalization, deconvolution, compound identification, and compound review. The imported 87 runs were aligned on the basis of an automatically selected QC sample. The retention time for peak picking was set as 0–20 min, and the limits and sensitivity were set as the default. The adduct ion forms of [M+H-H2O]+, [M+H]+, [M+NH4] +, [M+Na]+, [M+K]+, [2M+H]+, [2M+Na]+ were added to deconvolute the spectral data. After performing automatic processing to all compounds in all samples, a data matrix involving sample code, RT, *m/z*, and normalized abundance was generated. In the blank samples, features with the most abundance and an abundance 20 times less than the 69 test samples were hidden manually to remove medium and blank effects for cleaner data [20,71]. The obtained data were exported into the extended statistics module EZinfo 3.0 (Umetrics, Umea, Sweden) for PCA and OPLS-DA analyses. The significant differential retention time-observed mass (RT-*m/z*) or retention time-neutral mass (RT-*m/z*) pairs in the loadings plot and S-plot were selected and imported back into Progenesis QI for compound identification. After filtering by ANOVA *p*-value ≤ 0.05, *q* value ≤ 0.05, and max fold change ≥ 2, the filtered pairs were identified using the search method in Progenesis QI (parameters in search method: precursor tolerance 10 ppm and theoretical fragment tolerance 10 ppm). The Natural Product Atlas v19\_12 and StreptomeDB v3.0 databases as in-house libraries were use for the dereplication of the differential metabolites in the samples.

#### *4.6. Molecular Network Analysis*

The UPLC and ESI source parameters were set as same as shown above. DDA was also performed in positive ion mode. The full MS survey scan was performed for 0.2 s in the range of 100–2000 Da, and MS/MS scanned a mass range of 50–2000 Da by the same scan time. The five most intense ions were chosen for MS/MS fragmentation spectra. The gradient of collision energy was set as 20 V to 40 V for low-mass collision energy (LM CE) and 60 V to 80 V for high-mass collision energy (HM CE). Automatic switching to MS/MS mode was enabled when the TIC intensity rose above 10,000 counts

and switched off when 0.4 s had elapsed, or the TIC intensity was 1,000,000 counts. The tolerance window of ±3.0 Da was set in the deisotope peak detection mode. Dynamic peak exclusion was enabled, acquired, and then excluded for 3.0 s. Fixed peak exclusion was as follows: *m/z* 205.0877, 255.1581, 279.1591, 301.1425, 371.3183, 579.2933. Raw data files obtained from the DDA acquisition were converted to 32-bit mzxML format with MS-Convert [94] and then uploaded on the GNPS web platform (http://gnps.ucsd.edu, accessed on 11 November 2021) for dereplication and molecular networking construction.

A molecular network was created using the online workflow on the GNPS website (https://ccms-ucsd.github.io/GNPSDocumentation/, accessed on 11 November 2021). The precursor ion mass tolerance was set as 0.1 Da and an MS/MS fragment ion tolerance as 0.1 Da. A network was then created where edges were filtered to have a cosine score above 0.6 and more than 4 matched peaks. Furthermore, edges between two nodes were kept in the network only if each of the nodes appeared in each other's respective top 10 most similar nodes. Finally, the maximum size of a molecular family was set to 100, and the lowest scoring edges were removed from molecular families until the molecular family size was below this threshold. The spectra in the network were then searched against GNPS spectral libraries. The library spectra were filtered in the same manner as the input data. All matches kept between network spectra and library spectra were required to have scores above 0.6 and at least 3 matching peaks. The generated molecular network was visualized using Cytoscape 3.7.1 [95].

#### *4.7. Scale-Up Fermentation, Extraction, and Purification of Natural Products*

*Streptomyces* sp. M22 was grown and maintained on an ISP2 agar plate at 28 ◦C for 7–10 days. The spores of the strain were inoculated into 500 mL Erlenmeyer flasks contained 100 mL of the ISP2 medium, which grew at 28 ◦C for 2 days at 180 rpm as seed cultures. Then, each seed culture (100 mL) was inoculated into autoclaved 5 L Erlenmeyer flasks containing 1 L ISP2 medium. The flasks were incubated at 28 ◦C for 7 days on a rotary shaker (180 rpm). The total 18 L (18 × 1L) of fermentation broth was centrifuged at 4300 rpm for 20 min, and the supernatant was extracted three times with ethyl acetate (18 L/time) to give an organic extract. After 3 times of fermentation, the combined organic extract (5.5 g) was subjected to MPLC column chromatography eluted with MeOH-H2O (10:90, 30:70, 50:50, 70:30, 90:10, 100:0, *v*/*v*) to obtain six subfractions (Fr.01–Fr.06) based on LC-MS analysis. Fraction 05 was further separated by Sephadex LH-20 (CH2Cl2: MeOH=1:1, *v*/*v*) to yield five subfractions (Fr.05a–Fr.05e). Fr.05b was subjected to semi-preparative HPLC (ACN-H2O, 32:68, *<sup>v</sup>*/*<sup>v</sup>*, 0–5 min; 32:68–52:48, *<sup>v</sup>*/*<sup>v</sup>*, 5–40 min, 3.0 mL/min) to yield gutingimycin B (**16**, 6.0 mg), gutingimycin (**12**, 15.6 mg) and semi-pure trioxacarcin G. The semi-pure trioxacarcin G was further fractioned by semi-preparative HPLC using MeOH-H2O (55:45, *v*/*v*) to yield pure trioxacarcin G (**20**, 8.0 mg).

Gutingimycin B (**16**): yellow amorphous powder. [*α*]<sup>25</sup> D −49.4◦ (c 0.02, ACN); UV (MeOH) λmax (log ε) 274 (4.83), 408 (4.23) nm; IR *v*max: 3366, 2932, 2853, 1689, 1628, 1386, 1223, 1089, 999 cm<sup>−</sup>1; 1H NMR (CDCl3, 600 MHz) and 13C NMR (CDCl3, 150 MHz), see Table 4; HRESIMS: *m/z* 1030.3778 [M+H]+ (calcd for C47 H60 N5O21, 1030.3781).

Trioxacarcin G (**20**): yellow amorphous powder. [*α*]<sup>25</sup> D −140.0◦ (c 0.02, ACN); UV (MeOH) λmax (log ε) 232 (4.19), 271 (4.25), 400 (3.71) nm; IR *v*max: 3439, 2933, 2851, 1730, 1623, 1384, 1233, 1085, 998 cm<sup>−</sup>1; 1H NMR (CDCl3, 600 MHz) and 13C NMR (CDCl3, 150 MHz), see Table 4; HRESIMS: *m/z* 914.3644 [M+NH4] + (calcd for C42 H60NO21, 914.3658).
