*2.1. Phylogenetic Analysis*

Twenty-five strains, 14 Antarctic and 11 Arctic, were selected from a larger collection of Polar marine sediment bacteria mainly consisting of Actinobacteria [11], based on taxonomy, isolation location (depths ranging from 388 to 4730 m) and previously known metabolic profile of the strains. The selected strains belonged to seven rare actinomycete genera—*Pseudonocardia* (eight strains), *Micrococcus* (seven strains), *Rhodococcus* (three strains), *Microbacterium* (three strains), *Kocuria* (one strain), *Agrococcus* (one strain), *Dietzia* (one strain). This genus-level delineation was well supported with bootstrap values over 67% and all strains clade with their respective reference sequences. Additionally, one strain of the phylum Proteobacteria, belonging to the *Halomonas* genus, was included in the study (Figure 1, Table S1). In terms of core depth (Figure 1), the two *Agrococcus*, one *Kocuria* and the eight *Pseudonocardia* strains were only isolated from the deepest sediment cores at a depth greater than 4000 m, and the one *Halomonas* and the two *Microbacterium* strains were only isolated from core depths of 1000–4000 m, while no pattern was observed for the genera *Micrococcus* and *Rhodococcus*. Although these observations are interesting, larger strain numbers would be required to draw statistical conclusions.

**Figure 1.** Maximum likelihood tree based on 16S rRNA gene sequences of 25 strains isolated from Antarctic (blue) and Arctic (green) sediment samples. Strain numbers are followed by a symbol indicating the depth at which the sediment samples were collected from: - 300–1000 m, • 1000–4000 m, ♦ > 4000 m and - N/A (information missing). The accession number is shown in brackets following the strain name.

#### *2.2. Genome Mining*

Genome assembly was carried out using SPAdes and due to the large numbers of contigs obtained, MeDuSa was utilised for genome scaffolding, using reference strains with >95% similarity based on 16S rRNA sequencing data (Table S2). No reference strains with >95% sequence similarity could be identified for the *Pseudonocardia* strains; therefore, they were eliminated from genome mining to avoid possible discrepancies.

Genome mining of the seventeen Polar rare Actinobacteria and the Proteobacteria (*Pseudonocardia* strains excluded) revealed a total of 133 BGCs including NRPS, PKS, terpene and RiPP classes. Interestingly, 37% of the total BGCs showed no homology to any BGC within the MiBIG database and a further 30% suggested homology ranging from 2% to 63% to known antibiotics. The biosynthetic diversity per strain is shown in the Circos diagram (Figure 2). The width of the bands indicates the number of BGCs within each Natural Product (NP) class which, as expected, is positively correlated to genome size. The lowest number of BGCs was observed for small genomes such as the *Micrococcus* (2.4–2.7 Mbp) and *Agrococcus* (3.1 Mbp) both of which had five BGCs, whereas the three *Rhodococcus* strains (5.4–6.7 Mbp) revealed the largest number (17–22) of BGCs. Moreover, strains belonging to the same genus showed BGCs of the same NP class (Table S3). The ectoine pathway was observed in almost all genomes except the *Kocuria* (KRD140) and the *Microbacterium*

(KRD174) strains. A similar pattern was observed for the terpene BGC which was present in the genome of all Polar isolates except the *Halomonas* strain (KRD171) (Table S3). The most abundant NP class was NRPSs which were not evenly distributed among all strains, as the smaller genomes such as *Micrococcus* and *Microbacterium* did not show any NRPS BGCs, although at least one NRPS-like fragment was identified in their genomes. On the other hand, larger genomes such as the *Rhodococcus* strains revealed a high number of NRPS BGCs (up to 10). Identification of siderophores based on bioinformatic analysis can often be challenging as many siderophores are produced through NRPS pathways, thus antiSMASH identifies them as NRPS and not siderophores [8,43]. Indeed, antiSMASH identified two NRPS clusters in one *Halomonas* strain (KRD 171) and one *Rhodococcus* strain (KRD 197) that have 53% and 63% gene homology to the serobactin and heterobactin siderophore pathways, respectively. Only 10 BGCs belonging to the PKS family were observed, of these, five were identified as Type I PKS, one Type II PKS, three Type III PKS, and one heterocyst glycolipid synthase-like PKS (hgIE-KS).

**Figure 2.** BGC diversity by NP class and strain taxonomy across 17 Polar strains. The band colour depicts taxonomy at the genus-level *Micrococcus* spp. (green), *Kocuria* sp. (purple), *Rhodococcus* spp. (orange), *Halomonas* sp. (dark blue), *Microbacterium* spp. (light blue), *Agrococcus* sp. (teal) and *Dietzia* sp. (grey). Each coloured band can be traced from the organism (right half of the circle) to the types of BGCs found in that genome (left half of the circle). The width of the band represents the number of BGCs of that NP class. The outer rings on the left of the diagram show the number of the BGC types found in each microbial genome. BGCs are colour coded based on the NP classification.

The BGCs of the 17 Polar strains were further analysed using BiG-SCAPE, which resulted in 80 GCFs, with 46% shown as singletons. As expected, there were BGCs present in strains belonging to the same genus, that clustered in the same GCF. For example, the ectoine BGCs present in the eight Micrococcus strains clustered in one GCF and the ectoine BGCs of the three *Rhodococcus* strains were represented as an additional GCF (green circles in Figure 3). The three *Rhodococcus* strains had a terpene BGC which showed low homology (6%) to SF2575 BGC from the soil-derived *Streptomyces* sp. SF2575 [44] (blue circle in Figure 3). Although the homology with the known biosynthetic pathway is low, the fact that it is shared by all three BGCs from the different *Rhodococcus* strains, implies that the strains may produce the same or similar metabolite(s) to the tetracycline antibiotic SF2575 [45]. Additionally, the three *Rhodococcus* strains (KRD12, KRD175, KRD197) showed an NRPS BGC that shows low homology (11%) to the chloramphenicol BGC from *Streptomyces venezuelae* ATCC 10712 [46]. Interestingly, only the gene clusters of KRD175 and KRD162 were grouped in the same family, whereas the corresponding BGC of KRD197 was shown as a singleton (red circles in Figure 3) even running BiG-SCAPE with a high cut off (0.7). Further investigation of the antiSMASH data showed that the predicted metabolites for the NRPS genes of interest could be cyclic lipopeptides, which often exhibit antibiotic properties [47].

**Figure 3.** BiG-SCAPE analysis of 17 strains. The 133 BGCs; 53 NRPS/NRPS-like, 20 Terpene, 10 PKS, 3 RiPP and 47 others (including BGCs such as ectoine, siderophore, betalactone) were clustered in 80 GCFs. Examples of GCFs of interest are highlighted in coloured circles; red represents the *Rhodococcus* spp. NRPS BGC corresponding to a potentially new cyclic lipopeptide, green represents the ectoine pathway found in all strains (here highlighted for the *Micrococcus* and *Microbacterium* spp.) and blue the terpene BGC found in all three *Rhodococcus* strains with low homology to the known antibiotic SF2575.
