**1. Introduction**

Ubiquitous to soils and marine sediments, bacteriovorous myxobacteria display organized social behaviors and predation strategies [1–4]. Perhaps intrinsic to their role as predators, myxobacteria are a critical source of diverse secondary metabolites that exhibit unique modes-of-action across a broad range of biological activities [5]. Distinct from other bacterial sources, the vast majority of the 60 species within the order Myxococcales produce natural products [5,6]. This gifted diversity of secondary metabolite producing representatives has established myxobacteria as a prolific resource for drug discovery efforts perhaps only second to Actinomycetales [7,8]. Bolstered by the observed lack of overlap between actinomycetal and myxobacterial drug-like metabolites, the potential to discover novel specialized metabolites from myxobacteria remains considerably high [7,8]. Herein, we report a survey of all myxobacterial natural product biosynthetic gene clusters (BGCs) deposited in the antiSMASH database and provide an account of all BGCs with and without characterization and assigned metabolites in an effort to observe the capacity for discovery from readily cultivable, sequenced myxobacteria [9,10]. Such analysis provides an assessment of the potential associated with the continued discovery efforts as well as development and application of methodologies to activate situational or cryptic secondary metabolism not functional during axenic cultivation [11,12]. A homology network of 994 BGCs from 36 sequenced myxobacterial genomes was constructed using the combined BiG-SCAPE-CORe Analysis of Syntenic Orthologues to prioritize Natural products biosynthetic gene clusters (CORASON) platform [13]. BiG-SCAPE facilitates the exploration of calculated BGC sequence similarity networks and provides the opportunity to visualize biosynthetic diversity across datasets [13]. Gene cluster families (GCFs) rendered by BiG-SCAPE are connected

by edges that indicate shared domain types, sequence similarity, and similarity of domain pair-types amongst input BGCs [13]. Comparative analysis against the Minimum Information about a Biosynthetic Gene Cluster (MIBiG) repository (v1.4) indicates an untapped reservoir of BGCs that encompasses a broad range of biosynthetic diversity [14]. The 36 Myxococcales within the antiSMASH database currently span all 3 suborders with 26 Cystobacterineae, 7 Sorangineae, and 3 Nannocystineae included. Considering that the myxobacteria within the antiSMASH database minimally represent the breadth of the order Myxococcales, these observations not only support thorough investigation of identified myxobacteria and the presented biosynthetic space but also continued efforts for the identification and subsequent exploration of new myxobacteria [1,3].

#### **2. Materials and Methods**

Dataset. All BGCs associated with the order Myxococcales, a total of 994 BGCs from 36 myxobacteria, were downloaded as .gbk files from the publicly available antiSMASH database (https://antismash-db.secondarymetabolites.org) [9]. The original genome sequence data for all included myxobacteria are also publicly available and can be accessed at the National Center for Biotechnology Information, U.S. National Library of Medicine (https://www.ncbi.nlm.nih.gov/genome/ browse#!/prokaryotes/myxobacteria).

BIG-SCAPE-CORASON analysis. BiG-SCAPE version 20181005 (available at: https://git. wageningenur.nl/medema-group/BiG-SCAPE) was utilized locally to analyse the 994 BGCs as individual .gbk files downloaded from the antiSMASH database (1/30/2019) [9,13]. BiG-SCAPE analysis was supplemented with Pfam database version 31 [15]. The singleton parameter in BiG-SCAPE was selected to ensure that BGCs with distances lower than the default cutoff distance of 0.3 were included in the corresponding output data. The MIBiG parameter in BiG-SCAPE was set to include the MIBiG repository version 1.4 of annotated BGCs [14]. The hybrids-off parameter was selected to prevent hybrid BGC redundancy. Generated network files separated by BiG-SCAPE class were combined for visualization using Cytoscape version 3.7.1; annotations associated with each BGC were included into Cytoscape networks by importing curated tables generated by BiG-SCAPE [16]. Phylogenetic trees provided by CORASON were generated during BiG-SCAPE analysis. Annotated network and table files including GCF associations are provided as Supplementary files. All BGCs with sequence similarities to deposited MIBiG clusters ≥75% were indicated and annotated using Cytoscape. An annotated .cys Cytoscape file is included as Supplementary Material. All associated .network and .tsv files are provided as Supplementary Materials. All histograms were generated GraphPad Prism version 7.0d for Mac OS X, GraphPad Software, San Diego, California, USA, www.graphpad.com.

#### **3. Results**

### *3.1. BiG-SCAPE Analysis of BGCs from Sequenced Myxobacteria*

A sequence similarity network calculated using BiG-SCAPE consisted of 994 total BGCs as unique nodes from 36 myxobacteria and included 1035 edges (included self-looped nodes) representing homology across 753 GCFs (Figure 1). Of these 994 BGCs from the antiSMASH database, a total of 124 were determined to be located on contig edges by antiSMASH. Clusters determined to be on contig edges could contribute to redundancy within our analysis. While no 2 BGCs from an individual myxobacterium were found within a GCF, this does not preclude a single BGC split across multiple contigs from being included multiple times. A total of 613 singletons without homology using a similarity cutoff of 0.30 were also included in the network to appropriately depict all myxobacterial BGCs within the antiSMASH database [9,13]. Predicted BGC classes included 64 type I or modular polyketide synthases (t1PKS), 57 PKS categorized by antiSMASH as "PKSother" that includes all non-modular categories of PKSs, 125 nonribosomal peptide synthetases (NRPS), 166 hybrid PKS-NRPS, 245 ribosomally synthesized and post-translationally modified peptides (RiPPs), 149 terpene clusters, 3

saccharide clusters, and 185 clusters not belonging to any of the aforementioned classes that antiSMASH categorizes as "Others" clusters [9,10].

**Figure 1.** Sequence similarity network of 994 myxobacterial BGCs deposited in the antiSMASH database generated by BiG-SCAPE and rendered with Cytoscape [9,10,13,14,16]. All GCFs that include at least 1 BGC with sequence similarity greater than ≥75% to a characterized cluster deposited in the MIBiG repository are boxed in grey (excluding 25 geosmin BGCs) [9,14]. Totals for BGC class diversity and BGCs (including 25 geosmin BGCs identified as 22 Terpene and 3 Other clusters) with and without homology to MIBiG clusters as well as color reference provided (right).

While hybrid PKS-NRPS pathways that include both PKS and NRPS domains are organized into a specific separate grouping, all other hybrid pathways that include more than one BGC are categorized in the Others class [9,13]. The Others-associated BGCs included clusters with 133 predicted products as well as 52 hybrid BGCs (Figure 2). This breadth of biosynthetic diversity from just 36 myxobacteria includes 23 out of 52 BGC-types currently designated by antiSMASH [9,10].
