*2.3. Numerous P450s Involved in Secondary Metabolite Production in Streptomyces Compared to Other Bacterial Species*

Analysis of 144 *Streptomyces* species' genomes revealed the presence of 4457 BGCs in their genomes (Table 2 and Supplementary Table S4). The number of BGCs found in 144 *Streptomyces* species was found to be higher than in mycobacterial, *Bacillus*, and cyanobacterial species (Table 2), indicating the superiority of the *Streptomyces* species in producing secondary metabolites; two-thirds of the antibiotics used in the world currently come from these species [28]. The average number of BGCs in *Streptomyces* species was found to be double compared to mycobacterial species and close to four times higher than that in *Bacillus* and cyanobacterial species (Table 2). Analysis of BGCs revealed that a large proportion of *Streptomyces* species' P450s are part of BGCs compared to other bacterial species; 1231 P450s in *Streptomyces* species compared to 112 in *Bacillus* species, 204 in mycobacterial species, and 27 in cyanobacterial species (Table 2). A total of 1231 P450s were found to be part of BGCs belonging to 135 P450 families (Figure 4 and Supplementary Table S5). Among 135 P450 families, P450s belonging to the CYP107 family were dominantly present in BGCs, followed by CYP105, CYP157, and CYP154 (Figure 4 and Supplementary Table S5). This clearly suggests that the P450 families that are bloomed in *Streptomyces* species are actually involved in the production of secondary metabolites. This strongly supports the proposed hypothesis that in *Streptomyces* species, P450s are evolved to generate secondary metabolites, thus helping these bacteria to thrive in their environment [20]. In order to assess the *in silico* results generated by this study, in which a large number of *Streptomyces* species P450s were predicted to be involved in secondary metabolite production, we performed an extensive literature review to identify *Streptomyces* P450s involved in the production of secondary metabolites. As shown in Table 4, a large number of P450s belonging to different P450 families, as predicted in this study, were found to be involved in the production of different secondary metabolites. This strongly supports the notion that the P450s identified as part of different BGCs in this study produce secondary metabolites.

Analysis of P450 BGCs revealed the presence of 235 types of BGCs, where the BGC type, such as terpene, was dominant, followed by T1PKS, NRPS, and T3PKS (Figure 4 and Supplementary Table S5). A detailed analysis of P450s that are part of BGCs and types of BGCs containing P450s is presented in Supplementary Table S5. Analysis of the linkage between a particular P450 family and BGC revealed that some P450s are linked to a particular BGC (Supplementary Table S4), indicating horizontal transfer of BGCs between *Streptomyces* species. *Streptomyces* P450s such as CYP283A are linked to bacteriocin and bottromycin; CYP113K3 is linked to Bacteriocin-Nrps, CYP124G is linked to melanin, and CYP105A is linked to NRPS and butyrolactone. A point to be noted is that horizontal transfer of BGCs among different organisms is well-documented in the literature [37].



#### **Table 4.** *Cont.*


#### **Table 4.** *Cont.*

Note: For some P450s, protein notations are given in parentheses. These P450s were annotated in this study (indicated with asterisk superscript) and previously (indicated with exclamation mark) [20] by browsing the individual biosynthetic gene-cluster sequences reported in the literature. To enable readers to match the P450s with the published literature, we have provided protein notations in the parentheses. If known, the name of the secondary metabolite of which P450s are involved in production is indicated in the table.

#### **3. Materials and Methods**

#### *3.1. Information on Streptomyces Species and Genome Database*

In total, 203 *Streptomyces* species genomes (permanent and finished draft genomes) available for public use at the Joint Genome Institute Integrated Microbial Genomes and Microbiomes (JGI IMG/M) [99] and Kyoto Encyclopedia of Genes and Genomes (KEGG) [100] were used in this study. The 203 *Streptomyces* species included 48 *Streptomyces* species for which P450s and BGCs were annotated previously [20]. For these 48 species, P450 and BGCs data were retrieved from published articles and used in the study [20]. Thus, 155 *Streptomyces* species were data-mined for P450s and BGCs in this study. Information on the species used in the study is provided in Supplementary Table S1.

#### *3.2. Genome Data Mining and Identification of P450s*

Identification and annotation of P450s in *Streptomyces* species were carried out following a method described elsewhere [20–22]. Briefly, each *Streptomyces* species genome available at JGI IMG/M [99] was searched for P450s using the InterPro code "IPR001128". The hit protein sequences were then searched for the presence of P450 characteristic motifs such as EXXR and CXG [101]. Proteins having one of these motifs were considered pseudo-P450s, and proteins that were short in amino acid length and lacking both motifs as P450 fragments. Neither the pseudo-P450s nor the P450 fragments were considered for further analysis.
