*2.1. Streptomyces Species Have Large Number of P450s*

Genome-wide data mining and annotation of P450s in 203 *Streptomyces* species (Supplementary Table S1) revealed the presence of 5460 P450s in their genomes (Figure 1, Table 1, and Supplementary Dataset 1). The P450 count in the *Streptomyces* species ranged from 10 to 69 P450s, with an average of 27 P450s. Apart from the complete P450 sequences, pseudo-P450s (6 hit proteins), P450-fragments (114 hit proteins), P450-derived glycosyltransferase activator proteins (22 hit proteins), and P450 false-positive hits (2 hit proteins) were also found in some *Streptomyces* species (Supplementary Table S2). The presence of these types of P450 hit proteins in species is common and, because of the nature of these proteins, they were not included in the study for further analysis. Among *Streptomyces* species, *Streptomyces albulus* ZPM was found to have the highest number of P450s in its genome (69 P450s) followed by *S. clavuligerus* (65 P450s); the lowest number of P450s was found in *Streptomyces* sp. CNT372 and *S. somaliensis* DSM 40738 (10 P450s each) (Figure 1 and Table 1). Analysis of the most prevalent number of P450s revealed that 19 P450s was the prevalent number in *Streptomyces* species (Table 1). The average number of P450s in *Streptomyces* species was found to be higher than in *Bacillus* species [22] and cyanobacterial species [23], and almost the same as in mycobacterial species [21] (Table 2). A point to be noted is that the number of species greatly influences the average number of P450s and, thus, the higher the number of species in the analysis, the better and more accurate the results, as mentioned elsewhere [20,23]. This is the reason *Streptomyces* species showed a slightly lower average number of P450s in their genomes compared to mycobacterial species, since only 60 species were employed in the study [21]. Thus, future annotation of P450s in more mycobacterial species will provide accurate insights into this aspect.

**Figure 1.** Phylogenetic analysis of *Streptomyces* P450s. In total, 5 460 P450s were used to construct the tree and the dominant P450 families are highlighted in different colors and indicated in the figure. A high-resolution phylogenetic tree is provided in Supplementary Dataset 2.



**Table 1.** *Cont.*


Abbreviations: No. F: number of P450 families; No. SF: number of P450 subfamilies.


**Table 2.** Comparative analysis of key features of P450s in different bacterial species.

Abbreviations: BGC: biosynthetic gene cluster. Symbol: \* 103 cyanobacterial species [23] and 144 *Streptomyces* species were used for BGC analysis.

#### *2.2. CYP107 Family Was Found to Be Dominant and Conserved in 203 Streptomyces Species*

Analysis of P450 families and subfamilies in 203 *Streptomyces* species revealed that 5460 P450s could be grouped into 253 P450 families and 698 P450 subfamilies (Table 2 and Supplementary Table S3). Among *Streptomyces* species, *S. clavuligerus* had the highest number of P450 families (30) and P450 subfamilies (58) in its genome (Table 1). Although *S. rimosus rimosus* ATCC 10970 had the same number of P450 families as *S. clavuligerus*, the number of subfamilies was the third highest (52 subfamilies) (Table 1). One interesting observation is that the species with the highest number of P450s did not have the highest number of P450 families, suggesting that some of the P450 families were populated (bloomed). Blooming of P450 families is common across species, and this phenomenon has been observed in different species belonging to different biological kingdoms [24,26,34–36]. Phylogenetic analysis revealed that some of the P450 families were scattered across the evolutionary tree (Figure 1). This phenomenon was also observed previously for *Streptomyces* species P450s, and it has been hypothesized that the phylogenetic-based annotation of P450s could be detecting similarity cues beyond a simple percentage identity cutoff [20]. Analysis of P450 families in the 155 *Streptomyces* species used in this study revealed the presence of 38 new P450 families, i.e., CYP1200A1, CYP1216A1, CYP1223A1, CYP1228A1, CYP1236A1, CYP1238A1, CYP1265A1, CYP1279A1, CYP1369A1, CYP1432A1, CYP1518A1, CYP1529A1, CYP1543A1, CYP1568A1, CYP159A1, CYP1607A1, CYP1658A1, CYP1759A1, CYP1810A1, CYP1832A1, CYP1866A1, CYP1896A1, CYP1920A1, CYP1929A1, CYP1931A1, CYP1940A1, CYP1941A1, CYP1943A1, CYP1972A1, CYP1984A1, CYP1994A1, CYP2076A1, CYP2080A1, CYP2134A1, CYP2180A1, CYP2349A1, CYP2427A1, and CYP2723A1. A detailed analysis of the number of new P450 families found in different *Streptomyces* species is presented in Supplementary Table S2.

Among the P450 families, the CYP107 family was found to be dominant, with 1 235 P450s in *Streptomyces* species, followed by CYP105 with 684 P450s, CYP157 with 525 P450s, and CYP154 with 510 P450s (Figure 2 and Supplementary Table S3), indicating the possible blooming of these families in *Streptomyces* species, as observed in species belonging to different biological kingdoms [24,26,34–36]. It is interesting to note that the CYP107 family was also found to be dominant in the *Bacillus* species [22], indicating its dominant role in the synthesis of secondary metabolites in both the *Streptomyces* and *Bacillus* genera. An interesting pattern was observed when comparing subfamily diversity in the dominant P450 families (Figure 2, Table 3, and Supplementary Table S3). P450 families such as CYP107, CYP105, CYP183, and CYP113 had the highest diversity at the subfamily level, as numerous subfamilies were found in these families (Supplementary Table S3). This phenomenon of the highest diversity in P450 families being found in *Streptomyces* species is not uncommon, and this proved to be the key contributor in the production of diverse secondary metabolites in *Streptomyces* species compared to mycobacterial species [20]. Strong support for this argument is the fact that the CYP105 P450 family members in *Streptomyces* species have been shown to be involved in oxidation of numerous

endogenous and exogenous compounds and in the generation of different secondary metabolites [32]. However, in contrast to the diversity at subfamily level for the P450 families CYP107, CYP105, CYP183, and CYP113, the rest of the dominant P450 families had single or double or triple subfamilies, indicating subfamily-level blooming in these P450 families (Table 3).

**Figure 2.** P450 family and subfamily analysis in 203 *Streptomyces* species. Only the dominant P450 families with more than 40 P450s are shown in the figure. Detailed data on P450 families and subfamilies are presented in Supplementary Table S3.

**Table 3.** P450 subfamily analysis in the dominant families in 203 *Streptomyces* species. The number of members in the dominant P450 subfamily is presented. Detailed data on different subfamilies are presented in Supplementary Table S3.


P450 family conservation analysis revealed that the CYP107 family is conserved in all 203 *Streptomyces* species (Figure 3 and Supplementary Dataset 3). P450 families such as CYP156, CYP105, CYP154 and CYP157 are also present in the majority of the *Streptomyces* species (Figure 3 and Supplementary Dataset 3).

**Figure 3.** Heat-map of P450 family conservation analysis in *Streptomyces* species. In the heat-map, the presence and absence of P450 families are indicated in red and green colors. The horizontal axis represents P450 families and the vertical axis represents *Streptomyces* species.
