*3.4. P450 Diversity Percentage Analysis*

P450 diversity percentage analysis was carried out as described elsewhere [25,34]. Briefly, the P450 diversity percentage in *Bacillus* species was measured as a percentage contribution of the number of P450 families in the total number of P450s.

#### *3.5. Generation of P450 Profile Heat-Maps*

The presence or absence of P450s in *Bacillus* species was shown with heat-maps generated using P450 family data. The data was represented as −3 for family presence (green) and 3 for family absence (red). A tab-delimited file was imported into Mev (Multi-experiment viewer) [52]. Hierarchical clustering using a Euclidean distance metric was used to cluster the data. A hundred and twenty-eight *Bacillus* species formed the horizontal axis (see Supplementary dataset 3 for codes) and CYP family numbers formed the vertical axis.

#### *3.6. Secondary Metabolite BGCs Analysis*

Individual *Bacillus* species genome ID and plasmids IDs from the various species databases (Table S2) were submitted to anti-SMASH [30] for identification of secondary metabolite BGCs. Results were downloaded both in the form of gene cluster sequences and Excel spreadsheets representing species-wise cluster information, and finally, P450s that are part of a specific gene cluster were identified. Standard gene cluster abbreviation terminology available at anti-SMASH database [30] was maintained in this study.
