*3.4. Generation of P450 Profile Heat-Maps*

P450 profile heat-maps were generated following the method reported in the literature [56,78,79]. The presence or absence of P450s in cyanobacterial species was shown with heat-maps generated using P450 family data. The data were represented as 3 for family presence (red) and −3 for family absence (green). A tab-delimited file was imported into Mev 4.9 (multi-experiment viewer) [80]. Hierarchical clustering using a Euclidean distance metric was used to cluster the data. Eight-nine cyanobacterial species formed the horizontal axis and P450 families formed the vertical axis.

#### *3.5. Secondary Metabolite Biosynthetic Gene Clusters Analysis*

Secondary metabolite BGCs analysis in cyanobacterial species was carried out following the method as mentioned previously [54,55]. Briefly, individual cyanobacterial species genome IDs (Table S1) were submitted to anti-SMASH (antibiotics & Secondary Metabolite Analysis Shell) [65] for identification of secondary metabolite BGCs. The gene cluster information generated by anti-SMASH is analyzed for the presence of P450s by manually mining the cluster sequences. Information on the type of cluster, most similar known cluster and percentage similarity to a known cluster is also noted and presented in table format. Among 114 cyanobacterial species, 11 species' genome IDs did not pass through anti-SMASH analysis. Thus, in this study, secondary metabolite BGCs data for 103 species is presented. Lists of species that are not part of the secondary metabolite cluster analysis are presented in Table S5.

#### *3.6. Data Analysis*

P450 diversity percentage analysis in cyanobacterial species was carried out following the method described elsewhere [55]. P450 diversity percentage is calculated using the formula: P450 diversity percentage = 100 × Total number of P450 families/Total number of P450s × Number of species. The average number of P450s was calculated using the formula: Average number of P450s = Number of P450s/Number of species. The average number of BGCs was calculated using the formula: Average number of BGCs = Total number of BGCs/Number of species. A new formula was developed in order to calculate gene cluster diversity percentage and is described in Section 2.4.

#### **4. Conclusions**

Research on harnessing the biotechnological potential of cyanobacterial species is gaining momentum. In this direction, this study is an attempt to provide a complete picture of P450 enzymes in different cyanobacterial species as these enzymes are the key players in primary and secondary metabolism of organisms, including the production of different secondary metabolites. Furthermore, providing names for P450s as per International P450 Nomenclature Committee rules enables researchers to make use of the cyanobacterial species P450 names presented in the study. A limited amount of cyanobacterial species P450s functional analysis revealed that cyanobacterial species P450s are unique in terms of their catalytic activity and they show high resemblance to eukaryotic P450s. Unravelling the role of P450s in carotenoid synthesis is necessary to understand their biological relevance in cyanobacterial species and also to address the evolutionary link between these species and plants since cyanobacterial species are considered as precursors of chloroplasts in plants. The mathematical formula presented in this study will enable researchers to conduct accurate comparison of secondary metabolite biosynthetic gene cluster diversity among different organisms. The highest gene cluster diversity observed for cyanobacterial species compared to species belonging to the genera *Bacillus* and *Mycobacterium* and the fact that a large number of biosynthetic gene clusters have no similar known clusters indicate that these gene clusters might encode novel secondary metabolites with new biological properties whose potential needs to be explored for the food, cosmetic and pharmaceutical industries.
