*2.1. CYP139 P450s Are Present Only in Certain Mycobacterial Category Species*

Comprehensive comparative analysis of CYP139 P450s in 1111 mycobacterial species belonging to six different categories (Table S1) revealed that CYP139 P450s are present in 894 mycobacterial species belonging to three categories, namely the *Mycobacterium tuberculosis* complex (MTBC), *M. avium* complex (MAV) and non-tuberculosis mycobacteria (NTM) (Figure 1 and Table S2). This phenomenon of identifying CYP139 P450s only in these three mycobacterial categories was also observed previously when 60 mycobacterial species were analysed [23]. Results from this study, which involved such a large data set, not only supported, but also confirmed that mycobacterial species belonging to categories such as *Mycobacterium* causing leprosy (MCL), Saprophytes (SAP) and the *Mycobacterium chelonae-abscessus* complex (MCAC) do not have CYP139 P450s in their genomes, as seen in Figure 1. Interestingly, not all mycobacterial species of MTBC, NTM and MAC categories have CYP139 P450 (Figure 1). Among 956

mycobacterial species, only 850 mycobacterial species of MTBC have CYP139 P450; 10 of 14 and 34 of 57 mycobacterial species of NTM and MAC, respectively, have this P450 (Figure 1 and Table S2). A detailed analysis of CYP139 P450s along with species names and protein ID is presented in Table S2 and the CYP139 P450 sequences are presented in Supplementary Dataset 1.

**Figure 1.** Comparative analysis of CYP139A P450s in species belonging to six different mycobacterial categories. Abbreviations: MTBC, *Mycobacterium tuberculosis* complex; MAV, *M. avium* complex; NTM, non-tuberculosis mycobacteria; MCL, *Mycobacterium* causing leprosy; SAP, Saprophytes and MCAC, *Mycobacterium chelonae-abscessus* complex. Information on mycobacterial species and CYP139A P450s is presented in Supplementary Tables S1 and S2, respectively.

Analysis of *CYP139* P450s in the genomes of mycobacterial species revealed that only a single copy of the *CYP139* P450 gene is present in all mycobacterial species (Table S2). Furthermore, P450 subfamily analysis revealed that all CYP139 P450s found in 894 mycobacterial species belong to the subfamily "A" (Figure 2). Phylogenetic analysis of CYP139A P450s revealed that CYP139A P450s grouped per their mycobacterial category, indicating after speciation CYP139A P450s were subjected to amino acid changes specific to their category (Figure 1), similar to what was observed for other P450s described elsewhere [23,24]. However, four CYP139A P450s belonging to *M. genavense* ATCC 51234 and *Mycobacterium sp*. JDM601 of NTM and *Mycobacterium sp*. UM CSW and *M. avium avium* Env 77 of MAC were aligned separately, suggesting that these CYP139A P450s had deviated from their counterparts (Figure 2). Percentage identity among CYP139 P450s further confirmed that CYP139A P450s from these species have a low percentage identity with their counterparts (Supplementary Dataset 2). CYP139A P450s of *Mycobacterium* sp. UM CSW and *M. avium avium* Env 77 have an average of ~77% and ~63% identity, whereas CYP139A P450s of *M. genavense* ATCC 51234 and *Mycobacterium* sp. JDM601 have an average of 75% and 60% with their counterparts (Supplementary Dataset 2) suggesting these P450s have been subjected to significant amino acid changes. The phenomenon of P450s not grouping with their counterpart species was also observed in fungal species, where CYP53D1 has been subjected to extensive amino acid changes [24], the same as what was observed for the four CYP139A P450s identified in this study. Determining the effect of these amino acid changes on functional specificity of four CYP139A P450s, if any, will be interesting future work.

#### *2.2. CYP139 P450 Family Ranked among Top 10 P450 Families*

Ranking of P450 families belonging to different biological kingdoms, based on the number of conserved amino acids in their protein sequence, placed the CYP139 P450 family in the twelfth rank [23,25]. While ranking the CYP139 P450 family, only 54 CYP139A P450s were used [23,25]. Identification of quite a large number of CYP139A P450s in this study necessitated re-analysis of the ranking of this P450 family. In order to identify the conservation rank, CYP139A P450s were subjected to PROfile Multiple Alignment with Local Structures and 3D constraints (PROMALS3D) [26] analysis (Supplementary Dataset 3). PROMALS3D analysis revealed the presence of 165 amino acids invariantly conserved in CYP139 P450s (Table 1). Comparative analysis with other P450 families from different

biological kingdoms revealed that the CYP139 P450 family now occupies the eighth rank compared to the twelfth rank as assigned previously (Table 1).

**Figure 2.** Phylogenetic analysis of CYP139A P450s. Different mycobacterial categories were indicated in different colours. CYP51B1 from *Mycobacterium tuberculosis* H37Rv is used as an outgroup. Abbreviations: MTBC, *Mycobacterium tuberculosis* complex; MAV, *M. avium* complex; NTM, non-tuberculosis mycobacteria. A high-resolution phylogenetic tree is provided in Supplementary Figure S1.

**Table 1.** Comparative amino acid conservation analysis of CYP139 P450 family with top 10 ranked P450 families [23,25]. The conservation index score is obtained as described in the section on materials and methods, following the procedure described elsewhere [27]. The conservation score (5–9) obtained via PROMALS3D is presented in the table, where the number 9 indicates invariantly conserved amino acids in P450 members. P450 families were arranged from the highest to the lowest number of amino acids conserved. CYP139 P450 family is indicated in bold.

