2.4.1. Energy Metabolism

Energy is a potential, needed to perform work and maintain life, usually acquired by breaking a chemical bond and stored by making another chemical bond, very often in the form of ATP. Methane metabolism is one of the UMPs by which bacteria can obtain energy by oxidizing one-carbon compounds (e.g., methanol, methane). Methanotrophic bacteria are generally considered environmentally friendly organisms, as they contribute to oxidizing environmental methane, thereby mitigating the effects of global warming [32]. Methane monooxygenases are the main enzymes to catalyze methane oxidation [33]. There are several UMPs in bacteria, which are related to photosynthesis and carbon fixation and can be exploited for the purpose of drug target identification.

### 2.4.2. Biosynthesis of Secondary Metabolites

Secondary metabolites are molecules not essentially required for the survival of an organism. A large portion of bacterial metabolism deals with the biosynthesis of secondary metabolites. However, these pathways have a minimal role in bacterial growth and viability and are not considered a suitable target for antibiotics. Even though secondary metabolites are not considered to be ideal as drug targets, many of these pathways are manipulated by researchers for valuable purposes such as penicillin and cephalosporin biosynthesis, carbapenem biosynthesis and streptomycin biosynthesis.

### 2.4.3. Amino Acid Metabolism

Amino acid metabolism in bacteria is diverse in nature and performs a pivotal role in maintaining bacterial growth. Amino acid metabolism has emerged as a potential target for new antibiotics, and a number of new drug targets have been proposed in recent years [34–37]. Some of these drug targets have shown promising results. Lysine biosynthesis, an essential pathway in bacteria for survival and growth, is reported to be a potential target for antibiotics [38,39]. Similarly, *D*-alanine metabolism is a significant target; an antibiotic *<sup>D</sup>*-cycloserine targeting *D*-alanine metabolism is already in clinical use against *Mycobacterium tuberculosis* [40,41]. The heterogeneity of amino acid metabolism implies an enormous scope for discovering new antibiotic targets using modern computational tools.

Other types of metabolic activities in bacteria, such as terpenoids and polyketides, glycan biosynthesis and drug resistance, also perform supportive functions for bacterial growth and survival; however, these metabolic routes are not prioritized targets for anti-bacterial drugs. Rather, these metabolic routes are often manipulated for advantageous purposes [42].

### *2.5. Shortlisting of Proteins Sequences as Druggable*

The potential drug targets were shortlisted based on obtained information from earlier successful literature reports. The druggability of non-host uncharacterized protein sequences was determined by performing BLASTp against the druggable protein sequences present in the DrugBank Database. For this purpose, the earlier shortlisted, non-host, uncharacterized proteins, which are essential in metabolic pathways, were analyzed for druggability by comparing their sequences with the DrugBank Database. In this search, only one protein was prioritized in TAH-135, whereas four and seven potential drug targets emerged with the OCU-466 and A5 strains, respectively (Table 1). All these potential drug targets were similar to the FDA-approved drug target sequences in the DrugBank Database, including the DNA polymerase III subunit ε of the TH-135 strain, Interα-trypsin inhibitor heavy chain H4, exopolyphosphatase, DNA polymerase III subunit ε, mannoside ABC transport system and sugar-binding protein of the OCU-466 strain. In addition to all the proteins from the OCU-466 strain, diacylglycerol acyltransferase/mycolyltransferase, Ag85C and nickel-binding periplasmic protein were found for the A5 strain.


**Table 1.** Protein drug targets of *M. avium* subsp. *hominissuis.*


**Table 1.** *Cont*.

It is noteworthy that all the proposed drug targets could be analyzed for 3D structural information to prioritize novel drug targets against pathogens. Therefore, BLASTp was performed for the target proteins against the Protein Data Bank (PDB) database, which revealed that 12 protein sequences had no 3D structure available ye<sup>t</sup> in the PDB. Therefore, this study offers those 12 proteins' sequences to not only consider as a potential druggable genome, but also for future studies of 3D structure determination either by homology modeling (template-based) or by ab initio (template-free) methods [43].

### **3. Materials and Methods**

An overview of the subtractive genomics approach is illustrated in Figure 4.

**Figure 4.** Workflow of the subtractive genomics approach.

### *3.1. Extraction of the Host–Pathogen Proteome*

The whole proteome of the host, i.e., *Homo sapiens*, and pathogen, i.e., *Mycobacterium avium* subsp. *Hominissuis,* were downloaded from the UniProt KB database [44] to retrieve protein sequences. The drug target identification approach was carried out on the pathogenic MAH-TH135, MAH-OCU466 and A5 strains.

### *3.2. Grouping of Common Proteins in All Strains*

The CD-HIT tool [45] clusters the protein or nucleotide sequences and reduces redundancy and manual efforts in sequence analysis. This tool was used as a standalone command line tool to remove paralogous or duplicated sequences of all strains with a threshold value of 80%. The remaining set of proteins was grouped as orthologous sequences.

### *3.3. Identification of Non-Homologous Proteins*

Standalone BLAST version 2.8.1 was downloaded from the NCBI FTP server [46]. The orthologous sequences were subjected to BLASTp against the *H. sapiens* database with an expectation value (e-value) of 10−<sup>3</sup> [47]. The output was obtained with keywords of "no hits found" for unique proteins and "significant alignments" for the sequences having similarity with the human (host) proteome. The results were analyzed, and only protein sequences "with no homology with the human host" were retained, while the rest were removed. Those proteins were further labelled as non-homologous proteins, and finally, they were extracted using our in-house scripts.

### *3.4. Finding of Essential Genes*

The genes required to sustain the life cycle of bacteria are called essential genes. The Database of Essential Genes (DEG) contains lists of genes with their corresponding sequences, which are essential for the survival of bacterial life. [48]. Therefore, the DEG was used to find the sequences that are essential to the bacterial pathogen studied here (i.e., *M. avium* subsp. *hominissuis*). The non-homologous proteins were aligned with the DEG database using BLASTp, and the expectation value was set to 10−5. As a result, the non-homologous essential genes, which may have hypothetical or uncharacterized proteins, were obtained.

### *3.5. Information about Metabolic Pathways*

The metabolic pathways of the identified non-homologous essential proteins were searched in the Kyoto Encyclopedia of Genes and Genomes (KEGG) [49] through the KAAS server. KAAS [50] uses BLASTp for the comparison of query proteins against the KEGG database and annotates functions. KAAS provides the KEGG Orthology (KO) identifiers and information on the metabolic pathways of the proteins.

### *3.6. Annotation of the Curated Proteins*

Annotation of proteins includes information about the location of proteins in various regions of the cell and the family to which it belongs. PSORTb version 3.0 [51] is well known to predict the subcellular localization (SCL) of proteins. The SCL includes di fferent compartments, such as cytoplasmic membrane, cytoplasm, cell wall and extracellular and unknown regions of the cell where the proteins reside. All the non-homologous essential, as well as hypothetical, proteins were subjected to the protein databases with known functions using SCL BLAST by the web-based server. SVM-Prot [52] is an online tool for the classification of protein functional families. It applies the machine-learning method and predicts a diverse set of molecular and biological functions covering all major classes of enzymes, channels, transporters, receptors, DNA/RNA binding proteins, etc. and covering 192 functional families of proteins. Those proteins whose functions are still unknown were labeled as non-homologous, hypothetical/uncharacterized proteins and passed through the server of SVM-Prot to classify them into functional families.

### *3.7. Druggability of the Shortlisted Sequences*

In order to detemine the novel drug targets, standalone BLASTp was run between hypothetical non-homologous essential proteins, and drug target sequences were taken from the DrugBank Database [53] with an e-value cuto ff 10−3. The DrugBank Database provides detailed information on drugs and drug targets. A large database shows up to 8261 drugs, including FDA-approved drugs; experimental and nutraceutical drugs are available in the DrugBank Database.
