*Article* **Identification of Therapeutic Targets in an Emerging Gastrointestinal Pathogen** *Campylobacter ureolyticus* **and Possible Intervention through Natural Products**

**Kanwal Khan 1,†, Zarrin Basharat 2,† , Khurshid Jalal 3,\* ,†, Mutaib M. Mashraqi <sup>4</sup> , Ahmad Alzamami <sup>5</sup> , Saleh Alshamrani <sup>4</sup> and Reaz Uddin <sup>1</sup>**


**Abstract:** *Campylobacter ureolyticus* is a Gram-negative, anaerobic, non-spore-forming bacteria that causes gastrointestinal infections. Being the most prevalent cause of bacterial enteritis globally, infection by this bacterium is linked with significant morbidity and mortality in children and immunocompromised patients. No information on pan-therapeutic drug targets for this species is available yet. In the current study, a pan-genome analysis was performed on 13 strains of *C. ureolyticus* to prioritize potent drug targets from the identified core genome. In total, 26 druggable proteins were identified using subtractive genomics. To the best of the authors' knowledge, this is the first report on the mining of drug targets in *C. ureolyticus*. UDP-3-O-acyl-N-acetylglucosamine deacetylase (LpxC) was selected as a promiscuous pharmacological target for virtual screening of two bacterialderived natural product libraries, i.e., postbiotics (*n* = 78) and streptomycin (*n* = 737) compounds. LpxC inhibitors from the ZINC database (*n* = 142 compounds) were also studied with reference to LpxC of *C. ureolyticus*. The top three docked compounds from each library (including ZINC26844580, ZINC13474902, ZINC13474878, Notoginsenoside St-4, Asiaticoside F, Paraherquamide E, Phytoene, Lycopene, and Sparsomycin) were selected based on their binding energies and validated using molecular dynamics simulations. To help identify potential risks associated with the selected compounds, ADMET profiling was also performed and most of the compounds were considered safe. Our findings may serve as baseline information for laboratory studies leading to the discovery of drugs for use against *C. ureolyticus* infections.

**Keywords:** pan-genome; *Campylobacter ureolyticus*; UDP-3-O-acyl-N-acetylglucosamine deacetylase; LpxC; campylobacteriosis

#### **1. Introduction**

*Campylobacter ureolyticus* is a Gram-negative bacterium previously classified as *Bacteroides ureolyticus* and belongs to a class of pathogens that cause gastroenteritis [1]. *Campylobacter* outnumbers *Shigella*, *E. coli O157*, and *Salmonella* as the most common cause of bacterial enteritis [2]. Previously, the most common *Campylobacter* species linked with human illness were *C. jejuni* and *C. coli*, but breakthroughs in molecular diagnostics paired

**Citation:** Khan, K.; Basharat, Z.; Jalal, K.; Mashraqi, M.M.; Alzamami, A.; Alshamrani, S.; Uddin, R. Identification of Therapeutic Targets in an Emerging Gastrointestinal Pathogen *Campylobacter ureolyticus* and Possible Intervention through Natural Products. *Antibiotics* **2022**, *11*, 680. https://doi.org/10.3390/ antibiotics11050680

Academic Editors: Dóra Kovács and Simone Carradori

Received: 8 April 2022 Accepted: 16 May 2022 Published: 18 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

with the development of innovative culture methods have helped identify and isolate a range of under-reported and fastidious *Campylobacter* species, including *C. ureolyticus*. Farm animals are the primary reservoir for *Campylobacter* sp. infections and the primary cause of campylobacteriosis. Farm animals are also the leading source of bacterial food poisoning and *Campylobacter* gastrointestinal illnesses worldwide. Campylobacter foodborne illness is a concern and an expensive burden for the human population, accounting for 8.4% of all diarrhea cases worldwide. In most cases, *C. ureolyticus* has been recovered from human samples, with just one report of *C. ureolyticus* isolation from healthy horse endometria. Following that, a retrospective investigation of over 7000 patients with diarrhea found *C. ureolyticus* in 23.8% of *Campylobacter*-positive samples, marking the first discovery of *C. ureolyticus* in the faeces of gastroenteritis patients and highlighting the species' status as an emerging enteric pathogen [3–5].

*C. ureolyticus* has been accompanied by a variety of illnesses in the past, including bacterial vaginosis [6], gangrenous lesions [7], superficial ulcers [1,8], nongonococcal urethritis [1,9], male infertility [10] and more recently, ulcerative colitis [11]. Furthermore, *C. ureolyticus* has been associated with periodontal diseases, such as gingivitis and periodontitis, similar to numerous other new and atypical *Campylobacter* species [9]. Recent research has led to the discovery and isolation of *C. ureolyticus* as the sole pathogen in the feces of many diarrheal patients. It is now thought to be the second most common *Campylobacter* species found in patients suffering from diarrhea, surpassing the established pathogens *C. coli* and *C. jejuni*. Furthermore, an examination of infectivity data indicates the predominance of this pathogen in diarrheal patients between the ages of 5 and 70, suggesting that it impacts the pediatric population more than adults. Patient data for gastrointestinal illnesses suggest that *C. ureolyticus* is an emerging gastrointestinal pathogen [11].

Its pathogenesis is based on invasion, colonization, adhesion, and toxin release. Previously, the absence of genetic data hindered the understanding of in-depth pathogenic mechanisms and virulence, but the democratization of next-generation sequencing (NGS) has made it easier to explore the species at genome scale [12]. Pan-genomic studies help exploit the entire repertoire of genes from all strains within a species, which enables investigation of genomic diversity and similarity [13]. This type of information aids the understanding of species evolution and pathology and helps identify drug targets for pan-treatment of *C. ureolyticus* [7,14]. In light of this, the current study aimed to investigate multiple isolates of *C. ureolyticus* to better understand its pathogenesis and characterize the core and pan-genome subsets to prioritize novel therapeutic targets from shared core genes. To the best of our knowledge, this is the first pan-genome study on *C. ureolyticus*, as well as the first study to make use of core gene data to exploit therapeutic targets. Virtual screening was also carried out against a selected target, using natural product libraries to identify potent inhibitors that could prove useful for campylobacteriosis treatment.

#### **2. Results**

#### *2.1. Pan-Genome and Resistome Analysis*

The pan-genome of 13 *C. ureolyticus* strains was composed of ~5500 accessory, 1182 core, and 400 unique genes. The pan-genome curve showed a Bpan value of 0.17 (i.e., <1), representing the open nature of the pan-genome of this pathogen (Figure 1A). The comparative study showed that each strain shared ~300–500 genes in the accessory genome fraction, with the highest number retrieved for the strain NCTC10941 (*n* = 535 genes), while strain ACS-301-V-Sch3b harbored the maximum number of unique genes (*n* = 114 genes). Additionally, five strains (ACS-301-V-Sch3b, LFYP111, LMG 6451, RIGS 9880, and UMB0112) lacked certain genes that were present in other strains (Supplementary File S2). The COG functional analysis provided insights into the conserved proteins and their specific metabolic pathways. The core genome showed the highest number of genes pertaining to metabolism, while the accessory genome contained the largest fraction of genes related to cellular processes and signaling. Further analysis revealed that the core genome was enriched in amino acid transport/metabolism, translational, ribosomal structure/biogenesis, and energy production/conversion apparatus, while the accessory genome was enriched in the cell wall, membrane, envelope biogenesis and inorganic ion transport metabolism. The unique genome comprised the main fraction of genes related to information processing and storage and those of unknown function (Figure 1B). was enriched in amino acid transport/metabolism, translational, ribosomal structure/bio‐ genesis, and energy production/conversion apparatus, while the accessory genome was enriched in the cell wall, membrane, envelope biogenesis and inorganic ion transport me‐ tabolism. The unique genome comprised the main fraction of genes related to information processing and storage and those of unknown function (Figure 1B).

while strain ACS‐301‐V‐Sch3b harbored the maximum number of unique genes (*n* = 114 genes). Additionally, five strains (ACS‐301‐V‐Sch3b, LFYP111, LMG 6451, RIGS 9880, and UMB0112) lacked certain genes that were present in other strains (Supplementary File S2). The COG functional analysis provided insights into the conserved proteins and their spe‐ cific metabolic pathways. The core genome showed the highest number of genes pertain‐ ing to metabolism, while the accessory genome contained the largest fraction of genes related to cellular processes and signaling. Further analysis revealed that the core genome

*Antibiotics* **2022**, *11*, x FOR PEER REVIEW 3 of 19

**Figure 1.** (**A).** A dot plot of the pan‐genome vs. the core genome of 13 *C. ureolyticus* strains. (**B**) COG distribution of genome fractions for *C. ureolyticus* strains. **Figure 1.** (**A).** A dot plot of the pan-genome vs. the core genome of 13 *C. ureolyticus* strains. (**B**) COG distribution of genome fractions for *C. ureolyticus* strains.

Additionally, the phylogenetic tree generated for the pan‐ and core genomes high‐ lighted monophyly in all 13 strains (Figure 2). Strains with greater proportions of shared genes have been found in close lineages, hence the proportion of pan‐genes between them Additionally, the phylogenetic tree generated for the pan- and core genomes highlighted monophyly in all 13 strains (Figure 2). Strains with greater proportions of shared genes have been found in close lineages, hence the proportion of pan-genes between them reveals their evolutionary closeness. In *C. ureolyticus*, the average proportion of common genes was ~61.86 percent. Phylogenetically, all strains represented in the pan- and core genomes were found to be a group in almost the same clade, indicating similarities.

**Figure 2.** Curved phylogenomic trees based on: (**A**) the pan‐genome of 13 *C. ureolyticus* strains (SBL is 0.71, while the distance in millions of years is 0.02); and (**B**) the core genome of the studied strains (SBL value is 0.08, while the distance in millions of years is 0.005). **Figure 2.** Curved phylogenomic trees based on: (**A**) the pan-genome of 13 *C. ureolyticus* strains (SBL is 0.71, while the distance in millions of years is 0.02); and (**B**) the core genome of the studied strains (SBL value is 0.08, while the distance in millions of years is 0.005).

reveals their evolutionary closeness. In *C. ureolyticus,* the average proportion of common genes was ~61.86 percent. Phylogenetically, all strains represented in the pan‐ and core genomes were found to be a group in almost the same clade, indicating similarities.

The resistome analysis of these 13 strains resulted in the identification of certain an‐ tibiotic resistance genes (ARGs), found in the core and accessory genomes but absent in the unique genome fraction. The core genome had only the *gyrA* resistance gene, which encodes for fluoroquinolone via antibiotics that target the alteration mechanism [15]. Ad‐ ditionally, it was observed that the *gyrA* gene contained a single‐nucleotide polymor‐ phism at position T86I. This mutation has previously been well characterized in *Campylo‐ bacter* species [16–19] and has been tested as a biomarker to detect fluoroquinolone re‐ sistance in *C. jejuni* [20]. However, no such report for *C. ureolyticus* exists yet. In the acces‐ sory genome, two resistance genes, OXA‐85 and TetM, were identified, which encode for OXA‐beta‐lactamases and tetracycline‐resistant ribosomal protection protein. These pro‐ teins showed resistance to carbapenem, cephalosporin, penam class, and tetracycline an‐ tibiotics through antibiotic inactivation. OXA‐85, encoded by FUS‐1, is a narrow‐spectrum beta‐lactamase. It has been isolated previously from *Fusobacterium nucleatum* subsp. *poly‐ morphum* [21] but not yet reported in any *Campylobacter* spp. This means that this gene could have been introduced in *C. ureolyticus* via horizontal gene transfer. TetM has been widely reported in many species before [22–27] and may be transposon‐ or chromosomal‐ The resistome analysis of these 13 strains resulted in the identification of certain antibiotic resistance genes (ARGs), found in the core and accessory genomes but absent in the unique genome fraction. The core genome had only the *gyrA* resistance gene, which encodes for fluoroquinolone via antibiotics that target the alteration mechanism [15]. Additionally, it was observed that the *gyrA* gene contained a single-nucleotide polymorphism at position T86I. This mutation has previously been well characterized in *Campylobacter* species [16–19] and has been tested as a biomarker to detect fluoroquinolone resistance in *C. jejuni* [20]. However, no such report for*C. ureolyticus* exists yet. In the accessory genome, two resistance genes, OXA-85 and TetM, were identified, which encode for OXA-betalactamases and tetracycline-resistant ribosomal protection protein. These proteins showed resistance to carbapenem, cephalosporin, penam class, and tetracycline antibiotics through antibiotic inactivation. OXA-85, encoded by FUS-1, is a narrow-spectrum beta-lactamase. It has been isolated previously from *Fusobacterium nucleatum* subsp. *polymorphum* [21] but not yet reported in any *Campylobacter* spp. This means that this gene could have been introduced in *C. ureolyticus* via horizontal gene transfer. TetM has been widely reported in many species before [22–27] and may be transposon- or chromosomal-mediated [25,27].

#### mediated [25,27]. *2.2. Differential Sequence Mining and Therapeutic Target Identification*

*2.2. Differential Sequence Mining and Therapeutic Target Identification* The identified 1882 core genes of *C. ureolyticus* were further used for downstream drug target identification. To prioritize potent drug targets, paralogous, essential, and non‐homologous sequence identifications are the crucial steps. CD‐Hit redundancy anal‐ ysis was used to eliminate paralogous sequences from the identified core fraction data, The identified 1882 core genes of *C. ureolyticus* were further used for downstream drug target identification. To prioritize potent drug targets, paralogous, essential, and nonhomologous sequence identifications are the crucial steps. CD-Hit redundancy analysis was used to eliminate paralogous sequences from the identified core fraction data, resulting in the discovery of 1178 paralogous sequences. The role of essential proteins helps in the survival of the pathogen. We used BLASTp with an E-value of 10−<sup>5</sup> to locate the translated gene product against the DEG and CEG databases. This resulted in the shortlisting of 757 and 659 essential proteins, respectively. The intersection of this dataset revealed 647 common proteins. Furthermore, there were 109 *C. ureolyticus* proteins non-homologous

to the gut bacterial proteome. These proteins were involved in critical cellular and metabolic processes. Selective targeting of these proteins could prevent cytotoxic reactions and harmful effects in the human host.

The ability of a small-molecule medication to regulate the activity of a therapeutic target is known as druggability. Proteins with a high frequency of sequence similarities (80% or more) were considered as druggable targets. Among 109 proteins, 35 were identified as druggable. These targets were then examined for their virulence as well. Among these, 26 proteins were found to be virulent (Table 1) and belonged to either metabolism, information or signaling, and cellular process families of proteins.


**Table 1.** Drug targets of *C. urolyticus* that are virulent in nature. The selected target is shown in bold.

UDP-3-O-Acyl-N-acetylglucosamine deacetylase (LpxC) was selected as a therapeutic target. It is a zinc-dependent metalloamidase and catalyzes the second and final step of lipid A production [28]. It eradicates the acetyl group from UDP-(3-O-(R-3 hydroxymyristoyl))-N-acetylglucosamine and results in the production of UDP-(3-O-(R-3 hydroxymyristoyl))-glucosamine and acetate. The ensuing enzymatic reaction converts UDP-(3-O-(R-3-hydroxymyristoyl))-glucosamine to lipid A, which is later integrated into lipopolysaccharides [29]. It is an appealing and validated target for the development of novel antibacterial medicines to treat Gram-negative infections because of its crucial role in lipid A production. Due to the absolute dependency of the microbes on this biosynthetic pathway, along with its nonexistence in humans [30], this potential drug target was chosen for further processing in *C. ureolyticus*. UDP‐3‐O‐Acyl‐N‐acetylglucosamine deacetylase (LpxC) was selected as a therapeu‐ tic target. It is a zinc‐dependent metalloamidase and catalyzes the second and final step of lipid A production [28]. It eradicates the acetyl group from UDP‐(3‐O‐(R‐3‐hy‐ droxymyristoyl))‐N‐acetylglucosamine and results in the production of UDP‐(3‐O‐(R‐3‐ hydroxymyristoyl))‐glucosamine and acetate. The ensuing enzymatic reaction converts UDP‐(3‐O‐(R‐3‐hydroxymyristoyl))‐glucosamine to lipid A, which is later integrated into lipopolysaccharides [29]. It is an appealing and validated target for the development of novel antibacterial medicines to treat Gram‐negative infections because of its crucial role in lipid A production. Due to the absolute dependency of the microbes on this biosynthetic pathway, along with its nonexistence in humans [30], this potential drug target was cho‐ sen for further processing in *C. ureolyticus*.

WP\_018713283.1 130

#### *2.3. Structure Prediction and Inhibitor Screening 2.3. Structure Prediction and Inhibitor Screening*

showing 93% residues in the favored region.

*Antibiotics* **2022**, *11*, x FOR PEER REVIEW 6 of 19

24 Aspartate 1‐decarboxylase Metabolism WP\_018713496.1 115

processes

<sup>23</sup> Biopolymer transporter ExbD Signaling and cellular

The structure of *E. coli* LpxC (PDB ID: 4MDT) was utilized as a template due to its high sequence identity of 44 percent with the LpxC of *C. ureolyticus*. A 3D stereochemical model (Figure 3A) revealed 95% residues in the core favorable area (Figure 3B) and 2.1% outliers, with a Z-score of 0.78. Around 26% of the proteins consisted of alpha helices and 38% of beta-sheets, while 8% were disordered. Since the metal cofactor is important for LpxC catalytic activity, the zinc ion was also kept bound with the structure for docking. The structure of *E. coli* LpxC (PDB ID: 4MDT) was utilized as a template due to its high sequence identity of 44 percent with the LpxC of *C. ureolyticus*. A 3D stereochemical model (Figure 3A) revealed 95% residues in the core favorable area (Figure 3B) and 2.1% outliers, with a Z‐score of 0.78. Around 26% of the proteins consisted of alpha helices and 38% of beta‐sheets, while 8% were disordered. Since the metal cofactor is important for LpxC catalytic activity, the zinc ion was also kept bound with the structure for docking.

**Figure 3.** (**A**) Modeled structure of the LpxC protein. (**B**) Ramachandran plot using PROCHECK, **Figure 3.** (**A**) Modeled structure of the LpxC protein. (**B**) Ramachandran plot using PROCHECK, showing 93% residues in the favored region.

Many effective LpxC inhibitors have been identified [31,32], with a range of chemical scaffolds and antibiotic profiles [31], but none of them has attained approval as an anti‐ bacterial moiety [31]. LpxC inhibitors have not been reported for *C. ureolyticus.* For this purpose, therefore, structure‐based inhibitor screening was attempted using a molecular docking strategy. This is an excellent method for determining how drugs/compounds in‐ teract with a biological target. To comprehend the LpxC‐drug binding mechanism and energy, analyses were conducted in MOE software. Compound libraries included LpxC inhibitors from the ZINC database, metabolites from *Streptomyces* spp., and postbiotics. Many effective LpxC inhibitors have been identified [31,32], with a range of chemical scaffolds and antibiotic profiles [31], but none of them has attained approval as an antibacterial moiety [31]. LpxC inhibitors have not been reported for *C. ureolyticus*. For this purpose, therefore, structure-based inhibitor screening was attempted using a molecular docking strategy. This is an excellent method for determining how drugs/compounds interact with a biological target. To comprehend the LpxC-drug binding mechanism and energy, analyses were conducted in MOE software. Compound libraries included LpxC inhibitors from the ZINC database, metabolites from *Streptomyces* spp., and postbiotics. *Streptomyces* spp. produce the largest number of antibiotics among all bacteria [33]. They decrease the fitness costs of secreted secondary metabolites, leading to enhanced yield and product diversity [34]. Postbiotics are the healthful metabolites of probiotics [35]. The use of such metabolites has been reported against *Vibrio campbelli* infection [36] and SARS-CoV-2 [37]. Docking revealed the binding interactions for the top selected compounds from each library (Table 2).

ZINC26844580 (S-value: −7.42), ZINC13474902 (S-value: −7.05), and ZINC13474878 (S-value: −6.90) were prioritized from the ZINC database LpxC inhibitors. Notoginsenoside St-4 (from *Lactobacillus caseii*, S-value: −8.59), Asiaticoside F (from *Lactobacillus caseii*, S-value: −8.43), and Paraherquamide E (from *Lactobacillus plantarum*, S-value: −8.02) were shortlisted from the postbiotics library, whereas ZINC08219868 (Phytoene, S-value: −7.20), ZINC08214943 (Lycopene, S-value: −7.03), and ZINC04742519 (Sparsomycin, S-value: −7.01) were selected from the streptomycin library. Figure 4 shows the 2D bonding interactions between the shortlisted compounds and the LpxC protein. Among the prioritized compounds, Notoginsenoside St-4, a dammarane-type saponin from the root of *Panax notoginseng* has been previously reported to hamper herpes simplex virus entry [38].


**Table 2.** Molecular docking analysis of shortlisted compounds from the studied libraries.

Asiaticoside F has also been reported from the leaves of *Centella asiatica* and is known to inhibit tumor necrosis factor-α [39]. Paraherquamide E has been isolated from the fungus *Penicillium charlesh* and is a known anti-nematicidal agent [40]. These postbiotics have previously been isolated from plants or fungi, but bacterial production of this compound heralds an exciting use of biogenic metabolites of probiotic strains.

Phytoene is produced by *Streptomyces scabrisporus* NF3 [41], *Streptomyces griseus* [42], archaeon *Thermococcus kodakarensis* [43], algae *Dunaliella* sp., [44], yeast *Xanthophyllomyces dendrorhous* [45] and halophilic bacteria [46], among other microorganisms. Lycopene has also been produced by microorganisms [47]. Phytoene and lycopene have radical scavenging activity [48]. Sparsomycin has also been previously isolated from *Streptomyces* spp. and possesses anti-tumor activity [49]. Inhibition of LpxC by these compounds is of interest, making for welcome additions to the class of existing inhibitors of this enzyme.

**Figure 4.** Two‐dimensional depictions of shortlisted compounds from: (**A**) the postbiotics library, i.e., (**i**) Notoginsenoside St‐4, (**ii**) Asiaticoside F, and (**iii**) Paraherquamide E; (**B**) the streptomycin library, i.e., (**i**) ZINC08219868 (Phytoene), (**ii**) ZINC08214943 (Lycopene), and (**iii**) ZINC04742519 (Sparsomycin); and (**C**) the ZINC LpxC library, i.e., (**i**) ZINC26844580, (**ii**) ZINC13474902, and (**iii**) ZINC13474878. **Figure 4.** Two-dimensional depictions of shortlisted compounds from: (**A**) the postbiotics library, i.e., (**i**) Notoginsenoside St-4, (**ii**) Asiaticoside F, and (**iii**) Paraherquamide E; (**B**) the streptomycin library, i.e., (**i**) ZINC08219868 (Phytoene), (**ii**) ZINC08214943 (Lycopene), and (**iii**) ZINC04742519 (Sparsomycin); and (**C**) the ZINC LpxC library, i.e., (**i**) ZINC26844580, (**ii**) ZINC13474902, and (**iii**) ZINC13474878.

#### Asiaticoside F has also been reported from the leaves of *Centella asiatica* and is known *2.4. ADMET Profiling of Shortlisted Compounds*

to inhibit tumor necrosis factor‐α [39]. Paraherquamide E has been isolated from the fun‐ gus *Penicillium charlesh* and is a known anti‐nematicidal agent [40]. These postbiotics have previously been isolated from plants or fungi, but bacterial production of this compound heralds an exciting use of biogenic metabolites of probiotic strains. Phytoene is produced by *Streptomyces scabrisporus* NF3 [41], *Streptomyces griseus* [42], archaeon *Thermococcus kodakarensis* [43], algae *Dunaliella* sp., [44], yeast *Xanthophyllomyces dendrorhous* [45] and halophilic bacteria [46], among other microorganisms. Lycopene has also been produced by microorganisms [47]. Phytoene and lycopene have radical scav‐ enging activity [48]. Sparsomycin has also been previously isolated from *Streptomyces* spp. and possesses anti‐tumor activity [49]. Inhibition of LpxC by these compounds is of inter‐ est, making for welcome additions to the class of existing inhibitors of this enzyme. *2.4. ADMET Profiling of Shortlisted Compounds* The majority of the shortlisted compounds were not found to inhibit the P450 class of enzymes, CYP1A2, CYP2C19, CYP2C9, CYP2D6, and CYP3A4. ZINC13474902 and ZINC13474878 inhibited several cytochrome P450s. Since these enzymes are involved in The majority of the shortlisted compounds were not found to inhibit the P450 class of enzymes, CYP1A2, CYP2C19, CYP2C9, CYP2D6, and CYP3A4. ZINC13474902 and ZINC13474878 inhibited several cytochrome P450s. Since these enzymes are involved in drug metabolism [50], deactivating or excreting them, their non-binding behavior indicates that the shortlisted drugs will not be rendered ineffective. Six compounds (ZINC26844580, ZINC13474902, ZINC13474878, Notoginsenoside St-4, Asiaticoside F, and Sparsomycin) were identified as non-permeable to the blood–brain barrier, while three (paraherquamide E, phytoene, and lycopene) were permeable. GI absorption was low for Notoginsenoside St-4, Asiaticoside F, and Sparsomycin. Skin sensitization was not observed, while significant permeability to Caco2 was seen. Caco2 being used as a model to study the intestinal displacement barrier, this means that the metabolites can cross the intestinal barrier. Two compounds (Notoginsenoside St-4 and Asiaticoside) were identified as P-glycoprotein inhibitors. This could have an impact on the tissue distribution of these compounds and may change the pharmacokinetics due to the drug–drug interaction properties of P-glycoprotein inhibitors [51]. Except for Notoginsenoside St-4 and Asiaticoside, all compounds followed Lipinski's rule of five. Compounds were non-toxic and AMES carcinogenicity was null (Table 3), except for ZINC13474878. These features make several of the screened inhibitors

ideal candidates for future therapeutic development. For the compounds that do not fall in the ideal category of drug-likeliness or safety profile, their scaffolds could be used to make better compounds. Additionally, a conjugate could be made for the compounds that have low gastrointestinal tract absorption.


**Table 3.** ADMET properties analysis of the nine compounds shortlisted.

#### *2.5. MD Simulation Analysis*

A classic MD analysis was performed to determine the free energies of complexes, i.e., MM/PBSAs were almost similar, with the lowest value for the LpxC–lycopene complex (Table 4). The atom-scale MD and the structural stabilities of the selected inhibitors and protein complex were determined in a time-dependent manner. RMSD, RMSF, hydrogen bonding, and Rg of the ligand-bound LpxC protein were studied to analyze the structural integrity during bonding. This was achieved through trajectory mapping graphs of the backbone carbon atoms.

**Table 4.** MM/PBSA values of the selected compounds.


The RMSD analysis for ZINC LpxC database compounds (ZINC26844580, ZINC13474902, and ZINC13474878) showed the stability of compounds during the whole 50 ns simulation study, with a mild fluctuation found in ZINC26844580 at ~30 ns. The complete system was found to be at equilibrium within the range of 0.15–0.20 nm, as is evident from Figure 5A. The RMSD analysis for the postbiotic shortlisted compounds (Asiaticoside F and Paraherquamide E) showed considerable fluctuation during the initial 10 ns of simulation. However, after 15 ns, the system progressively stabilized and remained stable until the simulation was completed, with an average deviation of 0.15–0.20 nm, indicating the simulated system's convergence. Notoginsenoside St-4 showed stability throughout the simulation study, as is apparent from Figure 5B. Compounds from the streptomycin library (ZINC08219868 (Phytoene), ZINC08214943 (Lycopene), and ZINC04742519 (Sparsomycin)) exhibited mild to moderate fluctuations during the 50 ns simulation, showing variations in RMSDs values in an acceptable range of 0.15–0.25 nm (Figure 5C).

(Figure 5C).

**Table 4.** MM/PBSA values of the selected compounds.

ZINC1347878 0.04 −19.31 −18.95 ZINC13474902 0.03 −19.31 −18.95 ZINC26844580 0.14 −19.31 −19.03 Notoginsenoside St‐4 −0.85 −19.31 −19.19 Asiaticoside F −0.82 −19.31 −19.20 Paraherquamide E −0.07 −19.31 −18.89 Phytoene −0.96 −19.31 −19.46 Lycopene −0.92 −19.31 −19.54 Sparsomycin 0.05 −19.31 −18.97

**Name Ligand Protein Protein−Ligand Complex**

The RMSD analysis for ZINC LpxC database compounds (ZINC26844580, ZINC13474902, and ZINC13474878) showed the stability of compounds during the whole 50 ns simulation study, with a mild fluctuation found in ZINC26844580 at ~30 ns. The complete system was found to be at equilibrium within the range of 0.15–0.20 nm, as is evident from Figure 5A. The RMSD analysis for the postbiotic shortlisted compounds (Asiaticoside F and Paraherquamide E) showed considerable fluctuation during the initial 10 ns of simulation. However, after 15 ns, the system progressively stabilized and re‐ mained stable until the simulation was completed, with an average deviation of 0.15–0.20 nm, indicating the simulated system's convergence. Notoginsenoside St‐4 showed stabil‐ ity throughout the simulation study, as is apparent from Figure 5B. Compounds from the streptomycin library (ZINC08219868 (Phytoene), ZINC08214943 (Lycopene), and ZINC04742519 (Sparsomycin)) exhibited mild to moderate fluctuations during the 50 ns simulation, showing variations in RMSDs values in an acceptable range of 0.15–0.25 nm

**Figure 5.** RMSD analysis for the shortlisted compounds: (**A**) ZINC26844580 (green), ZINC13474902 (red), and ZINC13474878 (black); (**B**) Notoginsenoside St‐4 (black), Asiaticoside F (red), and Para‐ herquamide E (green); and (**C**) ZINC08219868 (Phytoene) (black), ZINC08214943 (Lycopene) (red), and ZINC04742519 (Sparsomycin) (green). **Figure 5.** RMSD analysis for the shortlisted compounds: (**A**) ZINC26844580 (green), ZINC13474902 (red), and ZINC13474878 (black); (**B**) Notoginsenoside St-4 (black), Asiaticoside F (red), and Paraherquamide E (green); and (**C**) ZINC08219868 (Phytoene) (black), ZINC08214943 (Lycopene) (red), and ZINC04742519 (Sparsomycin) (green). *Antibiotics* **2022**, *11*, x FOR PEER REVIEW 11 of 19

However, to evaluate the fluctuations found in amino acid residues of the ligandbound protein during the simulation studies, RMSF data for these compounds were also plotted. Figure 6 depicts the RMSF plot for all nine shortlisted compounds, which displayed variations throughout the simulation. The average RMSF for the simulated system was determined to be 0.6 nm. RMSF was considerably stable for all of the amino acid residues of the ligand-bound protein in the simulated system. However, to evaluate the fluctuations found in amino acid residues of the ligand‐ bound protein during the simulation studies, RMSF data for these compounds were also plotted. Figure 6 depicts the RMSF plot for all nine shortlisted compounds, which dis‐ played variations throughout the simulation. The average RMSF for the simulated system was determined to be 0.6 nm. RMSF was considerably stable for all of the amino acid residues of the ligand‐bound protein in the simulated system.

**Figure 6.** RMSF analysis for the shortlisted compounds: (**A**) ZINC26844580 (green), ZINC13474902 (red), and ZINC13474878 (black); (**B**) Notoginsenoside St‐4 (black), Asiaticoside F (red), and Para‐ herquamide E (green); and (**C**) ZINC08219868 (Phytoene) (black), ZINC08214943 (Lycopene) (red), and ZINC04742519 (Sparsomycin) (green). To evaluate the compactness of the structure complexes, Rg was plotted (Figure 7). **Figure 6.** RMSF analysis for the shortlisted compounds: (**A**) ZINC26844580 (green), ZINC13474902 (red), and ZINC13474878 (black); (**B**) Notoginsenoside St-4 (black), Asiaticoside F (red), and Paraherquamide E (green); and (**C**) ZINC08219868 (Phytoene) (black), ZINC08214943 (Lycopene) (red), and ZINC04742519 (Sparsomycin) (green).

The Rg plot for the catalytic site during the initial 10 ns of simulation showed large fluc‐ tuations. However, after 20 ns, the system gradually moved towards compaction, contrib‐

To evaluate the compactness of the structure complexes, Rg was plotted (Figure 7). The Rg plot for the catalytic site during the initial 10 ns of simulation showed large fluctuations. However, after 20 ns, the system gradually moved towards compaction, contributing to the overall stability of the simulated protein structure in the presence of all shortlisted compounds. *Antibiotics* **2022**, *11*, x FOR PEER REVIEW 12 of 19

**Figure 7.** Rg analysis for the shortlisted compounds: (**A**) ZINC26844580 (green), ZINC13474902 (red), and ZINC13474878 (black); (**B**) Notoginsenoside St‐4 (black), Asiaticoside F (red), and Para‐ herquamide E (green); and (**C**) ZINC08219868 (Phytoene) (black), ZINC08214943 (Lycopene) (red), **Figure 7.** Rg analysis for the shortlisted compounds: (**A**) ZINC26844580 (green), ZINC13474902 (red), and ZINC13474878 (black); (**B**) Notoginsenoside St-4 (black), Asiaticoside F (red), and Paraherquamide E (green); and (**C**) ZINC08219868 (Phytoene) (black), ZINC08214943 (Lycopene) (red), and ZINC04742519 (Sparsomycin) (green).

and ZINC04742519 (Sparsomycin) (green). The interactions between the protein and compounds were further determined through hydrogen bonding analysis. A robust interaction was observed between ZINC26844580, ZINC13474902, ZINC13474878 (ZINC LpxC inhibitor compounds), Asi‐ aticoside F, Paraherquamide E, Notoginsenoside St‐4 (Postbiotics), and LpxC protein in terms of hydrogen interaction, ranging from 8‐10 bonds throughout the 50 ns simulation. The compound ZINC04742519 (Sparsomycin) from the streptomycin library mediated three hydrogen interactions, initially at 10 ns. However, after 15 ns of the simulation, five hydrogen bonds were observed constantly. ZINC08219868 (Phytoene) and ZINC08214943 The interactions between the protein and compounds were further determined through hydrogen bonding analysis. A robust interaction was observed between ZINC26844580, ZINC13474902, ZINC13474878 (ZINC LpxC inhibitor compounds), Asiaticoside F, Paraherquamide E, Notoginsenoside St-4 (Postbiotics), and LpxC protein in terms of hydrogen interaction, ranging from 8-10 bonds throughout the 50 ns simulation. The compound ZINC04742519 (Sparsomycin) from the streptomycin library mediated three hydrogen interactions, initially at 10 ns. However, after 15 ns of the simulation, five hydrogen bonds were observed constantly. ZINC08219868 (Phytoene) and ZINC08214943 (Lycopene) formed no hydrogen bonds, indicating hydrophobic interaction with the LpxC protein (as shown in Figure 8).

(Lycopene) formed no hydrogen bonds, indicating hydrophobic interaction with the LpxC protein (as shown in Figure 8). Eventually, the LpxC protein in combination with the nine compounds was found to be stable during the 50 ns simulation. This confirms that the docking interactions that the complexes formed were stable.

**Figure 8.** Hydrogen bond analysis for the shortlisted compounds: (**A**) ZINC26844580 (green), ZINC13474902 (red), and ZINC13474878 (black); (**B**) Notoginsenoside St‐4 (black), Asiaticoside F (red), and Paraherquamide E (green); and (**C**) ZINC08219868 (Phytoene) (black), ZINC08214943 (Ly‐ **Figure 8.** Hydrogen bond analysis for the shortlisted compounds: (**A**) ZINC26844580 (green), ZINC13474902 (red), and ZINC13474878 (black); (**B**) Notoginsenoside St-4 (black), Asiaticoside F (red), and Paraherquamide E (green); and (**C**) ZINC08219868 (Phytoene) (black), ZINC08214943 (Lycopene) (red), and ZINC04742519 (Sparsomycin) (green).

#### copene) (red), and ZINC04742519 (Sparsomycin) (green). **3. Discussion**

Eventually, the LpxC protein in combination with the nine compounds was found to be stable during the 50 ns simulation. This confirms that the docking interactions that the complexes formed were stable. **3. Discussion** *Campylobacter* infections are usually caused by the consumption of contaminated poultry products [9]. Although the frequency of human sickness caused by oral Campylobacter species is lower than that caused by zoonotic *Campylobacter* species, it is considered that non-*jejuni/coli Campylobacter* illness is underreported due to a lack of good culture-based detection methods [5]. Recently, *C. ureolyticus* has been reported to cause gastroenteritis in both developed and developing countries [3].

*Campylobacter* infections are usually caused by the consumption of contaminated poultry products [9]. Although the frequency of human sickness caused by oral Campyl‐ obacter species is lower than that caused by zoonotic *Campylobacter* species, it is consid‐ ered that non‐*jejuni/coli Campylobacter* illness is underreported due to a lack of good cul‐ ture‐based detection methods [5]. Recently, *C. ureolyticus* has been reported to cause gas‐ troenteritis in both developed and developing countries [3]. The current study intends to prioritize potential therapeutic targets, as well as lead drug candidates, against *C. ureolyticus* based on the results of a comprehensive pan‐ge‐ nome investigation. In this study, we used the pan‐genome to detect and characterize the *C. ureolyticus* antimicrobial resistome as well. The pan‐genome concept has previously been utilized to distinguish between commensal and pathogenic strains [52,53] and we were able to investigate the pan‐resistome of the 13 strains as well. The pan‐resistome The current study intends to prioritize potential therapeutic targets, as well as lead drug candidates, against *C. ureolyticus* based on the results of a comprehensive pan-genome investigation. In this study, we used the pan-genome to detect and characterize the *C. ureolyticus* antimicrobial resistome as well. The pan-genome concept has previously been utilized to distinguish between commensal and pathogenic strains [52,53] and we were able to investigate the pan-resistome of the 13 strains as well. The pan-resistome analysis is useful for determining the ARG diversity of the species and revealing the occurrence of species-specific ARGs. Previously, Costa et al. [12] identified that some ARGs were conserved between *C. jejuni* and *C. lari*. In our analysis, we also found conservation of fluoroquinolone-resistant *gyrA* gene between *C. jejuni* and *C. ureolyticus* and its specific mutation T86I. TetM also exists in *C. jejuni* [54], but OXA-85 was unique to *C. ureolyticus* and has only been previously reported in *Fusobacterium nucleatum* subsp. *polymorphum* [21].

analysis is useful for determining the ARG diversity of the species and revealing the oc‐ currence of species‐specific ARGs. Previously, Costa et al. [12] identified that some ARGs For drug target mining, the core genome of *C. ureolyticus* was subjected to subtractive genome analysis and 35 targets were obtained. Previously, 34 targets have been reported in *C. jejuni* [55]. Proteins with a role in metabolic pathways (especially lipopolysaccharide synthesis), secondary metabolite synthesis, two-component systems, and multi-drug resistance were identified as drug targets and some of them were similar to previously reported targets in *C. jejuni*. Eventually, UDP-3-O-acyl-N-acetylglucosamine deacetylase (LpxC) was selected as a potent drug target against *C. ureolyticus*. This protein previously has been identified as a potential drug target against *C. jejuni* [55], *Pseudomonas aeruginosa* [32], and several Gram-negative pathogens [56]. The 3D structure of this protein was modeled and a library of LpxC zinc inhibitors, probiotics, and streptomycins (957 total compounds) was screened against it. ADMET profiling was applied to identify associated adversities. This helps to identify decisive factors in relation to a molecule's potential to be further processed for use as a drug. Harmonizing toxicity and ADME helps to summarize the criteria for the ideal compound, and thus compounds can be narrowed down at the initial screening stage to decide whether to proceed further. Our studied compounds showed several good properties, and their scaffolds could be utilized further for designing and optimizing drugs against *C. ureolyticus* and possibly other bacteria harboring LpxC.

Despite the powerful potential of the screening and validation of binding properties with rigorous simulations, our study has its limitations. Computational predictions do not give 100% accurate results and failure is possible to some extent. Therefore, the results need to be interpreted with caution for use in clinical settings and pharmaceutical endeavors. Lab validations must be carried out before moving on to clinical trials of high-throughput screened molecules. Better algorithms and more computational power may be available in the future to resolve unstudied and necessary features of drug-like molecules but current investigations are rife with errors. This should not, however, discourage studies at the computational level; rather, it should stimulate further endeavors.

#### **4. Materials and Methods**

#### *4.1. Data Retrieval*

The complete genomes of *C. ureolyticus* strains (*n* = 22) were retrieved from the NCBI database [57]. Redundant species data were removed and 13 genomes were left (details in Supplementary File S1).

#### *4.2. Pan-Genomic Analysis*

The pan-genomic analysis was performed using the BPGA tool [58], the parameters for which were defined in our previous work [59]. The clustering of inputted FASTA files was performed using the USERACH algorithm [60], with defined cutoff values of 70% for homologous genes. The pan- and core genome dot plots were generated. Furthermore, the alignment of core, accessory, and unique genomes was performed using the MUSCLE tool [61] with default parameters. UPGMA based on maximum likelihood was used to infer the phylogenetic relationships between strains. Additionally, the resistome of the identified core, unique, and accessory genes was investigated via the Comprehensive Antibiotic Resistance Database (CARD) [62]. CARD employs the automated BLASTp algorithm and a cutoff of 70% for alignment. The homologous gene set was functionally annotated through the Clusters of Orthologous Groups of proteins (COG) database [63]. The genome fractions were annotated for gene functions.

#### *4.3. Drug Target Prioritization*

The obtained core genome was subjected to the standalone CD-HIT program [64] and homologous gene products with 60% similarity exclusion criteria were clustered. CD-HIT also reduces redundancy. Using an in-house script, all of the clusters were retrieved from the CD-HIT output file and further investigated for gene essentiality. The obtained data were BLAST-searched against the CEG [65] and DEG [66] databases (with an E-value of 10−10) and a bit score of 100. The proteins found essential in both databases were used for further target identification. The significantly preferable drug targets must be nonhomologous to the human genome in order to avoid the cross-reactivity of drug candidates. For this purpose, the translated products of the obtained sets of genes were subjected to BLASTp against the human proteome (conditions: E-value: 10−<sup>3</sup> ; gap penalty: 11; gap extension penalty: 1).

Furthermore, a comparative analysis was performed between obtained non-homologous proteins and the proteomes of the 83 useful species of human microbial gut flora. The acquired set of proteins was processed against the DrugBank database. The identified druggable targets were also filtered through the virulence factor database (VFDB) [67] using an E-value of 10−<sup>3</sup> .

#### *4.4. Structural Modeling*

The structure of the selected protein LpxC was modeled using SWISS-MODEL [68]. This tool uses a homology-based approach for deciphering protein structures. The final model was selected based on its percent identity and query coverage to its template LpxC protein from *Escherichia coli* (PDB ID: 4MDT). Furthermore, the validation of the modeled structure was achieved using a Ramachandran plot. Secondary structure evaluation, Zscore prediction, and stereochemical quality (via PROCHECK) were also assessed through the SWISS-MODEL server [69]. The secondary structure and the disordered regions were predicted using the Phyre2 server [70].

#### *4.5. Virtual Screening*

The interactions between the screened streptomycins (*n* = 737), postbiotics (*n* = 78), and LpxC zinc database inhibitors (*n* = 142) against the LpxC protein were appraised using a molecular docking-based screening approach [71]. Using MOE v2019 software, the pre-docking protonation of LpxC was carried out with the following parameters: flip: all atoms; atoms: all atoms; titrate: all atoms; solvent: 80, disconnected metal treatment: enabled; temperature = 300 K; salt = 0.1; pH = 7; van der Waals: 800R3, with cutoff (A): 10; dielectric: 1, 1; protection = none; electrostatics: GB/VI, with cutoff (A): 15. Correspondingly, the following parameters were set for energy minimization: forcefield: Amber99; gradient: 0.05; fix hydrogen and partial charges = yes. The 957 compounds from the mentioned libraries were taken and screened against LpxC. The screening parameters were: placement = Triangle Matcher; refinement = forcefield; rescoring 1 = London dG; rescoring 2 = affinity dG. The top docked compounds were chosen based on their S-values. The interactions of the docked protein–ligand complexes were examined to explore the hydrogen bonding and hydrophobic interactions among receptors and ligand atoms within a range of 5 Å [72].

#### *4.6. ADMET Profiling of the Shortlisted Drug Candidates*

After the molecular docking study, pharmacokinetic properties of the shortlisted lead drug compounds, such as absorption, distribution, metabolism, excretion (ADME), and toxicology, were determined using Swiss ADME [73]. Following this, the selected compounds were processed using the pkCSM tool (http://biosig.unimelb.edu.au/pkcsm/ (accessed on 28 December 2021)) to determine the optimal drug candidate with high penetration and the least number of side effects. SwissADME presented skin permeation values, and pkCSM provided drug tolerance values for a range of organisms, including humans. Drug safety evaluation is critical for new drugs. Early knowledge of medication toxicity and side effects is crucial for drug induction in the development pipeline [74,75]. With this in mind, the highest tolerated dosages, the impacts on various species, and the excretion of drug characteristics were all determined.

#### *4.7. Molecular Dynamics (MD) Simulation*

The stability and flexibility analyses of the identified compounds against the shortlisted drug targets were carried out through MD simulation studies. MM/PBSA values were calculated using MOE software [76], while the atom-scale MD simulations were carried out with the GROMACS 5.1.2 package24 using the gromos54a71 forcefield. The ligand topology parameters were generated using the Automated Topology Builder (ATB v3.0) server. Using periodic boundary conditions, the complex was placed at a distance of 1.0 from the box edge in a dodecahedron box, with the addition of solvent molecules from the SPC water model. The insertion of counter ions into the solvated system helps to neutralize the entire system. The neutralized system was minimized using the steepest descent method, with a maximum force of 1000 kJ mol−<sup>1</sup> . The minimized systems were passed for 100 ps of NVT and NPT equilibration before the production dynamics to raise the system temperature by 300K and maintain a constant pressure of 1 bar for the system utilizing thermostat and Berendsen barostat algorithms. The Linear Constraint Solver (LINCS) algorithm was used to constrain all the bonds, and the Particle Mesh Ewald (PME) method was employed to compute the long-range electrostatics with a cutoff value of 1.0 nm. Finally, a 50 ns production run was carried out, with coordinates and energies being saved every 10 ps in the output trajectory file, as per the procedure. Root Mean Square Deviation (RMSD), Root Mean Square Fluctuation (RMSF), and Radius of Gyration (Rg) were plotted to evaluate the system's stability.

#### **5. Conclusions**

To the best of our knowledge, the current study is the first to map the resistome of *C. ureolyticus* as well as drug targets for this species. Resistance profiling is not only useful for the scientific community working in this domain but also for clinicians and health policymakers. Our computational analysis will aid the scientific community in further investigating the proposed therapeutic drug targets and inhibitors using experimental methodologies. The identified therapeutic proteins might be studied further in experimental investigations to prevent *C. ureolyticus*-mediated gastroenteritis, along with the compounds that we screened against the LpxC enzyme. The prioritized compounds ZINC26844580, ZINC13474902, ZINC13474878, Notoginsenoside St-4, Asiaticoside F, Paraherquamide E, Phytoene, Lycopene, and Sparsomycin need to be tested further in vivo and in vitro for their druggability.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/antibiotics11050680/s1, File S1: Details of genome data used in this study, File S2: Pan-genome statistics of the *C. ureolyticus* strains.

**Author Contributions:** Conceptualization, Z.B. and R.U.; methodology, Z.B. and R.U.; software, M.M.M., A.A. and S.A.; validation, Z.B., K.J. and K.K.; formal analysis, Z.B., K.K. and K.J.; resources, M.M.M., S.A. and A.A.; data curation, Z.B., K.J. and K.K.; writing—original draft preparation, K.J. and K.K.; writing—review and editing, Z.B.; visualization, M.M.M., S.A. and A.A.; supervision, R.U.; project administration, Z.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Publicly analyzed datasets used in this study can be found in NCBI GenBank by reference to the details provided in Supplementary File S1. Any other raw or generated data are available from the corresponding authors without reservation.

**Acknowledgments:** We would like to thank 'The SEARLE Company Ltd. (TSCL)', Karachi, Pakistan, for its capacity-building support of the Jamil-ur-Rahman Center for Genome Research, PCMD, ICCBS, University of Karachi, Pakistan.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

<sup>1.</sup> Burgos-Portugal, J.A.; Kaakoush, N.O.; Raftery, M.J.; Mitchell, H.M. Pathogenic potential of Campylobacter ureolyticus. *Infect Immun.* **2012**, *80*, 883–890. [CrossRef] [PubMed]

