1. Introduction
Biofilms are communities of microorganisms that can colonize both biotic and abiotic surfaces, thus playing a significant role in the persistence of bacterial infection. Microorganisms living in biofilms are embedded within a self-produced matrix of extracellular polymeric substances (EPS) [
1,
2]. The extracellular matrix supports the cell-to-cell interaction in biofilms and plays an important function in several processes including cell attachment, cell-to-cell connection, structural function, and antimicrobial tolerance. This matrix produced by bacteria is mainly composed of proteins, enzymes, polysaccharides, signaling molecules, and extracellular DNA [
3].
The change from free-living bacteria to biofilm involves the production of adhesins and extracellular matrix compounds that interconnect cells in biofilms [
4].
This microorganism assemblage provides a protected mode of growth, with resistance to antimicrobials and killing by host immune system. The biofilm development makes the survival of pathogen microorganisms in hostile environments, as well as the dispersal and colonization of other niches possible [
5].
In the medical field, biofilm phenotype has been recognized as an active agent in development of many infections, with the most common, being related to the use of medical devices [
6]. The National Institutes of Health have reported that, 65% and 80% of microbial and chronic infections, are associated with biofilm formation, respectively [
7]. A better understanding of the molecular and physiological mechanisms of biofilm formation will allow a tool for its inhibition.
Cyclic diguanylate (c-di-GMP) is an important second messenger involved in the biofilm organization [
8]. This signaling molecule modulates bacterial growth phenotypes including, but not only limited to, biofilm formation, virulence factor production, and motility [
8,
9].
In the bacterial kingdom, the enzymatic GG[D/E]EF and EAL domains of diguanylate cyclase, PleD protein are highly conserved and several copies of diguanylate cyclase (DGC) proteins containing these domains are found in many bacterial genomes [
10]. A consensus of all the GG[D/E]EF, EAL, and HD-GYP domains in bacterial genomes is available at
http://www.ncbi.nlm.nih.gov/Complete_Genomes/c-di-GMP.html [
11].
The synthesis of c-di-GMP depends on diguanylate cyclases (DGCs) which use two molecules of guanosine-5
-triphosphate (GTP) to obtain c-di-GMP in a two-step reaction; whereas phosphodiesterases (PDEs) hydrolyze c-di-GMP to linear di-GMP [
9,
12].
Inhibition of DGCs as a strategy for preventing biofilm formation represents a potential method and has been extensively studied using PleD protein (EC: 2.7.7.65) as a model of DGC for in silico studies (using virtual screening) and in vitro studies in biofilms formation [
8,
9,
12,
13,
14].
In this work, we used as a model the DGC (PleD) from
Caulobacter crescentus, taking into account that catalytic site of DGCs are structurally conserved (
Figure S1 in in Supplementary Materialss).
Caulobacter crescentus is a Gram-negative bacterium. It has been an important model organism, not only for investigate the regulation of the cell cycle, asymmetric cell division, and cellular differentiation but also to study the activation and synthesis of the second messenger involved in the response of the environmental adaptation [
15]. The first diguanylate cyclase activity related to the synthesis of the second messenger, c-di-GMP, as a response regulator of biofilm formation, was experimentally demonstrated with
Caulobacter crescentus [
16]. During the life cycle, the bacterium produces a sessile, surface-adherent stalked cell, and a motile swarmer cell [
17]. The stalked cell stage offers a fitness advantage by anchoring the cell to surfaces to form biofilms and or to exploit nutrient sources [
18]. In the biofilm formation, high cellular levels of c-di-GMP promote exopolysaccharide production and surface adhesion, in contrast, in motile swarmer stages low c-di-GMP concentration result in flagellar gene expression increasing cellular motility [
10]. A similar response has been observed in the human opportunistic pathogen
Pseudomonas aeruginosa, changes in the intracellular concentration of c-di-GMP determine its physiology and pathogenesis. Thus, during the process of biofilm formation that initiates with attachment to the surface of planktonic bacteria, an increase of intracellular c-di-GMP level was observed [
19].
The increase in infections due to existence of multi-resistant bacteria instigates the need for the discovery of new drugs. Bioactive molecules synthesized by microorganisms, plants, and animals, are efficient natural compounds against microbes. These substances act as chemical defense in various competitive environments [
20]. An alternative for selecting DGC natural inhibitory molecules involves the search of natural products in the database, which could supply substances (especially small compounds) with good binding affinity for the target enzyme.
Since the discovery of penicillin, natural products play an important role in the discovery and development of new drugs [
21,
22]. Between 1981 and 2014, 43.6% of anti-infective drugs were approved and 40.7% of anticancer agents were based on natural products or derivatives thereof [
22], and many natural products have become a template for drug design because they often present a ligand–protein binding motif [
23]. Some natural products have shown antibacterial activity like as described by Nofiani et al. [
24] and Emiru et al. [
25].
In the present study, an evaluation of diguanylate cyclase, using as a model, the PleD protein from
Caulobacter crescentus against 224,205 molecules from natural resources in ZINC15 database was carried out. Here, virtual screening (VS), molecular dynamics (MD) simulations, and binding free energy calculation adopting the Molecular Mechanics Poisson–Boltzmann Surface Area method (MM/PBSA) were used for the hit searching. A summary of the complete procedure is shown in
Figure 1. This work aimed to propose potential hit substances from natural compounds database that can be optimized to generate promising lead candidate compounds as a competitive ligand for the diguanylate cyclase proteins.
2. Results and Discussion
Diguanylate cyclase PleD protein of
Caulobacter crescentus belongs to the response regulator family and its activity is controlled by phosphorylation, where two cognate kinases are involved: DivJ and PleC. PleD is required for polar differentiation in the bacterial cell cycle. Bacterial cells without functional PleD are hypermobile and fail to accomplish swarmer-to-stalked cell transition [
16,
27,
28,
29]. PleD protein contains an intrinsic nucleotide cyclase activity which converts two molecules of GTP into 3
,5
-cyclic diguanylic acid (c-di-GMP) [
16], a molecule of great interest which regulates surface-adhesion properties and motility in bacteria [
15,
30]. In addition, c-di-GMP is involved in formation and persistence of bacterial biofilm [
31]. Given that c-di-GMP is exclusively found in bacteria, c-di-GMP can be a potential target for medicinal applications.
From the pseudo-active structure (PDB-ID: 2V0N), the mode of substrate binding is as far as the position of the terminal phosphates close to the P-loop, together with two Mg
ions, which are coordinated by the phosphates and two invariant carboxylates (ASP327 and GLU370). Chain A from an inactive structure (PDB-ID: 1W25) was chosen for MD and VS, and a GTP molecule and two Mg
ions were added to the catalytic site of the protein, keeping the above mentioned position of the substrate (GTP) from the pseudo-active structure. In addition, ASP52 was phosphorylated and parameters for the phosphate group were obtained following the procedure mentioned elsewhere [
32]. The objective of this manuscript is to find compounds in a natural products library which compete for the active site, in such case avoiding the formation of cyclic di-GMP (c-di-GMP). The c-di-GMP second messenger represents a signaling system that regulates many bacterial behaviors and is of key importance for driving the lifestyle switch between motile loner cells and biofilm formers [
19].
Activation of PleD proceeds via phosphorylation-induced dimerization. Upon modification of Asp53 of the Rec domain, the intramolecular packing of the Rec and (Rec-Rec’)2 “stem” is improved [
33]. In such a way, the system was prepared to simulate a possible activation from inactive to active once molecular dynamics was run, simulating a real scenario for protein dynamics.
Molecular dynamics was done for the complex PleD–GTP for 20 ns. The monomer of PleD (chain A) was chosen and GTP molecule was added to the structure in the catalytic site, together with Mg ions and phosphorylation at ASP52. Charges and atom types for each ligand were assigned from the gaff force field.
The system converges at 1.2 ns of simulation (
Figure S2 in Supplementary Materials). From 1.2 ns, an average structure for the rest of the trajectory was computed for VS simulation using Chimera package [
26] (
Figure 2). First, molecular docking was done with the GTP molecule, obtaining an ICM energy score of about −27.64 kcal/mol (
Table S1 in Supplementary Materials). VS was done with natural compounds library from ZINC15, using GTP molecule from the average structure as reference. The coordinates of the ligands were extracted from the sdf file in ZINC15 natural compounds library. Bond orders, tautomeric forms, stereochemistry, hydrogen atoms, and protonation states were assigned automatically by ICM package with default parameters [
34]. Atom types, charges, and parameters were obtained from Merck Molecular Force Field (MMFF) for each ligand in the VS procedure. Geometry optimization was carried out using the automatic converting procedure in ICM [
34].
ICM scores for the best 100 hits molecules are summarized in
Table S1 in Supplementary Materials. They have ICM score energies ranging from −47.13 to −31.62 kcal/mol. Based on results from the VS procedure, the first 100 natural compounds had better ICM score energy than the GTP (
Figure 3). The main idea of this work is to find a compound from natural product source which can block the active site of the PleD protein, in that way, blocking the formation of c-di-GMP. To do this, a more accurate calculation of the binding free energy was done. MM/PBSA was used to compute the binding free energies for the best 35 ICM energy score molecules from VS.
2.1. Molecular Mechanics Poisson–Boltzmann Surface Area (MM/PBSA)
The best 35 ligands from VS simulations were selected to perform 10 ns of MD simulation. From this results, the best 6 molecules with the best binding free energy MM/PBSA were studied. For the 35 ligands, the energy values were ranked between −121.39 (ZINC04501392) and −12.84 kcal/mol (ZINC15956889). The complete list is shown in
Table S2 in Supplementary Materialss. On the other hand, the GTP molecule has a binding free energy of about −178.09 kcal/mol. The difference in energy from VS and MM/PBSA calculation lies, not only in the method, but the parameter and charges used for this type of calculation (see methods). The six ligands with the best binding free energy according to MM/PBSA calculations are listed in
Table 1 with the ZINC15 ID. In
Figure 4, a 2D structure representation of the GTP and molecules are shown. Previous studies have reported antibacterial activity using one of these ligands or derivatives of them, for instance, citrate can be used to functionalize and synthesize nanoparticles with antibacterial activity [
35,
36]. In searching for new nematicidal factors from
Bacillus thuringiensis against
Meloidogyne incognita (a plant-parasitic nematode), a component identified as trans-aconitic acid (TAA) was found. The acid showed high nematicidal activity, suggesting that TAA is specifically synthesized by the bacillus, as a virulence factor. Moreover, TAA acid has shown nematicidal activity against
Bacillus thuringiensis bacterium [
37].
The MM/PBSA calculation provides information on strength of the binding between protein and ligand. For this, the algorithm considers the binding free energy equal to the free energy of the complex minus the sum of the free energy of the protein plus the free energy of the ligand, see Equation (
1).
Each
G component from Equation (
1) can be calculated as:
where
is average molecular mechanics potential energy in vacuum; T is temperature; S is entropy; and
is the free energy of solvation. The term
includes energy of both bonded and non-bonded interactions (
and
).
considers bond, angle, dihedral, and improper interactions.
considers electrostatic (
) and van der Waals (
) interactions, see Equation (
3).
are included polar and non-polar free energies; and polar contribution is calculated using the Poisson–Boltzmann equation while non-polar contribution includes repulsive and attractive forces between solute and solvent generated by cavity,
formation, and van der Waals interactions
see Equation (
4). For more information about how the equation works and how it is calculated, consult the work of Kumari et al. [
38], Baker et al. [
39], Wisser et al. [
40], and Konecny et al. [
41].
A lower value of
G implies a better coupling between protein and ligand.
Table 2 shows the values obtained for 6 ligands with the best binding free energy. Results show that none of the molecules tested had the same or close binding free energy as GTP, such that the difference between GTP and ligands energies were: Ligand1 = 78 kcal/mol; Ligand2 = 74.77 kcal/mol; Ligand3 = 58.51 kcal/mol; Ligand4 = 56.76 kcal/mol; Ligand5 = 70.16 kcal/mol; and Ligand6 = 77.11 kcal/mol. The largest difference is exhibited by Ligand1, while the lowest difference is exhibited by Ligand4. In order to have a better insight on residues that contributed more to
binding free energy, energy decomposition was performed (see methodology section for details); with this, total
energy is presented as a contribution per residue. Armed with this information, we can classify residues into favorable and non-favorable interactions.
Figure 5 and
Figure 6 only considered residues between 300 and 457, because, as shown in
Figure 5a, the first 300 residues do not contribute to the binding free energy. For
Figure 5 and
Figure 6, negative values are considered as favorable interactions. The interactions between two of the three magnesium present in the protein play a role in coordinating the carboxyl groups in all 6 ligands including GTP which help to gain more stability. Taking the GTP as reference, an interaction with ARG445 is observed and this is a highly favorable interaction which is not present in any of the 6 ligands evaluated, suggesting this residue as an important residue in the binding pocket which help to reach a more stable conformational configuration. The ligands show common favorable interactions with LYS331, LYS441, PHE329, and PHE330 except for Ligand2, presumably because this ligand is the shortest one, thus limiting its interaction with the protein. The result of positive energy suggest a non-favorable interaction with the protein residues, such that all ligands show non-favorable interaction with ASP326, ASP328, GLU369, ILE327, and GLU370. These five residues are common and have the higher positive values among the ligands. All ligands (except for Ligand4 and Ligand6) show non-favorable interaction with ASP434. The non-favorable interaction for GTP is mainly with ASP326 and GLU369, which is shown graphically in
Figure 7.
2.1.1. Root Mean Square Deviation (RMSD)
From the 10 ns molecular dynamic simulation used for the calculation of MM/PBSA, a Root Mean Square Deviation (RMSD) analysis was done. The RMSD was used to verify the protein stability during the entire simulation [
42]. Less deviation in these values indicates a protein stability. For these,
Table 3 shows the average RMSD for the 6 ligands and GTP.
Figure 8 shows that the 6 ligands and GTP are stabilized around 7 ns and 4 ns, respectively. These values were obtained using cpptraj tool [
43]. Aligning to the first frame of production part of the simulation, the result suggests a stable configuration of the protein with the ligands, as shown in
Figure 8, where GTP shows less variation compared to the other ligands.
2.1.2. Hydrogen Bonds (H-Bonds)
Hydrogen bonds represent an important interaction between protein–ligand complexes and is regarded as the main reason for protein–ligand selectivity. This is why hydrogen bond analysis is often used to describe affinity between protein and ligand [
44].
With cpptraj tool the average amount of time at which a hydrogen bond was present during the simulation were analyzed. From this information, the protein residues with H-bond formation respect to the ligands are computed. For these, the H-bond analysis for each ligand (including GTP) was done using a threshold of 3 Å around the ligands during the production, and for each frame.
Among all ligands evaluated, LYS441 was the most important residue in the H-bond formation. In some cases, it is present in more than the 80% of the production (Ligand2). Residues PHE331, PHE330, and LYS331 were also present as relevant. For GTP, the residue with more interactions was ARG445, and this was also seen in MM/PBSA decomposition results.
A second analysis was performed to observe the behavior of the best ligand obtained in MM/PBSA calculation, which is Ligand4. For this, 100 ns of MD with the same condition of the previous 10 ns were done. The starting point for this new simulation was the ending of the previous one, giving a total of 110 ns of simulation. In this second analysis, the PleD–protein without any ligand (to serve as a reference), complex PleD–GTP, and complex PleD–Ligand4 were taken for simulations. MM/PBSA calculations done using different time windows 20, 30, 40, 50, and 60 ns of the molecular simulation were consider. In each window, the same amount of snapshots (200) were taken.
2.2. Trans-Aconitic Acid (Ligand4)
2.2.1. Root Mean Square Deviation
RMSD showed high stability in the three cases (PleD, GTP, and Ligand4). In
Figure 9a, the average value with its standard deviation were as follows: PleD–protein = 2.75 ± 0.31; PleD–GTP complex = 2.57 ± 0.28; and PleD–Ligand4 complex = 3.20 ± 0.45. The PleD–protein had a higher value than PleD–GTP, thus indicating that GTP helped the protein itself to reach a more stable configuration. The PleD–Ligand4 shows more variation, although it has high stability during the dynamic as seen in the low standard deviation.
2.2.2. Root Mean Square Fluctuation
For RMSF, the same approach as MM/PBSA was made. The graph considered residues 300–457, in which the interactions where establish (
Figure 9b). The same fluctuation was observed in the protein structure for the three cases, specifically in PleD–GTP. A major variation is present in residues 450–455 and the average variation and its standard deviation were: 1.65 ± 0.63 for PleD–protein; 1.53 ± 0.87 for PleD–GTP complex; and 1.53 ± 0.62 for PleD–Ligand4 complex. The RMSF analysis agrees with RMSD, in which PleD alone presents lower stability compared to its binding PleD with GTP.
2.2.3. Radius of Gyration
The radius of gyration (Rg) can give an insight about the compactness of a protein structure [
45].
Figure 9c suggests a structural stability in the 3 systems considered; however, the complex PleD–Ligand4 had less Rg compared to the other two structures. The average value and its standard deviation for the three systems were: 25.48 ± 0.27 for PleD–protein; 25.36 ± 0.26 for PleD–GTP; and 24.1 ± 0.17 for PleD–Ligand4.
2.2.4. Hydrogen Bond (H-Bond)
The H-bond analysis results for 110 ns production show some differences compared to the results for 10 ns (
Table 4 and
Table 5), in which GTP was observed to have the same tendency as the previous result, where residue ARG445 is much more present during the simulation. In this case, the order is maintained, such that LYS331, LYS441, and PHE330 residues come after ARG445. In the case of Ligand4, there is a different order such that residue PHE330 is the most present followed by LYS331 and LYS441.
The H-bond formation shows that, during the production, GTP made nine H-bonds while Ligand4 made only four, as shown in
Figure 9d,e.
The GTP molecule forms twice as many H-bonds compared to the Ligand4, this is due to the number of electronegative atoms available to form hydrogen bonds. GTP contains (
Figure 4) phosphate, amine, and hydroxyl groups which provide more availability to form this kind of interaction against the hydroxyl and carbonyl groups present in the Ligand4.
2.2.5. Molecular Mechanics Poisson–Boltzmann Surface Area (MM/PBSA)
The MM/PBSA values presented are averages of the windows establish in this step. The values (with their standard deviation) include: −169.93 ± 4.03 for PleD–GTP; and −98.79 ± 0.43 for PleD–Ligand4.
Figure 10 shows the average values of residues from different windows (20, 30, 40, 50, and 60) compared to previous results (
Figure 5) for GTP. There exist a similarity, which indicates a tendency of residues like ASP326, ASP328, and GLU369, to have non-favorable interaction, while ARG445, LYS441, LYS331, PHE330, and PHE329 to have the major favorable contribution to the interaction with GTP. Thus, suggesting the importance of these residues in the binding pocket. In the case of Ligand4, the residues showed similar tendency as in the simulation of 10 ns, ASP326, ILE327, ASP328, GLU369, and GLU370 residues showed non-favorable interaction; whereas residues PHE329, LYS331, and LYS441 showed favorable interaction. In both cases, the magnesium atoms play an important role to the binding free energy.
The low standard deviation of the reported result at different windows is interpreted as high stability of the complexes. When comparing the two MM/PBSA results (first and second analysis), some variations in energy presented and emergence or disappearance of new residues, such as GLU370 or ASP343 for GTP, and LYS332 for Ligand4, is as a result of better conformational sampling obtained with more simulation time (110 ns). The result of MM/PBSA with short dynamic gives an idea of the overall interaction including the favorable or non-favorable interactions.
For the GTP results, where the major contribution for MM/PBSA binding free energy is the H-bond formation from ARG445, this residue is suggested as a key residue of ligand–protein interaction. The absence of this interaction with the rest of the ligands indicate a lower binding free energy in comparison with GTP. Both MM/PBSA results, as well as the one obtained by 10 ns and the average from the five windows, showed that the Ligand4 has good stability; thus, suggesting this molecule as a possible template for the design of new ligands that can surpass the GTP binding free energy and effectively inhibit the PleD protein.
Although the biosynthesis of
trans-aconitic acid has been determined principally in soybean, maize, and wheat plants and some bacteria such as
Bacillus subtillis,
Bacillus thuringiensis, the enzymatic aconitate isomerase (AI; EC 5.3.3.7) activity was reported until now in the soil bacteria
Pseudomonas sp. It has been associated, as it assimilates via the
trans-aconitic acid into the bacterial tricarboxylic acid cycle, as a carbon source [
46]. However, our goal in this research was to identify natural products as a potential starting point for hit-to-lead methodologies to obtain new inhibitors of the DGCs using as model PleD protein. On the other hand, it has been not reported that other bacteria possess the aconitate isomerase, and they could be used TAA as a carbon source too.
In this research, the binding free energy was calculated by MM/PBSA method and used to define a possible compound capable of inhibiting the PleD protein and, hence, blocking the formation of biofilms. TAA performed best among the 224,205 ligands studied. This compound is found in plants like
Zea mays,
Triticum aestivum,
Avena sativa,
Brachiaria plantaginea, and
Saccharum officinarum [
47]. Methods of extraction are described by Schnitzler et al. [
48] and Kanitkar et al. [
49]. In our results TAA does not surpass GTP binding free energy; however, it does not imply that it cannot be used. Our results suggest a good interaction with the PleD protein. Further optimization can be made to improve the results, using strategies like Fragment-Based Drug Discovery (FBDD) [
50]. In addition, Fragment Molecular Orbital (FMO) [
51,
52] can be used to optimize the hit to make it more effective.