*Article* **Identification of Potential Allosteric Site Binders of Indoleamine 2,3-Dioxygenase 1 from Plants: A Virtual and Molecular Dynamics Investigation**

**Vitor Martins de Almeida and Osvaldo Andrade Santos-Filho \***

Laboratório de Modelagem Molecular e Biologia Estrutural Computacional, Instituto de Pesquisas de Produtos Naturais Walter Mors, Centro de Ciências da Saúde, Universidade Federal do Rio de Janeiro, Av. Carlos Chagas Filho, 373, Bloco H, Cidade Universitária, Rio de Janeiro 21941-599, RJ, Brazil **\*** Correspondence: osvaldo@ippn.ufrj.br

**Abstract:** Ligand and structure-based computational screenings were carried out to identify flavonoids with potential anticancer activity. Kushenol E, a flavonoid with proven anticancer activity and, at the same time, an allosteric site binder of the enzyme indoleamine 2,3-dioxygenase-1 (IDO1), was used as the reference compound. Molecular docking and molecular dynamics simulations were performed for the screened flavonoids with known anticancer activity. The following two of these flavonoids were identified as potential inhibitors of IDO1: dichamanetin and isochamanetin. Molecular dynamics simulations were used to assess the conformational profile of IDO1-flavonoids complexes, as well as for calculating the bind-free energies.

**Keywords:** cancer; immunology; flavonoids; IDO1; virtual screening; molecular docking; molecular dynamics; free energy

**Citation:** de Almeida, V.M.; Santos-Filho, O.A. Identification of Potential Allosteric Site Binders of Indoleamine 2,3-Dioxygenase 1 from Plants: A Virtual and Molecular Dynamics Investigation. *Pharmaceuticals* **2022**, *15*, 1099. https://doi.org/10.3390/ ph15091099

Academic Editor: Mary J. Meegan

Received: 13 July 2022 Accepted: 31 August 2022 Published: 2 September 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

#### *1.1. Tryptophan Metabolism and the Kynurenine Pathway*

Studies show that tumor tissues within a chronic inflammatory context (presence of pro-inflammatory cytokines for a long time) catabolize in large proportions the amino acid L-tryptophan in the cell matrix and synthesize metabolites from one of the main degradation pathways of tryptophan, the pathway of kynurenine [1,2].

Almost all metabolites of the kynurenine pathway affect immune activity through various mechanisms. Figure 1 illustrates the effects of kynurenine pathway metabolites on the immune system. The determining step of the kynurenine pathway is the formation of N-formylkynurenine, regulated by the family of dioxygenase enzymes known as tryptophan 2,3-dioxygenase (TDO) and indolamine 2,3-dioxygenase (IDO), which may exist in the following two isoforms: IDO1 and IDO2 [3].

TDO, IDO1, and IDO2 are heme-bearing enzymes and catalyze the rate-limiting step of tryptophan catabolization. TDO is mainly expressed in the liver and oxidizes excess tryptophan in the body, generating ATP and especially NAD+. The expression of TDO is stimulated by the concentration of blood tryptophan as well as by the synthesis of the heme cofactor [4].

IDO1 can be expressed by many different cells, including antigen-presenting cells (APCs), such as monocyte-derived macrophages, dendritic cells (DCs), and fibroblasts. Its expression is mainly induced by inflammatory cytokines, such as IFN-γ, TNF-α, IL-1, and IL-2, secreted by Th1-type lymphocytes, as well as TGF-β, IL-10, and adenosine secreted by regulatory T lymphocytes [5]. IDO1 expression is further stimulated by its own product, kynurenine, via the aryl hydrocarbon receptor (AhR) [6,7].

**Figure 1.** Catabolization of tryptophan via the kynurenine pathway and its interactions with the immune system. Orange boxes indicate the effects of immune mediators on the kynurenine pathway and yellow boxes indicate the effects of tryptophan metabolites on the immune system. Adapted from Lanser et al. [4].

IDO2 is significantly less active than IDO1 [8]. Similar to IDO1, its expression is stimulated by the activation of AhR [9]. Although IDO2 is expressed by cancer cells, when compared to IDO1, it contributes little to the accumulation of kynurenine pathway metabolites [10,11].

In healthy patients, IDO1 expression is restricted to endothelial cells in the placenta, lung, mature dendritic cells in secondary lymphoid organs, and in epithelial cells scattered in the female genital tract [12]. Whereas, under inflammatory conditions, IDO1 expression is strongly induced by INF-γ [13]. Furthermore, it is highly expressed at inflammatory sites, where it may contribute to negative feedback against local immune response activity [12]. Inflamed tumors may also express IDO1 as an adaptive resistance mechanism, in which it is induced by IFN-γ produced by dendritic cells and tumor-infiltrated macrophages [14]. γ γ

Given the physiological importance of the IDO1 enzyme within the tumor microenvironment, the need for structural studies and the development of increasingly efficient inhibitors has become an attractive approach to treating cancer patients.

#### *1.2. IDO1: Structure*

The human IDO1 is a monomeric α-helical dioxygenase enzyme containing an N-terminal minor domain (residues 1–154) and a C-terminal major domain (residues 155–403), where the catalytic site and a cofactor heme are located (Figure 2). α

**Figure 2.** Structure of indoleamine deoxygenase 1 (IDO1) with highlighted regions: minor domain (green) and major domain (grey, magenta, red, blue, and cyan). Contained in the larger domain are: the DE fragment, composed of the DE hairpin and the DE loop; JK loop; EF loop; the heme cofactor (orange); the allosteric site.

α β The minor domain or N-terminal domain, shown in green in Figure 2, contains residues Gly11 to Asp158, consisting of six α-helices (involving amino acids Tyr36 to His45, His45 to Ser52, Gln54 to Lys61, Asp72 to Gly93, Pro104 to Glu119 and Val125 to Val130 or helix A, contributing to the active site), four 310-helices (Ser12 to His16, Pro33 to Phe35, Ser66 to Leu70 and Thr144 to Glu146) and a small antiparallel β-sheet formed by residues Lys135 to Lys136 and Met148 to Asp149. Such a domain makes up part of the upper region of the catalytic site and acts as structural support for the enzyme [15].

α The major domain or C-terminal domain, shown in grey in Figure 2, contains residues Cys159 to Gly403, being constituted by 9 α-helices [15]. This domain performs functions related to conversion, positioning, and displacement of substrates/products in the enzyme and is where the catalytic site is located. There are also regions with specific functionalities that are important for the enzymatic reaction to occur. Among them, there is the "access" tunnel for the entrance of small molecules from the external environment (ligand delivery

tunnel) and the JK loop (red), with the function of conducting and maintaining the substrate (tryptophan) inside the catalytic site [16].

Finally, there is the DE fragment, consisting of the DE-hairpin (Ser235 to Tyr249) and the DE loop (Glu250 to Gly262) (magenta), the EF loop (Gly278 to His287) (blue), and the JK loop (Gln360 to Gly380) [17].

IDO1 has the following two non-competitive natural substrates: tryptophan and molecular oxygen (O2). Moreover, the enzyme has the following four states:


Despite not being the active form of the enzyme, apo-IDO1 is a promising macromolecular target since cellular studies have shown that this form is the most abundant in the cell medium, about 85% [18]. Therefore, it is likely that there is a binding site where potential inhibitors of the apo form can interact with this form of the enzyme. The main objective of this project was to apply molecular modeling methods in order to investigate this possibility.

#### *1.3. IDO1: Inhibitors*

Over the past decade, the scientific academy has been making efforts for the development of IDO1 inhibitors [2]. Röhrig and collaborators classified existing test-phase inhibitors into the following four categories: type I, II, III, and IV inhibitors [19].

Type I inhibitors are classified as competitive inhibitors of tryptophan. In general, type I inhibitors target the molecular oxygen-bound holo-IDO1 form and do not form a direct bond with heme iron [17]. The 1-LMT and 1-DMT are classified within this category of inhibitors [20].

Type II inhibitors are classified as competitive inhibitors of molecular oxygen by binding to the iron atom in its ferrous state of the heme group, acting in the ferrous holo-IDO1 form. Inhibitors such as β-carboline [21], and Epacadostat [22] fall into this category. β

Type III inhibitors are noncompetitive inhibitors of tryptophan. New inhibitors in this class are being studied, such as 4PI, Navoximod, and MMG-0358 [19].

Type IV inhibitors are the most recently described class of inhibitors in the literature, including inhibitors such as the clinical compound BMS-986205, which targets the apo-IDO1 form [23].

The structures of the mentioned inhibitors are shown in Figure 3.

**Figure 3.** IDO1 inhibitors.

Kwon and collaborators [24] determined the amino acid residues that define an allosteric site for the action of apo-IDO1 inhibitors (Ser12, Tyr15, Ile17, Ile178, Lys179, Ile181, Pro182, Phe185, Lys186, Phe306, Ser309, Leu310, Ser312, Asn313, and Pro314), and found that the flavonoid obtained from plants of the species *Sophora flavescens*, known as Kushenol E, acts as an inhibitor of this site (Figure 4). The authors described the amino acids Pro182 and Phe185, located in the larger domain of the enzyme, as the main components of the allosteric site for inhibitory activity. Site-directed mutagenesis experiments were performed on these residues, and in vitro enzyme inhibition assays were performed. The results proved the contribution of these amino acids to enzymatic function and indicate this allosteric site as a target for new inhibitors [24].

**Figure 4.** Allosteric site described by Kwon and colleagues [24]. The docked flavonoid Kushenol E, as well as Pro182, and Phe185 are shown in detail.

Most IDO1 inhibitors are obtained from a natural source; however, as shown by Kwon, flavonoids may inhibit IDO1 [24]. The diverse variety of natural products can make the screening of these compounds a time-consuming and expensive task. Due to the development of more powerful computers, virtual screening of chemical compounds and pharmacodynamic validation of ligand-receptor interactions can accelerate the process of discovering potential new drugs. In this context, we focused our work on the identification of potential inhibitors of the naturally occurring apo-IDO1 allosteric site.

#### **2. Results and Discussion**

#### *2.1. Ligand-Based Virtual Screening*

In order to identify potential IDO1 inhibitors capable of interacting with its allosteric site, the following two virtual screening tools were used: the Mcule platform [25] and the ZINC 15 database [26]. At this stage of the work, the flavonoid Kushenol E (Figure 5) was used as the reference compound, due to its proven effectiveness as a non-competitive inhibitor that binds to the allosteric site of the enzyme [24]. Tanimoto's coefficient was used as the sampling criterium [27], and 172 compounds were sampled.

**Figure 5.** Kushenol E.

#### *2.2. Structure-Based Virtual Screening and Molecular Docking*

Besides the ligand-based virtual screening mentioned above, a structure-based virtual screening was also carried out (Figure 6). In this approach, the previously sampled compounds, including Kushenol E, were docked into the allosteric site of the IDO1.

**Figure 6.** Workflow of ligand-based and structure-based virtual approaches.

− − − − The calculated docking energies of each compound are shown in Supplementary Material (Table S1). The range of calculated energies varied from −8.1 kcal/mol to −5.3 kcal/mol. The docking energy for Kushenol E is equal to −6.6 kcal/mol. We realized that most of the sampled compounds were not from natural sources. Since the focus of this research was the identification of substances obtained from natural products with the potential ability to inhibit the biological response of the enzyme IDO1 through interaction at the allosteric site, a new virtual screening was carried out. The strategy was to use each compound that presented interaction energies equal to or lower than −7.1 kcal/mol (19 compounds; represented in red in Supplementary Materials, Table S1) as reference compounds in new ligand-based virtual screenings. The ZINC 15 database was the chosen tool. In this context, 28 natural products were selected, which were later docked to the allosteric site of the IDO1 enzyme. The result of this second-round structure-based virtual screening is shown in Table 1. While, in this investigation, Kushenol E (shown in bold red) works as the positive control, it is also important to consider a compound that would work as a negative control. In this context, as in the paper by Kwon [24], steppogenin was chosen as such a compound. As shown in the table, among all compounds, steppogenin shows the least favorable docking energy (shown in light red).

In this second molecular docking calculation, two additional criteria were used for the final screening as follows: (a) compounds that presented docking energies equal to or lower than −7.1 kcal/mol; (b) compounds that showed some anticancer activity (as described in the literature). Consequently, the following three flavonoids were selected: dichamanetin, isochamanetin, and chamaejasmin B, shown in bold black.

−


**Table 1.** Structure-based (second) virtual screening.

Dichamanetin is a C-benzylated flavanone that can be obtained from several Southeast Asian plant species, such as the following: *Uvaria alba*, *Uvaria chamae*, *Piper sarmentosum*, and *Xylopia pierrei* [28–30]. Isochamanetin, another C-benzylated flavanone, was obtained from several plant species such as *Sphaeranthus amaranthoides*, *Xylopia pierrei*, and *Ulvaria chamae* [28,31,32]. Chamaejasmin B is a dimeric biflavonoid formed by two isosacuranetin subunits linked by C3-C3 carbons, and obtained from Southwest Asian plant species, such as *Stellera chamaejasme L*. [33].

Table 2 and Figure 7 show the intermolecular interactions formed with specific residues of the allosteric site of IDO1.

As shown in Table 2 and Figure 7, the flavonoids interact with amino acid residues from the allosteric site, as defined by Kwon [24] (Ser12, Tyr15, Ile17, Ile178, Lys179, Ile181, Pro182, Phe185, Lys186, Phe306, Ser309, Leu310, Ser312, Asn313, Pro314). Since flavonoids are compounds that have several aromatic rings, the main interactions formed with the enzyme are of the pi and van der Waals types. The screened flavonoids (Table 2) have a significant number of oxygens; consequently, they can work as hydrogen bond donors and acceptors. Furthermore, according to Kwon [24], residues Pro182 and Phe185 are essential for the structural and functional integrity of the IDO1 enzyme system.

Dichamanetin and isochamanetin A rings interact via pi-sigma interaction with Pro182. According to Kwon [24], this residue would be important for the structural and functional integrity of the IDO1 system. In chamaejasmin B, which is a biflavonoid, one of the A rings interacts with Pro182 (pi-alkyl) and Phe185 (pi-pi) with the assistance of van der Waals interactions with neighboring residues; the second ring A interacts with Phe185 (hydrogen bond) and with Pro314 (pi-alkyl), in addition to van der Waals interactions.

Focusing on substituents attached to the A ring of flavonoids, some structural features must be considered. The hydroxyl group attached to carbon 7 of the A rings of isochamanetin, dichamanetin, and Kushenol E forms strong hydrogen bond interactions with Ser12. Phenyl groups linked to the A ring of dichamanetin and isochamanetin probably help in the stability of the interaction of these compounds with the allosteric site. As shown in Table 2, dichamanetin interacts more strongly with IDO1 than with isochamanetin (−7.8 kcal/mol and −7.5 kcal/mol, respectively), with the first compound forming two pi-alkyl interactions with Pro182. It is also noted that the participation of a weak hydrogen bond interaction with Gly11. Isochamanetin, on the other hand, forms only one interaction. As will be shown in the next section, referring to molecular dynamics simulations, both Ser12 and Pro182 residues contribute significantly to the energetic stabilization of the formed complex.



**Figure 7.** Intermolecular interactions formed with specific residues from the allosteric site of IDO1: chamaejasmin B, dichamanetin, isochamanetin, Kushenol E, and steppogenin, respectively.

Rings B of both dichamanetin and isochamanetin interact via pi-pi stacking with Phe185. Ring B of chamaejasmin B interacts with Pro182 and Lys186 (pi-alkyl interaction). The methoxyl group attached to the C4' carbon of the ring B of chamaejasmin B forms an alkyl interaction with Pro182. Of all the flavonoids, the only one that interacts with IDO1 from the C ring is chamaejasmin B. This occurs through a hydrogen bond with a carbonyl group on carbon 4 of the C ring. It is worthy of note the fact that steppogenin is the only compound that shows unfavorable bump and acceptor-acceptor interactions. α δ

Pro182 interacts through pi interactions or with the A ring of flavanones or with phenolic groups attached to the same ring. It is known that the proline ring faces are partially positively charged [34]. As shown in Figure 8, hydrogens adjacent to the carbonyl and nitrogen of the amide (Hα and Hδ, respectively) are the most partially positive. The proline side chain is also conformationally restricted, allowing interaction with aromatic residues with minimal entropic or steric penalty [34]. α δ

**Figure 8.** Hydrogens in proline.

Phe185, an aromatic residue (Figure 9), interacts with flavonoids through pi-pi and pi-pi/T stacking, and van der Waals interactions [35], which are very important for the stabilization of the structural energy of protein complexes with flavonoids.

**Figure 9.** Hydrogens in phenilalanine.

Comparing our docking modeling of Kushenol E (Figure 7) with the docking modeling proposed by Kwon [24] (Figure 10), we can see that our simulation shows this compound in a different orientation in the cavity of the allosteric site, with a deviation of the longitudinal axis towards the B direction for the prenyl group attached to the C6 carbon of the A ring. According to Kwon, the B ring of the flavonoid is positioned inside the cavity formed by the amino acids Leu178, Ile181, and Phe306 (Figure 10). Differently, we found the same ring located next to residues Leu178, Ile181, and Phe306. Furthermore, according to Kwon, A and C rings interact with Ser309 and Asn313. Instead, in our model, this interaction takes place with the Pro182, and there is assistance from a few van der Waals interactions. The Kwon model shows the prenyl group attached to carbon 6 of the ring interacting with Pro314, whereas our model shows interaction with Pro182. Since the structure of the IDO1-Kushenol E complex has not been experimentally determined to date, a definitive answer about the correct orientation of Kushenol E will only be inferred after the structural elucidation study by X-ray crystallography, nuclear magnetic resonance, or cryo-electron microscopy.

**Figure 10.** Kwon's docking model of Kushenol E—IDO1 complex [24].

#### *2.3. Molecular Dynamics Simulations and Free Energy Calculation*

To further refine and investigate the ability of molecular docking simulation to predict the conformational profile of the IDO1-flavonoid complexes, and to calculate the respective binding free energies, molecular dynamics simulations were carried out.

Three-dimensional structures of the IDO1-flavonoid complexes at 20 ns, 40 ns, 60 ns, 80 ns, and 100 ns are shown in Figure 11. With the exception of chamaejasmin B, which detaches from the allosteric site cavity after 40 ns, all other flavonoids remained docked. Interestingly, after detaching from the allosteric site, chamaejasmin B stabilizes next to the JK loop. Furthermore, only one of the isosacuranetin subunits interacts with the allosteric site, while the other subunit is exposed to the solvent. Moreover, Figure 11 also shows that after 80 ns, steppogenin is almost completely detached from the allosteric site.

It is worthy of note that although the interactions of chamaejasmin B were not strong enough to keep it docked in place during 100 ns, this flavonoid showed the "best" docking energy. It is possible that its isosacuranetin subunit (Figure 12) be a good candidate for future studies of docking and molecular dynamics.

**Figure 11.** Conformational profile of the complexes formed by the interaction of IDO1 with chamaejasmin B, dichamanetin, isochamanetin, Kushenol E, and steppogenin, respectively, up to 100 ns molecular dynamic simulations.

**Figure 12.** Isosacuranetin.

In order to verify how flavonoids could alter the conformational profile (packing/ folding) of IDO1, radii of gyration (Rg) plots were calculated not only for IDO1 in its apo form but also for each IDO1-flavonoid complex. The corresponding plots are shown

in Supplementary Material (Figure S1). In molecular dynamics, the radius of gyration is a measure of the stability of a system consisting of several particles and describes the variation in density as a function of the center of mass over time. According to Figure S1, the IDO1-steppogenin complex is the one with the highest degree of energetic-structural destabilization when compared to the other complexes. The IDO1-dichamanetin complex was the most stable. These stability data are supported by the values of relative molecular docking energies, shown in Table 1, and by the values of free energy of interaction between the flavonoids and IDO1, shown in Table 3.


**Table 3.** Free energy of interaction of each flavonoid with IDO1.

Based on Figure S2, one can assess how each flavonoid changes the conformation of IDO1 (its apo conformation). In this context, Kushenol E and isochamanetin distort the IDO1 structure more significantly. Whereas dichamanetin is the flavonoid that less intensely distorts IDO1.

In addition to the radius of gyration and RMSD plots, root means square fluctuation (RMSF) for Cα of each enzyme residue was calculated (Figure S3). This measure is described by means of a two-dimensional graph where the abscissa shows each Cα associated with a given residue and the ordinate shows the average value of the fluctuations. It is verified that helices B (residues Cys159 to Met190) and G (residues Pro300 to Asn313), important for the formation of the allosteric site, do not undergo significant fluctuations due to the interaction of flavonoids. The EF loop (residues Gly278 to His287), significatively fluctuates due to the interaction of Kushenol E, steppogenin, and isochamanetin. Such an effect is less significant when dichamanetin is the interacting flavonoid. As mentioned before, residues between 251 and 300 and the EF loop are part of the molecular oxygen access tunnel structure. Figure S3 indicates that, of the three flavonoids, only dichamanetin does not "stiffen" the tunnel. The JK loop did not fluctuate significantly under the action of flavonoids.

RMSF plots were also constructed for each docked flavonoid into the allosteric site of IDO1 along with the molecular dynamic simulations (Figure S4). As can be seen, dichamanetin and isochamanetin showed lower atomic fluctuations than Kushenol E and steppogenin do. These results reinforce, once again, the relative stability of the interactions of each flavonoid with the allosteric site.

The complexes were also analyzed for the frequency of hydrogen bonds formed during the molecular dynamics simulation (Figure S5). Dichamanetin has a higher frequency of hydrogen bonds than isochamanetin and Kushenol E. These data demonstrate that hydrogen bonds are important for the stabilization of the IDO1-flavonoid complexes, mainly for dichamanetin and isochamanetin. However, steppogenin shows an "anomalous" hydrogen bond frequency. Perhaps that should be due to the highest degree of freedom ("pose fluctuations"), since, as shown in Figure 11, that compound seems to be slightly bound to the allosteric site.

The MM-PBSA method evaluates conformational fluctuations as a function of the simulation trajectory and, consequently, it is a useful approach for inferring the entropic profile of the formed complexes, as well as for calculating the free energy of binding of each flavonoid with the IDO1. The results are shown, respectively, in Tables 3 and 4. According to Table 3, the affinity of each flavonoid with IDO1 increases in the following order: steppogenin (−5.73 kcal/mol); Kushenol E (−21.65 kcal/mol); isochamanetin (−22.09 kcal/mol); dichamanetin (−25.93 kcal/mol). It is worth noting the calculated affin-

ity value for steppogenin (the negative control), testifying once again, and in accordance with the experimental data [24], that steppogenin forms the least stable complex with IDO1. The free energy contributions of each constituent residue of the allosteric site are shown in Table 4.


**Table 4.** Distributed binding energies per amino acid for each modeled IDO1-flavonoid complex.

#### **3. Materials and Methods**

#### *3.1. IDO1*

The crystallographic structure of IDO1 in its holo form, with a resolution of 2.44 Å (PDB ID: 7A62) [15], was obtained from the Protein Data Bank [36]. The enzyme preparation steps were carried out as follows: (1) non-essential water molecules were removed; (2) polar hydrogens were added to the enzyme; (3) partial charges were calculated using both Kollman and Gasteiger's approaches [37,38].

#### *3.2. Virtual Screening*

For the virtual screening of compounds, Mcule platform [25], ZINC 15 database [26], and AutoDock Vina 1.1.2 program [39] were used.

#### *3.3. Molecular Docking*

Molecular docking simulations were performed with the AutoDock Vina 1.1.2 program [39]. This tool uses a hybrid scoring function, i.e., it is a combination of empirical and knowledge-based functions [39]. The binding energy is predicted as the sum of distancedependent atom pair interactions as follows:

$$E = \sum\_{\text{pair}} e\_{\text{pair}}\ (d)$$

Here *d* is the surface distance calculated as follows, where *r* is the interatomic distance and *R<sup>i</sup>* and *R<sup>j</sup>* are the radii of the atoms in the pairs:

$$d = r - \mathcal{R}\_i - \mathcal{R}\_j$$

Every atom pair interacts through a steric interaction given by the first three terms of the equation shown below. Moreover, depending on the atom type, there could be hydrophobic and non-directional H-bonding interactions, given by the last two terms.

$$e\_{pair}(d) = \begin{cases} +\ w\_1 \ast \, Gauss\_1(d) \\ +\ w\_2 \ast \, Gauss\_2(d) \\ +\ w\_3 \ast \, Repulsion(d) \\ +w\_4 \ast \, Hyproboic(d) \\ +\ w\_5 \ast \, HBond(d) \end{cases}$$

The combination of an attractive Gaussian function with a repulsive parabolic function reproduces the general shape of a typical Lennard-Jones interaction, provided the Gaussian term is negative and the parabolic positive.

#### *3.4. Molecular Dynamics Simulations and Free Energy Calculation*

All molecular dynamics simulations were performed with the CHARMM36 force field [40] implemented in the GROMACS 2018.1 program [41]. The parameterization of flavonoids was performed using the CHARMM General Force Field (CGenFF) program [42]. Subsequently, a dodecahedral simulation box, including solvent inside (TIP3P water model) was created. Sodium and chlorine ions were added to the system, aiming to electronically neutralize it. Periodic boundary conditions were used.

Each complex was submitted to energy minimization using the steepest descent algorithm, involving 50,000 steps and the convergence criterion of less than 2.39 kcal/mol. Subsequently, two 100 ps molecular dynamics simulations were performed, aiming to balance the IDO1-flavonoid complexes. In the first simulation, the NVT ensemble was used, and in the second one, the NPT ensemble was used. In both cases, the simulation temperature was kept constant at 300 K and, when performing the NPT ensemble, the pressure was kept constant at 1 bar.

After the equilibration of the complexes, 100 ns molecular dynamics simulations were performed, aiming to calculate the free energies of interaction of each flavonoid with IDO1 (production stage). The running conditions were the following: ensemble NPT, where temperature was maintained, using the V-rescale implementation of Berendsen's thermostat [43]. A molecular frame was sampled every 10 ps. To keep the pressure constant, the Parrinello-Rahman pressure coupling method was used [44]; for the longrange treatment, the PME method was used [45].

Free energy calculations were calculated with the g\_mmpbsa tool [46]. The binding energy can be individually decomposed as a function of the constituent residues of the macromolecular target. Initially, the EMM, Gpolar, and Gnonpolar energy components of the individual atoms of each residue are calculated in the bound, as well as the unbound form. Subsequently, the contribution to the interaction energy ∆*R BE <sup>x</sup>* of the residue *x* is calculated as follows:

$$\Delta R\_{\chi}^{BE} = \sum\_{i=0}^{n} \left( A\_i^{bound} - A\_i^{free} \right)^2$$

where, *A bound i* and *A f ree i* are the energy of atom *i* of residue *x* in bonded and unbonded states, respectively, and *n* is the total number of atoms in the residue. The energy contribution added to all residues is equal to the interaction energy, that is, ∆*Gbinding* = ∑ *m <sup>x</sup>*=<sup>0</sup> ∆*R BE x* , where *m* is the total number of residues that make up the ligand-protein complex [46].

#### *3.5. Visualization Tools and Plots*

Molecular Visualization Programs BIOVIA Discovery Studio 2021 [47], UCSF Chimera [48], and AutoDock Tools [49] were used. Grace plotting tool was used for graphical analysis [50].

#### **4. Conclusions**

IDO1, one of the enzymes that participate in the immune modulation process, is an interesting macromolecular target for anticancer action. From a known drug (Kushenol E), with proven anticancer activity, extensive virtual screening studies were carried out in digital databases. Both ligand- and structure-based approaches were performed. In this context, three natural products, isolated from plants, were identified as potential allosteric site binders of IDO1. Two of which, dichamanetin and isochamanetin, were shown to be promising IDO1 inhibitors and, in our opinion, should be considered in future studies of biological activity, molecular optimization, organic synthesis, and enzymatic assays. From the point of view of structural biology (conformational characteristics of IDO1), it was found that helices constituting the allosteric site did not undergo significant modifications; the same happened for the JK loop (located at the entrance of the catalytic site). In order to investigate, if the presence of the identified allosteric binders could weaken the binding of orthosteric binders (O2, tryptophan), as well as weaken the binding of the heme cofactor, further simulations, involving multiscale quantum mechanics/molecular mechanics (QM/MM) are being performed in the laboratory.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/ph15091099/s1, Figure S1: Radius of gyration plots of apo IDO1 (black), IDO1-Kushenol E (red), IDO1-steppogenin (magenta), IDO1-dichamanetin (blue), and IDO1-isochamanetin (yellow) systems.; Figure S2: RMSD plots. Apo IDO1 (black), IDO1- Kushenol E (red), IDO1-steppogenin (magenta), IDO1-dichamanetin (blue), and IDO1-isochamanetin (yellow); Figure S3: RMSF plots. Apo IDO1 (black), IDO1- Kushenol E (red), IDO1-steppogenin (magenta), IDO1-dichamanetin (blue), and IDO1-isochamanetin (yellow); Figure S4: RMSF plots for the flavonoids: Kushenol E (red), steppogenin (magenta), dichamanetin (blue), and isochamanetin (yellow); Figure S5: Hydrogen bond frequency. IDO1 Kushenol E (red), IDO1-steppogenin (magenta), IDO1-dichamanetin (blue) and IDO1-isochamanetin (yellow); Table S1: Screened compounds using Mcule and ZINC 15 databases.

**Author Contributions:** Conceptualization, O.A.S.-F.; methodology, O.A.S.-F. and V.M.d.A.; software, O.A.S.-F.; validation, O.A.S.-F. and V.M.d.A.; formal analysis, O.A.S.-F. and V.M.d.A.; investigation, V.M.d.A.; resources, O.A.S.-F.; data curation, V.M.d.A.; writing—original draft preparation, O.A.S.-F.; writing—review and editing, O.A.S.-F.; visualization, V.M.d.A.; supervision, O.A.S.-F.; project administration, O.A.S.-F.; funding acquisition, O.A.S.-F. All authors have read and agreed to the published version of the manuscript.

**Funding:** This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível superior—Brasil (CAPES)—Finance Code 001. Resources of the Laboratório de Modelagem Molecular e Biologia Estrutural Computacional (LMMBEC) were used in this research.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data is contained within the article and supplementary material.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Review* **In Silico Approaches: A Way to Unveil Novel Therapeutic Drugs for Cervical Cancer Management**

**Diana Gomes 1,2,3 , Samuel Silvestre 1,4,5,6 , Ana Paula Duarte 1,4,6 , Aldo Venuti 7 , Christiane P. Soares 8 , Luís Passarinha 1,2,3,4, \* and Ângela Sousa 1, \***


**Abstract:** Cervical cancer (CC) is the fourth most common pathology in women worldwide and presents a high impact in developing countries due to limited financial resources as well as difficulties in monitoring and access to health services. Human papillomavirus (HPV) is the leading cause of CC, and despite the approval of prophylactic vaccines, there is no effective treatment for patients with pre-existing infections or HPV-induced carcinomas. High-risk (HR) HPV E6 and E7 oncoproteins are considered biomarkers in CC progression. Since the E6 structure was resolved, it has been one of the most studied targets to develop novel and specific therapeutics to treat/manage CC. Therefore, several small molecules (plant-derived or synthetic compounds) have been reported as blockers/inhibitors of E6 oncoprotein action, and computational-aided methods have been of high relevance in their discovery and development. In silico approaches have become a powerful tool for reducing the time and cost of the drug development process. Thus, this review will depict small molecules that are already being explored as HR HPV E6 protein blockers and in silico approaches to the design of novel therapeutics for managing CC. Besides, future perspectives in CC therapy will be briefly discussed.

**Keywords:** cervical cancer management; computer-aided drug design; E6 inhibitors; in silico studies; human papillomavirus

### **1. Introduction**

Cancers are some of the deadliest pathologies, and according to Globocan, the global cancer burden in 2020 increased to 19.3 million cases and 10 million cancer deaths. With these new data, it is estimated that 1 in 5 people will develop cancer during their lifetime, and 1 in 8 men and 1 in 11 women will die from the disease [1]. Thereby, as logical consequence, novel pharmacotherapy approaches have increased in the recent years [2]. In particular, cervical cancer (CC) is considered the fourth cause of death among women worldwide with its establishment being associated with human papillomavirus (HPV) infection [3]. Considering the data available by Globocan regarding this pathology, there

**Citation:** Gomes, D.; Silvestre, S.; Duarte, A.P.; Venuti, A.; Soares, C.P.; Passarinha, L.; Sousa, Â. In Silico Approaches: A Way to Unveil Novel Therapeutic Drugs for Cervical Cancer Management. *Pharmaceuticals* **2021**, *14*, 741. https://doi.org/10.3390/ ph14080741

Academic Editor: Osvaldo Andrade Santos-Filho

Received: 1 July 2021 Accepted: 27 July 2021 Published: 29 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

were 341,831 deaths in 2020 with a higher incidence in low-income countries (LIC). In fact, in less developed regions such as Africa, Asia, and South America, CC is the primary cancer found in women, which can be due to the lack of screening programs, limited resources and access to health care, or even anti-vaccination movements [4,5]. In developed countries, the incidence of CC cases is lower due to better health services and the high availability of HPV prophylactic vaccines, which constitutes a great step in the prevention of HPV-associated cancers. However, the prophylactic vaccines have only been effective when administered in healthy patients, and they are not able to exert a therapeutic effect or treat an established infection [6].

The current treatments for CC are based on excisional or ablative procedures, surgical resection, radiotherapy, or chemotherapy, which do not specifically target the oncogenic properties of HPV, and therefore lesion recurrence can occur [7]. In addition, most of these procedures can affect normal tissues and can have potential side effects, including bleeding, which cause patient discomfort and can reduce life quality [5]. These constraints highlight the need to improve the current therapeutic approaches by combining strategies or proposing new compounds to offer more specific and less invasive treatments, without affecting healthy tissues. Hence, the scientific community has been focused on different ways to combat CC. A strategy with great potential consists of finding new anticancer agents by targeting the major oncoproteins responsible for HPV-driven carcinogenesis, E6 and E7. In fact, the discovery of the E6 protein X-ray crystal structure, available in protein data bank (PDB ID: 4GIZ and 4XR8, accessed on 20th May 2021), led to an increase in the use of in silico approaches to uncover potential E6 inhibitors [8].

Drug discovery and development is a very expensive and time-consuming process, which can take 10 to 15 years until a drug reaches the market. In the last decades, the pharmaceutical industry has been employing computer-aided drug design (CADD) techniques to accelerate drug development, intending to reduce time, costs, and failures, namely in the final stage [9,10]. This analysis is based on calculated properties and prediction models for drug therapeutic targets and identification of safety liabilities. Typically, CADD can be divided into three categories: structure-based, ligand-based, and hybrid methods [11,12]. The structure-based approaches, including docking and the application of molecular dynamics simulations, use the 3D structure of the target molecule to screen potential ligands. These methods evaluate ligand recognition by the target molecule and the prediction and characterization of binding sites as well as binding affinity [9,12]. For instance, molecular docking is one of the most applied techniques to select promising molecules from large libraries by predicting the orientation of a compound towards the target and characterizing ligand–target interactions. Molecular dynamics techniques entail the motion principles to molecules and are frequently used to perform binding mode studies and to predict the stability of a ligand–target complex, giving a deeper understanding to the researchers on the interaction of a ligand to a biomolecular target [9,11–13]. On the other hand, ligandbased methods, including similarity searching, pharmacophore modeling, and quantitative structure–activity relationship (QSAR) studies, use the information of groups of small molecules with different structures capable of interacting with the target to identify new and powerful compounds [12,14]. These methods are usually applied when the 3D structure of the target is not available and assume that analogous compounds show similar biological activity and interaction with the target. QSAR modeling allows understanding the effects of structural variables on biological activity to develop compounds with enhanced and optimal pharmacological profiles. Another ligand-based method is similarity searching, which is mostly applied in filtering compounds from big libraries based on the assumption that compounds with structural similarity can have similar bioactivity [9]. When the 3D structure of the target is available, as well as the ligands' structure, it is possible to use hybrid methods. This means combining structure-based and ligand-based methods to perform some types of pharmacophore modeling or to predict the activity considering the biological profile of tested compounds against several targets [12]. In fact, combining CADD methods can be more effective once their advantages complement

one another [11,12]. Given the importance of pharmacokinetics and toxicity properties of selected compounds, in silico ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) filters can also be applied to eliminate compounds with potential undesirable physiological qualities [9]. With the improvement of technology and bioinformatic tools, the use of in silico approaches for drug development, mainly in preliminary studies, has increased over the years.

Hence, in this review, we intend to summarize the current treatment used for CC stages induced by HPV persistent infection, present the small molecules that are already being explored as inhibitors/blockers for E6 protein, and retrospectively analyze the studies published in the last years that applied in silico approaches to the design of novel therapeutics for CC treatment. In addition, we will concisely discuss the future perspectives for CC management.

#### **2. HPV, CC Development, and Clinical Treatment**

HPV is the etiological agent associated with CC. There are about 200 HPV genotypes of high-risk (HR HPV) and low-risk (LR HPV) identified. HPV16 and 18 are responsible for more than 70% of invasive CC. The HPV genome presents tropism for epithelial cells, and the infections appear mainly in the anogenital tract [3,5,15] and head and neck anatomic sites. While HPV is the hallmark of CC development [15], the classic major risk factors to head and neck cancer are tobacco and alcohol, but in the past few decades, human papillomavirus (HPV) has emerged as a novel risk factor [16,17]. The biology and life cycle of human papillomavirus have been reviewed elsewhere [18]. However, we considered it helpful to readers to give a brief insight into the HPV life cycle that may lead to carcinogenesis.

The HPV genome encodes eight genes; the L1 and L2 structural proteins constitute the capsid that protects the viral genome, and the E1, E2, E4, E5, E6, and E7 proteins are involved in replication, transcription regulation, and oncogenesis. E6 and E7 oncoprotein expression disrupt the cell repair mechanisms by degradation or inhibition of the tumor suppressor proteins p53 and retinoblastoma protein (pRB), respectively, resulting in the immortalization and cellular transformation of infected cells [19–21]. The viral life cycle begins with viral particles arriving in the basal layer of the squamous epithelia via micro-abrasions. After reaching the nucleus the viral DNA is replicated, and low quantities of E1, E2, E6, and E7 proteins are produced early in the infection, halting normal keratinocyte development. Then, E2 recruits E1 to increase the number of viral episome copies, which continue to rise as the epithelium differentiates. E6 and E7 proteins are abundantly expressed in the top differentiating epithelial layers, resulting in uncontrolled cell proliferation, and the viral life cycle is completed when L1 and L2 proteins are expressed in the epithelium's highest layer. As a result, the viral genome is encapsulated, and mature virions are released [22,23]. Although many women contract HPV, most infections are eliminated or controlled by the immune system after 1–2 years [24]. The establishment of persistent infection is associated with the appearance of cervical lesions, since the accumulation of DNA damage caused by HR HPV E6 and E7 interactions with tumor suppressors, p53 and pRb, causes apoptosis suppression and uncontrolled proliferation. The chronically infected cells lead to the establishment of cervical intraepithelial neoplasia (CIN), as CIN grade I, which over time can evolve to CIN grade II, III, or invasive carcinoma [23,24].

CC is staged according to different systems. The most widely used is the FIGO system proposed by the International Federation of Gynecology and Obstetrics in collaboration with the International Union Against Cancer (IUCC) [7]. Table 1 summarizes the different stages of FIGO's system with the current treatment for managing CC.


**Table 1.** CC stages according to the FIGO system and the recommended treatment.

All surgical interventions can cause side effects or complications, including bleeding and damage in the tissue nearby, and simple or radical hysterectomy results in infertility or bladder/bowel dysfunction [25]. With the spread of cancer to other tissues (metastasis), surgery is no longer a viable possibility. Radiation therapy (RT) uses radiation to destroy cancer cells, and it is possible to affect mainly the zones with the tumor in the lower abdomen in CC cases. Some side effects can be infertility, discomfort, and menopause. Chemotherapy involves the administration of cytotoxic drugs to interfere with cell proliferation, killing rapidly dividing cells. The currently approved drugs by the FDA for CC treatment include, among others, bleomycin sulfate, topotecan, pembrolizumab, bevacizumab, cisplatin, paclitaxel, vinorelbine, ifosfamide, fluorouracil, and gemcitabine [26]. Nevertheless, these treatments are unable to effectively distinguish healthy from cancer cells, affecting several types of dividing cells throughout the entire body. This effect can result, namely, in a higher risk of anemia, bleeding, and infections [3,25]. Hence, the design and development of therapeutic agents that are efficient, non-toxic, and capable of distinguishing cancerous from healthy tissue are essential. In addition, for managing early-stage CC, the employment of less invasive methods to improve quality of life and decrease treatment-related sexual dysfunction is a necessity.

#### **3. E6 Protein**

As mentioned before, E6 and E7 proteins of HR HPV play a major role in CC development once its growth is dependent on sustained E6/E7 expression. The interest in HR HPV E6 protein as a potential therapeutic target has arisen in the past years. This protein is essentially found in the cell nucleus and is constituted by 150–160 amino acids and two zinc-binding motifs. In 2013, Zanier and co-workers published in PDB the 3D structure of HPV16 E6 protein bound to the E6AP interaction domain (ID: 4GIZ) [27]. More recently, the same research group also published the 3D structure of the E6-E6AP-p53 trimeric complex (ID: 4XR8) [28]. The wild-type (wt) HPV16 E6 protein forms intermolecular disulphide bonds, which makes it very difficult, if not impossible, to purify [29]. Thus, Zanier and colleagues used an E6 nonfull-sequence structure (4C/4S mutant) to achieve its crystallization, where four cysteines were replaced with four serine residues. The E6 4C/4S mutant maintains the ability to degrade p53 in a very similar way to that of HPV16 wt E6 [30]. These crystallographic findings determine the identification and design of novel

compounds to block the HR HPV E6 oncoprotein action with higher efficiency. As the E7 protein 3D structure is still unknown/resolved, this explains why researchers are mainly directing their efforts in developing strategies to block E6 to the detriment of E7.

The E6 protein binds to E6AP (E6-associated protein), a ubiquitin ligase required for the interaction with p53 protein. The formation of the trimeric E6-E6AP-p53 complex is responsible for the ubiquitination and degradation of p53, resulting in p53-dependent apoptosis blockage. The crystallization data revealed that HPV16 E6 is composed of an N-terminal and a C-terminal connected by an alpha helix, forming the hydrophobic binding pocket for E6AP [27]. E6 targets other structures that are involved in several molecular pathways, bringing down some cancer hallmarks. For example, HR HPV E6 protein is able to activate telomerase by transcriptional up-regulation of TERT (reverse transcriptase component of telomerase). There are two mechanisms proposed for this phenomenon, and both require the E6-E6AP complex formation. Briefly, one of the models proposes that the E6-E6AP complex can bind to NFX1-91 (repressor of TERT transcription) causing its degradation and releasing the transcriptional repression at the TERT promoter. On the other model, the E6-E6AP complex binds to c-myc, which may displace the inhibitory USF transcriptional repressor from the E box in the TERT promoter. Other mechanisms are possible once the telomerase activation is not well understood. Besides telomere elongation, TERT can participate in apoptosis inhibition and allow cell proliferation, which increases the probability of malignant conversion [31]. Besides E6AP binding, E6 only from high-risk HPV can bind to the PDZ-domain family of proteins through its C-terminal, although this region is not necessary for p53 interaction [28,32,33]. Thus, HR HPV E6 protein complexes, along with the different proteins that bind to E6, affect several biological functions such as cell survival, DNA damage, and cell cycle progression. Considering the problems associated with current treatments for CC and the important role of E6 oncoprotein in the progression of HR HPV infection towards cervical lesions, the inhibition/blockage of its function could be a promising and specific therapeutic strategy. Indeed, several authors have shown that decreased HR HPV E6 expression by RNA interference, flavonoids, and intracellular antibodies could restore p53 levels and induce apoptosis in HPV-positive cell lines [34,35]. Hence, a presentation of the advances made in anticancer drug design and development for the HR HPV E6 protein using computational methods will be performed. For that, three databases (PubMed, Web of Science, and ScienceDirect) were used to search articles published after 2015 with the following keywords: E6 inhibitors AND in silico; E6 inhibitors AND molecular docking; E6 protein AND in silico; E6 protein AND molecular docking.

#### *3.1. E6-E6AP Complex Inhibitors*

The HR HPV E6 protein has different binding pockets, including the hydrophobic pocket where the LxxLL motif of E6AP binds, the PDZ-binding domain, and the p53 binding site, Figure 1 [27,28]. Until recently, the most explored area to achieve inhibition has been the hydrophobic pocket, to the detriment of the HR HPV E6PDZ domain.

Researchers have been taking advantage of the E6/E6AP complex 3D structure available on PDB (ID: 4GIZ) to understand the interactions involved between both proteins and to apply in silico methods to design potential inhibitors. E6AP is required to induce E6-mediated p53 degradation by inducing structural changes in the E6 protein, which allow the formation of the trimeric complex E6/E6AP/p53 [28]. The hydrophobic pocket able to recognize the LxxLL motif is mainly constituted by the following amino acid residues: Leu100, Leu50, Cys51, Val53, Val62, Leu67, Tyr32, Tyr70, and Ile73 [27].

**Figure 1.** A map of different binding sites of the HPV16 E6 protein using PDB ID 4XR8 and 2KPL on Chimera software. Surface representation of E6 residues that participate in the binding with E6AP (**A**), p53 (**B**), and the PDZ domain (**C**). **A**—Orange residues participate in E6AP binding; **B**—The three interfaces of contact between E6 and p53 are colored in light, medium, and dark blue. **C**—The C-terminal PDZ binding motif is represented in green. E6N: N-terminal; E6C: C-terminal.

#### 3.1.1. Synthetic Compounds

Ricci-López and co-workers developed an in silico pipeline to propose potential inhibitors of E6/E6AP interaction [36]. For this, the HPV16 E6 sequence (UniProt ID: P03126) and the template retrieved from the ternary complex E6/E6AP/p53 (PDB ID: 4XR8) were used to obtain the full-length structure of the HPV-16 E6 protein by homology modeling once the E6 protein was crystallized by using a 4C/4S mutant, as mentioned earlier. Twenty-six compounds described in the literature as capable of binding to E6 were used as reference molecules and queries in the Zinc15 database to build a compound library with almost 35,000 molecules using a structural similarity approach. ZINC15 is a public access database provided by the Irwin and Schoichet Laboratories in the Department of Pharmaceutical Chemistry at the University of California, San Francisco (UCSF). It contains over 230 million purchasable compounds in ready-to-dock, 3D formats and over 750 million purchasable compounds, and it is possible to search for analogues. It is widely used for virtual screening, ligand discovery, and pharmacophore screens, among others. Compounds were then filtered considering ADME properties, and those with the highest potential were submitted to molecular docking studies considering the hydrophobic pocket as the binding site. After the binding free energy calculations of the best-docked complexes, compound **1** (ZINC111606147), compound **2** (ZINC362643639), and compound **3** (ZINC96096545) (Figure 2) were selected for further evaluation through molecular dynamics simulations. When compared to luteolin (reference ligand), these three compounds had higher affinity to the E6 protein, and all docked complexes were stable and capable of inhibiting E6/E6AP interaction [36].

‐ ‐ ‐

‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ **Figure 2.** Structures of studied synthetic compounds as E6/E6AP inhibitors. IUPAC name of compounds—compound **1**: (*S*)-*N*-((3-amino-1-(5-ethyl-7*H*-pyrrolo [2,3-*d*]pyrimidin-4-yl)pyrrolidin-3 yl)methyl)-2,4-difluorobenzamide; compound **2**: *N*-((6-methyl-1*H*-benzo[*d*]imidazol-2-yl)methyl)- 5-(thiophen-2-yl)-1*H*-pyrazole-3-carboxamide; compound **3**: *N*<sup>1</sup> -(5-chloro-2-cyanophenyl)-*N*<sup>2</sup> -(2 hydroxy-2-(1-methyl-1*H*-indol-3-yl)ethyl)oxalamide; compound **4**: (*E*)-6-((2*S*,3*S*)-2-((*E*)-5-((*tert*butyldimethylsilyl)oxy)-2-methylpent-1-en-1-yl)-5-methoxytetrahydrofuran-3-yl)-4-methylhex-3 en-1-ol; compound **5**: *tert*-butyl(((*E*)-5-((2*S*,3*S*)-5-methoxy-3-((*E*)-3-methylhexa-3,5-dien-1 yl)tetrahydrofuran-2-yl)-4-methylpent-4-en-1-yl)oxy)dimethylsilane; compound **6**: 2-(3,4 dihydroxyphenyl)-5,7-dihydroxy-8-methyl-4*H*-chromen-4-one; compound **7**: (*S*)-(*R*)-2-((2-amino-6-oxo-1*H*-purin-9(6*H*)-yl)methoxy)-3-hydroxypropyl 2-amino-3-methylbutanoate; compound **8**: 4-amino-1-((2*R*,3*S*,4*S*,5*R*)-3,4-dihydroxy-5-(hydroxymethyl)tetrahydrofuran-2-yl)pyrimidin-2(1*H*)-one; compound **9**: (*R*)-2-((2-amino-6-oxo-1*H*-purin-9(6*H*)-yl)methoxy)-3-hydroxypropyl 4-aminobenzoate.

Senthilkumar and colleagues described ansiomelic acid (AA), a compound isolated from the plant *Anisomeles malabarica*, as being able to inhibit E6 and E7 protein expression and induce apoptosis in HPV-positive cancer cells. Then, they decided to perform a structure–activity relationship study with several AA analogues to identify potent inhibitors of HPV E6 and E7 oncoproteins [37]. Molecular docking was used to predict the affinity of 26 AA-derived compounds towards the hydrophobic pocket of the E6 protein (PDB ID: 4GIZ) using AA as control. Of these, compounds **4** and **5** (Figure 2) showed the highest docking scores −65.74 Kcal/mol and −63.66 Kcal/mol, respectively, and lower IC<sup>50</sup> values (HPV-positive cell lines, SiHa and HeLa) were observed when compared to AA and other compounds. In addition, they displayed less toxicity towards fibroblasts than commercial drugs and were able to inhibit p53 degradation mediated by the E6 protein, according to the in vitro studies performed [37].

Rietz and his team followed a different approach. For this, small-molecule probes, which can modulate protein interactions, were used to evaluate the recognition features of the E6 protein by means of binding and functional assays [38]. In this context, authors intended to understand the contribution of the substituents in positions 2 and 6 on the benzopyranone scaffold explored and to determine which E6 features are involved in binding. Interestingly, it was observed that charged groups in position 6 and non-polar substituents in position 2 displayed higher activity. Molecular dynamics simulations with analogs of these small molecules and exploring a set of mutations in different amino acid residues allowed the authors to conclude that a group of arginines (Arg10, Arg55, Arg102, Arg129, and Arg131) play a major role in the shape of the E6 helical binding groove, as well as in molecular recognition of the binding partners. Therefore, these last results can be of relevance for structure-based targeting of HPV E6 [38].

Another strategy was developed by Kumar and colleagues using ligand-based and structure-based methods [39]. They built an e-pharmacophore model for virtual screening based on the amino acid residues involved in the interaction of E6 (ID: 4GIZ) and a peptide reported by Zanier and co-workers [35]. The ligands were selected based on literature information, and 2-aminobenzothiazole and the luteolin chromone moiety were used as query molecules to found potential compounds on the ZINC 15 database. Then, 6000 compounds were screened using the pharmacophore model developed, and molecular docking studies predicted the best compounds based on dock scores and interactions with amino acid residues of the binding site. In addition, ADME analysis and molecular dynamics simulations were performed to find the best hit. The chromone derivative, compound **6**, ZINC14761180 (Figure 2) showed the most relevant interactions, and molecular dynamics of the complex ZINC14761180-E6 protein evidenced good stability in the binding pocket [39]. These findings can be a starting point for further design and synthesis of new E6 inhibitors. Following this work, the authors decided to explore a different approach but with the same objective [40]. For this, a drug repurposing approach based on the FDA-approved drugs library was applied. This methodology consists of identifying a different therapeutic use for an available or approved drug. Approved drugs have acceptable ADMET properties; therefore, this is an effective method as it involves less time, lower costs, and reduces the probability of an undesirable ADMET profile in clinical trials [10]. After the compounds' screening through a pharmacophore model, docking, molecular dynamics simulations, and ADME analysis, the selected hits with the highest potential were compound **7** (ZINC000001543916—valganciclovir; anti-viral drug) and compound **8** (ZINC000003795098—cytarabine; anticancer drug) (Figure 2). Both drugs are purine or pyrimidine nucleoside analogues, which showed that these scaffolds can be applied to design novel E6 inhibitors. Interestingly, compound **9**, ASK4 (Figure 2), which is a valganciclovir derivative, also showed promising results. Molecular dynamics simulations indicated that these ligands could form a stable complex with E6 protein, and it was also evidenced that they have an acceptable ADME profile [40].

The zinc-finger motif was proposed as strictly necessary for E6 protein function. Therefore, mutations in zinc-fingers interfere with E6/E6AP complex formation and cellu-

lar transformation. Concerning these data, Choudhury and co-workers selected specific disulfide (C13, C14, C16, R2, R15, R19) and azoic (C4) compounds, as controls based on their ability to eject zinc from the zinc finger motif of E6 protein and inhibit E6/E6AP interaction, proved by in vitro studies. Then, derivatives of these compounds were generated, and their ADME properties were predicted to exclude undesired compounds. Molecular docking, using the 3D structure of the C-terminal zinc-binding domain of the E6 protein (PDB ID: 2FK4), was performed for the selected ligands. Afterward, ligands with the highest binding score, such as compound **10**, *(E)-N*-(2-amino-2-methylpropyl)-*N*- (thiophen-2-yl)diazene-1,2-dicarboxamide, and compound **11**, (*E*)-*N*-(2-amino-2-oxoethyl)- *N*-(4-chlorophenyl)diazene-1,2-dicarboxamide, both azoic derivatives (Figure 3), were used for pharmacophore modeling (ligand-based method). The predicted amino acid residues involved in the interaction with the ligands are Tyr15, Trp55, Asn50, Leu23, Leu33, Ser5, Ile24, Ile51, Arg47, Arg54, Arg25, Lys45, Lys38, Phe48, Cys26, and Ala1. Some of these residues are found to be conserved among HPV6, HPV11, HPV16, and HPV18 zinc fingers. This demonstrates the possibility of the zinc finger domain to be a drug target [41]. ‐ ‐ ‐ ‐ ‐ *‐* ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ *‐ ‐* ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ 

‐ ‐

‐

‐

‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ **Figure 3.** Structures of studied synthetic compounds as E6/E6AP inhibitors acting by ejecting zinc ions. IUPAC name of compounds—compound **10**: (*E*)-*N*<sup>1</sup> -(2-amino-2-methylpropyl)- *N*1 -(thiophen-2-yl)diazene-1,2-dicarboxamide; compound **11**: (E)-*N*<sup>1</sup> -(2-amino-2-oxoethyl)-*N*<sup>1</sup> -(4 chlorophenyl)diazene-1,2-dicarboxamide.

#### 3.1.2. Natural Compounds

‐

‐ ‐ ‐ ‐ ‐ ‐ − Plant-derived compounds have been used in the prevention and treatment of different clinical conditions, and several of these natural products possess beneficial effects against several types of cancers, including cervical, colon, skin, breast, and prostate cancers [42]. In addition, many natural products are abundantly available and therefore can constitute a cost-effective way to obtain active pharmaceutical ingredients, including for cancer treatment [43]. Therefore, some researchers have focused on investigating the potential of natural compounds as E6 inhibitors. Clemente-Soto and his research team studied the ability of quercetin, compound **12** (Figure 4), a flavonol belonging to the flavonoid group, to inhibit p53 degradation mediated by E6 through in silico and in vitro studies [44]. According to molecular docking results, quercetin was able to bind in three different sites of E6 (ID: 4GIZ). Interestingly, it was predicted that the lowest energy (−7.08 Kcal/mol) and interactions with crucial amino acids for E6-E6AP formation (such as L50, L100, R102, R131) occurred in site II (corresponds to the hydrophobic pocket). As reference ligands, CAF24, C170, and luteolin were used because it was previously demonstrated that they were able to bind the E6 protein and disrupt the E6AP association [34]. Additionally, in vitro studies demonstrated that quercetin was capable to reactivate p53 and induce G2 phase cell cycle arrest and apoptosis in HPV-positive cancer cell lines [44].

‐

‐

‐

‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ **Figure 4.** Structures of the studied natural compounds as E6/E6AP inhibitors. IUPAC name of compounds—compound **12**: 2-(3,4-dihydroxyphenyl)-3,5,7-trihydroxy-4*H*-chromen-4-one; compound **13**: 1',5-dihydroxy-3',7-dimethyl-[2,2'-binaphthalene]-1,4,5',8'-tetraone; compound **14**: (*S*)-5,7-dihydroxy-2-(3-hydroxy-4-methoxyphenyl)chroman-4-one; compound **15**: (*S*)-5,7-dihydroxy-2-(3-hydroxy-4-methoxyphenyl)chroman-4-one; compound **16**: ((2*R*,3*R*,4*S*,5*R*,6*S*)-4-hydroxy-3- (((2*S*,3*R*,4*S*,5*R*,6*R*)-5-hydroxy-6-(hydroxymethyl)-4-methyl-3-(sulfooxy)tetrahydro-2*H*-pyran-2 yl)oxy)-6-methoxy-5-(sulfooxy)tetrahydro-2*H*-pyran-2-yl)methyl hydrogen sulfate.

‐ ‐ Kumar's research group took advantage of both 3D structures of the E6 protein available on PDB (ID: 4GIZ and 4XR8) to perform a protein–protein alignment and identify the amino acid residues that suffer conformational changes when the E6/E6AP complex recruits p53 [45]. The residues Arg8, Tyr32, Cys51, Tyr70, Ser74, and Arg131 demonstrated conformational changes when the E6/E6AP complex bound to the p53 protein. Then, an e-pharmacophore model was built based on the amino acid residues present in the predicted cavity and screened against 27,354 compounds. Molecular docking and molecular dynamics approaches were used to study the hits previously selected, where compound **13**, diospyrin (Figure 4), was identified as the best hit. H-bond interactions with Tyr32, Cys51, Ser74, and Arg131 amino acid residues were predicted, which might be important for an inhibitor to interact with the E6 protein [45].

Kolluru and his research team stated that E6 can inhibit the apoptotic pathway by binding to adaptor molecule FADD (Fas-associated death domain); therefore, the inhibition of this interaction could be a promising strategy for CC treatment [46]. As there is no information about the FADD binding site on the E6 protein, the authors decided to identify the flavonol binding pocket on HPV 16 E6 (ID: 4GIZ) protein by studying six flavonols reported in the literature as E6 inhibitors. First, a blind docking was performed to identify different pockets, followed by the determination of amino acid residues involved in interactions with ligands. Amino acids Cys51, Leu50, Arg102, Arg131, Leu67, Val62, and Gln107 were the most common among binding pockets that have scores correlated with the previously described IC<sup>50</sup> (determined by the interaction of GST-E6 and His-caspase 8), indicating their importance in E6 inhibition. Compound **14**, myricetin (Figure 4), demonstrated the highest docking score in most of the binding pockets, which agrees with the IC<sup>50</sup> reported in the literature. The efficacy of these ligands may be increased by conjugation with multivalent glycocalixarenes, which are known to interact with biological macromolecules [46].

Other researchers, such as Prakash and his team, studied the anticancer effect of compound **15**, hesperetin (Figure 4), a plant-isolated flavonoid, on CC via in vitro and molecular docking studies [47]. First, a preliminary study of hesperetin's effective cytotoxicity towards HeLa cells was performed. Then, molecular docking with crystal structure PDB ID: 4XR8 was applied to understand the binding mode and interaction of the hesperetin–E6 protein complex. The binding energy score was −5.58 Kcal/mol, and different bonds were established with the active site of the E6 protein, specifically H-bonds with Trp132 and Asp98, carbon–hydrogen bond with Arg102 and Leu100, and hydrophobic pi-alkyl interaction with Arg102, Arg131, and Leu100. These results demonstrated that hesperetin can bind in the hydrophobic pocket of E6 and hinder the interaction of E6 with p53 [47].

Kamma and co-workers focused on ligands available on natural sources, including carrageenan, curcumin, and papain, to target the E6 protein (PDB ID: 6SIV) [48]. Molecular docking revealed minimal binding energy (−10.7 Kcal/mol) for compound **16**, carrageenan, being considered the best ligand of this group. However, this is a preliminary study, and it is fundamental to perform in vitro studies to confirm the carrageenan behavior towards CC and non-cancer cells.

Although plant-derived compounds present many advantages as anticancer agents, their usual low availability on tumor sites could represent a drawback. Thus, their administration using adequate drug delivery systems or combination with approved drugs could be a promising way to improve their bioavailability and allow their delivery to target cells, including for CC management [43,49].

The activation of interferon regulatory factor 3 (IRF3) is dependent on the kinase binding site within the autoinhibitory domain (AD), which is blocked by E6 protein [50]. The interaction of E6 with E6AP prevents apoptosis after p53 degradation causing cell cycle disruption. Both IRF3 and E6AP E6 have specific leucine-rich motifs. In the IRF3 case, it is the N-terminal that participates in E6 binding, where for the E6AP it is the C-terminal. Therefore, HPV escape from the antiviral response is suggested to be possible through the interaction of E6 protein with IRF3 [50]. However, the mechanism of IRF3 inactivation by E6 is not completely understood. Thus, Shah and colleagues used in silico approaches to explore this mechanism [51]. The N-terminal of IRF3 comprises two leucine-rich clusters that are assumed as E6-specific binding motifs. The binding affinity of these motifs towards the E6 protein was evaluated through protein–protein docking and molecular dynamics simulations using the E6/E6AP complex structure (PDB ID:4GIZ), as a control to corroborate the protocol described by the authors. After investigation of IRF3 stable residues and identification of E6 residues with high binding energy, the binding mode of E6 inhibitors reported in the literature was explored. Molecular docking with 20 ligands (natural and synthetic compounds) into the hydrophobic pocket of E6 was performed, and their binding affinities and behaviors were evaluated through computational mutagenesis and drug resistance scanning. Computational mutagenesis was applied to study the stability of each ligand-bound E6 complex, where the polar arginine residues of the E6

pocket were mutated into alanine residues. Then, the drug resistance of E6 was determined considering the difference of binding affinities of using wild-type or mutant residues. It was noticed that the stability was compromised when Arg131 was mutated into alanine, whereas Arg102 Ala mutation reduced the ligand-binding affinities. The change in the binding affinities suggests that E6 might become more resistant to drugs when Arg131 and Arg102 are mutated into neutral amino acid residues. The data obtained indicated that the LxxLL motifs of IRF3 bind within the hydrophobic pocket of E6 and the polar areas (Arg55, Arg102, and Arg131), significantly affecting the stability of the LxxLL-E6 complex. Despite the new findings where the E6 binding to IRF3 might inhibit the kinase-mediated protein activation, the fact that the polar patches are inconsistent among HR HPV species may compromise an unsuccessful treatment due to point mutation that could make drugs ineffective [51].

#### *3.2. E6-p53 Complex Inhibitors*

CC satisfies the criteria of "oncogene addiction", which means that tumor cell development occurs due to the activity of one or some genes [52]. As E6 is the main protein responsible for p53 degradation, several researchers recently focused their attention on designing and developing potential inhibitors of E6/p53 binding by employing in silico methods. In fact, the direct disruption of the p53 binding site to E6 can be useful in the development and optimization of specific inhibitors, improving the likelihood of preventing off-target effects.

#### 3.2.1. Natural Compounds

Kumar and co-workers explored natural compounds described in the literature as able to block HPV infection towards the E6 protein of HPV18 [53]. The 3D structure of E6 HPV18 was obtained by homology modeling using the amino acid sequence available in NCBI (GenBank ID: NP\_040310.1), and the E6 HPV16 protein (PDB ID: 4GIZ) was employed as a structure template. After structure validation, molecular docking with 12 ligands was performed to elucidate the interactions with E6 HPV18. The binding site of p53 to E6 HPV18 was revealed by sequence comparison with E6 HPV16 and set on the residues 108–117 (CQKPLNPAEK). All studied compounds interacted with the p53 binding site on the E6 protein. However, the lowest binding energy (−5.85 Kcal/mol) was predicted for compound **17**, withaferin A (Figure 5) interacting through H-bonds with four amino acid residues of E6 (Glu116, Asn113, Asn122, and Ser140) [53]. Following this work, Atabaki and colleagues performed in silico and in vitro studies to evaluate the behavior of phytochemicals of *Jurinea macrocephala* subsp. *Elbursensis* in CC cell lines and their interaction with E6 HPV18 [54]. From this plant, three compounds were isolated, namely, 4-hydroxypectorolide-14-*O*-acetate, 4-hydroxy pectorolide, and pinoresinol monomethyl ether-*β*-D-glucoside (PMG), and their toxicity toward cancer cells was investigated. Of these, compound **18**, PMG (Figure 5), displayed significant toxicity on HeLa cells; therefore, molecular docking was performed to understand if it could be a potential E6 inhibitor. Using doxorubicin as control, the PMG showed higher binding energy (−4.75 Kcal/mol) and interacted through H-bonds with three amino acid residues (Arg119, Leu112, and Tyr99) [54].

Mamgmain and co-workers explored five natural compounds—colchicine, curcumin, daphnoretin, ellipticine, and epigallocatechin-3-gallate—as potential E6 inhibitors [55]. The 3D structure of the E6 protein was built through homology modeling and the crystal structure PDB ID: 4GIZ as a template. The binding sites predicted were chosen for molecular docking studies. The highest binding affinity, −8.3 Kcal/mol, was attributed to compound **19**, daphnoretin (Figure 5), which interacts through H-bonds with E6 amino acid residues Tyr39, Cys58, Ser78, Gln114, and Arg138 [55].

‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ **Figure 5.** Structures of the studied natural compounds as E6/p53 inhibitors. IUPAC name of compounds—compound **17**: (4*S*,4a*R*,5a*R*,6a*S*,6b*S*,9*R*,9a*S*,11a*S*,11b*R*)-4-hydroxy-9-((*S*)-1- ((*R*)-5-(hydroxymethyl)-4-methyl-6-oxo-3,6-dihydro-2*H*-pyran-2-yl)ethyl)-9a,11b-dimethyl-5a,6,6a,6b,7,8,9,9a,10,11,11a,11b-dodecahydrocyclopenta[1,2]phenanthro[8a,9-b]oxiren-1(4*H*)-one; compound **18**: (2*S*,3*R*,4*S*,5*S*,6*R*)-2-(4-((2*S*,6*R*)-6-(3,4-dimethoxyphenyl)hexahydrofuro[3,4 *b*]furan-2-yl)-2-methoxyphenoxy)-6-(hydroxymethyl)tetrahydro-2*H*-pyran-3,4,5-triol; compound **19**: 7-hydroxy-6-methoxy-3-((2-oxo-2*H*-chromen-7-yl)oxy)-2*H*-chromen-2-one; compound **20**: 5,7-dihydroxy-8-(5-(5-hydroxy-7-methoxy-4-oxo-4*H*-chromen-2-yl)-2-methoxyphenyl)-2-(4 hydroxyphenyl)-4*H*-chromen-4-one.

‐ ‐ ‐ ‐ − ‐ ‐ ‐ ‐ ‐ ‐ ‐ Considering the different molecules/proteins (E6AP, p53, and c-Myc) that bind to E6 protein and can interfere with normal cell function, Nabati and co-workers investigated more than 100 already described plant-derived compounds with anticancer and antiviral properties for the different protein binding sites [56]. These compounds were filtered based on their ADMET properties, and twenty compounds were selected for molecular docking studies. The crystal structure of the E6 protein PDB ID: 4GIZ was used. Compound **20**, ginkgetin (GK, Figure 5), extracted from *Gingko biloba* leaves, was the most effective in binding to all sites (E6AP, p53, and Myc) on the E6 protein, and the lowest binding energy was determined. In particular, it was predicted that GK forms five H-bonds with Arg55, Cys51, Val53, and Tyr60 on the E6AP binding site; on the p53 binding site it should form H-bonds with the same amino acid residues of the E6AP binding site; on Myc binding site, it can interact with Pro5 and Arg8 by forming four H-bonds. Considering the GK bioactivities described in the literature, including antitumor and antiviral properties, and ability to induce apoptosis in cancer cells, this study could be a starting point for the development of potential E6 inhibitor in vitro studies [56].

‐ ‐ ‐ With the crystallization of the ternary complex E6/E6AP/p53 by Zanier and coworkers in 2016, it was possible to achieve a better structural understanding of the amino acid (aa) residues that participate in the binding interface of E6-p53. Actually, the aa residues described in the literature before 2016, or used as the binding site of p53 to E6 in some of the studies mentioned above, were predicted by computational methods since there was no information about the real binding site [28,57,58]. Even now, some

authors chose to use computational methods for this purpose, while other authors used the information of the p53 binding site discovered in 2016. According to this study, the E6AP C-terminal does not substantially contribute to contacts with p53, and the E6/p53 interface can be divided into three sub-interfaces. In sub-interface I the E6 residues Glu7 and Glu18 establish interaction with p53 residues, where mutations in Glu18 impair the ternary complex formation and p53 degradation. Likewise, the hydrogen bond formed through Gln104 and Gly105 of p53 to Arg8 and Gln6 of E6 alters the conformation of the E6 N-terminal. Mutagenesis in the amino acid residues Phe2, Pro5, Arg8, or Pro9 prevented the ternary complex formation and p53 degradation. Sub-interface II has the aa residues that mediate vital contacts to p53 and are important for p53 degradation. The residues consist of Phe47, Asp44, and Asp49, which correspond to the most conserved positions in HR HPV. In addition, Ile23, His24, and Tyr43 provide hydrophobic contacts with p53. Finally, sub-interface III involves hydrophobic interactions between Leu114 and Trp146 of p53 and Leu100 and Pro112 of the E6 C-terminal [28]. ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐

‐

#### 3.2.2. Synthetic Compounds

Celegato and colleagues used the crystal structure of the E6/E6AP/p53 complex (PDB ID: 4XR8) to perform an in silico screening of small-molecule libraries against the central region of the p53-binding cleft of E6 [58]. This region was chosen due to its significant role in p53 binding and degradation as well as the high conservation among HR HPVs. A structure-based screening of three databases of commercially compounds was performed using chain F of the complex, and a filter step was applied considering the interactions with the residues Asp49 and Phe47 of E6. Twenty-nine compounds were selected and tested through their ability to rescue p53 by in vitro studies. This strategy allowed the selection of three compounds, but only pyrimidinone 21 (Figure 6) affected the viability of CC cells without affecting healthy cells. It was also found that this compound could re-establish p53 intracellular levels and transcriptional activity, reduce the viability and proliferation of HPV-positive cancer cells, and block 3D cervospheres formation [58]. Molecular dynamics simulations with E6-compound 20 complex were performed to gain further insight into the binding mode of compound **20**. The complex showed stability by the formation of a double H-bond with Asp49 and by hydrophobic interactions with Phe47 and established other interactions with Leu12, Cys16, and Ile23. This study suggested that compound **21** can be a starting point for the development of specific anti-HPV drugs. ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐

‐ ‐

**Figure 6.** Structures of the studied synthetic compounds as E6/p53 inhibitors. IUPAC name of compounds—compound **21**: 2-((4-fluorophenyl)amino)-6-(((1-phenyl-1*H*-tetrazol-5 yl)thio)methyl)pyrimidin-4(3*H*)-one; compound **22**: *N*-(4-(benzo[*d*]thiazol-2-yl)phenyl)-5-chloro-2-methoxybenzamide.

Modi and co-workers explored the anticancer properties of 10 benzothiazole derivatives in CC cell lines. Later, by in silico and in vitro studies, they aimed to identify the molecular pathways involved in the apoptosis induced by *N*-(4-(benzo[d]thiazol-2-yl)phenyl)-5 chloro-2-methoxybenzamide (compound **22** or A-07, Figure 6) [59]. To perform in silico analysis in this context, the E6 and p53 crystal structures were obtained from PDB (ID: 4GIZ and 4XR8, respectively). Then, A-07 was evaluated by molecular docking against each protein where the active site was identified by the metaPocket server. It was predicted

that this benzothiazolyl derivative was able to interact with both proteins, specifically with 10 amino acids of E6 and 14 amino acids of p53. These results suggest that A-07 (Figure 6) can hinder ternary complex formation and prevent p53-mediated intrinsic apoptosis in SiHa cells, as demonstrated by in vitro studies [59].

The natural and synthetic compounds presented here as HR HPV E6 protein inhibitors by either target E6/E6AP complex or the E6/E6AP/p53 complex formation are a valid starting point for drug design and development in cervical cancer management. However, strategies that combine both targets, the hydrophobic pocket and the p53-binding cleft of E6 protein, could result in an effective way to disrupt the E6/E6AP/p53 complex to specifically affect the viability of HPV-positive cancer cell lines and proposing specific anti-HPV therapies.

#### **4. Future Perspectives**

CC management still needs improvements, as current therapies are mainly surgery, chemotherapy, or radiotherapy. Several drugs have been proposed for treating patients with CC, but most do not overcome clinical trials due to low efficacy [36]. Bearing in mind the role of HR HPV E6 protein in the development and progression of HPV infection to cervical lesion or even invasive carcinoma, the inhibition of E6 function could be useful for treating CC. Results reported in this review support the idea that combining in silico approaches and in vitro studies could lead to a rise in the number of molecules understudy to block/inhibit E6 protein. According to Franconi and co-workers and Duenas-Gonzalez and co-workers, there are no clinical studies using natural or synthetic compounds as E6 or E7 inhibitors yet [43,60], due to low affinity and/or potency on in vitro and in vivo studies. Thus, the employment of in silico methodologies in the drug development process can be a great help to quickly find potential inhibitors and circumvent possible undesirable properties of compounds. Therefore, identification and functional evaluation of proteins associated with E6 could provide an insight into CC carcinogenesis and thus allow the design of specific strategies towards tumor cells. Moreover, it is essential to search and find cost-effective treatment options that could bring better outcomes for the patients. A strategy in this context could explore the potential use of plant-derived compounds, usually associated with lower toxicity and side effects when compared with classic anticancer agents. In addition, considering that drug distribution to the tumor site is low, the use of suitable drug delivery systems (DDS) compatible with the anticancer agents could be explored to achieve better clinical response with lower toxicity, as DDS could be functionalized with ligands that are specifically recognized by cancer cells [49]. Thinking about the fact that HPV infection is localized in the anatomical regions that can be easily reached for topical treatment, the possibility of locally delivering small molecules or natural compounds could be a valid option. Indeed, in low-income countries where women only can reach health facilities a few times in a lifetime, the combination of HR HPV test positivity and treatment in a single visit could be fundamental. A "screen-and-treat" approach allows reducing travel time, minimizing the number of visits, transport, childcare needs, and reducing the cost [61]. In places where is difficult to reach people and a return visit is not an option, self-sampling for HPV screening and mobile treatment for precancer could be applied [62]. Moreover, visual inspection with acetic acid (VIA) can be applied as a triage method for LIC, once it is low-cost and offers the option of treatment immediately or shortly after diagnostic testing. One of the biggest problems of cervical neoplasia is the resistance of tumor cells to chemotherapy and radiotherapy. Thus, the combination of anticancer agents in DDS with conventional chemotherapy or radiotherapy could also be a solution. This represents another line of investigation that needs to be explored in the near future. In terms of CC management, it would be interesting to explore a combinatory approach targeting E6 and E7 oncoproteins. In this case, the aim consists of exploiting molecules able to interact with E6 and E7 proteins, once both have a relevant role on CC progression. From this point forward, the use of in silico methods would be the key to pursue this approach.

Overall, our research group aims to propose a more specific, efficient, and nontoxic/invasive therapeutic approach for cervical cancer management. Thus, in silico approaches, such as those described in the present manuscript, will be used to select promising compounds as E6 potential inhibitors. Additional studies will be conducted with HPV E6 recombinant proteins and the selected compounds to characterize the kinetic magnitude and affinity constants of the compound–protein interaction. Then, the most promising compounds will be applied in in vitro studies with HPV-positive and HPVnegative cell lines to confirm the inhibitor/blocker effects of the E6 oncoprotein. In addition, drug delivery systems can be developed to circumvent a possible high toxicity and low availability of the compounds and, for instance, to combine this approach of E6 inhibitors with gene therapy to supplement p53 content and induce the cancer cell apoptosis.

#### **5. Conclusions**

The high impact that CC has in developing countries is undeniable. This review has briefly discussed the role of HPV in CC carcinogenesis as well as its different stages and current treatments. Subsequently, the potential of natural and synthetic small molecules in HR HPV E6 protein inhibition, mainly by targeting the E6/E6AP complex or the E6/E6AP/p53 complex, was discussed. The drug development process is a very expensive and timeconsuming process until achieving regulatory agency approval. In silico methods represent a viable solution to these current problems by allowing a fast screening and identification of potential drugs and effective predictions. In terms of impact, in silico methodologies can help to increase the speed of acceptance of potential antiviral drugs with ADMET acceptable profiles for CC management. Moreover, in silico methods have been applied in the medical field, representing the therapeutic response of drugs on virtual organs and body systems and predicting patients' biological responses to the treatment, and this significantly improves outcomes. Computational-based approaches hold a great promise for improving drug development and revolutionizing clinical research by providing a specific treatment for women diagnosed with cervical cancer. Considering the costs of screening and treatment of CC, and knowing that the highest incidence occurs in developing countries, a more cost-effective treatment is needed. Thus, exploring natural compounds with the ability to impair E6-p53 interaction could be a specific and promising strategy for CC management in a more economical way.

**Author Contributions:** Conceptualization, D.G.; software, D.G.; validation, S.S., A.P.D., A.V., C.P.S., L.P. and Â.S.; writing—original draft preparation, D.G.; writing—review and editing, S.S., A.P.D., A.V., C.P.S., L.P. and Â.S.; supervision, C.P.S., L.P. and Â.S.; funding acquisition, A.P.D., L.P., Â.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the Foundation for Science and Technology (FCT), through funds from the State Budget and by the European Regional Development Fund through the "Programa Operacional Regional do Centro (Centro 2020)—Sistema de Apoio à Investigação Científica e Tecnológica—Programas Integrados de IC&DT" (Project Centro-01-0145-FEDER-000019—C4—Centro de Competências em Cloud Computing) and the project ref: UIDB/00709/2020. This work was also supported by national funds from FCT—Fundação para a Ciência e a Tecnologia, I.P., in the scope of the project UIDP/04378/2020 and UIDB/04378/2020 of the Research Unit on Applied Molecular Biosciences—UCIBIO and the project LA/P/0140/2020 of the Associate Laboratory Institute for Health and Bioeconomy—i4HB. D. Gomes also acknowledges the doctoral fellowship from FCT ref: 2020.06792.BD.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data sharing not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Review* **Opportunities and Challenges for In Silico Drug Discovery at Delta Opioid Receptors**

**Yazan J. Meqbil <sup>1</sup> and Richard M. van Rijn 2,3, \***


**Abstract:** The delta opioid receptor is a Gi-protein-coupled receptor (GPCR) with a broad expression pattern both in the central nervous system and the body. The receptor has been investigated as a potential target for a multitude of significant diseases including migraine, alcohol use disorder, ischemia, and neurodegenerative diseases. Despite multiple attempts, delta opioid receptor-selective molecules have not been translated into the clinic. Yet, the therapeutic promise of the delta opioid receptor remains and thus there is a need to identify novel delta opioid receptor ligands to be optimized and selected for clinical trials. Here, we highlight recent developments involving the delta opioid receptor, the closely related mu and kappa opioid receptors, and in the broader area of the GPCR drug discovery research. We focus on the validity and utility of the available delta opioid receptor structures. We also discuss the increased ability to perform ultra-large-scale docking studies on GPCRs, the rise in high-resolution cryo-EM structures, and the increased prevalence of machine learning and artificial intelligence in drug discovery. Overall, we pose that there are multiple opportunities to enable in silico drug discovery at the delta opioid receptor to identify novel delta opioid modulators potentially with unique pharmacological properties, such as biased signaling.

**Keywords:** mutagenesis; artificial intelligence; computer-aided drug design; molecular dynamic simulation; biased signaling; G protein-coupled receptor

#### **1. Introduction**

The δ opioid receptor (δOR) is a Gi-protein-coupled receptor with a broad expression pattern both in the central nervous system and the periphery. The endogenous agonists for the δOR are pentapeptide enkephalins, particularly Leu 5 -enkephalin, but other peptides that originate from plants and other species, like frogs can also bind and activate the δOR [1,2]. Similar to the µ opioid receptor (µOR), which is activated by small molecules from natural products like opium and kratom and fully synthetic small molecules like fentanyl, the δOR can be activated by a variety of naturally occurring and (semi-) synthetic small molecules [3–6].

The δOR has been a potential candidate to treat a variety of diseases and disorders. Front and center have been the ability of δOR selective agonists to reduce chronic pain, be it inflammatory, neuropathic, or migraine [7]. δOR agonists have shown promise in preventing cardiac and cerebral ischemia [8], as a potential treatment for neurodegenerative diseases [9–13], and both δOR agonists and antagonists have been proposed as mechanisms for the treatment of alcohol use disorder [5,14–16]. Outside the central nervous system, δOR antagonism and positive allosteric modulation of δOR has been proposed as a treatment for gastrointestinal motility disorders, such as irritable bowel syndrome [17,18].

Early attempts by SmithKline Beecham to synthesize δOR agonists to suppress cough (SB 227122), or to treat inflammatory pain without causing seizure activity (SB 235863) [19–21]

**Citation:** Meqbil, Y.J.; van Rijn, R.M. Opportunities and Challenges for In Silico Drug Discovery at Delta Opioid Receptors. *Pharmaceuticals* **2022**, *15*, 873. https://doi.org/ 10.3390/ph15070873

Academic Editor: Osvaldo Andrade Santos-Filho

Received: 23 June 2022 Accepted: 13 July 2022 Published: 15 July 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

did not progress to clinical trials. While δOR-selective agonists, including ADL5859 and AZD2327, have previously entered clinical trials for the treatment of pain and depression, no δOR selective drugs have ultimately been approved for human use. [22–24] Both ADL5859 and AZD2327 failed to advance beyond phase II clinical trials, which are designed to establish efficacy in patients within the therapeutic indication space. ADL5859 and AZD2327 are part of a class of diethylbenzamides that include the prototypical δORselective agonist SNC80. However, SNC80 and multiple other δOR agonists have been reported to reduce seizure threshold and induce convulsions [25–27], and this has reduced enthusiasm for δOR agonists as a therapeutic area of research.

Around the same time that ADL5859 and AZD2327 were in clinical trials, Johnson and Johnson developed the anti-hyperalgesic δOR agonists, JNJ-20788560 and RWJ-394674 [28,29], but they did not take these into clinical trials, potentially due to the failure of the aforementioned clinical trial compounds. Prior to becoming insolvent, Ardent Pharmaceuticals, also produced multiple δOR agonists, with mixed µOR activity in their DPI series (DPI-221, DPI-125, DPI-289) in hopes of producing an analgesic drug with fewer adverse effect liabilities relative to the clinically used µOR agonists [12,30,31].

Recent studies suggest that β-arrestins, multifunctional proteins that can promote receptor desensitization and intracellular signaling, are involved in the mechanism of seizure activity of SNC80 [26]. This insight has spurt efforts to develop G-protein-biased δOR agonists to reduce adverse effects including seizures, paralleling similar efforts for increasing the therapeutic window through G-protein-biased agonism at other GPCRs, including the µOR. These endeavors have generated a multitude of peptides with reduced β-arrestin recruitment potency/efficacy [3,32–35]. Similarly, small molecule biased agonists have also been developed including TAN-67, KNT-127, TRV250, and most recently PN6047. Indeed, these G-protein-biased δOR agonists appear to suffer less from detrimental side effects including no seizure activity, no hyperlocomotion, and no rewarding effect [36–39]. Positive and negative allosteric modulators (PAM, NAM) and bitopic opioids that act as 'Ago-PAM' or 'Ago-NAM' have been identified and modeled in the δOR binding pocket. The benefit of pure allosteric modulators is that they are inert in the absence of enkephalin and only amplify (PAM) or inhibit (NAM) δOR signaling and/or binding when, for example, enkephalins are synaptically released. Allosteric modulators may promote a particular signaling conformation and bias the endogenous peptide; such a strategy could reduce the risk of tachyphylaxis, off-target effects, and even on-target side effects. Less than ten years ago now, Bristol Meyers Squibb identified a number of µOR and δOR Ago-PAMs [40–42] and more recently, novel δOR agonists lacking a basic nitrogen, including a novel chemotype and bitopic ligand, were also identified [43,44]. As these molecules are very recent their clinical utility has not been explored in much depth, but δOR PAM activity may, for example, aid the treatment of irritable bowel syndrome [18].

Still, except for TRV250, which has undergone phase I clinical trials for migraine, no real progress has occurred towards the production of δOR-based clinical candidates. Thus, there both remains a need and a large opportunity for discovering and developing novel δOR agonists as potential therapeutic agents. Here, we present a summary of current structural data that has been generated for the δOR; we first will provide an overview of available resolved structures and insights gained from mutagenesis studies and MD simulations. We will then discuss the limitations of the current structural knowledge. We conclude this review by presenting exciting opportunities for computer-aided drug design at the δOR.

#### **2. Current Structural Insight in** δ**OR Binding Pocket and Activation Mechanism**

Early mutagenesis studies following the cloning of the δOR [45,46] hypothesized the involvement of certain amino acids in ligand recognition, selectivity, and overall receptor activation. One of the earliest mutagenesis studies investigated the role of Asp2.50 (Ballesteros-Weinstein numbering, [47]) in regulating ligand binding at the δOR where they demonstrated that Asp2.50Asn diminishes binding of peptide agonists with minimal

effects on alkaloid agonists and antagonist [48]. Additionally, the authors hypothesized that Asp2.50 is in close proximity to the Na<sup>+</sup> binding site [48]. This was followed by another study, where Asp3.32 was investigated due to its conservation across many GPCRs that are activated via cationic neurotransmitters (protonated amines) [49]. The removal or replacement of that residue affects the binding of several δOR modulators. Unlike the alanine scanning mutations, the replacement of Asp3.32 with Asn3.32 resulted in modifications to the receptor's pharmacology and affected the binding potency of alkaloid and peptide-agonists. The authors hypothesized that the main reason for such dramatic change may be attributed to the increase in the size of the Asn3.32 side chain compared to the WT residue. Asp3.32 acts as a proton donor only whereas Asn3.32 can act as a hydrogen donor or acceptor. Additionally, the authors noted that the "Na<sup>+</sup> -induced low-affinity state" lowered the affinity of δOR peptide agonists such as DTLET and DADLE but did not affect the binding affinity of SNC-80 or BW373U86 (SNC-86) in agreement with the work by Kong et al. [48]. Interestingly, when the authors tested the δOR agonists DTLET, DADLE, and SNC-86 in the presence of sodium chloride, the binding affinity was reduced dramatically indicating a role for Asp2.50 in receptor activation to counteract the negative allosteric effects of the Na<sup>+</sup> .

The same group later used single-point mutagenesis to investigate the involvement of aromatic amino acid residues (Tyr, Trp, and Phe) in transmembrane helices III–VI in ligand recognition [50]. To identify which aromatic residues to target, they used computational modeling to construct a 3D homology model for the δOR based on the human rhodopsin and hamster β2-adrenergic receptors. They showed that mutations Tyr mutations (Tyr3.33, Tyr7.42) had the most impact on the binding of deltorphin II [50]. They concluded that each ligand-receptor complex has unique binding and conformation where mutations do not have the same effect equally across various δOR ligands [50].

Another research team created a chimeric protein, DMDD, by replacing the area around the 1st extra-cellular loop 1 (ECL1) with the corresponding residues of µOR which significantly enhanced DAMGO binding to δOR [51]. A subsequent study replaced seven non-conserved residues in transmembrane domain (TM2) and TM3 with the corresponding residues in µOR and found that one residue only, Lys2.63 (replaced by Asn2.63) showed a high affinity for DAMGO. Replacement of Lys2.63 with nineteen different amino acid residues resulted in fourteen mutant receptors that could bind to DAMGO with comparable affinity to the DMDD chimera indicating a role for Lys2.63 to act as a recognition switch for δOR agonists [52]. A similar approach of using chimeric constructs for δOR in a different study demonstrated the importance of the ECL3 in the binding of selective peptide and small molecule agonists to the δOR. In the same study, the authors showed that three residues, Trp6.58, Val7.30, and Val7.31 are necessary for the binding of δOR agonists [53].

These mutagenesis studies have provided valuable insight into ligand recognition and receptor selectivity of the δOR some of which have been verified in recent structural studies (discussed below). However, these studies did not provide insight into the effect that the investigated mutations have on downstream signaling cascades at δOR. More importantly, there is a gap in knowledge with respect to the impact of most of these mutations on biased agonism.

Over the past decade, several moderate to high-resolution structures of the δOR have been produced and have confirmed older evaluations of the δOR binding pocket performed by mutagenesis and computational modeling (Figure 1, Table 1) [49,54,55]. In the first such structure (3.4Å, PDB:4EJ4), δOR was bound to the antagonist naltrindole [56]. The structure confirmed the important interaction between Asp3.32 with the protonated amine of naltrindole, which mimics the amine of Tyr<sup>1</sup> in endogenous OR peptides. As mentioned, this Asp3.32 was already known to be crucial for opioid affinity from mutagenesis studies [49]. In this structure, anchors are provided by hydrogen bonds (potentially including water molecules) with His6.52 and Tyr3.33 [56]. Additional amino acids surrounding the binding pocket were Met3.36, Trp6.48, Ile6.51, Val6.55, Trp6.58, Leu7.35, and Tyr7.43 [56]. A higher resolution structure (1.8Å, PDB:4N6H) of naltrindole-bound δOR was able to

‐

δ

δ

resolve an allosteric binding site for a sodium ion (Figure 2B) [57]. The sodium site consists of Asp2.50, Asn3.35, Asn7.45, and Asn7.49 and sits below the conserved 'message' site ['message' = part of the molecule that recognizes ORs, 'address' = part that renders the drug subtype selective] of the opioid receptor binding pocket [58]. In 2015, a third δOR structure (2.7Å, PDB:4RWD) was resolved, but this time bound to the δOR peptide antagonist Dmt<sup>1</sup> -Tic<sup>2</sup> -Phe<sup>3</sup> -Phe<sup>4</sup> (DIPP-NH2, Dmt = 2,6,-dimethyl-l-tyrosine) (Figure 2D) [59]. ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐

‐ δ

δ ‐ ‐ ‐ ‐ ‐ ‐ ‐

‐ ‐

‐

‐

 δ ‐ ‐ δ ‐ ‐ **Figure 1.** Resolved structures of the δOR in complex with small molecules and peptides. Schematic depiction of the small molecule agonist DPI-287 and antagonist naltrindole and the peptide agonist KGCHM07 and antagonist DIPP-NH2 bound to the δOR (Top panels; active-like structures in yellow and inactive structures in sea green). The difference in TM domain positions between the antagonistand agonist-bound structures (Lower panels; antagonist-bound in grey and agonist bound in hot pink). TM domain positions produced using the structure comparison tool from GPCRdb.

**Table 1.** Overview of resolved x-ray crystal structures of the δOR. Table produced using the GPCRdb.


\* 4RWD structure was obtained using the XFEL method.

‐ ‐ ‐

‐ δ ‐ δ ‐ δ ‐ δ ‐ δ ‐ ‐ **Figure 2.** Receptor-ligand interactions at δOR deduced by X-ray crystallography. (**A**) δOR-DPI287 (PDB: 6PT3) (**B**) δOR-NTI (PDB: 4N6H (**C**) δOR-KGCHM07 (PDB: 6PT2) (**D**) δOR-DIPP-NH2 (PDB: 4RWD). Figures made in ChimeraX 1.1.

‐ ‐ ‐ ‐ δ ‐ μ The DIPP-NH2 binding pocket utilized the same 'message' binding pocket residues as naltrindole with slight movements of Val6.55 and Trp6.58. As expected, the Dmt<sup>1</sup> residue interacted with Tyr3.33, Ile6.51, and Val6.55 with the N-terminal amine forming a salt bridge with Asp3.32 [57]. The Tic<sup>2</sup> side group resided in a hydrophobic pocket made up of Ile6.51, Val6.55, Trp6.58, Leu7.35, and Ile7.39. Importantly, the larger surface of DIPP-NH2 extended further into the 'address' portion of the δOR binding pocket, with Phe<sup>3</sup> interacting with Leu3.29, Asp3.32, and Tyr3.33. Phe<sup>4</sup> interacted with a Met and Leu in ECL2. In the µOR, the corresponding amino acids are charged/polar, making this a possible region important for selectivity [59,60] (Table 2).

‐ ‐ δ ‐ δ ‐ ‐ ‐ δ ‐ ‐ ‐ In 2019, two agonist-bound X-ray crystal structures produced novel insight into δOR in the active-like state: the δOR bound to the peptide KGCHM07 (PDB:6PT2) and bound to the SNC80-like small molecule DPI-287 (PDB:6PT3) [61] (Figure 2A,C). Relative to the naltrindole-bound structure, the agonist structure shows the movement of TM6 (Figures 1 and 2), particularly, Phe6.44, Cys6.47, and Trp6.58, all of which had been previously linked to δOR activation [55,61]. Arg291 in ECL3 changes location in the KGCHM07 structure and forms a lid on the binding pocket and is part of a hydrophobic pocket that also includes, Ile6.51, Phe6.54, Val6.55, Trp6.58, and Leu7.35, which fits the benzyl moiety of KGCHM07 (Figure 2). Three water molecules interacted with Dmt<sup>1</sup> of KGCHM07 through Tyr3.33, Lys5.39 and His6.52. Water molecules also aid in forming a water-mediated salt-bridge between D-Arg<sup>2</sup> and Asp5.35. Slight differences were observed between the peptide and small molecule structures particularly in relation to the polar network, involving Thr2.56, Glu2.60, and Tyr7.43 the latter being part of a hydrophobic pocket that fits Phe<sup>3</sup> in the peptide-bound structure. Tyr7.43 stabilizes the primary amine of KGCHM07 but does not

interact with DPI-287. On the other hand, Thr2.56 stabilizes the polar network for DPI-287 but not KGCHM07 [61]. Overall, the antagonist structures confirmed many predicted and experimentally established key amino acids within the δOR binding pocket. Paired with the addition of the agonist-bound structures this provides new avenues and opportunities for in silico drug discovery at the δOR.

**Table 2.** Receptor-ligand interactions of the δOR in complex with peptide and small-molecule agonists and antagonists. Table produced in part using the GPCRdb. [62,63]. This table does not reflect the full extent of receptor-ligand interactions, especially with regards to the involvement of the amino acid residues forming hydrophobic sub-pockets of the orthosteric site that are necessary for ligand binding. Additional amino acid residues such as Asp2.50, Asn3.35, and Ser3.39 which form the sodium binding site are also not included in this table.


#### **3. Limitations of Current** δ**OR Structures**

All the current δOR structures have been resolved using X-ray crystallography (Table 1). The nature of X-ray crystallography relies heavily on producing a receptor that is stable and does not show a lot of movement. On the other hand, cryo-electron microscopy (cryo-EM) is more forgiving in this regard, and thereby provides more opportunities to generate a structure of a wild-type/non-thermostabilized receptor to overcome the current hurdle that the available δOR agonist structures are mutated. Importantly, the Sexton group at Monash University has made significant improvements in the workflow for generating cryo-EM structures, such that structures with resolutions below 3Å can now be routinely resolved [64].

Another limitation for in silico drug discovery at the δOR is that none of the δOR structures were co-crystallized with an effector protein. The ability of cryo-EM to determine the structures of large complexes of macromolecules gained attraction over the past decade following advancements in electron detectors and data software used to reconstruct the 3D structures from the 2D images [64]. For GPCRs in complex with downstream effector proteins, such as G-protein, β-arrestin, or GRKs, cryo-EM is increasingly becoming the method

of choice for structure determination. This is in part because protein structures obtained using cryo-EM overcome some of the limitations such as thermostabilizing mutations and fusion proteins which are commonly introduced in X-ray crystallography structures [65]. The increasing number of cryo-EM structures that are being obtained in complex with downstream effector proteins can provide valuable insight into the molecular basis of GPCR signaling, potentially biased signaling, which then informs the structure-based drug discovery process.

Thus far, three cryo-EM structures of the µOR and one structure for the κ-opioid receptor (κOR) have been resolved [66–68]. Nonetheless, with respect to the δOR, the absence of X-ray crystal structures or cryo-EM structure of the δOR in complex with downstream effector proteins represents a challenge for structure-based drug discovery. Overcoming this hurdle requires careful and extensive molecular modeling that integrates the available crystal structures of the δOR in their inactive- and active-like states (Table 1) with the structures of other opioid receptors that are in complex with Gi-proteins, β-arrestins, nanobodies. This is especially crucial when using the active-like crystal structures of δOR due to the presence of thermostabilizing mutations. This method of computational structure determination was applied at the κOR where molecular dynamics (MD) simulations and an enhanced sampling method called meta-dynamics simulations were used to determine the structure of the κOR in complex with the Gi-protein. To obtain the optimized active structure, the authors started with optimizing the nb39 stabilized crystal structure of the κOR in complex with the agonist MP1104 (PDB: 6B73) then used the µOR-DAMGO-nucleotide free Gi-protein cryo-EM structure to couple the Gi-protein to the κOR [66,67,69]. Then, they used meta-dynamics simulations to optimize the κOR-Gi complex interactions before examining its stability using MD simulations. Their approach was applicable to the crystal structure of the µOR (5C1M) which they used to construct µOR-BU72-Gi and confirm the structural determinants of G-protein coupling.

To obtain the active-like δOR x-ray crystal structures bound to a peptide agonist and to a small molecule agonist, the δOR was thermostabilized by nine point mutations that negatively impacted the native pharmacology [61]. This was unsurprising, as the impact of mutations on receptor function can compound with increasing numbers [55]. Particularly, the mutations impacted the allosteric sodium binding pocket that has been implicated in β-arrestin signaling [57,61]. Thus, the crystallized conformation may not be optimal to identify signal-biased agonists reducing the utility of the current agonist-bound δOR structures.

#### **4. Opportunities for Computer-Aided Drug Discovery at the** δ**OR**

Over the last ten years, an increasing number of opioid receptors structures have been elucidated in unbound (apo-state), antagonist bound, or agonist bound states, either stabilized with thermostabilizing mutations, G<sup>i</sup> -protein, or nanobodies. These structures, even the antagonist-bound ones, have proven useful for performing docking studies on large virtual libraries. For example, a screen of 3M molecules on the inactive µOR (4DKL) led to the identification of a hit that was optimized in three steps to the novel G-biased agonist PZM21 [70]. Recent advances in docking have enabled the screening of libraries of nearly two magnitudes larger in size; thus far, this approach has been successfully employed to screen 138 million compounds using an antagonist-bound dopamine D<sup>4</sup> receptor [71], and 150 million compounds at a thermostabilized agonist-bound melatonin MT1 receptor [72]. These screens relied on the ZINC database [73] to provide decoys and screening molecules. The ZINC database is an ever-increasing repository for accessible molecules, currently holding about one billion compounds [74,75]. The increase in the size of the ZINC database is largely supported by the increase in catalog size of commercially available compounds from Enamine (https://enamine.net/news-events/press-releases/807-enamine-expandscollaboration-with-ucsf, accessed on 12 July 2022). Indeed, commercially available make-on demand chemical libraries such as Enamine's REAL Space collection which comprises 22.7 billion compounds as of the time of writing this review, make it feasible to identify

novel chemotypes that could induce novel pharmacology. It comes as no surprise that high throughput virtual screening campaigns are expected to further expand the utilized ligand chemical space which will result in the identification of an increasing number of novel hit compounds. This expansion in the utilized chemical space can be effectively leveraged for structure-based drug discovery using currently available, and future, structures of δOR to virtually dock and screen more compounds than ever before.

Efforts in virtual screening of chemical libraries have used structure-based drug discovery to expand the available chemical space for various GPCR targets [72,76]. This approach has yielded many novel chemotypes across several targets that could provide useful starting points for medicinal chemists to modify and improve selectivity among other properties. However, in screening campaigns that aim to identify biased agonists in-silico, this approach represents a potential bottleneck to the discovery process and might not be sufficient to identify functionally selective ligands. Thus far, the method of choice in identifying biased agonists at the δOR and other GPCRs has relied on the functional characterization of known and novel binders using cell-based assays to establish pharmacological and SAR profiles for future hit identification and lead optimization campaigns. However, the recent advances in computational tools and enhanced sampling methods such as molecular dynamics simulations which enable the dynamic modeling of the δOR and other GPCRs should be leveraged to unravel the structural determinants of biased signaling. In other words, the structural and conformational changes induced by agonist binding at δOR that appear in an MD simulation could be correlated with pharmacological data and mutagenesis analyses. Such an approach could be used to generate structural models or snapshots of the δOR in different conformational states which would increase the ability of docking campaigns to identify novel and biased agonists (Figure 3).

Until recently, GPCRs' structures in general, including those of the opioid receptors, still do not appear to provide a clear picture of the underlying signaling mechanisms given the complexity of the involved signaling network, the limited number of available structures bound to G-proteins, β-arrestins, or GRKs, and the diversity of the chemical space interacting with these receptors [77]. The tremendous success of cryo-EM structure determination and the increasing number of high-resolution structures when coupled with MD simulations could provide insights into GPCRs in action. Hence, developing and utilizing computational methods and workflows to model ensembles of structural conformations and then combining such methods with resolving GPCR structures in complex with G-proteins, β-arrestins, and GRKs should provide a strong approach to alleviate current limitations. This could also minimize the misinterpretations that stem from comparisons and analyses that are based on static GPCR structures that suffer from the limitations mentioned above.

A similar approach was applied recently at the µOR, where Wang et al. highlighted how the implementation of molecular dynamics could be helpful in lead optimization campaigns [68]. The authors implemented a structure-based lead optimization approach to generate PZM21 analogs with improved CNS penetration and higher G-protein bias with lower β-arrestin recruitment compared to fentanyl. In the study, the authors resolved a high-resolution cryo-EM structure of PZM21 bound to the µOR in a complex with the trimeric Gi-protein (PDB: 7SBF). The resolved cryo-EM structure showed that PZM21 forms a salt bridge between its basic amine and Asp3.32 of µOR which confirmed previous findings [67,78]. To confirm the stability of the PZM21′ s binding pose, and characterize the water-mediated interactions with µOR, the authors performed all-atom MD simulations which showed that PZM21 formed water-mediated interactions with His6.52 and Lys5.39. Intriguingly, the authors used insights from structural and dynamic analyses for lead optimization, guided by MD simulations and the cryo-EM structure of µOR in complex with Gi-protein. The resulting PZM21 analogs were µOR-selective with improved functional selectivity. Translating such an approach at the δOR will allow for more successful hit-to-lead and lead-optimization campaigns. Recently resolved cryo-EM structures of mitragynine pseudoindoxyl, which has very low arrestin recruitment, and lofentanil, a potent arrestin

recruiter demonstrate the involvement of distinct orthosteric sub-pockets in determining arrestin recruitment at the µOR. The authors demonstrated that each agonist has distinct moieties that bind in two distinct sub-pockets while sharing a central binding pocket with DAMGO [79]. Such findings support previous predictions and lead optimization strategies that have been applied to modulate biased agonism at the µOR and κOR [80,81].

Another promising area that has been on the rise recently is using machine learning and artificial intelligence in protein structure prediction and drug discovery. These advances present an exciting avenue for the discovery of novel and potential therapeutic agents at the δOR. Undeniably, the integration of machine learning and deep learning with current computational and pharmacological approaches presents a valuable opportunity to accelerate drug discovery campaigns at δOR. Machine learning models could be trained using high-quality datasets to predict drug properties, toxicity, target selectivity, and potentially ligand-receptor interactions. This has been made possible in part due to the GPCR community's efforts to provide access to curated datasets such as GPCRdb.org [62,82] and open-source machine learning packages such as DeepChem and AMPL, which allows researchers to build, train and deploy machine learning models for drug discovery [83]. Additionally, the rapid increase in high-performance cloud computing, improved Graphics Processing Units (i.e., GPUs), and increasingly efficient machine learning algorithms provide an opportunity to expand the screening of ultra-large chemical libraries or the de-novo drug design in a more efficient manner.

The significant improvement in structure prediction provided by AlphaFold 2 [84], may provide avenues for obtaining a wild-type thermostable δOR structure, that could be used for docking studies. There are significant limitations, in particular the current lack of Alphafold to predict how a protein will change conformation upon binding of a specific ligand [85]. Future machine learning algorithms may learn how to do this and further reduce the initial requirement of wet-lab science to obtain potent and selective novel molecules for a receptor target including the δOR.

MD simulations are another area that is expected to benefit from recent advances in machine learning and artificial intelligence. In fact, work is already underway to develop machine learning force fields that could increase the accuracy of MD simulations while reducing the computational cost [86]. Deep learning frameworks such as TorchMD [87] and neural networks such as graph convolutional neural networks (GCNNs) have been used for geometry optimization [88], acceleration of MD simulations, or even in improving force fields [89]. Such advances should be carefully utilized and expanded to complement the available structural and experimental data and accelerate the identification of therapeutic agents at the δOR.

To accelerate future large-scale drug discovery efforts at the δOR, well-trained and validated machine learning models should be combined with physics-based scoring functions to reduce the computational cost of docking and screening ultra-large chemical libraries. In this instance, a machine learning model could serve as a filter that could be incorporated into a docking workflow to prioritize which molecules move on to the next phase and ultimately which molecules are to be tested pharmacologically. Moreover, this approach allows the incorporation of structural and pharmacological parameters to build, train, and deploy multi-task learning models that could increase the selectivity and novelty of the identified compounds.

Despite the optimistic outlook and promising advances in computer-aided drug design, it is important to know the limits of the utilized tools. Furthermore, it is worth noting that there are numerous challenges and potential pitfalls associated with the incorporation of machine learning or deep learning models, especially with respect to the availability of large and high-quality datasets for the drug target [90,91]. For the δOR and other GPCRs, the diversity of cell-based assays that are used to characterize their pharmacology and the inconsistent practices between different research labs, or in some instances within the same lab, are two major limitations that need to be addressed before we can reliably use machine learning in GPCR drug discovery. Hence, future efforts should also focus

on the standardization of experimental data collection and computational data curation and modeling.

‐ ‐ δ ‐ β‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ **Figure 3.** Proposed workflow for screening large chemical libraries to identify G-protein biased agonists at the δOR and other GPCRs. A similar workflow could be applied to identify GRK- or βarrestin-biased small molecules given that high-quality crystal or cryo-EM structures are available. In cases where distinct interactions or sub-pockets specific to biased agonists or in cases where we know that an allosteric site/pocket could lead to biased effects, we could restrict ligand docking to that specific site to screen a given chemical library. The most accurate way to confirm such interactions would be to resolve high-quality structures and/or perform mutagenesis studies. Alternatively, enhanced sampling computational modeling to model receptor-effector complexes could be useful if computational cost is not a limiting factor.

μ

‐

#### **5. Conclusions**

For the identification of novel, functionally selective, and potentially therapeutic agonists at δOR, future efforts should aim to expand our understanding of the effect of various mutations on the structure and function of δOR utilizing the constantly improving *in-vitro* and *in-silico* approach. Additionally, producing multiple wild-type agonist-bound high-resolution structures of δOR will allow for a more efficient expansion of its chemical space in virtual screening campaigns or in computer-aided lead optimization. Consequently, these efforts will provide high-quality datasets that could allow for the incorporation of ML and DL tools in opioid drug discovery. Overall, we think there are more opportunities than that there are challenges to carry out a high yield in silico screen at the δOR to generate novel chemical matter that hopefully can be translated into meaningful therapeutics.

**Author Contributions:** Conceptualization, Y.J.M. and R.M.v.R.; writing—original draft preparation, Y.J.M. and R.M.v.R.; writing—review and editing, Y.J.M. and R.M.v.R.; visualization, Y.J.M.; funding acquisition, R.M.v.R. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Institute on Alcohol Abuse and Alcoholism, grant number R01AA025368 (R.M.v.R.).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** van Rijn is currently employed as a Principal Scientist at Septerna Inc. and holds stock options in the company. van Rijn holds a US patent (10,954,224) describing novel delta opioid receptor agonists. YM has no conflict of interest to declare. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**

