1. Introduction
Human immunodeficiency virus (HIV) is still a major global public health issue. In 2014, 36.9 million people were living with this virus. Every year, 2 million people become infected with HIV and 1.2 million people die from HIV-related causes globally [
1].
HIV infects helper T cells, macrophages, and dendritic cells, the major cells of the human immune system [
2]. When CD4
+ T cell numbers declines below a critical level, cell-mediated immunity is lost, and the body becomes progressively more susceptible to opportunistic infections. The late symptoms of the infection are referred to as acquired immunodeficiency syndrome (AIDS) [
3].
The global epidemic has prompted virologists to understand the pathogenic processes caused by the virus, determine its structure, as well as search for preventive vaccines and antiviral drugs. Knowledge of the viral enzymes responsible for the replication of the virus (reverse transcriptase, protease, integrase) allows for synthesis and fast introduction of drugs, which are inhibitors of these active proteins, into therapy.
Reverse transcriptase inhibitors prevent the HIV enzyme reverse transcriptase (RT) from converting single-stranded HIV RNA into double-stranded DNA. There are two types of RT inhibitors. Nucleoside/nucleotide RT inhibitors (NRTIs) are faulty DNA building blocks that stop HIV DNA synthesis. Non-nucleoside RT inhibitors (NNRTIs) bind to RT, interfering with its ability to convert HIV RNA into HIV DNA.
Nucleoside reverse transcriptase inhibitors are derivatives of 2′,3′-dideoxynucleosides. They are active only in the triphosphate form [
4]. Human kinase-mediated phosphorylation occurs at the 5′ position in three stages [
5]. NRTIs are substrates for the enzyme and are incorporated into the DNA chain. Due to the lack of a hydroxyl group at the 3′ position, formation of a phosphodiester bond with another nucleoside is impossible, and further elongation of the nucleic acid cannot occur, thereby inhibiting viral replication. Lamivudine and emtricitabine are structurally remote from 2′-deoxynucleosides. Instead of deoxyribose, they contain oxathiolane, which also prevents further elongation of the DNA.
Fullerene is an allotrope of carbon. The fullerene molecules are composed of even number of carbon atoms that form a spherical polyhedron. C
20 is the smallest possible fullerene. It is composed of 12 pentagons and has a structure of a regular dodecahedron with a diameter of about 3.1 Å. The C
20 molecule has 30 C-C bonds of about 1.46 Å. Next to C
60, C
20 has the most regular geometry of all fullerenes [
6].
The latest research shows that fullerene derivatives could be novel, great antiviral compounds to treat the human immunodeficiency virus infection [
7,
8]. HIV protease and reverse transcriptase enzymes are the two targets for anti-HIV fullerene derivatives. Molecular modeling studies have demonstrated that C
60 derivatives could act as HIV RT inhibitors [
7,
9,
10]. In 2005, Mashino et al. [
11] synthesized various malonic acid- and amino acid-type C
60 derivatives and investigated the inhibitory activity of those compounds towards HIV RT. Machino continued his fullerene research, and in 2015, showed that pyridine/pyrimidine-type fullerene derivatives could inhibit the enzyme without cytotoxic effects [
12].
The fullerene derivatives should therefore be considered new chemical entities with new biological activity as a result of overall structure rather than composition of two subcomponents. Here we performed in silico molecular modeling studies on compounds formed with C20 fullerene and lamivudine, zidovudine, and emtricitabine.
Studied ligands were designed on the grounds of nucleoside reverse transcriptase inhibitors’ structures. The key difference in our studies is the replacement of dideoxyribose ring or oxathiolane with a C
20 fullerene. Three leading structures are proposed. They are built by pyrimidine or purine base and hydroxymethyl group attached in various positions to one side of C
20. This is illustrated in
Figure 1. The first type of substitution is based on the structure of most NRTIs, such as zidovudine. The second type is based on the structures of lamivudine and emtricitabine. Additionally, there is proposed also a third type of connection. Three bases were used: adenine, thymine, and cytosine. Additionally, non-substituted carbon atoms of the substituted fullerene rings were replaced with atoms of nitrogen, sulfur or boron. We also propose compounds with a methylene group connecting fullerene with the nitrogenous base. All the combinations used give 180 compounds in total. Replacement of one carbon atom in C
20 with a nitrogen atom causes a loss of a double bond and addition of one hydrogen atom. In order to increase the comparability of the results, sulfur and boron derivatives as well as derivatives without heteroatoms have one double bond saturated, adding two hydrogen atoms. As a result, the fullerene core in every designed compound has exactly 8 double bonds.
In order to evaluate the binding affinities of designed compounds towards RT active site, an in silico study was performed. The fullerene derivatives were docked to multiple reverse transcriptase models. Selected molecules had their binding free energies calculated. Then, the best protein-ligand complexes were analyzed, giving valuable insight into interactions between studied compounds and RT active site.
2. Results and Discussion
One hundred eighty designed compounds were docked to four RT models. Docking protocol used in this study includes ligand flexibility, yet the protein remains rigid. In order to imitate flexibility of amino acids of the active site, rigid docking was performed for different RT conformations. This procedure minimized the occurrence of false negative results. It is worth mentioning that among these models are two generated by us. During the validation of the docking protocol, four new RT models were prepared. For best of them (Model 3) the Pearson correlation coefficient between Ki values and scoring function values of test compounds proved to be superior than for the crystallographic structures. In
Tables S13–S16 all docked fullerene derivatives are presented along with potential of mean force (PMF) and PLP1 function values.
Figure 2 and
Figure 3 show how studied compounds are placed in RT.
Ligands that achieved best scores had their orientations in the binding pocket evaluated and then those with orientations similar to orientations of ligands in the crystallographic structures were chosen. Orientations of NRTI and well-placed ligand are shown in
Figure 4 and
Figure 5. The necessity of the above action results from the mechanism of action of NRTIs. They are built into the DNA strand, which requires not only strong affinity towards binding sites, but also adequate orientation allowing formation of a phosphodiester bond between ligand molecule and DNA strand nucleotide. After considering both factors, these are the compounds most promising to be active towards RT (The ordinal numbering from
Tables S13–S16).
3KK2: 6, 78, 2, 25, 102, 106, 174, 28, 7, 107, 83, 48
Model 2: 29, 47, 5, 83, 66, 78, 82, 30, 4, 94, 144, 100
Model 3: 6, 78, 102, 84, 60, 114, 72, 107, 76, 96, 105, 83
3V4I: 18, 3, 27, 25, 132, 40, 39.
PMF scoring function values for 3KK2, Model 2 and Model 3 indicate that the best designed ligands have a higher affinity towards RT than test compounds, including zidovudine and stavudine triphosphates. On the other hand, the PLP1 function values for 3V4I model show that best designed compounds have similar affinity towards RT compared to the most active test ligands. Among the above-mentioned compounds a few derivative groups occur most frequently. It allows for the selection of favorable structural features. These are mainly the derivatives with the hydroxymethyl groups and nitrogenous bases attached to the fullerene core in positions analogical to NRTI structures. Compounds designed that are not based on existing NRTI structures, proved to have weaker affinity towards the RT active site. Presence of the heteroatom in the fullerene core turned out to be another important feature. Depending on the substitution position of the two aforementioned groups to the fullerene, different heteroatoms and different places of their insertion are preferred. In many cases the addition of the methylene group between the fullerene core and the nitrogenous base proved to be beneficial. Groups of the best ligands are shown in
Figure 6.
In order to further evaluate and confirm the docking results, binding free energies of above-mentioned compounds and test ligands were calculated. The RT model with the best correlation for Ki values and docking scores—Model 3 was used in this part of the research. The results are depicted in
Figure 7. Designed ligands with best docking scores proved to have on average more negative binding energy values when compared to test ligands. While the test ligands’ set contains also those with low binding affinities, binding energies of designed ligands were additionally compared with those of test ligands with Ki values below 100 nM. Once again, fullerene derivatives managed to receive on average superior results. Also, minimum binding energy for designed compounds turned out to be more negative than the test one. This means that fullerene derivatives should have a higher affinity towards HIV RT. However, the maximum value so far of the best designed compounds was less negative than some of the test ligands’ binding energies. Therefore, the set of potential RT inhibitors was further reduced to include compounds: 2, 3, 4, 5, 6, 18, 29, 47, 72, 76, 78, 82, 83, 84, 94, 96, 100, 102, 106, 114, and 144. Their structural formulas along with scoring function values and binding energies are shown in
Table 1.
Molecular docking allows also to analyze the interactions between ligands and amino acids in binding sites. Docking of the ligand base to four RT models allows the enhanced examination of these interactions, because they can differ with respective proteins conformations. In the case of RT active site, ligands interact also with DNA nucleotides. Most of the interactions are common for all four used RT conformations, therefore they were investigated together for all RT models. In the case of the occurrence of significant differences, they were specified and explained.
Nitrogenous bases in the studied compounds interact primarily with both strands of DNA nitrogenous bases. It is worth mentioning that scoring function values and orientations of the ligands were not strongly influenced by the type of the substituted nitrogenous base. Ligands were placed in the binding pocket well, regardless of whether the used nitrogenous base was complementary to the base in the template DNA strand or not. It could be the result of the presence of the fullerene core. It makes a significant difference in comparison to deoxyribose rings of the RT natural ligands. It certainly affects the geometry of the compound as well as its orientation in the active site, which may improve the nitrogenous base’s ability to obtain favorable position, regardless of its complementarity. For derivatives with a methylene group between the nitrogenous base and the fullerene core, this modification can also alter the orientation of the nitrogenous base. There are hydrophobic π-π stacking interactions between base rings of the ligands and nucleotide base rings of DNA primer strands. Hydrogen bonds occur between ligands’ bases and bases to be complementary. There are also π–π stacking interactions between base rings of ligands and nucleotide base rings of the template strand, that are situated near the nucleotides supposed to be complementary. Additionally, some ligands tend to create π–π T-shaped interactions. They are formed by π electrons of the base ring of the ligand and tyrosine 115 ring, that are oriented perpendicularly to each other. In some cases, hydrogen bonds between glutamine 151 hydrogen atom and π electrons of the nitrogenous bases rings can be observed. The last occurring interaction type is an electrostatic cation–π interaction, formed by electrons of nitrogenous bases and protonated imine group =NH
2+ of arginine 72. The interactions of the nitrogenous bases are illustrated in
Figure 8.
The key difference between the designed compounds and the currently used NRTIs is the replacement of the dideoxyribose ring with a fullerene. For this part of the compound, the biggest differences in interaction with amino acids at binding site occur when compared to nucleosides or NRTIs. Interaction of hydrophobic π electrons of fullerene cores with alanine 114 methyl group is most frequently observed. It occurs in all ligand-protein complexes, for all RT models. The interaction between fullerene core and tyrosine 115 ring has to be mentioned as well, as it belongs to the most significant ones. It is of hydrophobic nature π–π stacking group and occurs for all of the prepared RT models. However, in the case of Model 3 it is observed rarely, only for a limited portion of the studied ligands. This suggests the dependence of the occurrence of the aforementioned interaction on amino acid side chains’ conformations in the β3-β4 flexible loop in p66 fingers subdomain. Also, for some ligands hydrogen bonds occur between the fullerene part and a hydrogen atom in the peptide bond between alanine 114 and tyrosine 115. The above-mentioned interactions of the fullerene core are only the main ones. In the case of certain orientations of specific ligands interactions also occur with other amino acids in the binding pocket. Relatively often electrostatic cation-π interactions with lysine 65 amine group can be observed. They have partially hydrogen bond character. There are also cation-π interactions with =NH
2+ group and hydrogen bonds with amine group of arginine 72. Additionally, hydrophobic interactions with two aforementioned amino acid side chains occur. Another type of fullerene core’s interaction is that of its π electrons with free electron pairs of glutamine 151 oxygen atom as well as with oxygen atom in the peptide bond between valine 111 and glycine 112. The other observed interaction is the one between the fullerene core and aspartic acid 185 which belongs to the anion-π type. Interactions of the fullerene core are depicted in
Figure 9.
The fullerene core’s numerous interactions with the active sites amino acids are probably the main cause of positive results in form of scoring function values. Natural RT ligands and NRTIs form usually only two to three interactions with their deoxyribose or dideoxyribose ring and function groups attached in its position 3. They are shown in
Figure 10. These are mainly hydrophobic interactions of the carbohydrate ring with tyrosine 115 ring as well as hydrogen bonds and electrostatic interactions of the hydroxyl or azide groups. It is definitely less than in the fullerene part of the designed compounds.
The third important group of interactions in the studied compounds that occurs in ligand–protein–DNA complexes includes the triphosphate group interactions. This part of the molecule was not modified. Its interactions do not differ from those commonly observed in NRTIs and RT natural ligands. They depend mostly on orientation of the triphosphate part in the active site. These are primarily hydrogen bonds and electrostatic interactions. All of the oxygen atoms in the triphosphate group can take part in the formation of these interactions. The hydrogen bonds’ donor atoms are located in the peptide bonds between glycine 112 and aspartic acid 113, lysine 65 and lysine 66 as well as lysine 66 and aspartic acid 67. Additionally, bonds of this type can be formed due to the hydrogen atoms present in glycine 112, in the side chains of lysine 219 and lysine 66, and in arginine 72 amine group. Hydrogen bonds also are present, formed by the electrostatic interactions between oxygen atoms of the triphosphate group and lysine 65 and lysine 219 amine group and arginine 72 =NH
2+ group. It is worth mentioning that the interactions of the triphosphate group depend not only on its orientation in the active site, but also on the conformations of the amino acid side chains in the binding pocket. For example, in the case of Model 3, in which Lys65 and Arg72 conformations were altered, there are no interactions of the triphosphate group with arginine 72. This is caused by translation of this amino acid side chain in the direction of the DNA template strand and increasing the distance from amino acids important for triphosphates binding, such as Lys219. This change prevents the interaction of Arg72 with part of the ligand, regardless of its orientation in the binding pocket. In the Model 3V4I lysine 66 and aspartic acid 67 are placed far from the binding site and do not contribute to the creation of the ligand-protein complex at all. Interactions of the triphosphate group are depicted in
Figure 11.
An important factor for achieving by the designed compounds the correct orientation in the RT binding site is the methylene group connecting the fullerene part with the triphosphate group. The hydrogen atom forms a hydrogen bond with the oxygen atom in the deoxyribose of DNA primer strand to which the ligand is supposed to be attached. This is illustrated in
Figure 12. Hydrogen atoms of the methylene group sometimes form also the hydrogen bond with oxygen atom of the carboxyl group of aspartic acid 185.
4. Conclusions
We designed 180 new derivatives of nucleoside reverse transcriptase inhibitors with a C
20 fullerene. Computational chemistry methods used in this study allowed us to evaluate the possible affinity of designed compounds towards HIV-1 RT binding site and select the most active ones (
Table S1: 4, 6, 29, 47, 72, 76, 82, 84, 96, 10, 106, 114). One hundred eighty designed compounds were docked to four chosen RT models. The best of the designed compounds had potentially superior or similar affinity to RT active site in comparison to most active test compounds, including drugs used in anti-HIV therapy. We plan to synthesize selected C
20 derivatives and examine them in vitro.
This is the first study that indicates potential affinity of fullerene derivatives towards HIV RT active site. While the methods used in this work allow only for preliminary conclusions, it may be an important groundwork for both more accurate studies of the studied compounds and other related research in that area. This is also a pioneering attempt to show a potential application of C20 as an active pharmaceutical ingredient. Presented results can serve as starting points for further studies on other chemical groups of compounds containing fullerenes. We intend to extend our research and study other fullerene derivatives, paying special attention to their potential affinity towards HIV-1 RT.