*Article* **In Silico Antiprotozoal Evaluation of 1,4-Naphthoquinone Derivatives against Chagas and Leishmaniasis Diseases Using QSAR, Molecular Docking, and ADME Approaches**

**Lina S. Prieto Cárdenas 1 , Karen A. Arias Soler 1 , Diana L. Nossa González 1 , Wilson E. Rozo Núñez 1 , Agobardo Cárdenas-Chaparro 1 , Pablo R. Duchowicz <sup>2</sup> and Jovanny A. Gómez Castaño 1, \***


**Abstract:** Chagas and leishmaniasis are two neglected diseases considered as public health problems worldwide, for which there is no effective, low-cost, and low-toxicity treatment for the host. Naphthoquinones are ligands with redox properties involved in oxidative biological processes with a wide variety of activities, including antiparasitic. In this work, in silico methods of quantitative structure–activity relationship (QSAR), molecular docking, and calculation of ADME (absorption, distribution, metabolism, and excretion) properties were used to evaluate naphthoquinone derivatives with unknown antiprotozoal activity. QSAR models were developed for predicting antiparasitic activity against *Trypanosoma cruzi*, *Leishmania amazonensis*, and *Leishmania infatum*, as well as the QSAR model for toxicity activity. Most of the evaluated ligands presented high antiparasitic activity. According to the docking results, the family of triazole derivatives presented the best affinity with the different macromolecular targets. The ADME results showed that most of the evaluated compounds present adequate conditions to be administered orally. Naphthoquinone derivatives show good biological activity results, depending on the substituents attached to the quinone ring, and perhaps the potential to be converted into drugs or starting molecules.

**Keywords:** chagas; leishmaniasis; naphthoquinones; antiprotozoal evaluation; QSAR; molecular docking; ADME

#### **1. Introduction**

Chagas and leishmaniasis are two parasitic infectious diseases endemic to Latin America, considered by the World Health Organization (WHO) as neglected tropical diseases, for which there is currently no effective, safe, and economical chemotherapy treatment. For chagas, the WHO estimates that between 7 and 8 million people are infected, with 12,000 deaths and 70 million people at risk of contracting it per year [1,2], while for leishmaniasis, there is an estimated figure of 12 million infected people, 1.6 million new cases each year, and 350 million people at risk of acquiring it [3–5]. In response to this problem, a growing number of investigations, mainly in Latin America, are being carried out to find new powerful anti-chagas and anti-leishmaniasis agents with low toxicity.

Chagas disease is a zoonosis caused by the protozoan *Trypanosoma cruzi*, which affects around 150 species of mammals, including humans, and is mainly vector-borne by hematophagous insects of the *Tritominidae* subfamily (popularly known as kissing bugs, bed bugs, and whistles, among other names). Since 1970, the treatment against this pathology

**Citation:** Prieto Cárdenas, L.S.; Arias Soler, K.A.; Nossa González, D.L.; Rozo Núñez, W.E.; Cárdenas-Chaparro, A.; Duchowicz, P.R.; Gómez Castaño, J.A. In Silico Antiprotozoal Evaluation of 1,4-Naphthoquinone Derivatives against Chagas and Leishmaniasis Diseases Using QSAR, Molecular Docking, and ADME Approaches. *Pharmaceuticals* **2022**, *15*, 687. https://doi.org/10.3390/ ph15060687

Academic Editor: Osvaldo Andrade Santos-Filho

Received: 30 April 2022 Accepted: 27 May 2022 Published: 31 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

has been based mainly on the antiparasitic drugs Nifurtimox (NFX) and Benznidazole (BNZ). These medicines have an effectiveness of 70% during the acute phase, where the prevailing parasitic form is blood trypomastigotes, and decreases to 20% in the chronic phase, where the predominant form corresponds to the intracellular phase of amastigotes [6,7]. It has been proven that NFX and BNZ have a high toxicity for mammalian cells, with serious side effects, such as peripheral neuropathies, anorexia, nausea, and vomiting, as well as neurological reactions such as anxiety and disorientation, among others [8]. On the other hand, leishmaniasis encompasses a group of infectious diseases caused by at least 20 species of the *Leishmania* genus, including *L. braziliensis*, *L. amazonensis*, *L. infantum*, *L. guyanensis*, *L. panamensis*, and *L. mexicana*, which are the species that affect humans. Its transmission is mainly through vectors via insect bites of the *Psychodidae* family. This disease presents three different clinical forms in humans, canines, and several wild vertebrates: cutaneous, mucocutaneous, and visceral [3]. Current treatment against leishmaniasis is based on the use of pentavalent antimonials, such as Miltefosine, Amphotericin B (AmB), Paromomycin, and Pentamidine; however, there are restrictions on the use of these drugs due to their side effects, which include hepatic, cardiac, and renal toxicity [5]. Additionally, they present limitations, such as high production cost and increased resistance by the parasite [9].

Naphthoquinones stand out among the most studied natural compounds and synthetic derivatives for their anti-chagas and anti-leishmaniasis (mainly against *L. infantum* and *L. amazonensis*) activity. β-lapachone derivatives have exhibited potential anticancer, antiviral, antiparasitic (including anti-chagas and anti-leishmanial activity), antimicrobial, anti-inflammatory, anti-obesity, antioxidant, and neuroprotective activity, while showing low levels of toxicity to normal cells [10]. Naphthoimidazoles derived from β-lapachone have been prepared, and their trypanocidal activity has been evaluated using electron microscopy, flow cytometry, and biochemical techniques, indicating that some compounds lead to an oxidative imbalance, which generates the production of ROS and the death of the parasite [10–13]. De Silva et al. carried out molecular docking studies on two cysteine proteases: cruzin and rhodesain, which are fundamental in the metabolism of the *T. cruzi* parasite [14]. This allowed for the identification of 14 naphthoquinone derivatives with potential antiprotozoal activity, which were synthesized and tested in vitro. On the other hand, 2,3-diphenyl-1,4-naphthoquinone (DPNQ) is considered a potential chemotherapeutic agent against *T. cruzi* due to its high trypanocidal activity in phenotypic screening and experimental murine infection by *T. cruzi*. Treatment with DPNQ in infected female mice promoted a halving of the parasite load, and ensured a 60% survival rate of the animal [15]. In 2021, Becerra et al. used pharmacophoric models based on different studies to design and subsequently synthesize nine phenolic derivatives, which were tested against *T. cruzi* strains, most of which showed good activity compared to BNZ, and the best prospects showed low toxicity [16]. Plumbagin is a plant-derived naphthoquinone metabolite (5-hydroxy-2-methyl-1,4-naphthaquinone) that inhibits trypanothione reductase, and has been validated as a drug which is responsible for promoting oxidative stress in *Leishmania* [9]. Similarly, naphtherocarpanquinone (LQB-118), which has been evaluated against cutaneous leishmaniasis, showed activity against intracellular amastigotes of *L. amazonensis*. It is proposed, through an anti-leishmanial model, that a good level of administration could counteract this clinical form of leishmaniasis [17]. The leishmanicidal activity of a series of substituted bis-2-hydroxy-1,4-naphthoquinones prepared from lawsone with aliphatic and aromatic aldehydes has also been evaluated [18]. Overall, four of the bis-lawsone analogs showed results similar to Pentamidine, but without cytotoxicity to host cells. Other studies demonstrated the high activity and low toxicity for host cells of derivatives of nor-α-lapachone and α-lapachone fused with triazole; despite them, however, resistance similar to that presented by the reference drug (Pentamidine) was observed [19]. Likewise, α-lapachone has shown a similar inhibition to sodium stibogluconate (Pentostam) [20]. Acetylisolapachol showed greater activity against *L. amazonensis* and does not exhibit a risk to the host [21].

Continuing with our search for novel structures with high antiparasitic activity and low toxicity [22], in this work, we have carried out an extensive in silico study of substituted derivatives of 1,4-naphthoquinone as potential anti-chagas and anti-leishmanial candidates. QSAR modeling of anti-*T. cruzi* in trypomastigotes, anti-*L. amazonensis*, *anti-L. infantum,* and toxicity has been carried out, based on the antiparasitic derivatives of naphthoquinone and other related structures reported in the literature. The best QSAR models were used to evaluate the predictive antiparasitic and toxicity activity of 68 1,4-naphthoquinone derivatives with hitherto unknown biological activity. This study was complemented by the evaluation of the 68 candidates against 5 macromolecular targets related to vital processes for the survival of these protozoa. In the case of *T. cruzi*, trypanothione reductase (*Tc*TR) and lanosterol α-demethylase (*Tc*LαD) proteins were selected. For the *Leishmania* genus in general, trypanothione reductase (*L*TR) protein was selected, while for the *L. amazonensis* and *L. infantum* strains, arginase (*La*A) and tyrosine aminotransferase (*Li*TA) were selected, respectively. Additionally, the ADME (absorption, distribution, metabolism, and excretion) properties of the 68 naphthoquinone derivatives were evaluated. The best anti-chagas and anti-leishmanial candidates derived from this in silico study are currently in the organic synthesis stage at our lab for their subsequent in vitro evaluation.

#### **2. Results and Discussion**

#### *2.1. QSAR Modeling*

In total, four QSAR models were performed based on in vitro reports of naphthoquinone derivatives and related structures with anti-*T. cruzi* (in trypomastigotes) [23–31], anti-*L. amazonensis* [18,32–39], and anti-*L. infantum* [40–45] activities, as well as toxicity [30,46–51] evaluation. For each QSAR study, the top seven models were generated, which are shown in Tables S1 and S2 in the Supplementary Materials. In turn, the best model (in bold) among the seven was selected, which was performed considering high homogeneity (fitting, R<sup>2</sup> ) between the calibration and validation groups with the lowest root mean square error (RMSE), thereby avoiding over-adjusted models. For selecting the best model, the Ockham's principle of parsimony was also considered, which states that if there are models with statistical parameters of equal quality, the simplest one should be selected [52].

#### 2.1.1. QSAR Model for Anti-chagas Activity

Equation (1) shows the best QSAR model for the anti-chagas activity predicted as LogIC50, which was developed based on 153 derivatives of 1,2-naphthoquinone and 1,4 naphthoquinone (Figure A1) evaluating in vitro blood trypomastigotes of strain Y of *T. cruzi*. A description of the molecular descriptors that define this model is listed in Table 1.

$$\text{LogIC}\_{50} = 4.5112 + 0.3637a\_1 + 1.1416a\_2 - 0.1003a\_3 + 0.9440a\_4 + 0.1929a\_5 - 1.8425 \times 10^{-4}a\_6 \tag{1}$$


**Table 1.** Molecular descriptors of the best anti-chagas QSAR model as defined by Equation (1).

<sup>ଷ</sup> <sup>ସ</sup> <sup>ହ</sup>

<sup>ସ</sup> <sup>ହ</sup> <sup>ଷ</sup>

<sup>ଷ</sup> <sup>ସ</sup> <sup>ହ</sup>

<sup>ସ</sup> <sup>ହ</sup> <sup>ଷ</sup>

ସ

ସ

<sup>ଷ</sup>

<sup>ଷ</sup>

ହ = 0.26 − 1.00<sup>ଵ</sup> + 0.24<sup>ଶ</sup> + 0.42<sup>ଷ</sup> − 0.05<sup>ସ</sup>

ହ = 0.26 − 1.00<sup>ଵ</sup> + 0.24<sup>ଶ</sup> + 0.42<sup>ଷ</sup> − 0.05<sup>ସ</sup>

ହ

ହ

<sup>ସ</sup> <sup>ହ</sup>

<sup>ସ</sup> <sup>ହ</sup>

<sup>ଵ</sup> <sup>ଶ</sup>

<sup>ଵ</sup> <sup>ଶ</sup>

The descriptors *a*<sup>1</sup> and *a*<sup>2</sup> are fingerprints calculated in Fragmentor [53], and refer to the presence of oxygen followed by an unsaturation; while the second was calculated in PaDeL [54], and belongs to the MACCS keys, a group of 166 free access fragments [55], which involves an oxygen–oxygen arrangement at a distance equal to three bonds and two intermediate atoms of any type. According to this QSAR model (Equation (1)), these molecular descriptors contribute negatively to the evaluated activity (i.e., they decrease the activity). The other descriptors of the model were calculated in the program QUBILS-MAS [56]. The descriptors *a*3, *a*4, and *a*<sup>5</sup> are calculated from the Kurtosis, which is a statistical invariant, while *a*<sup>6</sup> belongs to the Minkowski indicators. *a*<sup>3</sup> is a quadratic index, *a*<sup>4</sup> and *a*<sup>5</sup> are bilinear, and *a*<sup>6</sup> is linear. The descriptor *a*<sup>3</sup> is a global index that contributes to the biological activity. This is related to molar refractivity, which can sometimes be used to model London dispersion forces or attractive van der Waals interactions; these are factors related to the presence of strong interactions between a ligand and the active sites of a given macromolecular receptor. However, refractivity is the consequence of repulsive nonbonding interactions, and is highly dependent on the flexibility of the ligand [57]. The descriptor *a*<sup>4</sup> describes the electronegativity and polarizability of the fragment of carbon atoms in aliphatic chains, while *a*<sup>5</sup> is associated with the refractivity and charge of the subset of heteroatoms in the molecule [4]. Both descriptors contribute negatively to the antichagas activity. Considering the nature of descriptors *a*<sup>4</sup> and *a*5, it is hypothesized that the anti-chagas activity could be related to the ability of the ligands to dock and inhibit essential macromolecules in the metabolism of *T. cruzi*. For its part, the descriptor *a*<sup>6</sup> provides a positive contribution to the biological activity, and is a function of the mass of aromatic carbon atoms, which suggests that the presence of aromatic systems improves activity.

Table S3 in the Supplementary Materials shows the intercorrelation relationship between each pair of descriptors included in the anti-chagas QSAR model, while Table S4 lists the values of each descriptor for each molecule. Table S5 shows the experimental values together with those predicted by the anti-chagas activity model as log IC50.

#### 2.1.2. QSAR Model for Anti-*L. amazonensis* Activity

For the anti-*L. amazonensis* (cutaneous and mucocutaneous leishmaniasis) activity (expressed as LogIC50), the QSAR model with four descriptors was selected (Table S1), which is described according to Equation (2). This model was developed based on 60 ligands (Figure A2), with in vitro anti-leishmanial activity reported as IC50:

$$\text{LogIC}\_{50} = 0.26 - 1.00b\_1 + 0.24b\_2 + 0.42b\_3 - 0.05b\_4 \tag{2}$$

<sup>ଶ</sup> <sup>ଷ</sup> <sup>ସ</sup>

ସ

ହ = −1.51 + 6.83<sup>ଵ</sup> + 0.29<sup>ଶ</sup> − 3.88 × 10ି<sup>ଷ</sup> + 0.04<sup>ସ</sup> − 3.72<sup>ହ</sup> − 3.68 × 10ି

ଷ

ସ

where *b*1, *b*2, *b*3, and *b*<sup>4</sup> are described as shown in Table 2.

<sup>ଶ</sup> <sup>ଷ</sup> <sup>ସ</sup>

ଵ


**Table 2.** Molecular descriptors in the anti-*L. amazonensis* QSAR model, as described by Equation (2).

ଶ

( = 1 to 6)

ଵ ଶ ଷ

ସ

ହ  ଵ

The *b*<sup>1</sup> descriptor refers to the Klekota–Roth fingerprint KRFP2, and it comprises part of six of the seven anti-*L. amazonensis* QSAR models considered (Table S1). This descriptor provides a measure of chemical similarity related to a bond between two carbon atoms, where each carbon atom has, as its substituents, a hydrogen atom and two nonhydrogen atoms [54,58]. On the other hand, *b*2, *b*3, and *b*<sup>4</sup> refer to 2D type descriptors. The minHBint9 (*b*2) descriptor describes the topological state of the atom's environment, such as the electronic interactions present in the atoms of the molecule at a topological distance of nine with each atom. According to Equation (2), this characteristic decreases the IC50, that is, the less topological distance there is between the atoms that make up the structure, the better its antiprotozoal activity [59]. The IC3 descriptor (*b*3) is based on the three-order neighborhood symmetry. This means that molecules with lower symmetry will have a better predictive activity [60]. Finally, the descriptor MDEC-23 (*b*4) refers to the information regarding the edge of the molecular distance among all secondary and tertiary carbons. The presence and value of *b*<sup>4</sup> enhance the anti-leishmaniasis activity predicted by the model, indicating that the presence of secondary and tertiary carbon atoms in these molecules is related to their biological activity [61–63].

#### 2.1.3. QSAR Model for anti-*L. infantum* Activity

For predicting anti-*L. infantum* (visceral leishmaniasis) activity such as LogIC50, the model with six descriptors was selected (Table S1), which is represented according to Equation (3). The molecules used for constructing this QSAR model are shown in Figure A3:

$$\text{LogIC}\_{50} = -1.51 + 6.83c\_1 + 0.29c\_2 - 3.88 \times 10^{-6}c\_3 + 0.04c\_4 - 3.72c\_5 - 3.68 \times 10^{-6}c\_6 \tag{3}$$

where *cn*(*n* = 1 to 6) are 2D descriptors described as shown in Table 3.


**Table 3.** Molecular descriptors in the anti-*L. infantum* QSAR model, as described by Equation (3).

The MATS3c (*c*1) descriptor refers to Moran's autocorrelation, which expresses the partial charge values of atoms separated by three distances, that is, it estimates the correlation of charges divided into three bonds [64,65]. The descriptor nHBint7 (*c*2) counts resistance E-state descriptors for hydrogen bonds with a path length of seven. The *c*3–*c*<sup>6</sup> descriptors are part of the QuBiLS-MAS program [66]. The *c*<sup>3</sup> descriptor refers to the partition algorithm, which is used to calculate estimates of most neutral organic compounds that have C, H, O, N, S, Se, P, B, Si, and halogen atoms. This descriptor is based on AlogP, which can also estimate local hydrophobicity, visualize molecular hydrophobicity maps, and evaluate hydrophobic interactions when protein–ligand complexes are formed [67,68]. The *c*<sup>4</sup> descriptor considers bilinear indices and aliphatic chain carbon atoms, and correlates the topological area of the polar surface and the van der Waals volume. The polar surface area (PSA) is a molecular descriptor widely used in the study of drug transport properties, related to the penetration of the blood–brain barrier and its intestinal absorption. Additionally, the descriptor *c*<sup>4</sup> alludes to the sum of the contributions of polar atoms such as oxygen,

nitrogen, and hydrogen to the molecular surface area [69,70]. The descriptor *c*<sup>5</sup> correlates the carbon atoms of aliphatic chains with the van der Waals volume and charge. Molecular volume is defined as a measure of the space around electron-filled atomic nuclei, and is geometrically interpreted as the combined volume of the superimposed spheres centered on the nuclei, similar to a space-filling molecular model [71,72]. Finally, the descriptor *c*<sup>6</sup> presents a nonstochastic matrix of order 15; in this case, the descriptor considers the heteroatoms different from C and H, correlating them with the partition algorithm and estimating the different atoms. This descriptor includes most of the zwitterionic compounds that have amine, carboxylic acids, and ammonium halide salts. The *c*<sup>6</sup> descriptor is also based on an intrinsically atomistic model, which is useful for drug design, since it makes estimates of the local or general hydrophobicity of a molecule [66,68].

Table S7 in the Supporting Materials shows the intercorrelation relationship between each pair of descriptors included in the anti-leishmanial QSAR models (Equations (2) and (3)), while Table S8 (*L. amazonensis*) and Table S7 (*L. infantum*) show the values of each descriptor calculated for each molecule used in the development of these QSAR models. Tables S10 and S11 list the experimental and model-predicted values of anti-leishmanial activity as log IC50.

#### 2.1.4. QSAR Model for Toxicity

From the QSAR study for toxicity (expressed as LogIC50), a model with five descriptors was selected (Table S2), which is represented in Equation (4). This model was developed based on 76 naphthoquinone derivatives, as shown in Figure A4.

$$\text{LogIC}\_{50} = 0.37 + 0.11d\_1 - 0.21d\_2 + 0.36d\_3 - 0.05d\_4 - 0.17d\_5 \tag{4}$$

where *dn*(*n* = 1 to 5) are 2D descriptors described as shown in Table 4.


**Table 4.** Molecular descriptors in the QSAR model of toxicity as described by Equation (4).

The toxicity model descriptors were calculated using the QUBILS-MAS program [66]. The descriptors *d*1, *d*2, *d*4, and *d*<sup>5</sup> are calculated from the Kurtosis, which is a statistical invariant of distribution. The *d*<sup>1</sup> descriptor correlates with physicochemical properties, polarized topological surface area, and refractivity. As indicated above, the topological polar surface area of a molecule depends on the sum of the surface area of polar atoms, such as oxygen, nitrogen, and hydrogen, and facilitates the ability of a molecule to penetrate cells. According to this, the greater the value of the polar topological surface in a molecule, the greater its probability of being transported [73]. Meanwhile, refractivity is a measure of the volume occupied by an atom or a group of atoms [69]. These last two properties allow for a theoretical prediction of the pharmacological potential of a molecule in a biological environment [74]. The descriptor *d*<sup>2</sup> correlates the polarization with the charge of aromatic carbon atoms because it facilitates the distortion of the atomic or molecular charge in electromagnetic fields. This descriptor refers to an electronic parameter, which impacts chemical–biological interactions [75]. The descriptor *d*<sup>3</sup> correlates the polar surface area with the polarization of the carbon atoms of the aromatic moiety and the heteroatoms attached to this moiety. The *d*<sup>4</sup> descriptor is associated with the partition algorithm,

heteroatoms, and electronegativity. This descriptor is based on the tendency of a heteroatom or functional group to attract electrons and estimate the local or general hydrophobicity of a molecule [67–74,76,77]. Finally, the descriptor *d*<sup>5</sup> refers to the charge and the carbon atoms in the aliphatic chains.

Table S13 in Supporting Materials shows the intercorrelation relationship between each pair of descriptors included in the QSAR model of toxicity, while Table S15 shows the experimental and predicted values in log IC<sup>50</sup> used for constructing of the QSAR model of toxicity.

#### 2.1.5. Validation of QSAR Models

Table 5 contains a compilation of the statistical parameters used for the internal and external validation of the four QSAR models developed. From Y-randomization, it was found that *RMSEcal* < *RMSErand*, thus indicating that all QSAR models are robust due to the absence of random correlation [78] Internal validation by the Leave-One-Out (LOO) method yielded a value of the squared correlation coefficient (*R* 2 *loo*) ≥ 0.5 [78,79], which ensures the statistical stability of each model. In addition to presenting good internal validation parameters, all of the QSAR models are predictive, since they meet the following requirements: the slopes *k* and *k'* of the plots of observed and predicted values are in the range 0.85–1.15 (with *k* corresponding to the case when the predicted values are plotted on the x-axis and the experimental values on the y-axis, while *k'* is the inverse graph). The CCC statistically evaluates the models, this parameter verifies the difference between the experimental and predicted values. The squared correlation coefficients have values greater than 0.5, which validate the models [80].


**Table 5.** Internal and external validation parameters for each best QSAR model.

Figure 1 shows the dispersion diagram of the experimental values of anti-chagas and anti-*Leishmania* (*L. amazonensis* and *L. infantum*) activity and toxicity expressed as *Log*(IC50) as a function of the values predicted by each model. In all cases, it is observed how points adopt a linear trend around the line of perfect fit (in green), which confirms a multivariate linear relationship.

ோௗ ଶ ோௗ

> ிଵ ଶ ிଶ ଶ ிଷ ଶ

> > ଷ

**Figure 1.** Linear correlation plots between the experimental and the predicted values obtained using the QSAR Equations (1) (**a**), (2) (**b**), (3) (**c**) and (4) (**d**).

=0 Figure 2 shows the plots of the residuals for the four QSAR models developed. It was observed that for anti-chagas (Figure 2a) and anti-leishmaniasis (Figure 2b,c) activity, no residual is greater than three times the standard deviation (3S) of the model (outliers), while the toxicity model (Figure 2d) presented one outlier of the test group. In all QSAR models, the points follow a random distribution around the line at *y* = 0, which suggests that these properties are modeled using of multiple linear regression.

Due to the low diversity of molecules considered in the development of the QSAR models, the acceptable predictions are restricted to structural analogs derived from naphthoquinone, whose influence value is less than the critical influence value (*h\**) in each model (Figure 3). As shown in Figure 3a,c, for anti-chagas and anti-*L. infantum* activity, all molecules of the validation group and of the test group are within the domain of applicability, and molecules of the calibration group are considered outside the value of *h\**, a fact that reinforces the predictive capacity of these models. For the anti-*L. amazonensis* activity model, Figure 3b shows how a molecule of the test group is rejected, thus demonstrating its ability to reject molecules with large differences. For its part, the toxicity model shows that all molecules considered for its development and validation are within the domain of applicability.

#### 2.1.6. Molecular Design and Applicability of QSAR Models

Both *ortho*-naphthoquinone (1,2-substituted) and *para*-naphthoquinone (1,4-substituted) are recognized as highly active cores due to the synergy between its acid base and oxidation reduction properties [81], whose derivatives have exhibited several tunable antiparasitic effects according to the substitutions made in their fused rings [82]. As can be verified from the structural compilation used for constructing our QSAR antiprotozoal models (Figures A1–A3 in Appendix A), a high in vitro anti-chagas or anti-leishmanial effect is achieved when substitutions with heterocyclic, aromatic, or aliphatic groups are made at

the 2 and 3 positions, for the case of the *para*-naphthoquinone, or at the 3 and 4 positions, for the case of the *ortho*-naphthoquinone nucleus.

The molecular design of our selection group was based on the following principles: (i) restriction to only derivatives of the 1,4-naphthoquinone nucleus at positions 2 and 3; (ii) use of open-chain and heterocyclic substituents at these positions; (iii) use of nitrogenous derivatives in position 2; (iv) variation in position 3 with electro-withdrawing and electrodonor groups; and (v) substitution in the amino nitrogen, located in position 2, with both highly functionalized aromatic and heterocycle rings. Due to this strategy, four families of 1,4-naphthoquinone derivatives (Figure 4) were selected for in silico evaluation: (a) 2-chloro-3-arylamino family (NQ–Chlorine); (b) 2-amino-3-arylamino family (NQ–Amine); (c) 2 amino-3-triazolamino family (NQ–Triazole); and (d) phenazine family (NQ–Phenazine). In particular, the use of triazoles and phenazines as substituents was established based on the high intrinsic activity of these groups, as well as their enhancing effect when incorporated into other active nucleus [23,82–84].

**Figure 2.** Dispersion of residues according to QSAR Equations (1) (**a**), (2) (**b**), (3) (**c**), and (4) (**d**).

QSAR models established according to Equations (1)–(4) were used to predict antichagas (IC<sup>50</sup> in trypomastigotes), anti-*L. amazonensis* (cutaneous and mucocutaneous leishmaniasis), and anti-*L. infantum* (visceral leishmaniasis) activity, as well as toxicity, of the 68 derivatives of 1,4-naphthoquinone shown in Figure 4, which are structures that do not present experimental anti-chagas or anti-leishmanial activity (in vitro or in vivo) reported so far.

Figure 5a shows the prediction values of anti-chagas activity (IC50) for the 68 1,4 naphthoquinone derivatives evaluated using QSAR Equation (1). As a result, 47 (69%) of the 68 evaluated molecules had a better predicted IC<sup>50</sup> than the experimental value of the reference drug BNZ. Overall, 10 of these derivatives (structures 17, 43, 53–60) were not

within the applicability domain of the model, so their predicted activity should be considered unreliable. High anti-chagasic activity was found for the NQ–Phenazine derivatives substituted with the isopropyl group (structures 61 and 62), as well as for the NQ–Chlorine and NQ–Amine derivatives containing the same substituent (structures 24–26 and 51–52, respectively). Additionally, the derivatives coupled to the triazole ring (NQ–Triazole family) show a slightly better predicted activity than BNZ. According to this QSAR model, those amino derivatives with electron-withdrawing substituents (nitro and fluorine, structures 34–36 and 40–42, respectively) present the less parasitic activity toward *T. cruzi* trypomastigotes.

The predicted IC<sup>50</sup> values of anti-*Leishmania* activity (*L. amazonensis* and *L. infantum*) for the 68 naphthoquinone derivatives evaluated with QSAR Equations (2) and (3) are presented in Figure 5b,c respectively. According to the influence value (*h\**), 63 (93%) of the 68 naphthoquinone derivatives showed reliable anti-*Leishmania* activity, and five molecules (42, 43, 64, 67 and 68) did not show reliable activity. It was found that all molecules belonging to NQ–Phenazine family, as well as some derivatives of the NQ– Chloro (structures 1, 20, 24, and 26), and NQ–Amine families (structures 27, 36, 39, 44–47, 50–51), presented a better anti-*L. amazonensis* effect than the reference drugs Miltefosine and Glucantime (see Figure 5b), unlike the all of NQ–Triazole derivatives that had lower results than the reference drugs. According to Equation (3), 51 derivatives presented a reliable predictive anti-*L. infantum* activity, while the other 17 molecules were not found within the influence parameter (0.24). Among the molecules that presented a reliable activity, the best activity was that of molecule 6 together with several other NQ–Chloro derivatives (structures 1, 3, 6–7, 9–10, 12, 14–17, and 19–26), one NQ–Phenazine derivative (structure 56), and four of the NQ–Triazole derivatives (structures 63–65 and 67), all of which presented better activity than the reference drug. All of the NQ–Amine derivatives presented lower anti-*L. infantum* activity than the two reference drugs.

**Figure 3.** *h lim* values for (**a**) anti-chagas (Table S5); (**b**) anti-*L. amazonensis* (Table S10); (**c**) anti-*L. infantum* (Table S11) activity, and (**d**) toxicity (Table S15).

**Figure 4.** Molecular design of 1,4-naphthoquinone derivatives with potential antiparasitic activity (chagas and leishmaniasis). (**top-left**) (NQ–Chlorine family), (**top-right**) (NQ–Amine family), (**bottomleft**) (NQ–Phenazine), and (**bottom-right**) (NQ–Triazole family).

**Figure 5.** Prediction of anti-chagas (IC50) (Table S6) (**a**), anti-*L. amazonensis* (IC50) (**b**), and anti-*L. infantum* (LogIC50) (Table S12) (**c**) activity, as well as toxicity (LogIC50) (Table S16) (**d**), for the 68 1,4-naphthoquinone derivatives (*x*-axis) evaluated using QSAR Equations (1)–(4), respectively. Only structures with reliable activity, as determined by the influence value (*h\**), are presented. The predicted values for anti-*L. infantum* and toxicity are presented in LogIC<sup>50</sup> because of the wide dispersion of the data.

The toxicity values (LogIC50) predicted by QSAR Equation (4) for the 68 naphthoquinone derivatives are shown in Figure 5d. A total of 64 molecules presented a reliable predicted activity, while 4 of them (structures 41–42, 53 and 54) were outside of the influence parameter (*h\**). Among the 68 molecules evaluated, all of the structures belonging to the NQ–Phenazine family presented the lowest toxicity values.

#### *2.2. Molecular Docking*

Ligand–protein docking simulations were carried out to determine the most effective binding mode of each of the 68 1,4-naphthoquinone derivatives within the catalytic sites of the 5 chosen molecular targets (*Tc*TR, *Tc*LαD, *L*TR, *La*R, and *Li*TA). These macromolecular receptors were selected due to their relevance in the processes of survival, metabolism, reproduction, and proliferation of the parasites *T. cruzi*, *L. amazonensis,* and *L. infantum*. Conformational flexibility was allowed in all rotational bonds of the ligand, while the protein was used as a rigid structure. The best poses were selected according to the MVD scoring function, which helped to elucidate the electronic and structural aspects of the binding mode of the ligands in the active site of each protein. α

2.2.1. Docking in Trypanothione Reductase and Lanosterol α-Demethylase Proteins of *T. Cruzi*

α

For the *Tc*TR protein a 496 Å<sup>3</sup> cavity was used, while for the *Tc*LαD protein, a 481 Å<sup>3</sup> cavity was used (Figure S1). The results of the ligand–protein coupling are shown in Figure 6a,b, where the 68 1,4-naphthoquinone derivatives show a good interaction (from −128.75 to −79.68 kcal/mol) with the two selected molecular targets. In this study, as the interaction energy decreases, the affinity of the compounds with the enzyme improves. α − −

α **Figure 6.** Ligand–protein docking energies of 1,4-naphthoquinone derivatives against (**a**) *Tc*TR and (**b**) *Tc*LαD receptors.

− − As shown in Figure 6a, the triazole-fused naphthoquinone derivatives had better interaction energy (from −128.75 to −120.03 kcal/mol) with the active site of the *Tc*TR. Among these, NQ–Triazole structure 64 had the best affinity, presenting 4 hydrogen bonds: 1 with the Asn 340 A residue, another with the Arg 355 A residue, and 1 with the Gly 459 B residue (Figure 7a), with the latter 2 not present in its triazole analogs; as well as steric interactions with the amino acids His 461, Thr 335, Ile 339, Glu 466, Glu 19, Pro 336, and Ser 470. Chacón et al. [85] also reported that the hydrogen bond interaction with the Gly 459 residue provided the best affinity energy to the quinoxaline derivative of the group they evaluated; and reported for this compound the same hydrophobic interactions. Similarly, the natural substrate co-crystallized in the active site of the *Tc*TR protein presented interactions with residues Glu 19 A, Gly 459 B, Pro 336 A, Ile 339 A, and His 461 B [86]. NQ–Triazole 66 presented an energy close to that of molecule 64, and in addition to the same hydrophobic interactions, it presented unions with the amino acids Leu 18 A, Tyr 111 A, and Val 54, which are also part of the *Tc*TR-trypanothione union (Figure 7b). The only hydrogen bond that this derivative presents together with the other NQ–triazole molecules is with the Ser 15 A residue, which is a solvent-mediated hydrogen

α

bond interaction [86]. Note that this interaction occurs only in this series of compounds. Table S17 in the Supporting Materials shows the interaction energy of each ligand, as well as the hydrogen bonds with the different amino acid residues of the active site of the *Tc*TR and *Tc*LαD proteins.

− − − − − α NQ–Chlorine derivatives 6 (−108.09 kcal/mol), and 17 (−111.59 kcal/mol), as well as NQ–Amine derivative 43 (−110.87 kcal/mol), also presented high ligand–protein interaction energies. Although these compounds have hydrogen bonds in the *Tc*TR active site (Figure 7c), the strength of these hydrogen bonds is comparable to those in compounds that have the least favored interaction energies, i.e., NQ–Phenazine structures 57 (−79.68 kcal/mol) and 58 (−81.93 kcal/mol). From the above, it follows that the compounds 6, 17, and 43 achieve their highest affinity and stability through other types of interactions.

− − The molecular docking evaluation against the *Tc*LαD protein showed that molecule 66 was again the ligand with the best affinity (−121.60 kcal/mol), followed by molecule

68 (−121.18 kcal/mol). Both structures belong to the family of triazoles substituted with fluorine atoms in the terminal aromatic ring, in this case, in the *meta* and *ortho*-*para* positions, respectively. Derivative 66 presents three hydrogen bonds between the three nitrogen atoms of the triazole ring with the donor oxygen atom of the Tyr 103 (B) residue; this type of interaction was also found in the binding site of Fluconazole and Posaconazole with a single hydrogen bond [87]. Ligand 66 also exhibits hydrophobic interactions with residues Tyr 116 (B), Ala 291 (B), Ala 287 (B), Met 460 (B), Phe 290 (B), Met 106 (B), and Met 40 (B). A capture the pose of compound 66 at the active site of the *Tc*LαD target is presented in Figure 7c. Structure 68 presents two hydrogen bonds, the strongest with the iron protoporphyrin IX enzymatic cofactor (HEM\_1450 or Heme) co-crystallized in the amino acid chain B, and the other with the Tyr 103 residue (B) (Figure 7d), which is consistent with the interactions reported for this receptor with Fluconazole and Posaconazole [87]. The steric interactions that stabilize ligand 68, also present in the two azoles mentioned [87], occur with residues Ala 291 (B), Ala 287 (B), Tyr 116 (B), and Phe 110 (B). Cardoso et al. reported a hydrogen bond between their furan–naphthoquinones with the Tyr 116 residue in the binding site of this protein [88]; however, for molecules 66 and 68, the interactions with this residue are of the steric type. These same authors report cation-π interactions with the Fe of the Heme group, which were not observed in any 1,4-naphthoquinone derivative tested here.

Other structures that presented favorable interaction energies with the active site of the *Tc*LαD protein were the NQ–Amine derivatives 32 (−107.97 kcal/mol), 41 (−105.71 kcal/mol), and 46 (−106.12 kcal/mol), and the NQ–Chlorine derivatives 6 (−113.43 kcal/mol), 14 (−108.92 kcal/mol), 15 (−109.63 kcal/mol), 24 (−107.88 kcal/mol), and 25 (−108.50 kcal/mol). Among these, the lowest energy was found for the derivative substituted with an ethoxide group in the *meta* position of the phenylamino aromatic ring (structure 6), which did not present hydrogen bond interactions. The other 30 molecules did not present formation of hydrogen bonds with any amino acid residue of the active site.

2.2.2. Docking in Trypanothione Reductase, Arginase, and Aminotransferase Proteins of *Leismania* Genus

The docking evaluation of the 68 1,4-naphptoquinone derivatives in the catalytic sites of the proteins trypanothione reductase (*L*Tr), arginase (*La*A), and aminotransfera (*Li*AT) was carried out using cavities of 705, 305, and 1171 Å<sup>3</sup> , respectively (Figure S2).

The ligand–protein interaction energies for the best pose of each of the 68 derivatives evaluated are shown in Figure 8. In all cases, the most effective interactions against the three macromolecular receptors of the *Leishmania* genus were obtained for derivatives belonging to the NQ–triazoles. Molecules 65, 66, and 67 showed the best interaction energies (−158,024, −121,516, and −121,426 kcal/mol) against *L*Tr, *La*A, and *Li*At active sites, respectively, while derivative 64 showed the second-best interaction energy (−155,221, −116,708, and −119,504 kcal/mol) in all three docking studies. Only in the case of the docking evaluation against the *La*A site (Figure 8b) did derivative 17, belonging to the NQ–Chlorine family, and derivatives 32, 33, 40, 41, and 43 of the NQ–amino family, show a better affinity interaction than member 63 of the NQ–triazole family.

These results conform with some studies of drugs used to combat parasitic diseases based on triazoles or azoles. These compounds lead to alterations in the mitochondria and accumulation of lipid bodies, thus interfering with the biosynthesis of the cell membrane [89], and leading to cell death of the parasite [90,91]. Thus, triazole-substituted naphthoquinone derivatives have potential antiparasitic activity against *Leishmania*, since they have also been shown to be a type of compound tolerable by patients [89].

**Figure 8.** Ligand–protein docking energies of 1,4-naphthoquinone derivatives against (**a**) *L*Tr (Table S18), (**b**) *La*A (Table S19), and (**c**) *Li*AT (Table S20) receptors.

The two best poses adopted within the active site for each docking evaluation are presented in Figure 9. In the case of the poses of derivatives 64 and 65 against the *L*Tr protein (Figure 9a,b), both present interactions by hydrogen bonds with the amino acid residues Ser 14, Thr 335, Cys 52, and Lys 60 of *L*Tr. The two cysteine residues (Cys52 and Cys57) present in the active site form the disulfide bond in the oxidized form of the protein [92], which is a bond critical in the parasite's defense mechanism, while the interactions with the residues Thr 335 and Lys 60 are destined exclusively to the binding domain of FAD [93]. In this way, the binding of the ligands to these last residues prevents the binding of FAD and its orientation toward the active site during the reduction, which inhibits the *L*Tr enzyme [93].

The best positions of structures 64 and 66 in the active site of the *La*A protein (Figure 9c,d) presented both interactions by hydrogen bonds with residues Ser 150, Thr 148, and Val 149. Structure 64 presented hydrogen bonds with residues Gly151 and Asn 152, while structure 66 formed an interaction with residue Asn 143. In similar studies [94,95], some flavonoid ligands showed interactions with amino acid residues Ser150, Asn 152, and Asn153 within the active site, exhibited high in vitro activity against L. *amazonensis* cultures, and low toxicity in mammalian cells. Residues Ser150 and Asn143 are involved in the Mn(II) metal bridge in coordination with the active site of arginase, which is required to conduct its catalytic activity [95,96]. According to this, molecules 64 and 66 inhibit the coupling of the metallic bridge, and would be good inhibitors of the arginase protein.

(**a**) (**b**)

The poses of structures 64 and 67 facing the *Li*At receptor are presented in Figure 9e,f. Both structures present interaction by hydrogen bonds with the residue Tyr 256. It has been shown that this residue prevents the rotation of the pyridine ring of the pyridoxal phosphate cofactor until the entrance of the amino acid, avoiding its stability [97].

#### *2.3. ADME Analysis*

Theoretical early estimation of ADME properties for the series of 68 1,4-naphthoquinone derivatives was performed in the free SwissADME web tool [98]. The data of lipophilicity, solubility, pharmacokinetic properties, and skin permeability parameters are collected in Table S21.

Figure 10 shows the estimated lipophilicity values of the 68 ligands, expressed as the logarithm of the octanol–water partition coefficient (Log Po/w). This parameter makes it possible to determine hydrophobicity, which is a parameter considered in the initial phases of drug development, and which allows inferring how the compound will behave in biological fluids and its possible diffusion through biological membranes [77,99]. The lipophilic results show that the 68 derivatives have a Log *p* < 5 [100], so according to the Lipinski Rules, they can be administered orally [101]. NQ–Amine derivative 43 presented the lowest value (log Po/w = 0.49), that is, it is more hydrophilic than the rest.

**Figure 10.** Lipophilicity values of the 68 naphthoquinone derivatives.

− − − − − − − − − − Figure 11 shows the SWISSADME solubility values (expressed as log S) of the 68 derivatives according to the ESOL (Estimated SOLubility), solubility adapted by Ali et.al., and SILICO-IT [102] methods. The ESOL method estimates the solubility in water directly from the molecular structure, followed by its weight [103]; the Ali method incorporates the effect of the topological polar surface area [104]; and the SILICO-IT method is calculated using a fragmented procedure [102]. The solubility (Log S) scales used are insoluble < −10, poor < −6, moderate < −4, soluble < −2, and high < 0. The calculated solubility data presented ESOL values of −5.42 < Log S < −3.57, Ali values of −7.04 < Log S < −3.9 and SILICO-IT values of −8.33 < Log S < −4.28, indicating that all 68 compounds have moderate to poor water solubility.

− − − −

− − − −

− −

**Figure 11.** Solubility data (in Log S) calculated for the 68 1,4-naphthoquinone derivatives using the ESOL (blue), Ali-adapted (orange), and SILICO-IT (magenta) methods.

The evaluation of pharmacokinetic parameters of gastrointestinal (GI) absorption, P-gp substrate, and cytochrome P450 (CYP) inhibition or interaction for all 68 derivatives is presented in Figure 12.

**Figure 12.** Pharmacokinetic properties calculated for the 68 naphthoquinone derivatives.

The GI barrier has a complex structure given the characteristics of a semi-permeable membrane, which allows fat-soluble molecules to penetrate it through a diffusion process, as is the case with most drugs [105]. As Figure 13 shows, passive human gastrointestinal (GI) absorption data estimate that 97% of the 68 compounds present high absorption in the digestive tract, while the remaining 3%, corresponding to structures 17 and 43, would present low intestinal absorption. For its part, P-glycoprotein (PGP) is a permeability protein that pumps substances out of the body and prevents their absorption, which causes drug resistance to form [105,106]. It was found that none of the compounds behaved as a substrate for P-gp; therefore, they could conduct the function without any resistance [107].

**Figure 13.** BOILED-Egg (adapted with permission from Ref. [108]) evaluation of gastrointestinal absorption for the 68 1,4-naphthoquinone derivatives. Molecules inside the oval indicate that they have GI permeability.

Another form of drug administration is through the skin (transdermal distribution), which allows the transport of substances through the epidermis. This parameter is related to the skin permeability coefficient (Kp), whose values indicate that the more negative the log Kp (with Kp in cm/s), the less permeant the molecule is [109,110]. Figure 14 shows the skin permeability of the 68 evaluated compounds, where it is evident that the best candidates to be administered transdermally are NQ–Chloro derivatives; among them, the best are 25, 26, and 27, while the ligand with the lowest permeability is 43.

− ≤

**Figure 14.** Skin permeability values in Log Kp (Kp in cm/s) of the 68 derivatives.

≤ ≤

≤

The main enzyme isoforms CYP1A2, CYP2C19, CYP2C9, CYP2D6, and CYP3A4 [98] are involved in drug metabolism, and inhibitors block their metabolic activity in a dosedependent manner. Structure 43 was found to be the only derivative unable to inhibit CYP1A2, while 13 derivatives (36–38, 40–43, and 54–58), 4 derivatives (35–37, and 57), and 31 derivatives (11–16, 24–30, 40–43, 55–68) projected a negative inhibitory response of CYP2C19, CYP2C9, and CYP2D6, respectively. In each of these cases, the activity of the remaining molecules would be affected by their metabolism, due to their ability to bind to these enzymes, which may generate adverse effects such as high toxicity [111–113]. None of the 68 molecules reported the inhibition of CYP3A4.

Based on a similarity to drugs, filters based on Lipinski (Pfizer) [101], Ghose (Amgen) [114], Veber (GSK) [115], Egan (Pharmacia) [116], and Muegge (Bayer) [117] rules were also evaluated, as well as the bioavailability score. These filters qualitatively define the feasibility of a compound to become an oral drug candidate. The results show that 100% of the 1,4-naphthoquinone derivatives comply with the Lipinski and Ghose rules, and 98.5% comply with the Veber and Egan rules. For these last two cases, molecule 43 presented a violation, caused by a value of TPSA = 163.83 Å<sup>2</sup> that is above the established ranges (TPSA ≤ 140 and TPSA ≤ 131.6, respectively). Finally, the Muegge filter indicated that 90% of the derivatives present valid conditions to become oral drugs. The remaining 10% presented a violation since their XLOGP value is outside of the allowed range (−2 ≤ XLOGP ≤ 5) (structures 21–26), while the TPSA value of structure 43 is again above the range settled down. All compounds had a bioavailability value of 0.55, indicating oral bioavailability based on Lipinski's rules. Additionally, it was established that an AB score of 0.55 is required to be considered a sufficiently absorbable molecule orally [118].

#### **3. Materials and Methods**

#### *3.1. QSAR Modelling*

#### 3.1.1. In Vitro Anti-Chagas and Anti-Leishmaniasis Data

All of the in vitro data used for constructing of the antiparasitic and toxicity QSAR models were taken from the literature. For constructing the anti-chagasic activity model, 153 structures derived from naphthoquinone were used, some of which were fused with triazoles and oxanes, among other heterocycles (Figure A1). All of these derivatives showed anti-*T. cruzi* activity, evaluated in vitro under the following experimental conditions. The stock solutions of the compounds were prepared in DMSO. Strain Y trypomastigotes were obtained at the peak of parasitemia from albino mice, then isolated through differential centrifugation, and resuspended in Dulbecco's Modified Eagle Medium (DME), with a concentration of 107 parasite cells/mL in the presence of 10% mouse blood. A total of 100 µL of this solution was added to 100 µL of the compound solutions. The cell count was determined in a Neubauer chamber, and the trypanocidal activity was expressed as IC50, which corresponds to a concentration that allows the lysis of 50% of the parasites [23–31].

For the QSAR modeling of cutaneous anti-leishmaniasis (*L. amazonensis*) activity, 60 molecules (Figure A2) containing functional groups such as quinone, triazole, indole, and amine with high trypanocidal activity reported as IC<sup>50</sup> were used. The anti-leishmanial activity of these structures was evaluated by tests with 3-(4,5-dimethylthiazol-2-yl)-2,5 diphenyltetrazolium (MTT) bromide. In these bioassays, promastigotes were seeded in RPMI medium supplemented with 10% FBS, and cultured in a 96-well plate with a concentration of 10 5 cells/plate at 37 ◦C. After the seeding period, the MTT solution was added, the formed complex was dissolved in DMSO, the supernatant was removed, and the cells were incubated under the same seeding conditions in the presence of various concentrations of the compounds. The concentration that inhibited 50% of parasite growth was determined as IC<sup>50</sup> by linear regression [18,32–39].

For constructing the QSAR model of anti-visceral leishmaniasis (*L. infantum*) activity, 90 molecules (Figure A3) derived from quinone, triazole, and indole with activity reported as IC<sup>50</sup> were used. The anti-leishmanial activity evaluation for these molecules was conducted using incubation in RPMI medium with 12% fetal calf serum (FBS), at a

concentration of 105 cells/mL, for 48 h at 25 ◦C. After the incubation process, promastigote growth was estimated by counting the parasites with a Neubauer hemocytometer. The 50% inhibitory concentration (IC50) was defined as the drug concentration required to inhibit 50% of the parasite growth [40–45].

The QSAR model of toxicity was built from 76 structures (Figure A4) derived from naphthoquinone, some fused with triazoles, pyrazoles, imidazoles, and aromatic rings, with experimental data measured according to the following parameters. Mouse fibroblasts (L929) were used and determined by the reduction of 3-(4,5-dimethyl-2-thiazol)-2,5 diphenyl-2H-tetrazolium bromide (MTT), expressed as IC<sup>50</sup> [30,46–51].

#### 3.1.2. Development of Antiprotozoal QSAR Models

All molecules were processed in MDL mol (V2000) format using the ACD/I-Lab software version 11.0 13 [119]. The calculation of 83,180 fingerprint-type, one-dimensional, and two-dimensional molecular descriptors was performed in the programs PaDEL-Descriptor [54], Mold2 [120], QuBiLS-MAS [56], and Fragmentor [121] (all free access). By using the balanced subsets method, the data sets were divided into calibration, validation, and test groups, using 70, 15, and 15%, for the anti-chagas activity model, 80%, 10%, and 10% for the anti-leishmanial activity models, and 70, 15, and 15% for toxicity, respectively. Subsequently, the Replacement Method (RM) [122], which is available in the MatLab programming language, was applied to explore the set of descriptors. RM is an unequivocal algorithm based on multivariable linear regression (MLR), which searches for the best subsets of descriptors in large databases. By using the RM algorithm, multivariable regression models of up to seven descriptors were generated for each model, which was carried out in the MATLAB software version 7.12.0 [123].

#### 3.1.3. Validation of the QSAR Models

Validation of all QSAR models was performed through both internal and external validation processes. Internal validation was performed using the LOO (Leave-One-Out) validation method. For the internal validation of each QSAR model, the statistical parameters R2 LOO (square correlation coefficient of LOO) and SLOO (standard deviation of LOO) were determined. Additionally, the statistical robustness of the models was determined using randomization Y [124,125]. For its part, the external validation verified the predictive capacity, applying the methodology proposed by Golbraikh and Tropsha [78], where the correlation coefficients between the predicted and experimental properties of the compounds in the test set are estimated.

#### *3.2. Molecular Docking*

The structures of the macromolecular targets used for molecular docking were extracted from the Protein Data Bank (PDB). In the case of *T. cruzi*, trypanothione reductase (*Tc*TR) [86] and lanosterol α-demethylase (*Tc*LαD) [87] proteins were selected, with 1BZL and 2WUZ coding and a resolution of 2.40 Å and 2.45 Å, respectively. For the *Leishmania* genus in general, the protein trypanothione reductase (*L*TR) [126,127] with code 2JK6 and a resolution of 2.95 Å was selected, while for the strains *L. amazonensis* and *L. infantum* arginase (*La*A) [128] and tyrosine aminotransferase (*Li*TA) [129] were selected, with codes 4IU0 and 4IX8 and resolutions of 1.77 Å and 2.35 Å, respectively.

Before the molecular docking simulations, the proteins were prepared in the Molegro Virtual Docker (MVD) program [130], where bond assignment, bond order, hybridization, missing hydrogens, charge assignment, and atom tripos were performed. To validate the software and optimize the molecular docking parameters, a re-docking procedure of the co-crystallized ligands was carried out. For each protein, a cavity was created by delimiting the coupling space of the co-crystallized ligand, and the best pose (Figure 15) was selected based on the lowest RMSD value (<2) [131].

α α

α **Figure 15.** Re-coupling, white: co-crystallized substrate. Red: Substrate coupled using MVD. (**a**) *Tc*TR; (**b**) *Tc*LαD, (**c**) *Li*TA, (**d**) *La*A, and (**e**) *L*TR.

For the preparation of the ligands, the structures of the naphthoquinone derivatives (68) were built in the ACD-Labs program, and saved as .mol files. These structures were then optimized in the Gaussian 09W program [132] by simultaneously relaxing all geometric parameters to the theoretical level B3LYP/6–31 + G(d).

For the molecular docking simulations, 5 poses (conformation and orientation) were generated for each of the 68 ligands evaluated, using the search algorithm MolDock Optimizer, which was executed with 10 repetitions. The data analysis was performed in the Molegro Virtual Modeller (MVM) program, where the best pose in each case was selected, and the data were plotted based on the energy calculated using the scoring function (Equation (5)).

$$E\_{\text{score}} = E\_{\text{inter}} + E\_{\text{intra}} \tag{5}$$

where *Einter* is the intermolecular interaction energy of the ligand–protein, and the *Eintra* is the internal energy of the ligand. The *Einter* is shown in Equation (6):

$$E\_{\text{inter}} = \sum\_{i \in ligand} \sum\_{i \in ligand} \left[ E\_{PLP}(r\_{ij}) \right] + 332.0 \frac{q\_{i}q\_{j}}{4r\_{ij}^{2}} \tag{6}$$

where the EPLP term represents the PLP (piecewise linear potential) energy, which consists of the use of two different parameter sets, described as follows: one for approximation of the steric term (van der Waals) among atoms, and the other potential for the hydrogen binding. The second term (332.0 *<sup>q</sup>iq<sup>j</sup>* 4*r* 2 *ij* ) is related to the electrostatic interactions among overloaded atoms. It is a Coulomb potential with a dielectric constant dependent on the distance (D(r) = 4r). The numerical value of 332.0 is responsible for the electrostatic energy unit to be given in kilocalories per mol. The *q<sup>i</sup>* and *q<sup>j</sup>* terms represent the charges of the atoms *i* and *j*, respectively. The *rij* term indicates the interatomic distance between the atoms *i* and *j* [133].

The *Eintra* is shown in Equation (7).

$$E\_{\rm inter} = \sum\_{i \in \text{ligand}} \sum\_{i \in \text{ligand}} \left[ E\_{\rm PL}(r\_{i\bar{\jmath}}) \right] + \sum\_{f \in \text{explable bonds}} A \left[ 1 - \cos(m\theta - \theta\_0) \right] + E\_{\rm Calh} \tag{7}$$

The first part of the equation (double summation) is among all pairs of atoms in the ligand, taking off those which are connected by two bonds. The second term characterizes the torsional energy, where *θ* is the torsional angle of the bond and *θ*<sup>0</sup> is its corresponding

value in the equilibrium. The average of the torsional energy bond contribution is used if several torsions could be determined. The last term, *Eclash*, assigns a penalty of 1.000 if the distance between two heavy atoms (more than two bonds apart) is shorter than 2.0 Å, not considering infeasible ligand conformations [134]. The docking search algorithm that is applied in the MVD program considers an evolutionary algorithm, the interactive optimization techniques which are inspired by Darwinian evolution theory, and a new hybrid search algorithm called guided differential evolution. This hybrid combines the differential evolution optimization technique with a cavity prediction algorithm during the search process, thereby allowing for fast and accurate identification of potential binding modes (poses) [133–135].

#### *3.3. ADME Analysis*

The ADME study was conducted using the SWISSADME web tool of the Swiss Institute of Bioinformatics [98]. This free web tool allows you to calculate properties, such as lipophilicity, water solubility, pharmacokinetics, and drug similarity, based on the structure or SMILE of the molecule.

#### **4. Conclusions**

A total of four QSAR models based on naphthoquinone derivative structures were developed and validated, three of them for antiprotozoal activity against *T. cruzi, L. amazonensis*, and *L. infantum*, and one for toxicity prediction. All QSAR models were built using in vitro inhibitory concentrations (IC50) which were previously reported. The anti-*T. cruzi* model was developed based on 153 molecules, the anti-*L. amazonensis* used 60 molecules, the anti-*L. infantum* model used 90 structures, and the toxicity model was built with 76. According to the anti-*T. cruzi* QSAR model, the anti-chagasic activity is favored by the refractivity, electronegativity, and polarizability of the molecules. These characteristics possibly give them a better binding capacity and inhibition of essential macromolecules in the metabolism of the parasite. Additionally, it was found that the presence of aromatic fragments in these structures increases their antiparasitic activity. Of the proposed molecules, structures 61 and 62, belonging to the derivatives fused with phenazine, and 24, 25, and 54, which are phenylamino derivatives, presented a better activity against *T. cruzi* predicted by the QSAR model compared to that of the reference drug. The QSAR anti-*L. amazonensis* predicts that molecules with less symmetry and with the presence of secondary and tertiary carbons will be more active. Thus, phenazines 62, 60, 59, 61, 58, and 57 showed a predicted potential activity against *L. amazonensis*, as well as derivatives 56, 27, and 26 of the phenylamine family. A total of 22 structures showed better predicted activity than Miltefosine, and 37 demonstrated better results than Glucantime. According to the anti-*L. infantum* QSAR model, the activity is a function of LogP, which is an essential characteristic for ligands to be considered drugs, since it gives them the ability to permeate biological membranes. Similarly, the activity increases with PSA, which is related to the penetration of the blood–brain barrier and its intestinal absorption. The derivatives with the best predicted anti-*L. infantum* activity were 6, 15, 7, 14, and 16, which belong to the phenylamino family. Additionally, it is highlighted that 23 molecules were more active than the reference drugs (Miltefosine and Glucantime).

According to the results of molecular docking, all the evaluated compounds present very favorable interaction energies with the selected receptors, which are essential in the metabolism of both *T. cruzi* and *Leishmania*. In the *Tc*TR protein, the derivatives fused with triazole stand out. The best of these was structure 64, which, unlike its analogs, features two hydrogen bonds between the triazole ring and the Gly 459 residue. Similarly, the triazole derivatives presented the best interaction energies for the *Tc*LαD protein. The best of them was structure 66. This derivative presents three interactions of hydrogen bonds between the triazole ring and the Tyr 103 residue, which is a fundamental residue in the active site of said receptor. The triazole derivatives 65, 66, and 67 presented the best interaction affinities against the catalytic sites of all leishmania proteins evaluated (Trypanothione reductase, Arginase, and Tyrosine Aminotransferase). These types of derivatives have a wide variety of biological activities, and according to our results, they are considered potent inhibitors of macromolecular targets. Ligands belonging to this type of molecules are currently used as active principles of antiparasitic drugs because they generate cell deformation and alterations in the mitochondria, thus interfering with the biosynthesis of the parasite's cell membrane, causing cell death.

The results of ADME showed that the 68 evaluated compounds met the requirements to be administered orally. ADME calculations included lipophilicity, solubility, pharmacokinetics, and drug-likeness.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/ph15060687/s1, Figure S1: Cavities in the active site of TcTR (a) and TcLαD (b) proteins; Figure S2: Cavities in the active site of LTR (a). LaA (b). and LiTA (c) proteins; Table S1: Antiparasitic QSAR models for chagas and leishmaniasis protozoa based on naphthoquinone derivatives obtained by the replacement method; Table S2: Toxicity QSAR models based on naphthoquinone derivatives obtained by the replacement method; Table S3: Intercorrelation matrix of the QSAR model descriptors for anti-chagas activity; Table S4: Numerical values of the QSAR anti-chagas model descriptors; Table S5: Experimental and predicted values of anti-chagas activity. Residuals and influence values h (hˆlim = 0.196) are reported. ˆ validation group and \* test group; Table S6: Values predicted by the QSAR model of anti-chagas activity of the 68 naphthoquinone derivatives (prediction group); Table S7: Intercorrelation matrix of the QSAR model descriptors for anti-leishmanial activity; Table S8: Numerical values of the QSAR anti-L. amazonensis model descriptors; Table S9: Numerical values of the QSAR anti-L. infantum model descriptors; Table S10: Experimental and predicted values of anti-L. amazonensis activity. Residuals and influence values h (hˆlim = 0.3125) are reported. ˆ validation group and \* test group; Table S11: Experimental and predicted values of anti-L. infantum activity. Residuals and influence values h (hˆlim = 0.24) are reported. ˆ validation group and \* test group; Table S12: Values predicted by the QSAR models of anti-leishmanial activity of the 68 naphthoquinone derivatives (prediction group); Table S13: Intercorrelation matrix of the QSAR model descriptors for toxicity; Table S12: Numerical values of the QSAR toxicity model descriptors Table S14. Numerical values of the QSAR toxicity model descriptors; Table S15: Experimental and predicted values of toxicity. Residuals and influence values h (hˆlim = 0.3125) are reported. ˆ validation group and \* test group; Table S16: Values predicted by the QSAR model of toxicity of the 68 naphthoquinone derivatives (prediction group); Table S17: Results of the coupling of naphthoquinone derivatives in the active site of the TcTR and TcLαD proteins. Table S18: Results of the docking of naphthoquinone derivatives in the active site of the LTR protein; Table S19: Results of the docking of naphthoquinone derivatives in the active site of the LaA protein; Table S20: Results of the docking of naphthoquinone derivatives in the active site of the LiTA protein; Table S21: ADME parameters.

**Author Contributions:** Conceptualization, J.A.G.C., D.L.N.G. and A.C.-C.; methodology, L.S.P.C., K.A.A.S., J.A.G.C. and D.L.N.G.; software, L.S.P.C., K.A.A.S., J.A.G.C., D.L.N.G. and P.R.D.; validation, L.S.P.C., K.A.A.S., D.L.N.G. and P.R.D.; formal analysis, L.S.P.C., K.A.A.S., J.A.G.C., D.L.N.G. and P.R.D.; investigation, L.S.P.C., K.A.A.S. and D.L.N.G.; resources, J.A.G.C.; data curation, L.S.P.C., K.A.A.S. and D.L.N.G.; writing—original draft preparation, L.S.P.C., K.A.A.S. and J.A.G.C.; writing review and editing, L.S.P.C., K.A.A.S., J.A.G.C., D.L.N.G., P.R.D., A.C.-C. and W.E.R.N.; visualization, L.S.P.C., K.A.A.S. and J.A.G.C.; supervision, J.A.G.C.; project administration, J.A.G.C.; funding acquisition, J.A.G.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Universidad Pedagógica y Tecnológica de Colombia (UPTC) through projects SGI2033, SGI2949, SGI3038, and SGI3188 of the Dirección de Investigaciones (DIN).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data is contained within the article and Supplementary Material.

**Acknowledgments:** The authors greatly acknowledge the financial support provided by the Universidad Pedagógica y Tecnológica de Colombia (UPTC) and Ministerio de Ciencia, Tecnología e Innovación de Colombia (Minciencias). K.A.A.S. especially thanks the Ministerio de Ciencia, Tecnología e Innovación de Colombia (Minciencias) for awarding a Young Researcher Scholarship through call 891 of 2020. P.R.D. gives thanks to Ministerio de Ciencia, Tecnología e Innovacion Productiva de Argentina for the electronic library facilities. P.R.D. is a member of the scientific researcher career of CONICET.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **Appendix A**

**Figure A1.** Molecules with reported biological activity (IC<sup>50</sup> ) used for the development of the QSAR anti-*T. cruzi*.

**Figure A2.** Molecules with reported biological activity used for the development of the anti-L. amazonensis QSAR model.

**(329)** R1=NH<sup>2</sup> , R2=CH<sup>3</sup> , **(330)** R1=NH<sup>2</sup> , R2=-CH2OH, **(331)** R1=NH<sup>2</sup> , R2=-CH2Br, **(332)** R1=NH<sup>2</sup> , R2=-C4H8O<sup>3</sup> , **(333)** R1=NH<sup>2</sup> , R2=-C6H12O<sup>2</sup> **, (334)** R1=NH<sup>2</sup> ,R2=-C7H14O<sup>2</sup> , **(335)** R1=NH<sup>2</sup> , R2=-C9H10O<sup>2</sup> , **(336)** R1=NH<sup>2</sup> , R2=-C9H9FO<sup>2</sup> , **(337)** R1=NH<sup>2</sup> , R2=C10H12O<sup>3</sup> , **(338)** R1=Cl, R2=CH<sup>3</sup> , **(339)**R1=Cl, R2=CH3OH, **(340)** R1=Cl, R2= CH3Br, **(341)**R1=Cl, R2=C4H8O<sup>3</sup> , **(342)**R1=Cl, R2= C6H12O<sup>2</sup> **, (343)**R1=Cl, R2=C7H14O<sup>2</sup> , **(344)**R1=Cl, R2=-C9H10O<sup>2</sup> , **(345)**R1=Cl, R2=-C9H9FO<sup>2</sup> , **(346)**R1=Cl, R2=-C9H9ClO<sup>2</sup> **, (347)** R1=Cl, R2=-C10H12O<sup>3</sup>

N

**(351)**

N Cl

**Figure A4.** Molecules with reported toxicity in mouse fibroblasts (L929) used for the development of the QSAR model of toxicity.

### **References**

