Next Article in Journal
Pathogens Use and Abuse MicroRNAs to Deceive the Immune System
Next Article in Special Issue
Is It Reliable to Use Common Molecular Docking Methods for Comparing the Binding Affinities of Enantiomer Pairs for Their Protein Target?
Previous Article in Journal
Pancreatic Transdifferentiation and Glucose-Regulated Production of Human Insulin in the H4IIE Rat Liver Cell Line
Previous Article in Special Issue
Hyaluronidase Inhibitory Activity of Pentacylic Triterpenoids from Prismatomeris tetrandra (Roxb.) K. Schum: Isolation, Synthesis and QSAR Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Structural Investigation for Optimization of Anthranilic Acid Derivatives as Partial FXR Agonists by in Silico Approaches

1
College of Traditional Chinese Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou 350122, China
2
College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2016, 17(4), 536; https://doi.org/10.3390/ijms17040536
Submission received: 9 March 2016 / Revised: 29 March 2016 / Accepted: 5 April 2016 / Published: 8 April 2016

Abstract

:
In this paper, a three level in silico approach was applied to investigate some important structural and physicochemical aspects of a series of anthranilic acid derivatives (AAD) newly identified as potent partial farnesoid X receptor (FXR) agonists. Initially, both two and three-dimensional quantitative structure activity relationship (2D- and 3D-QSAR) studies were performed based on such AAD by a stepwise technology combined with multiple linear regression and comparative molecular field analysis. The obtained 2D-QSAR model gave a high predictive ability (R2train = 0.935, R2test = 0.902, Q2LOO = 0.899). It also uncovered that number of rotatable single bonds (b_rotN), relative negative partial charges (RPC), oprea's lead-like (opr_leadlike), subdivided van der Waal’s surface area (SlogP_VSA2) and accessible surface area (ASA) were important features in defining activity. Additionally, the derived3D-QSAR model presented a higher predictive ability (R2train = 0.944, R2test = 0.892, Q2LOO = 0.802). Meanwhile, the derived contour maps from the 3D-QSAR model revealed the significant structural features (steric and electronic effects) required for improving FXR agonist activity. Finally, nine newly designed AAD with higher predicted EC50 values than the known template compound were docked into the FXR active site. The excellent molecular binding patterns of these molecules also suggested that they can be robust and potent partial FXR agonists in agreement with the QSAR results. Overall, these derived models may help to identify and design novel AAD with better FXR agonist activity.

1. Introduction

Farnesoid X receptor (FXR) is a nuclear receptor expressed in liver, gall bladder, intestine, kidney, and adrenal glands. It regulates important physiological roles in various metabolic pathways involved in bile acid, triglyceride, and glucose homeostasis. Now, FXR has become an attractive target for treating a wide range of metabolic diseases, including diabetes, cholestasis, liver fibrosis, and inflammatory bowel diseases [1,2,3]. Therefore, a number of synthetic steroidal and nonsteroidal FXR agonists have been developed so far. 6-ethyl-chenodeoxycholic acid20 (6-ECDCA) and GW4064 were the most important and widely used steroidal and nonsteroidal FXR agonists [4]. They both constitute full FXR agonists with low nanomolar EC50 values of 85 nM and 0.9 M in a reporter gene assay [5], respectively. In addition, a recent study indicated that GW4064 was active on several off-targets [6]. Considering that metabolic diseases require a stable long-term therapy, well tolerated and low toxicity FXR agonists are predominantly required that can be applied over long time. Moreover, full activation of a ligand activated transcription factor may cause many side effects in long-term treatment [7]. Therefore, new potent partial FXR agonists aimed at providing a stable long-term therapy for metabolic diseases have attracted more and more attention nowadays.
Recently, a novel series of partial FXR agonists based on anthranilic acid skeleton have been reported by Merk et al. [8,9]. The continued interest in the development of more potent partial FXR agonists prompted us to explore the relationship between structures of AAD and FXR agonist activity. Here, quantitative structure activity relationship (QSAR) methods were introduced to guide lead optimization and study the action mechanism for partial FXR agonists. In this QSAR method, the bioactivity of compounds can be predicted by a mathematical model between physicochemical properties and bioactivity. The mathematical model can be achieved by many general algorithms such as linear and nonlinear algorithms or other new methods such as the spectral-structure activity relationship algorithm [10,11]. This QSAR method has become very useful and is widely applied in many fields for predicting compound properties [12,13], including physical property prediction, biological activity prediction, and toxicity prediction.
To the best of our knowledge, no QSAR study has yet been reported in AAD as FXR agonists so far. In this paper, we attempted to investigate the significant structural and physicochemical features required for improving biological activity and to obtain highly predictive 2D- and 3D-QSAR models so as to assist in the design of new potent partial FXR agonists. Firstly, a stepwise technology combined with multiple linear regression (MLR) was applied to develop predictive 2D-QSAR models for uncovering physicochemical features on FXR activity of AAD. Subsequently, a 3D-QSAR study was also performed to obtain more understanding with respect to chemical structures and biological activity using comparative molecular field analysis (CoMFA). The CoMFA model can provide identification of regions in space where the interactive fields may influence the biological activities in the form of contour maps, which would generate graphical visualization of crucial steric and electrostatic features in 3D Cartesian space [14]. Finally, some important observations were also made during the study concerning nine newly designed AAD with high predicted bioactivity and their interactions with the FXR active site by molecular docking. Molecular docking aims to predict the binding-conformation of ligands to the appropriate target binding site. The success of a docking program depends on two components: the search algorithm and the scoring function. A variety of conformational search strategies have been reported such as systematic or stochastic search or genetic algorithms or simplified molecular input line entry system conformation [15]. Most scoring functions are physics-based molecular mechanics force fields that estimate the energy of the pose within the binding site.

2. Results and Discussion

2.1. Two-Dimensional Quantitative Structure Activity Relationship (2D-QSAR) Models

2.1.1. Multiple Linear Regression Modeling

After stepwise multiple linear regression (SW-MLR) was performed, the best linear model was generated with five molecular descriptors. The obtained MLR model was given as follows:
pEC50 = (0.016 ± 0.002)ASA + (14.001 ± 4.041)RPC − (0.049 ± 0.0175)SlogP_VSA2 + (0.362 ± 0.158)b_rotN + (0.318 ± 0.276)opr_leadlike − (9.717 ± 2.221)
Ntrain = 31, R2train = 0.935, Ftrain = 72.353 > F0.005(5,25) = 4.43 (the cut off value of F distribution), RMSEtrain = 0.219, Q2LOO = 0.899, RMSELOO = 0.299, Ntest = 10, R2test = 0.902, RMSEtest = 0.534.
where, Ntrain and Ntest are the number of compounds in the training set and the test set, respectively. R2train and R2test are the squared correlation coefficient of training set and test set, respectively; Q2LOO is the leave-one-out (LOO) cross-validation squared correlation coefficient; F is the F-test value; RMSE is root mean standard error. The selected variables and their chemical meanings, standard coefficients are shown in Table 1. A variable inflation factor (VIF) (VIF = 1/(1 − Rj2), Rj2 represents the multiple correlation coefficient of one descriptor’s effect on the remaining molecular descriptors) was calculated to determine if multicollinearity existed among the descriptors in models. If VIF arrays from 1.0 to 5.0, the linked equation is suitable [16]. As shown in Table 1, the VIF of all descriptors were smaller than 4, indicating that the generated model possessed statistic significance and good stability. Table 2 shows the correlation matrix of the selected descriptors. From this table, it can be seen that the linear correlation coefficient value for each pair of descriptors was smaller than 0.85, suggesting that the selected descriptors were independent, meeting the important criterion for the model selections [17]. The predicted results of the MLR model are given in Table 3 and shown in Figure 1A. As described in Table 4, obviously, the MLR model was very successfully built with statistical significance and good prediction ability. The R2train value of this model reveals that it can explain 93.5% of the variance in activity. The Q2LOO value of 0.899 was much larger than 0.5, indicating that the developed model had very good stability and predictive ability. In addition, the value of R2test for the external prediction was 0.902, showing the good prediction and generalization ability.
Finally, to confirm the robustness of the model, the Y-randomization test was performed in this study. The dependent variable vector is randomly shuffled and a new model is constructed. If the new model gives significantly lower values for both R2train and Q2LOO statistics compared to the original model, the original generated model is not considered as resulting from a chance correlation [18]. The results of ten Y-randomization tests are summarized in Table 5. As can be seen, all new R2train and Q2LOO values were much lower than those of the original model. Thereby, the good results for the MLR model were not due to a chance correlation or structural dependency of the training set.

2.1.2. Model Applicability Domain Analysis for the MLR Model

Finally, to evaluate the generalization degree of the generated model, the applicability domain (AD) was defined by a Williams plot. In the Williams plot, leverage values versus standardized residuals were plotted to detect both the structurally influential chemicals (X outliers) and the response outliers (Y outliers) [19]. The leverage value h is defined as:
h i = x i T ( X T X ) 1 x i   ( i = 1 ,   2 ,   ,   n )
where x i is the descriptor row vector of compound, X is the matrix of the descriptor values of the training set and n is the number of training sets. The superscript “T” refers to the transposed value of the matrix/vector [19,20]. When a leverage value h is higher than the threshold value h* (calculated as 3(m + 1)/n, where m is the number of model parameters and n is the number of the training set), it is considered as X outliers. In addition, a value of ±3.0 standard deviation units is widely used as a cut off value for defining Y outliers.
In this study, the Williams plot for the MLR model is shown in Figure 2. From this Figure 2, it can be found that no Y outliers existed in the investigated data set. Nevertheless, there were five molecules (compound 18, 20, 32, 37 and 26) with a leverage value higher than the warning leverage limit (0.581), but their predicted values were very satisfactory, with standard residuals lower than ±1.0 standard deviation units. Hence, these molecules were not influential in the fitting performance of the model. Conversely, it further showed the reliability of the predictions of the generated model as well as confirmed its good generalization ability [19]. Therefore, compounds with high value of leverage and good fitting in the developed model can stabilize the model, and not be considered as X outliers.
As can be seen from the above results, the MLR model was significantly highly predictive, reliable and robust. It can be used to predict the FXR agonist activity of new AAD.

2.1.3. Interpretation of the Descriptors

The MLR model encompassed five descriptors: b_rotN, RPC, opr_leadlike, SlogP_VSA2 and ASA, indicating some vital physicochemical features of AAD to govern the FXR agonist activity. The relative importance of the descriptors in the model was varied in view of their standardized regression coefficients shown in Table 1 [19]. Therefore, the relative importance order is ASA > SlogP_VSA2 > b_rotN > RPC > opr_leadlike. ASA is the water accessible surface area calculated using a radius of 1.4 A for the water molecule. This showed that the water accessible surface area of FXR agonists might influence their agonist activity. Its positive coefficient value indicated that high polar groups tend to increase the agonist activity. For instance, it can be observed from the agonist activity of compounds 9 (having 3-cyanophenyl with pEC50 = 6.638) and 10 (having 3-methoxyphenyl with pEC50 = 6.420) or 1 (having 3-carboxyphenyl with pEC50 = 6.553) and 8 (having acetylphenyl with pEC50 = 6.319) in Table 3. Slogp_VSA2 is the subdivided surface area descriptor, which is based on the sum of the approximate accessible van der Waal’s surface area, calculated for each atom with contribution to the log of the partition coefficient (octanol/water) in the range of (−0.2,0). Its negative coefficient value indicated that high hydrophobicity tended to decrease the agonist activity. Obviously, the bioactivity of molecules with aliphatic chains at region A in Table 3, such as 34 (with pEC50 of 5.602) or 35, 36, 37 and 38 (with pEC50 of 5.066–5.357), was lower than those without aliphatic chains such as 1 (with pEC50 of 6.553) or 39 and 40 (with pEC50 of 5.824–6.000). B_rotN is the number of rotatable single bonds. Its positive coefficient illustrated that more b_rotN was favorable to the FXR agonist activity. For instance, the bioactivity of compounds 15 and 16 or 35, 36 and 38 are varied in order: 16 (having b_rotN of 10) > 15 (having b_rotN of 9) or 38 (having b_rotN of 11) > 36 (having b_rotN of 10) > 35 (having b_rotN of 9). RPC is a relative negative partial charge descriptor that depends on the partial charge of each atom of a chemical structure. The positive sign of this descriptor illustrated that the relative negative partial charge of the molecule was favorable for the agonist activity. It can be quickly understood by comparing molecules 13 (having –CONH2 headgroup with pEC50 = 7.131) and 1 (having –COOH headgroup with pEC50 = 6.553). Opr_leadlike belongs to atom count and bond count descriptors that refer to the number of violations of the Oprea’s lead-like test. The positive contribution of this descriptor indicated that the high value of opr_leadlike was beneficial to the bioactivity.
Therefore, high polar groups such as the acidic headgroup together with high values of RPC and b_rotN are favorable for FXR agonist activity. Further, the aliphatic chain has a negative effect on it.

2.2. 3D-QSAR Models

2.2.1. CoMFA Analysis

To graphically visualize the key chemical structural features that attributed to enhance the FXR agonist activity, CoMFA models were derived. The results of the CoMFA studies are listed in Table 4. The optimum number of components and filtering value for the CoMFA models were four and six, which were calculated by selecting the highest Q2LOO value. The generated CoMFA model illustrated a Q2LOO value of 0.802 (>0.5) by four components (RMSELOO = 0.383). The non-cross-validated PLS analysis with the four components resulted in R2train of 0.944, F value of 109.711 and RMSEtrain of 0.203 and R2test of 0.892. The contributions of steric and electrostatic fields calculated by the CoMFA model were 49.7% and 50.3% of the variance, respectively. The obtained high R2train, Q2train and F values along with the lower RMSEtrain indicated the satisfactory predictive ability of the derived model (Table 4). The pEC50 values predicted by the CoMFA model are listed in Table 3. Figure 1B demonstrates the correlation between experimental and predicted pEC50 values by the CoMFA model.

2.2.2. CoMFA Contour Maps

The steric and electrostatic contour maps derived by the CoMFA model based on the reference molecule (compound 30) are shown in Figure 3. The steric interactions are represented by green and yellow contours, while electrostatic interactions are represented by red and blue contours. In the green region of the steric contour plot, bulky substitutes enhance biological activity, while in the yellow regions, these are likely to decrease the activity. Blue contours represent regions where positive charge increases activity, whereas red-colored regions represent areas where negative charge enhances activity [17]. The three regions A, B, and C of compound 30 are depicted in Figure 4.
As shown in Figure 3A, there are two large yellow contours near regions A and C, one medium yellow contour near region B and one small yellow contour near region A, indicating that the bioactivity of molecules would be influenced by the introduction of bulky groups near these regions. According to Table 3, this can be explained by a comparison between molecules 21 (having –CH3 group at position 4 of region A with pEC50 = 7.347) and 32 (having –OCH3 at position 4 of region A with pEC50 = 7.328). The small yellow contour near position 6 of region A also suggested that the agonist activity would be decreased by the introduction of bulky groups here, such as compounds (18, 20, 19, 1) where the use of bulky groups (–OCH3 > –Cl > –F > –H) resulted in lower pEC50 values (5.328 < 5.959 < 6.319 < 6.553). A medium yellow contour near R2 at region B indicated that bulky groups here would cause lower activity. This can be observed by the comparison of molecules 27 (substituted by –CH3 with pEC50 value of 7.367) and 32 (substituted by –OCH3 with pEC50 value of 7.060), where the volume of –CH3 was smaller than –OCH3. This can also be observed by a comparison of compounds 25 (substituted by –Cl with pEC50 value of 7.328) and 29 (substituted by –Br with pEC50 value of 7.319). Another large yellow contour near region C showed that bulky groups at positions 3 and 5 of region C would lead to lower activity. For instance, the agonist activity of compounds 35 or 40 (substituted by naphthalen-2-yl) and 34 or 1 (substituted by 4-tert-butylphenyl) was varied in the order: 35 < 34 or 40 < 1. One small green contour near positions 2 and 3 of region A indicated that FXR agonist activity would be enhanced by introduction of bulky groups here. This can be observed by comparing molecules 17 (having –CH3 group at position 2 of region A) and 1 (having –H group at position 2 of region A), where using a bulky group influenced the outcome of pEC50 values (7.377 > 6.553). This can also be observed by comparing the bioactivity of molecules 16 and 15, where using bulky groups (–CH2CH2COOH > –CH2COOH) at position 3 of region A lead to higher pEC50 values (7.194 > 6.377). Another two small green contours near R3 substituent groups at region C showed the favorable effect of bulky groups here in increasing the biological activity of compounds. For instance, this can be explained by comparing the activity of compounds 1 (substituted by –C(CH3)3 with pEC50 value of 6.553) and 2 (substituted by –CF3 with pEC50 value of 5.161) or compounds 34 (substituted by –C(CH3)3 with pEC50 value of 5.602) and 33 (substituted by –CH2CH3 group with pEC50 value of 5.237).
As can be seen from Figure 3B, there was one medium blue contour near positions 2 and R1 of region A, which showed the favorable effect of electro-donating groups in increasing the biological activity of compounds. For instance, it can be observed by molecules 17 (having –CH3 group at position 2 of region A with pEC50 value of 7.377) and 1 (having –H group at position 2 of region A with pEC50 value of 6.553). This also can be explained by comparing the activity of compounds 13, 1 and 8, where using electro-donating substituents at R1 (–NH2 > –OH > –CH3) would result in higher pEC50 values (7.131 > 6.553 > 6.319).

2.3. Design of New Partial FXR Agonists

2.3.1. Chemical Structure Design

According to the information derived from the contour maps generated by the 3D-QSAR models, some important information about the chemical structures requirement was presented to investigate the effect of each kind of group as the substituent for regions A, B and C on FXR agonist activity. The bulky groups with lower electronegativity at positions 2 and R1 substituent of region A together with bulky groups at R3 of region C were considered to enhance the FXR agonist activity. However, the presence of bulky groups at positions 4 and 6 of region A, R2 of region B and 3 and 5 of region C would decrease the agonist activity. Therefore, some new compounds as potent FXR agonists were designed and are listed in Table 6. To investigate the results of each substituent on the activity results, CoMFA was the best modeling tool for use. The newly designed compounds showed that the bulky groups with lower electronegativity at R1 of region A had positive effects. This can be observed by comparing compound N3 (having –N(CH3)2 at R1 of region A with predicted pEC50 value of 8.322) with template compound T30 (having –OH at R1 of region A with predicted and actual pEC50 values of 8.175 and 8.097, respectively). The next attempt was to improve the effects of functional groups at R2 of region B where bulky effects were presented. It was observed that the addition of less bulky groups (such as –CH3) at R2 of region B (Table 6) can lead to better agonist activity (compound N2 with predicted pEC50 value of 8.323). Then, to investigate the bulky effects at R3 of region C, different bulky groups were tried. This can be observed by comparing compounds N1, N2, N4 (having –C(CF3)3, –C(CH3)3 and –CI3 at R3 of region C with predicted pEC50 values of 8.350, 8.323 and 8.304, respectively). Finally, to observe the effects of the addition of lower electronegativity groups at positions 2 of region A, the donor groups were investigated. It can be seen that using donor group substituent (–H < –CH3 < –OH) at R2 of region A would lead to an increase in the predicted pEC50 values in the compounds: N7 (having –CH3 with pEC50 = 8.374), N9 (having –OH with pEC50 = 8.388) and N1 (having –H with pEC50 = 8.350). Among the designed compounds, N9 presented the highest activity with a pEC50 value of 8.388. To understand the origin of this increase in activity, compounds T30 and N3 can be compared.

2.3.2. Molecular Docking Study

These molecules were ideally best based on their chemical structures, physicochemical properties and biological activity. Hence, molecular docking embedded in Molecular Operating Environment (MOE2008.10, Chemical Computing Group, Montreal, QC, Canada) was applied to study the binding modes and important interactions.
Prior to the docking, the crystal structure of FXR complexed with benzimidazole-based partial agonistic ligand was first downloaded from a protein data bank (PDB: 3OLF). The protein was protonated using the AMBER99 force field. A set of possible conformations of nine newly designed molecules was prepared by the conformational generation function of MOE. Then, molecular docking was carried out by following parameters: the binding site was defined by the ligand atom mode; triangle matcher was used as a placement method; two rescoring methods were computed, rescoring 1 was selected as London dG; rescoring 2 was selected as affinity; force field was used as a refinement [21].
A critical factor that determines the effectiveness of a docking program is its ability to reproduce ligand poses in the receptor as close to those found in X-ray deduced structures as possible [22]. In this docking study, the root-mean-square distance (RMSD) parameter measured between the complexed ligand and the redocked ligand was 0.5749 Å, suggesting that the docking results were very suitable and reliable. Docking results are listed in Table 6. Obviously, these newly designed compounds had higher docking scores for FXR than the known template compound T30, which was in agreement with the 2D- and 3D-QSAR results. The best docked orientation of compounds is shown in Figure 5, showing that newly designed compounds can be well docked into the ligand binding site of FXR. The best docked conformation of the most active compound N9, as shown in Figure 6A, revealed that the presence of perfluoroalkyl chain substituted groups at region C allowed for potentiation of strong hydrophobic interactions with Met369, Leu291, Trp458, Met454, Ile361, Leu455 and Phe333 in the active site of FXR and formed two H-bonds with Arg335 and Tyr373. Comparative molecular docking between compound N9 and the complexed ligand, shown in Figure 6, indicated that the former had a better binding score than the latter, suggesting that hydrophobic interactions between groups at region C with these amino acids played a dominant role in the FXR agonist activity. Thereby, the hydrophobic interaction of groups at region C seems to stabilize the compound within the binding site, thus contributing greater activity.

3. Experimental Section

3.1. Data Set

The structures and biological activities of 41 AAD as FXR agonists used for the QSAR analyses were taken from Merk et al. [8,9] and are listed in Table 3. The agonist activity (EC50) value was converted to a logarithmic-scale pEC50 value, which was taken as the dependent parameter for the QSAR study. In order to establish a reliable model, the data set was randomly divided into two subsets, a training set of 31 compounds (approximately 75% of the data) that represented a wide range of varied structures and a test set of 10 compounds (approximately 25%) that followed the distribution of the activity values for the training set [14]. The training set is to build models, while the test set marked by the asterisk in Table 3 will be used to evaluate the prediction ability of the final training set model.

3.2. 2D-QSAR Studies

3.2.1. Descriptors Calculation

All 2D structures of the molecules in Table 3 were sketched and their 3D structures were subjected to energy minimization using the molecular mechanics force field (MMFF) method with a convergence criterion of 0.01 kcal/mol and partial atomic charges. The final geometry optimization of each energy-minimized structure was carried out by stochastic conformational search. Then, only the lowest energy conformer of each structure was used to calculate 327 descriptors by employing the QuaSAR module of MOE [23]. These calculated descriptors include three classes: 2D descriptors, which use the atoms and connection information of the molecules, internal 3D (i3D), which uses 3D coordinate information about each molecule and external 3D (x3D), which uses 3D coordinate information with an absolute frame of reference. All the above processes were performed using MOE2008.10 package.

3.2.2. Stepwise Multiple Linear Regression (SW-MLR)

In this study, a stepwise technology combined with MLR (SW-MLR) was employed to select a set of the most relevant descriptors for model constructions. The stepwise regression combines forward and backward selections. It selects statistically meaningful variables that can appreciably increase the residual sum of squares checked by the Fisher test [24]. Therefore, different MLR models will be derived in this procedure. The selection of a good MLR equation is made by statistical parameters such as the squared correlation coefficient (R2), root mean standard error (RMSE), and Fisher statistic [25]. The best MLR model should have high R2 and Fisher statistic, and low RMSE.

3.3. 3D-QSAR Studies

3.3.1. Molecular Alignment

The 3D-QSAR model was constructed by CoMFA embedded in SYBYL 6.9. Because the prediction accuracy of CoMFA is highly dependent on the structural alignment of the molecules with a reference compound, the selection of the template molecule plays an important role in performing CoMFA. Generally, the lowest energy conformer of the most active compound is selected as a template molecule for superimposition of all other compounds [26]. Therefore, compound 30 (as shown in Figure 4) was identified as a reference molecule in view of its highest activity, and all of the remaining compounds were aligned on it to derive CoMFA models. The structures of the aligned molecules are demonstrated in Figure 7.

3.3.2. CoMFA Modeling

After aligning the molecules within the lattice that extended 4 Å units beyond the align molecules in all directions, an sp3 hybridized carbon was utilized as a probe atom to generate the steric and electrostatic fields with a charge of +1.0 and a van der Waals radius of 1.52 Å. The steric and electrostatic contributions were set as a default cut-off energy value of 30 kcal/mol. A partial least-squares (PLS) method, an extension of multiple regression analysis, was applied to calculate the minimal set of grid points and then linearly correlate the CoMFA fields to the pEC50 values in order to generate the CoMFA model [27].

3.4. Model Validation

The predictive ability and reliability of 2D- and 3D-QSAR models were evaluated by internal and external validations. The leave-one-out (LOO) cross-validation technology is often considered as the best way to internally validate the quality of derived models [28]. The LOO produces a number of models by deleting one object from the training set, which employs all the information available. Generally, when the value of LOO crossed validated correlation coefficient (Q2LOO) goes over a threshold value of 0.5, the model is acceptable [29]. In addition, external validation is also essential and significant to evaluate the generalization performance of the proposed model [25,30]. The statistical parameters, such as the root mean square errors (RMSEtest) and the squared correlation coefficient (R2test) of the external test set were calculated to assess the performance of the model [31].
All algorithms were written in MATLAB 8.0 and run on a computer [Intel(R) Pentium(R), 2.00-GB RA].

4. Conclusions

In this paper, a three level in silico approach was applied to investigate some important structural and physicochemical aspects of highly potent partial FXR agonists. Initially, 2D-QSAR using methods of both SW-MLR and 3D-QSAR CoMFA studies was performed based on forty-one AAD. Satisfactory results were obtained with the proposed methods. The best derived 2D-QSAR model by SW-MLR can explain 93.5% of the variance in activity with a low RMSE of 0.219. In addition, the 2D-QSAR study demonstrated that b_rotN, RPC, opr_leadlike, SlogP_VSA2, ASA of molecules had high correlation with the FXR agonist activity. Meanwhile, the best 3D-QSAR model presented higher predictive ability (R2train = 0.944, RMSEtrain = 0.203, Q2LOO = 0.802, R2test = 0.892) compared with the 2D-QSAR models. The derived contour maps from the 3D-QSAR model suggested the significant structural features (steric and electronic effects) required for improving biological activity. Consequently, the bulky groups with lower electronegativity at positions 2 and R1 substituent of region A together with bulky groups at R3 of region C were considered to enhance the FXR agonist activity. However, the presence of bulky groups at positions 4 and 6 of region A, R2 of region B and 3 and 5 of region C would decrease the agonist activity. Therefore, the obtained 2D- and 3D-QSAR models could provide valuable guidance for future design of new potent partial FXR agonists with an anthranilic acid skeleton in the drug discovery process. Finally, nine newly designed AAD with predicted pEC50 values higher than the known template compound were docked to the ligand binding domain of FXR. The molecular binding patterns and docking scores of these nine molecules also suggested that they can be robust and potent partial FXR agonists in agreement with the QSAR results. This also revealed that the hydrophobic interaction of groups at region C with Met369, Leu291, Trp458, Met454, Ile361, Leu455 and Phe333 seemed to stabilize the compound within the binding site, thus contributing greater activity. To the best of our knowledge, this work constituted the first in silico study for AAD as partial FXR agonists.

Acknowledgments

The authors thank Ji Zhiliang for his technological help. This work is supported by National Natural Science Foundation program (81503497), Fujian Provincial Natural Science fund subject (2015J01340), the key discipline special program of Fujian university of traditional Chinese medicine (X2014012), Fujian provincial traditional Chinese medicine science and technology program (wzrk201303) and Fujian Education Department class A science and technology fund (JA14155).

Author Contributions

Meimei Chen contributed to the analysis of the study and manuscript writing; Xuemei Yang, Xinmei Lai, Jie Kang, Huijuan Gan and Yuxing Gao helped perform the analysis.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Düfer, M.; Hörth, K.; Wagner, R.; Schittenhelm, B.; Prowald, S.; Wagner, T.F.J.; Oberwinkler, J.; Lukowski, R.; Gonzalez, F.J.; Krippeit-Drews, P.; et al. Bile acids acutely stimulate insulin secretion of mouse-cells via farnesoid X receptor activation and KATP channel inhibition. Diabetes 2012, 6, 1479–1489. [Google Scholar] [CrossRef] [PubMed]
  2. Nijmeijer, R.M.; Gadaleta, R.M.; van Mil, S.W.C.; van Bodegraven, A.A.; Crusius, J.B.; Dijkstra, G.; Hommes, D.W.; de Jong, D.J.; Stokkers, P.C.F.; Verspaget, H.W.; et al. Farnesoid X receptor (FXR) activation and FXR genetic variation in inflammatory bowel disease. PLoS ONE 2011, 8, e23745. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Hollman, D.A.A.; Milona, A.; van Erpecum, K.J.; van Mil, S.W.C. Anti-inflammatory and metabolic actions of FXR: Insights into molecular mechanisms. Biochim. Biophys. Acta 2012, 11, 1443–1452. [Google Scholar] [CrossRef] [PubMed]
  4. Akwabi-Ameyaw, A.; Bass, J.Y.; Caldwell, R.D.; Caravella, J.A.; Chen, L. FXR agonist activity of conformationally constrained analogs of GW4064. Bioorg. Med. Chem. Lett. 2009, 19, 4733–4739. [Google Scholar] [CrossRef] [PubMed]
  5. Merk, D.; Steinhilber, D.; Schubert-Zsilavecz, M. Characterizing ligands for farnesoid X receptor available in vitro test systems for farnesoid X receptor modulator development. Expert Opin. Drug Discov. 2014, 1, 27–37. [Google Scholar] [CrossRef] [PubMed]
  6. Singh, N.; Yadav, M.; Singh, A.K.; Kumar, H.; Dwivedi, S.K. Synthetic FXR agonist GW4064 is a modulator of multiple G protein-coupled receptors. Mol. Endocrinol. 2014, 5, 659–673. [Google Scholar] [CrossRef] [PubMed]
  7. Flatt, B.; Martin, R.; Wang, T.L.; Mahaney, P.; Murphy, B. Discovery of XL335 (WAY-362450), a highly potent, selective, and orally active agonist of the farnesoid X receptor (FXR). J. Med. Chem. 2009, 4, 904–907. [Google Scholar] [CrossRef] [PubMed]
  8. Merk, D.; Gabler, M.; Gomez, R.C.; Flesch, D.; Hanke, T.; Kaiser, A. Anthranilic acid derivatives as novel ligands for farnesoid X receptor(FXR). Bioorg. Med. Chem. 2014, 22, 2447–2460. [Google Scholar] [CrossRef] [PubMed]
  9. Merk, D.; Lamers, C.; Ahmad, K.; Gomez, R.C.; Schneider, G. Extending the structure-activity relationship of anthranilic acid derivatives as farnesoid X receptor modulators: Development of a highly potent partial farnesoid X receptor agonist. J. Med. Chem. 2014, 57, 8035–8055. [Google Scholar] [CrossRef] [PubMed]
  10. Putz, M.V.; Lacrămă, A.M. Introducing spectral structure activity relationship (S-SAR) analysis. Application to ecotoxicology. Int. J. Mol. Sci. 2007, 8, 363–391. [Google Scholar] [CrossRef]
  11. Putz, M.V.; Putz, A.M.; Lazea, M.; Ienciu, L.; Chiriac, A. Quantum-SAR extension of the spectral-SAR algorithm. Application to polyphenolic anticancer bioactivity. Int. J. Mol. Sci. 2009, 10, 1193–1214. [Google Scholar] [CrossRef] [PubMed]
  12. Davis, G.D.J.; Vasanthi, A.H.R. QSAR based docking studies of marine algal anticancer compounds as inhibitors of protein kinase B (PKBb). Eur. J. Pharm. Sci. 2015, 76, 110–118. [Google Scholar] [CrossRef] [PubMed]
  13. Noguera, G.J.; Fabian, L.E.; Lombardo, E.; Finkielsztein, L. QSAR study and conformational analysis of 4-arylthiazolylhydrazones derived from 1-indanones with anti-Trypanosoma cruziactivity. Eur. J. Pharm. Sci. 2015, 78, 190–197. [Google Scholar] [CrossRef] [PubMed]
  14. Roy, K.; Paul, S. Docking and 3D-QSAR studies of acetohydroxy acid synthase inhibitor sulfonylurea derivatives. J. Mol. Model. 2010, 16, 951–964. [Google Scholar] [CrossRef] [PubMed]
  15. Putz, M.V.; Dudas, N.A.; Isvoran, A. Double variational binding—(SMILES) conformational analysis by docking mechanisms for anti-HIV pyrimidine ligands. Int. J. Mol. Sci. 2015, 16, 19553–19601. [Google Scholar] [CrossRef] [PubMed]
  16. Goodarzi, M.; Jensen, R.; Heyden, Y.V. QSRR modeling for diverse drugs using different feature selection methods coupled with linear and nonlinear regressions. J. Chromatogr. B 2012, 910, 84–94. [Google Scholar] [CrossRef] [PubMed]
  17. Jalali-Heravi, M.; Asadollahi-Baboli, M.; Shahbazikhah, P. QSAR study of heparanase inhibitors activity using artificial neural networks and Levenberge-Marquardt algorithm. Eur. J. Med. Chem. 2008, 43, 548–556. [Google Scholar] [CrossRef] [PubMed]
  18. Zhao, L.; Xiang, Y.; Song, J.; Zhang, Z. A novel two-step QSAR modeling work flow to predict selectivity and activity of HDAC inhibitors. Bioorg. Med. Chem. Lett. 2013, 23, 929–933. [Google Scholar] [CrossRef] [PubMed]
  19. Wei, Y.; Xi, L.; Yao, X.; Li, J.Z.; Wu, X. Quantitative structure-activity relationship analysis of a series of human renal organic anion transporter inhibitors. Arch. Pharm. Chem. Life Sci. 2012, 345, 759–766. [Google Scholar] [CrossRef] [PubMed]
  20. Masand, V.H.; Mahajan, D.T.; Alafeefy, A.M.; Bukhari, S.N.A.; Elsayed, N.N. Optimization of antiproliferative activity of substituted phenyl 4-(2-oxoimidazolidin-1-yl) benzenesulfonates: QSAR and CoMFA analyses. Eur. J. Pharm. Sci. 2015, 77, 230–237. [Google Scholar] [CrossRef] [PubMed]
  21. Khedr, M.A.; Shehata, T.M.; Mohamed, M.E. Repositioning of 2,4-dichlorophenoxy acetic acid as a potential anti-inflammatory agent: In silico and pharmaceutical formulation study. Eur. J. Pharm. Sci. 2014, 65, 130–138. [Google Scholar] [CrossRef] [PubMed]
  22. Makhuri, F.R.; Ghasemi, J.B. Computer-aided scaffold hopping to identify a novel series of casein kinase 1 δ (CK1d) inhibitors for amyotrophic lateral sclerosis. Eur. J. Pharm. Sci. 2015, 78, 151–162. [Google Scholar] [CrossRef] [PubMed]
  23. Chen, M.M.; Lai, X.M.; Yang, X.M. A QSAR classification study on inhibitory activities of 2-arylbenzoxazoles against cholesteryl ester transfer protein. Med. Chem. Res. 2014, 23, 1878–1886. [Google Scholar] [CrossRef]
  24. Saghaie, L.; Shahlaei, M.; Fassihi, A.; Madadkar-Sobhani, A.; Gholivand, M.B.; Pourhossein, A. QSAR analysis for some diaryl-substituted pyrazoles as CCR2 Inhibitors by GA-stepwise MLR. Chem. Biol. Drug Des. 2011, 77, 75–85. [Google Scholar] [CrossRef] [PubMed]
  25. Vrontaki, E.; Melagraki, G.; Mavromoustakos, T.; Afantitis, A. Exploiting ChEMBL database to identify indole analogues as HCV replication inhibitors. Methods 2015, 71, 4–13. [Google Scholar] [CrossRef] [PubMed]
  26. Fortin, S.; Wei, L.; Moreau, E.; Lacroix, J.; Cote, M.F.; Petitclerc, E.; Kotra, L.P.; Gaudreault, R.C. Substituted phenyl 4-(2-oxoimidazolidin-1-yl)benzenesulfonamides as antimitotics. Antiproliferative, antiangiogenic and antitumoral activity, and quantitative structure–activity relationships. Eur. J. Med. Chem. 2011, 46, 5327–5342. [Google Scholar] [CrossRef] [PubMed]
  27. Wang, Y.; Wu, M.; Ai, C.; Wang, Y. Insight into the structural determinants of imidazole scaffold-based derivatives as TNF-α release inhibitors by in silico explorations. Int. J. Mol. Sci. 2015, 16, 20118–20138. [Google Scholar] [CrossRef] [PubMed]
  28. Zhang, J.; Han, B.; Wei, X.; Tan, C.; Chen, Y.; Jiang, Y. A two-step target binding and selectivity support vector machines approach for virtual screening of dopamine receptor subtype-selective ligands. PLoS ONE 2012, 7, e39076. [Google Scholar] [CrossRef] [PubMed]
  29. Wegner, J.K.; Frohlich, H.; Zell, A. Feature selection for descriptor based classification models. Theory and GA-SEC algorithm. J. Chem. Inf. Comput. Sci. 2004, 44, 921–930. [Google Scholar] [CrossRef] [PubMed]
  30. Golbraikh, A.; Tropsha, A. Beware of q2. J. Mol. Graph. Model. 2002, 20, 269–276. [Google Scholar] [CrossRef]
  31. Shahlaei, M.; Fassihi, A. QSAR analysis of some 1-(3,3-diphenylpropyl)-piperidinyl amides and ureas as CCR5 inhibitors using genetic algorithm-least square support vector machine. Med. Chem. Res. 2013, 22, 4384–4400. [Google Scholar] [CrossRef]
Figure 1. Plots of experimental against predicted pEC50 values by (A) multiple linear regression (MLR) and (B) CoMFA models.
Figure 1. Plots of experimental against predicted pEC50 values by (A) multiple linear regression (MLR) and (B) CoMFA models.
Ijms 17 00536 g001
Figure 2. The Williams plot for the MLR model.
Figure 2. The Williams plot for the MLR model.
Ijms 17 00536 g002
Figure 3. Contour maps of the CoMFA model: (A) steric field based on compound 30; (B) electrostatic field based on compound 30. Color values specify the CoMFA field levels that enclose volumes within which increase or decrease in bulk or positive charge favor higher dependent values.
Figure 3. Contour maps of the CoMFA model: (A) steric field based on compound 30; (B) electrostatic field based on compound 30. Color values specify the CoMFA field levels that enclose volumes within which increase or decrease in bulk or positive charge favor higher dependent values.
Ijms 17 00536 g003
Figure 4. Structure of template compound (compound 30). The three regions A, B and C are depicted.
Figure 4. Structure of template compound (compound 30). The three regions A, B and C are depicted.
Ijms 17 00536 g004
Figure 5. The best docked conformations and poses of newly designed compounds in the ligand binding domain of FXR.
Figure 5. The best docked conformations and poses of newly designed compounds in the ligand binding domain of FXR.
Ijms 17 00536 g005
Figure 6. The 2D representation of docking of compounds N9 (A) and complexed ligand (B) into the FXR active site.
Figure 6. The 2D representation of docking of compounds N9 (A) and complexed ligand (B) into the FXR active site.
Ijms 17 00536 g006
Figure 7. Alignment of training and test set compounds on compound 30. Baby blue, red, blue, green, gray and yellow signify hydrogen atom, oxygen atom, nitrogen atom, fluorine atom, carbon atom and sulfur atom, respectively.
Figure 7. Alignment of training and test set compounds on compound 30. Baby blue, red, blue, green, gray and yellow signify hydrogen atom, oxygen atom, nitrogen atom, fluorine atom, carbon atom and sulfur atom, respectively.
Ijms 17 00536 g007
Table 1. Selected descriptors of multiple linear regression.
Table 1. Selected descriptors of multiple linear regression.
DescriptorChemical MeaningCoefficientVIFStand Coefficient
b_rotNNumber of rotatable single bonds0.3622.8880.408
RPCRelative negative partial charges14.0011.1710.393
opr_leadlikeOne if and only if the number of violations of Oprea‘s lead-like test
<2 otherwise zero
0.3181.2710.136
SlogP_VSA2The subdivided surface area descriptor, which is based on sum of the approximate accessible van der Waal’s surface area−0.0493.728−0.567
ASAWater accessible surface area calculated using a radius of 1.4 A for the water molecule0.0161.5390.848
Constant-−9.717
Table 2. The correlation matrix of descriptors.
Table 2. The correlation matrix of descriptors.
Descriptorb_rotNRPCopr_leadlikeSlogP_VSA2ASA
b_rotN1.0000.3590.1370.715−0.032
RPC0.3591.000−0.0100.215−0.289
opr_leadlike0.137−0.0101.0000.298−0.461
SlogP_VSA20.7150.2150.2981.000−0.366
ASA−0.032−0.289−0.461−0.3661.000
Table 3. Molecular structures and corresponding experimental and predicted pEC50 values of the AAD as partial FXR agonists.
Table 3. Molecular structures and corresponding experimental and predicted pEC50 values of the AAD as partial FXR agonists.
Ijms 17 00536 i001
NO.R1R2R3EC50 (µM)pEC502D-Pred3D-Pred
SW-MLRCoMFA
13-carboxyphenylH4-tert-butylphenyl0.286.5536.7846.704
2 *3-carboxyphenylH4-(trifluoromethyl)phenyl6.95.1615.6535.171
33-carboxyphenylH4-bromophenyl3.75.4325.4515.485
43-carboxyphenylHbenzo[d][1,3]dioxol-5-yl105.0005.3365.133
5 *3-carboxyphenylH2,3-dihydrobenzo[b][1,4]dioxin-6-yl4.95.3105.7145.133
63-carboxyphenylH3-fluoro-4-(trifluoromethyl)phenyl55.3015.2325.358
7 *3-carboxyphenylHstyryl5.25.2844.8505.639
83-acetylphenylH4-tert-butylphenyl0.486.3196.5926.775
93-cyanophenylH4-tert-butylphenyl0.236.6386.6196.506
103-methoxyphenylH4-tert-butylphenyl0.386.4206.6616.327
113-(methylthio)phenylH4-tert-butylphenyl0.26.6996.6166.655
123-(1H-tetrazol-5-yl)phenylH4-tert-butylphenyl2.95.5385.8545.695
133-carbamoylphenylH4-tert-butylphenyl0.0747.1317.2037.099
14 *3,4-bimethoxyphenylH4-tert-butylphenyl0.0717.1497.7716.606
153-(carboxymethyl)phenylH4-tert-butylphenyl0.426.3776.4936.414
163-(2-carboxyethyl)phenylH4-tert-butylphenyl0.0647.1947.1436.999
172-methyl-3-carboxylphenylH4-tert-butylphenyl0.0427.3777.1157.266
182-methoxy-5-carboxyphenylH4-tert-butylphenyl4.75.3285.3075.444
192-fluoro-5-carboxyphenylH4-tert-butylphenyl0.486.3196.1486.361
20*2-chloro-5-carboxyphenylH4-tert-butylphenyl1.15.9596.5695.759
213-carboxy-4-methylphenylH4-tert-butylphenyl0.0457.3477.2096.934
22 *3-carboxy-4-methoxylphenylH4-tert-butylphenyl0.0477.3287.6507.563
233-carboxy-4-chlorophenylH4-tert-butylphenyl0.286.5536.8456.931
243-carboxy-4-bromophenylH4-tert-butylphenyl0.156.8246.9376.889
253-carboxyphenylchloro4-tert-butylphenyl0.0477.3286.8626.868
26 *4-carboxymethylphenylHnaphthalen-2-yl3.15.5095.4925.082
273-carboxy-4-methylphenylmethyl4-tert-butylphenyl0.0437.3677.0487.510
283-carboxyphenylmethyl4-tert-butylphenyl0.0617.2157.3346.942
293-carboxyphenylbromo4-tert-butylphenyl0.0487.3197.1607.241
303-carboxyphenylmethoxy4-tert-butylphenyl0.0088.0977.8888.175
313-carboxy-4-methylphenylchloro4-tert-butylphenyl0.116.9597.1787.027
32 *3-carboxy-4-methylphenylmethoxy4-tert-butylphenyl0.0877.0607.8616.858
333-carboxypropylH4-ehylphenyl5.85.2375.0585.253
343-carboxypropylH4-tert-butylphenyl2.55.6025.7485.885
353-carboxypropylHnaphthalen-2-yl8.65.0664.5984.817
364-carboxybutylHnaphthalen-2-yl8.35.0815.0864.795
37 *3-methoxy-3-oxopropylHnaphthalen-2-yl7.15.1495.3774.857
385-carboxypentylHnaphthalen-2-yl4.45.3575.6575.416
394-carboxyphenylHnaphthalen-2-yl1.06.0005.8825.902
403-carboxyphenylHnaphthalen-2-yl1.55.8245.7505.990
41 *4-carboxybenzylHnaphthalen-2-yl1.35.8866.7225.405
* denotes the test set compounds.
Table 4. Statistical parameters obtained using the MLR and CoMFA models.
Table 4. Statistical parameters obtained using the MLR and CoMFA models.
ModelTraining SetTest Set
R2trainRMSEtrainFQ2LOORMSELOOR2testRMSEtest
MLR0.9350.21972.3530.8990.2990.9020.534
CoMFA-10.9440.203109.7110.8020.3830.8920.330
Table 5. R2train and Q2LOO values after Several Y-randomization tests.
Table 5. R2train and Q2LOO values after Several Y-randomization tests.
No.12345678910
R2train0.1500.1450.1880.080.1230.1790.1440.1860.1410.131
Q2LOO0.0130.0580.0160.0390.0140.0120.1050.0000.0050.014
Table 6. Chemical Structures of Newly Designed partial FXR agonists based on 3D-QSAR Models.
Table 6. Chemical Structures of Newly Designed partial FXR agonists based on 3D-QSAR Models.
Ijms 17 00536 i002
NameR1Substituents at Position 2R2R3Predicted pEC50 ValuesDocking Scores
SW-MLRCoMFA
T30OHHOCH3C(CH3)37.8888.175−10.176
N1N(CH3)2HCH3C(CF3)39.0328.350−14.053
N2N(CH3)2HCH3C(CH3)38.2748.323−11.081
N3N(CH3)2HOCH3C(CH3)38.6268.322−10.716
N4N(CH3)2HCH3CI38.7448.304−12.038
N5N(CH3)2OHCH3C(CH3)38.2608.360−11.602
N6N(CH3)2OHCH3CI38.8288.357−12.073
N7N(CH3)2CH3CH3C(CF3)39.1238.374−14.193
N8N(CH3)2NH2CH3C(CF3)38.8168.378−14.335
N9N(CH3)2OHCH3C(CF3)39.0248.388−14.347
R1, R2, R3 represent substituent groups, respectively.

Share and Cite

MDPI and ACS Style

Chen, M.; Yang, X.; Lai, X.; Kang, J.; Gan, H.; Gao, Y. Structural Investigation for Optimization of Anthranilic Acid Derivatives as Partial FXR Agonists by in Silico Approaches. Int. J. Mol. Sci. 2016, 17, 536. https://doi.org/10.3390/ijms17040536

AMA Style

Chen M, Yang X, Lai X, Kang J, Gan H, Gao Y. Structural Investigation for Optimization of Anthranilic Acid Derivatives as Partial FXR Agonists by in Silico Approaches. International Journal of Molecular Sciences. 2016; 17(4):536. https://doi.org/10.3390/ijms17040536

Chicago/Turabian Style

Chen, Meimei, Xuemei Yang, Xinmei Lai, Jie Kang, Huijuan Gan, and Yuxing Gao. 2016. "Structural Investigation for Optimization of Anthranilic Acid Derivatives as Partial FXR Agonists by in Silico Approaches" International Journal of Molecular Sciences 17, no. 4: 536. https://doi.org/10.3390/ijms17040536

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop