Next Article in Journal
Discovery of Proteomic Code with mRNA Assisted Protein Folding
Previous Article in Journal
Upregulation of Heme Oxygenase-1 Combined with Increased Adiponectin Lowers Blood Pressure in Diabetic Spontaneously Hypertensive Rats through a Reduction in Endothelial Cell Dysfunction, Apoptosis and Oxidative Stress
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

QSAR Study of Antimicrobial 3-Hydroxypyridine-4-one and 3-Hydroxypyran-4-one Derivatives Using Different Chemometric Tools

Department of Medicinal Chemistry, Faculty of Pharmacy, Isfahan University of Medical Sciences, 81746-73461, Isfahan, Iran
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2008, 9(12), 2407-2423; https://doi.org/10.3390/ijms9122407
Submission received: 23 September 2008 / Revised: 18 October 2008 / Accepted: 24 November 2008 / Published: 2 December 2008
(This article belongs to the Section Physical Chemistry, Theoretical and Computational Chemistry)

Abstract

:
A series of 3-hydroxypyridine-4-one and 3-hydroxypyran-4-one derivatives were subjected to quantitative structure-antimicrobial activity relationships (QSAR) analysis. A collection of chemometrics methods, including factor analysis-based multiple linear regression (FA-MLR), principal component regression (PCR) and partial least squares combined with genetic algorithm for variable selection (GA-PLS) were employed to make connections between structural parameters and antimicrobial activity. The results revealed the significant role of topological parameters in the antimicrobial activity of the studied compounds against S. aureus and C. albicans. The most significant QSAR model, obtained by GA-PLS, could explain and predict 96% and 91% of variances in the pIC50 data (compounds tested against S. aureus) and predict 91% and 87% of variances in the pIC50 data (compounds tested against C. albicans), respectively.

1. Introduction

Quantitative structure activity relationships (QSAR) studies, as one of the most important areas in chemometrics, give information that is useful for molecular design and medicinal chemistry [15]. QSAR models are mathematical equations constructing a relationship between chemical structures and biological activities. These models have another ability, which is providing a deeper knowledge about the mechanism of biological activity. In the first step of a typical QSAR study one needs to find a set of molecular descriptors with the higher impact on the biological activity of interest [69]. A wide range of descriptors has been used in QSAR modeling. These descriptors have been classified into different categories, including constitutional, geometrical, topological, quantum chemical and so on. There are several variable selection methods including multiple linear regression (MLR), genetic algorithm (GA), partial least squares (PLS), principle component or factor analysis (PCA/FA), and so on. [79]. MLR yields models that are simpler and easier to interpret than PCR and PLS, because these methods perform regression on latent variables that don’t have physical meaning. Due to the co-linearity problem in MLR analysis, one may remove the collinear descriptors before MLR model development. MLR equations can describe the structure activity relationships well but some information will be discarded in MLR analysis. On the other hand, factor analysis–based methods such as PLS regression can handle the collinear descriptors and therefore better predictive models will be obtained by PLS method [10].
It is almost 120 years since physicians revealed that the coincidence of blood and bacteria in a wound may cause a life-threatening infection. It has also been shown that blood or hemoglobin enhance the lethality of intraperitoneal or subcutaneous inocula of bacteria such as Escherichia coli. The effective component of hemoglobin is iron, and various soluble iron compounds exert an equivalent effect [11]. Administration of iron compounds to the host can increase the virulence of Escherichia coli, Listeria monocytogenes, Salmonella typhimurium and other pathogens [12]. In fact, iron is an essential element required for the growth and virulence of virtually all microbial pathogens [13, 14]. The availability of iron is critically important in host-parasite interactions [15]. Vertebrate hosts withhold iron from microbial invaders as a major defence mechanism against infection [13, 15]. This task is achieved by sequestration of iron with iron-binding proteins, the most abundant, haemoproteins [16]. Some natural antibiotics, called siderophores, are low-molecular-weight chelating agents that form stable complexes with iron [17, 18]. There are many reports of the antimicrobial activity of chelating agents with different chemical structures [1922]. Kojic acid (5-hydroxy-2-hydroxymethyl-pyran-4-one) and its 3-hydroxypyranones derivatives are examples of these compounds [19]. The bidentate chelating ligand 3-hydroxypyranone, which has a catechol-like function, forms stable complexes with several metal ions such as Fe3+. In vitro antibacterial and antifungal activities of 3-hydroxy-pyridinones, bioisoster derivatives of 3-hydroxypyranones with metal chelating ability have been described. They have an inhibitory effect on the growth of Escherichia coli, Listeria inocua and Staphylococcus aureus [22]. More recently antibacterial and antifungal activities of carboxamide derivatives of 3-hydroxypyranones, 5-hydroxypyranones and 5-hydroxypyridinones have been reported [23, 24].
Few reports of antimicrobial studies of 3-hydroxypyridine-4-one and 3-hydroxypyran-4-one derivatives are available [19, 2125] and in those they were not the subject of QSAR studies. Preliminary QSAR models for a series of such derivatives have been investigated by Fassihi et al. [25]. The antimicrobial activity against C. albicans, S. aureus and P. aeroginosa was the subject of MLR analysis in this preliminary study. MLR models revealed the best relationship between the antimicrobial activity and structural properties against S. aureus and C. albicans. In the present paper, more than 600 topological, geometrical, constitutional, functional group, electrostatic, quantum and chemical descriptors were used, for the development of QSAR equations, different methods were applied for the antimicrobial activity of the studied compounds against S. aureus and C. albicans. These methods where: (i) genetic algorithm - partial least squares (GA-PLS), (ii) MLR with factor analysis as the data pre-processing step for variable selection (FA-MLR) and (iii) principal component regression analysis (PCRA). The correlation coefficient (r), standard error of regression (SE), r2cv (Q2) and RMScv (STD(r)) were employed to judge the validity of regression equation.

2. Experimental Section

2.1. Software

The two-dimensional structures of molecules were drawn using the Hyperchem 7.0 software. The final geometries were obtained with the semi-empirical AM1 method in the Hyperchem program. The molecular structures were optimized using the Polak-Ribiere algorithm until the root mean square gradient was 0.01 kcal mol−1. The resulted geometry was transferred into Dragon program package, which was developed by Milano Chemometrics and QSAR Group [26]. The z-matrix of the structures was provided by the software and transferred to the Gaussian 98 program. Complete geometry optimization was performed taking the most extended conformation as starting geometries. Semi-empirical molecular orbital calculation (AM1) of the structures was preformed using Gaussian 98 program [27]. MATLAB software (version 7.1 Math Work Inc.) was used for the PLS regression method.

2.2. Data set and descriptor generation

The biological data used in this study are antimicrobial activity, (in terms of –log MIC), of a set of 3-hydroxypyridine-4-one and 3-hydroxypyran-4-one derivatives [23, 24, 25]. The structural features of these compounds are listed in Table 1 and then used for subsequent QSAR analysis as dependent variables. The large number of molecular descriptors was calculated using Hyperchem, Dragon package and Gaussian 98. Some chemical parameters including molecular volume (V), molecular surface area (SA), hydrophobicity (LogP), hydration energy (HE) and molecular polarizability (MP) were calculated using Hyperchem Software. Dragon software calculated different functional groups, topological, geometrical and constitutional descriptors for each molecule.
Gaussian 98 was employed for calculation of different quantum chemical descriptors including, dipole moment (DM), local charges, HOMO and LOMO energies. Hardness (η), softness (S), electronegativity (χ) and electrophilicity (ω) were calculated according to the method proposed by Thanikaivelan et al. [28].
Constitutional, topological, geometrical, functional group, quantum and physicochemical indices were used in this study; brief description of some of them is listed in Table 2.

2.3. Data screening and model building

The calculated descriptors were collected in a data matrix whose number of rows and columns were the number of molecules and descriptors, respectively. Genetic algorithm - partial least squares (GA-PLS), MLR with factor analysis as the data pre-processing step for variable selection (FA-MLR) and principal component regression analysis (PCRA) methods were used to derive the QSAR equations and feature selection was performed by the use of genetic algorithm (GA). The genetic algorithms are efficient methods for function minimization. In descriptor selection context, the prediction error of the model built upon a set of features is optimized [29].
In this study, to model the structure-antimicrobial activity relationships better, genetic algorithm-partial least square (GA-PLS) was employed [30, 31]. Partial least squares (PLS) linear regression is a recent technique that generalizes and combines features from principal component analysis and multiple regressions. PLS is a method suitable for overcoming the problems in MLR related to multicollinear or over-abundant descriptors [10].
Application of PLS method thus allows the construction of larger QSAR equations while still avoiding over-fitting and eliminating most variables. This method is normally used in combination with cross-validation to obtain the optimum number of components [32, 33]. The PLS regression method used was the NIPALS-based algorithm existed in the chemometrics toolbox of MATLAB software (version 7.1 Math Work Inc.). In order to obtain the optimum number of factors based on the Haaland and Thomas F-ratio criterion, leave-one-out cross-validation procedure was used [34].
In our previous study the classical approach of multiple regression technique was used for developing QSAR relation [25]. Here, FA-MLR was also performed on the dataset. Factor analysis (FA) was used to reduce the number of variables and to detect structure in the relationships between them. This data-processing step is applied to identify the important predictor variables and to avoid collinearities among them [35]. Principle component regression analysis, PCRA, was also tried for the dataset along with FA-MLR. With PCRA collinearities among X variables are not a disturbing factor and the number of variables included in the analysis may exceed the number of observations [36]. In this method, factor scores, as obtained from FA, are used as the predictor variables [35]. In PCRA, all descriptors are assumed to be important while the aim of factor analysis is to identify relevant descriptors.

3. Results and Discussion

3.1. GA-PLS

In PLS analysis, the descriptors data matrix is decomposed to orthogonal matrices with an inner relationship between the dependent and independent variables. Therefore, unlike MLR analysis, the multicolinearity problem in the descriptors is omitted by PLS analysis. Because a minimal number of latent variables are used for modeling in PLS; this modeling method coincides with noisy data better than MLR. In order to find the more convenient set of descriptors in PLS modeling, genetic algorithm was used. To do so, many different GA-PLS runs were conducted using different initial set of populations. The data set (compounds tested against S. aureus, n = 31) was divided into two groups: calibration set (n = 25) and prediction set (n = 6). Given 25 calibration samples; the leave-one-out cross-validation procedure was used to find the optimum number of latent variables for each PLS model. The most convenient GA-PLS model that resulted in the best fitness contained 17 indices, 5 of them being those obtained by MLR. The PLS estimate of coefficients for these descriptors are given in Figure 1.
As it is observed, a combination of quantum, topological, geometrical, constitutional, and functional group descriptors have been selected by GA-PLS to account the antimicrobial activity of the studied compounds. The majority of these descriptors are topological indices. The resulted GA-PLS model possessed very high statistical quality R2 = 0.96 and Q2 = 0.91. The values of pMIC using PLS model (refined from cross-validation or external prediction set) along with the corresponding relative errors of prediction (REP) are shown in Table 3. Very small values of relative errors confirm the accuracy of the proposed GA-PLS model for modeling antimicrobial activity of the studied compounds.
The data set (compounds tested against C. albicans, n = 28) was again divided into two groups: calibration set (n = 23) and prediction set (n = 5). Given 23 calibration samples; the leave-one-out cross-validation procedure was used to find the optimum number of latent variables for each PLS model. Here, the most convenient GA-PLS model contained 15 indices, five of them being those obtained by MLR. The PLS estimate of coefficients for these descriptors are given in Figure 2. As it is observed, a combination of quantum, topological, geometrical and functional group descriptors have been selected by GA-PLS to account the antimicrobial activity of the compounds. The majority of these descriptors are topological indices again. The resulted GA-PLS model possessed very high statistical quality R2 = 0.91 and Q2 = 0.87. The values of pMIC using PLS model along with the corresponding REPs are shown in Table 4. Very small values of relative errors confirm the accuracy of the proposed GA-PLS model for modeling antimicrobial activity of the studied compounds.

3.2. FA-MLR and PCRA

Table 5 shows the five factor loadings of the variables (after VARIMAX rotation) for the compounds tested against S. aureus. As it is observed, about 79% of variances in the original data matrix could be explained by selected four factors.
Based on the procedure explained in the experimental section, the following three-parametric equation was derived.
pMIC = 4.786       ( ± 0.484 ) +       0.196 ( ± 0.063 )       DMy + 0.1666       ( ± 0.063 )       nCONHR 0.130 ( ± 0.058 )       PJ13 R 2 = 0.73       S. E. = 0.31       F = 11.41       Q 2 = 0.68       RMScv = 0.34       N = 31
Equation 1 could explain 73% of the variance and predict 68% of the variance in pMIC data. This equation describes the effect of geometrical (PJI3), functional group (nCONHR) and quantum (DMy) indices on antimicrobial activity.
When factor scores were used as the predictor parameters in a multiple regression equation using forward selection method (PCRA), the following equation was obtained:
pMIC = 3.756       ( ± 0.036 ) +       0.4000       ( ± 0.036 )       f 3 R 2 = 0.81       S. E. = 0.19       F = 35.05       Q 2 = 0.79       RMScv = 0.20       N = 31
Equation 2 also shows high equation statistics (81% explained variance and 79% predict variance in pMIC data). Since factor scores are used instead of selected descriptors, and any factor-score contains information from different descriptors, loss of information is thus avoided and the quality of PCRA equation is better than those derived from FA-MLR.
As it is observed from Table 5, in the case of each factor, the loading values for some descriptors are much higher than those of the others. These high values for each factor indicate that this factor contains higher information about which descriptors. It should be noted that all factors have information from all descriptors but the contribution of descriptor in different factors are not equal. For example, factors 1 and 2 have higher loadings for topological, constitutional and functional group indices, whereas information about quantum and functional group descriptors is highly incorporated in factors 3 and 4. Therefore, from the factor scores used by equation E2, significance of the original variables for modeling the activity can be obtained. Factor score 1 indicates importance of Mv, HNar, nCaH and IDDE (topological, constitutional and functional group descriptors, respectively). Factor score 2 indicates importance of RBN and Me (constitutional descriptors), Factor score 3 and 4 signify the importance of DMy, and nCONHR (quantum and functional group descriptors, respectively).
Table 6 shows the five factor loadings of the variables (after VARIMAX rotation) for the compounds tested against C. albicans. As it is observed, about 80% of variances in the original data matrix could be explained by selected five factors.
Based on the procedure explained in the experimental section, the following four-parametric equation was derived.
pMIC = 5.980       ( ± 0.695 ) + 0.182       ( ± 0.022 )       piID 0.167 ( ± 0.024 )       nCp 0.085 ( ± 0.023 )       ASP 0.058 ( ± 0.023 )       PW3 R 2 = 0.81       S. E. = 0.17       F = 34.76       Q 2 = 0.79       RMScv = 0.18       N = 28
Equation 3 could explain and predict 85% and 81% of the variance in pMIC data, respectively. This equation describes the effect of topological (piID and PW3), functional group (nCp) and geometrical (ASP) indices on the antimicrobial activity.
When factor scores were used as the predictor parameters in a multiple regression equation using forward selection method (PCRA), the following equation was obtained:
pMIC = 3.806 ( ± 0.023 ) + 0.237 ( ± 0.024 )       f 1 0.114 ( ( ± 0.024 )       f 3 + 0.081 ( ± 0.024 )       f 2 0.065 ( ± 0.024 )       f 4 R 2 = 0.83       S.E. = 0.12       F = 38.05       Q 2 = 0.81       RMScv = 0.12       N = 28
Equation 4 shows also high equation statistics (88% explained variance and 83% predicted variance in pMIC data). It should be noted that the variables (factor scores) used in Equation 4 are perfectly orthogonal to each other. Since factor scores are used instead of selected descriptors, and any factor-score contains information from different descriptors, loss of information is thus avoided and the quality of PCRA equation is better than those derived from FA-MLR.
As it is observed from Table 6, in the case of each factor, the loading values for some descriptors are much higher than those of the others. Factors 1 and 2 have higher loadings for topological, quantum and functional group indices, whereas information about geometrical, quantum and topological descriptors is highly incorporated in factors 3, 4 and 5. Therefore, from the factor scores used by equation E4, significance of the original variables for modeling the activity can be obtained. Factor score 1 indicates importance of PW5, piID and electronegativity (topological and quantum descriptors). Factor score 2 indicates importance of HOMO nCp and nNR2 (quantum and functional group descriptor). Factor score 3 signifies the importance of ASP and L/Bw (geometrical descriptors) and factor score 4 and 5 signify the importance of quantum and topological descriptors (DMz and PW3).
Comparison between the results obtained by GA-PLS and the other employed regression methods indicates higher accuracy of this method in describing antimicrobial activity of the studied compounds.
Difference in accuracy of the different regression methods used in this study is visualized in Figures 3 and 4 by plotting the predicted activity (by cross-validation) against the experimental values. Obviously, all linear models represented scattering of data around a straight line with slope and intercept close to one and zero, respectively. As it is observed, the plot of data resulted by GA-PLS represents the lowest scattering and that obtained by FA-MLR and PCR analysis have lower accuracy. It should be mentioned that the model which GA-PLS method provides is better than that MLR analysis provided in our previous study [25]. In fact, MLR analysis could explain and predict 55% and 35% of variances in the pMIC data (compounds tested against S. aureus) and predict 82% and 73% of variances in the pMIC data (compounds tested against C. albicans).

4. Conclusions

Quantitative relationships between molecular structure and inhibitory activity of a series of 3-hydroxypyridine-4-one and 3-hydroxypyran-4-one derivatives were discovered by a collection of chemometrics methods including GA-PLS, FA-MLR and PCRA. The results revealed the significant role of topological parameters in the antimicrobial activity of the studied compounds against S. aureus and C. albicans. A comparison between the different statistical methods employed indicated that GA-PLS represented superior results and it could explain and predict 96% and 91% of variances in the pMIC data (compounds tested against S. aureus) and predict 91% and 87% of variances in the pMIC data (compounds tested against C. albicans). As it is observed, the plot of data resulted by GA-PLS represents the lowest scattering, and the impact of topological descriptors was the most.

Acknowledgments

This work was supported by Isfahan Pharmaceutical Sciences Research Center.

References and notes

  1. Schmidi, H. Multivariate Prediction for QSAR. Chemom. Intell. Lab. Sys 1997, 37, 125–134. [Google Scholar]
  2. Hansch, C; Kurup, A; Garg, R; Gao, H. Chem-bioinformatics and QSAR. A Review of QSAR Lacking Positive Hydrophobic Terms. Chem. Rev 2001, 101, 619–672. [Google Scholar]
  3. Wold, S; Trygg, J; Berglund, A; Antii, H. Some Recent Developments in the PLS Modeling. Chemom. Intell. Lab. Syst 2001, 58, 131–150. [Google Scholar]
  4. Hemmateenejad, B; Miri, R; Akhond, M; Shamsipur, M. QSAR Study of the Calcium Channel Antagonist Activity of some Recently Synthesized Dihydropyridine Derivatives. An Application of Genetic Algorithm for Variable Selection in MLR and PLS Methods. Chemom. Intell. Lab. Syst 2002, 64, 91–99. [Google Scholar]
  5. Hemmateenejad, B; Miri, R; Akhond, M; Shamsipur, M. Quantitative Structure Activity Relationship Study of Recently Synthesized 1,4- Dihydropyridine Calcium Channel Antagonists. Application of Hansch Analysis Methods. Arch. Pharm. Pharm. Med. Chem 2002, 10, 472–480. [Google Scholar]
  6. Horvath, D; Mao, B. Neighborhood Behavior. Fuzzy Molecular Descriptors and Their Influence on the Relationship between Structural Similarity and Property Similarity. QSAR. Comb. Sci 2003, 22, 498–509. [Google Scholar]
  7. Putta, S; Eksterowicz, J; Lemmen, C; Stanton, R. A Novel Subshape Molecular Descriptor. J. Chem. Inf. Comput. Sci 2003, 43, 1623–1635. [Google Scholar]
  8. Gupta, S; Singh, M; Madan, AK. Superpendentic Index: A Novel Topological Descriptor for Predicting Biological Activity. J. Chem. Inf. Comput. Sci 1999, 39, 272–277. [Google Scholar]
  9. Consonni, V; Todeschini, R; Pavan, M. Structure/Response Correlations and Similarity/Diversity Analysis by GETAWAY Descriptors. 2. Application of the Novel 3D Molecular Descriptors to QSAR/QSPR Studies. J. Chem. Inf. Comput. Sci 2002, 42, 693–705. [Google Scholar]
  10. Deeb, O; Hemmateenejad, B; Jaber, A; Garduno-Juarez, R; Miri, R. Effects of the Electronic and Physicochemical Parameters on the Carcinogenecis Activity of Some Sulfa Drug Using QSAR Analysis Based on Genetic-MLR & Genetic-PLS. Chemosphere 2007, 67, 2122–2130. [Google Scholar]
  11. Eaton, JW; Brandt, P; Mahoney, JR; Lee, JT, Jr. Haptoglobin: A Natural Bacteriostat. Science 1982, 215, 691–693. [Google Scholar]
  12. Jones, RL; Peterson, CM; Grady, RW; Kumbaraci, T; Cerami, A. Effects of Iron Chelators and Iron Overload on Salmonella Infection. Nature 1977, 267, 63–65. [Google Scholar]
  13. Weinberg, ED. Cellular Iron Metabolism in Health and Diseased. Drug Metab. Rev 1990, 22, 531–579. [Google Scholar]
  14. Weinberg, ED. Iron and Infection. Microbiol. Rev 1978, 42, 45–66. [Google Scholar]
  15. Weinberg, ED. Iron Withholding: A Defense Against Infection and Neoplasia. Physiol. Rev 1984, 64, 65–102. [Google Scholar]
  16. Skaar, EP; Gaspar, AH; Schneewind, O. Bacillus anthracis IsdG, a Heme-Degrading Monooxygenase. J. Bacteriol 2006, 188, 1071–1080. [Google Scholar]
  17. Neilands, JB. Siderophores: Structure and Function of Microbial Iron Transport Compounds. J. Biol. Chem 1995, 270, 26723–26726. [Google Scholar]
  18. Sebat, JL; Paszczynski, AJ; Cortese, MS; Crawford, RL. Antimicrobial Properties of Pyridine-2,6-Dithiocarboxylic Acid, a Metal Chelator Produced by Pseudomonas spp. Appl. Envir. Microbiol 2001, 67, 3934–3942. [Google Scholar]
  19. Weinberg, GA. Iron Chelators as Therapeutic Agents against Pneumocystis carinii. Antimicrob. Agents Chemother 1994, 38, 997–1003. [Google Scholar]
  20. van Asbeck, BS; Marcelis, JH; Marx, JJM; Struyvenberg, A; van Kats, JH; Verhoef, J. Inhibition of Bacterial Multiplication by the Iron Chelator Deferoxamine: Potentiating Effect of Ascorbic Acid. Eur. J. Clin. Microbiol. Infect. Dis 1983, 2, 426–431. [Google Scholar]
  21. Erol, DD; Yulug, N. Synthesis and Antimicrobial Investigation of Thiazolinoalkyl-4(1H)-pyridones. Eur. J. Med. Chem 1994, 29, 893–897. [Google Scholar]
  22. Min-Hua, F; van der Does, L; Bantjes, A. Iron (III)-Chelating Resins. 3. Synthesis, Iron (III)-Chelating Properties, and in vitro Antibacterial Activity of Compounds Containing 3-hydroxy-2-methyl-4(1H)-pyridinone Ligands. J. Med. Chem 1993, 36, 2822–2827. [Google Scholar]
  23. Aytemir, MD; Erol, DD; Hider, RC; Ozalp, M. Synthesis and Evaluation of Antimicrobial Activity of New 3-Hydroxy-6-methyl-4-oxo-4H-pyran-2- carboxamide Derivatives. Turk. J. Chem 2003, 27, 757–764. [Google Scholar]
  24. Aytemir, MD; Hider, RC; Erol, DD; Ozalp, M; Ekizoglu, M. Synthesis of New Antimicrobial Agents; Amide Derivatives of Pyranones and Pyridinones. Turk. J. Chem 2003, 27, 445–452. [Google Scholar]
  25. Fassihi, A; Abedi, D; Saghaie, L; Sabet, R; Fazeli, H; Bostaki, Gh; Deilami, O; Sadinpour, H. Synthesis, Antimicrobial Evaluation and QSAR Study of Some 3-hydroxypyridine-4- one and 3-hydroxypyran-4-one Derivatives. In Eur. J. Med. Chem; 2008; DOI:10.1016/j.emech.2008.10.022. [Google Scholar]
  26. Todeschini, R. Milano Chemometrics and QSPR Group, http://michem.disat.unimib.it/, accessed 9 September, 2008.
  27. Frisch, MJ; Trucks, MJ; Schlegel, HB; Scuseria, GE; Robb, MA; Cheeseman, JR; Zakrzewski, VG; Montgomery, JA; Stratmann, JR; Burant, JC; et al. Gaussian 98, Revision A.7; Gaussian, Inc: Pittsburgh PA, 1998. [Google Scholar]
  28. Roy, K. QSAR of Adenosine Receptor Antagonists II: Exploring Physicochemical Requirements for Selective Binding of 2-arylpyrazolo [3,4-c] quinoline Derivatives with Adenosine A1 and A3 Receptor Subtypes. QSAR. Comb. Sci 2003, 22, 614–621. [Google Scholar]
  29. Siedlecki, W; Sklansky, J. On Automatic Feature Selection. Int. J. Pattern Recog. Artif. Intell 1988, 2, 197–220. [Google Scholar]
  30. Leardi, R. Application of Genetic Algorithm-PLS for Feature Selection in Spectral Data Sets. J. Chemomtr 2000, 14, 643–655. [Google Scholar]
  31. Leardi, R; Gonzalez, AL. Genetic Algorithm Applied to Feature Selection in PLS Regression: How and When to Use Them. Chemom. Intell. Lab. Syst 1998, 41, 195–207. [Google Scholar]
  32. Fassihi, A; Sabet, R. QSAR Study of p56lck Protein Tyrosine Kinase Inhibitory Activity of Flavonoid Derivatives Using MLR and GA-PLS. Int. J. Mol. Sci 2008, 9, 1876–1892. [Google Scholar]
  33. Leardi, R. Genetic Algorithms in Chemometrics and Chemistry: A Review. J. Chemometrics 2001, 15, 559–569. [Google Scholar]
  34. Hemmateenejad, B. Optimal QSAR Analysis of the Carcinogenic Activity of Drugs by Correlation Ranking and Genetic Algorithm-Based. J. Chemometrics 2004, 18, 475–485. [Google Scholar]
  35. Franke, R; Gruska, A. Chemometrics Methods in Molecular Design. In Methods and Principles in Medicinal Chemistry; van Waterbeemd, H, Ed.; VCH: Weinheim, Germany, 1995; Volume 2, pp. 113–119. [Google Scholar]
  36. Kubinyi, H. The Quantitative Analysis of Structure-Activity Relationships. In Burger’s Medicinal Chemistry and Drug Discovery, 5th Ed.; Wolff, ME, Ed.; Wiley: New York, USA, 1995; Volume 1, pp. 506–509. [Google Scholar]
Figure 1. PLS regression coefficients for the variables used in GA-PLS model (against S. aureus).
Figure 1. PLS regression coefficients for the variables used in GA-PLS model (against S. aureus).
Ijms 09 02407f1
Figure 2. PLS regression coefficients for the variables used in GA-PLS model (against C. albicans).
Figure 2. PLS regression coefficients for the variables used in GA-PLS model (against C. albicans).
Ijms 09 02407f2
Figure 3. Plots of the cross-validated predicted activity against the experimental activity for the QSAR models obtained by different chemometrics methods (against S. aureus).
Figure 3. Plots of the cross-validated predicted activity against the experimental activity for the QSAR models obtained by different chemometrics methods (against S. aureus).
Ijms 09 02407f3
Figure 4. Plots of the cross-validated predicted activity against the experimental activity for the QSAR models obtained by different chemometrics methods (against C. albicans).
Figure 4. Plots of the cross-validated predicted activity against the experimental activity for the QSAR models obtained by different chemometrics methods (against C. albicans).
Ijms 09 02407f4
Table 1. Chemical structure of the compounds used in QSAR analysis. Ijms 09 02407i1
Table 1. Chemical structure of the compounds used in QSAR analysis. Ijms 09 02407i1
CompoundXR2R3R5R6
1NHCH3OHCH2-RaH
2NHC2H5OHCH2-RaH
3NHCH3OHCH2-N(CH3)2H
4NHC2H5OHCH2-N(CH3)2H
5NHCH3OHCH2-N(C2H5)2H
6NHC2H5OHCH2-N(C2H5)2H
7N-PhCH3OHHH
8N-m-OH-PhCH3OHHH
9N-C3H7CH3OHHH
10N-C4H9CH3OHHH
11OCH2ClHOHH
12OCH3HOHH
13OCH2OHOHHCH3
14OCH2OHOCH2PhHCH3
15OCHOOCH2PhHCH3
16OCOOHOCH2PhHCH3
17OCONHRbOCH2PhHCH3
18OCONHRcOCH2PhHCH3
19OCONHRdOCH2PhHCH3
20OCONHRbOHHCH3
21OCONHRcOHHCH3
22OCONHRdOHHCH3
23OCH2OHHOCH2PhH
24OCOOHHOCH2PhH
25OCONHPhHOCH2PhH
26N-CH3CONHPhHOCH2PhH
27N-CH3CONHPhHOHH
28OCONH-ReHOCH2PhH
29N-CH3CONH-ReHOCH2PhH
30N-CH3CONH-ReHOHH
31OCH2OHHOHH
Ijms 09 02407i2
Table 2. Brief description of some descriptors used in this study.
Table 2. Brief description of some descriptors used in this study.
Descriptor TypeMolecular Description
ConstitutionalMean atomic van der Waals volume (Mv) (scaled on Carbon atom), no. of heteroatoms, no. of multiple bonds (nBM), no. of rings, no. of circuits, no of H-bond donors, no of H-bond acceptors, no. of Nitrogen atoms (nN), chemical composition, sum of Kier-Hall electrotopological states (Ss), mean atomic polarizability (Mp), number of rotable bonds (RBN), mean atomic Sanderson electronegativity (Me), etc.
TopologicalNarumi harmonic topological index (HNar), Total structure connectivity index (Xt), information content index (IC), mean information content on the distance degree equality (IDDE), total walk count, path/walk-Randic shape indices (PW3, PW4, PW5, Zagreb indices, Schultz indices, Balaban J index (such as MSD) Wiener indices, Information content index (neighborhood symmetry of 2-order) (IC2), Ratio of multiple path count to path counts (PCR), Lovasz-Pelikan index (leading eigenvalue) (LP1), total information content index (neighborhood symmetry of 1-order) (TIC1), reciprocal hyper-detour index (Rww), Average connectivity index chi-5 (X5A), piID (conventional bond-order ID number), etc.
Geometrical3D Petijean shape index (PJI3), Asphericity (ASP), Gravitational index, Balaban index, Wiener index, Length-to-breadth ratio by WHIM (L/Bw), etc.
QuantumHighest occupied Molecular Orbital Energy (HOMO), Lowest Unoccupied Molecular Orbital Energy (LUMO), Most positive charge (MPC), Sum of square of positive charges (SSPC), Sum of square of negative charges (SSNC), Sum of positive charges (SUMPC), Sum of negative charges (SUMNC), Sum of absolute of charges (SAC), Standard deviation (Std), Total dipole moment (DMt), Molecular dipole moment at X-direction (DMX), Molecular dipole moment at Y-direction (DMY), Molecular dipole moment at Z-direction (DMZ), Electronegativity (χ= −0.5 (HOMO-LUMO)), Electrophilicity (ω= χ2/2 η), Hardness (η = 0.5 (HOMO+LUMO)), Softness (S=1/ η).
Functional groupNumber of total secondary C(sp3) (nCs), Number of total tertiary carbons (nCt), Number of H-bond acceptor atoms (nHAcc), Number of secondary amides (aliphatic) (nCONHR), Number of unsubstituted aromatic C (nCaH), Number of ethers (aromatic) (nRORPh), Number of ketones (aliphatic) (nCO), Number of tertiary amines (aliphatic) (nNR2), Number of phenols (nOHPh), Number of total primary C(sp3) (nCp), etc.
ChemicalLogP (Octanol-water partition coefficient), Hydration Energy (HE), Polarizability (Pol), Molar refractivity (MR), Molecular volume (V), Molecular surface area (SA).
Table 3. Experimental and predicted activity of compounds against Staphylococcus aureus.
Table 3. Experimental and predicted activity of compounds against Staphylococcus aureus.
CompoundExperimental pMICaPredicted pMICREP b (%)
13.293.32050.9173
23.293.30070.3242
33.293.2266−1.9664
4*3.293.39763.1675
54.193.7498−11.740
63.293.32050.9173
73.893.8255−1.6850
83.293.2698−0.6172
93.293.2886−0.0440
10*3.893.92830.9738
113.593.62070.8470
123.593.72543.6340
133.593.5063−2.3883
143.593.62120.8627
15*4.194.1563−0.8119
163.593.5611−0.8123
173.593.61770.7647
183.593.5548−0.9915
19*3.893.89500.1293
204.194.0995−2.2079
213.593.71173.2787
225.105.0840−0.3141
233.593.5533−1.0318
24*3.593.72233.5534
253.893.92220.8214
263.893.97792.2092
274.804.80220.0453
283.893.8591−0.8011
293.593.4907−2.8470
30*4.494.51050.4549
313.593.4728−3.3746
a pMIC= −log (MIC),
b REP = Relative Error Prediction
*Compounds used as prediction set
Table 4. Experimental and predicted activity of compounds against Candida albicans.
Table 4. Experimental and predicted activity of compounds against Candida albicans.
Compd.Experimental pMICPredicted pMICREP(%)
23.293.41393.6304
4*3.293.38932.9303
53.893.89200.0514
63.293.35912.0577
73.293.38352.7631
83.593.64771.5813
93.293.32080.9272
10*3.593.61960.8175
113.893.95671.6857
123.893.7481−3.7870
133.893.90920.4922
143.893.7076−4.9191
153.893.8892−0.0203
163.893.8422−1.2433
17*4.494.3961−2.1360
184.494.4476−0.9524
193.893.7076−4.9191
203.893.8014−2.3296
213.893.95251.5813
233.893.7450−3.8727
24*3.893.90560.3994
253.893.99692.6755
263.893.8489−1.0691
273.893.7573−3.5304
283.893.95031.5262
29*3.893.99642.6619
303.893.89780.2006
313.893.8732−0.4333
*Compounds used as prediction set
Table 5. Numerical values of factor loading numbers 1–4 for some descriptors after VARIMAX rotation (against S. aureus).
Table 5. Numerical values of factor loading numbers 1–4 for some descriptors after VARIMAX rotation (against S. aureus).
1234Commonality
MPC0.588−0.1050.587−0.3130.799
DMy0.195−0.0540.7620.0710.627
HOMO0.0590.637−0.0130.6200.794
Electonegativity−0.643−0.206−0.199−0.4960.741
Mv0.751−0.4130.362−0.2590.934
Me0.001−0.7810.097−0.2980.708
RBN0.0870.9020.0680.0030.826
HNar0.8660.0510.217−0.2520.863
Xt−0.645−0.505−0.3070.0810.772
IDDE0.7460.3590.2150.3240.837
LP10.6670.4600.3680.2920.877
TIC10.7140.4130.1750.1270.726
PJI30.3750.611−0.315−0.2760.689
nCS−0.5590.578−0.4110.1990.855
nCaH0.894−0.140−0.143−0.0790.845
nCONHR0.2610.2200.695−0.9060.765
nCO−0.0820.081−0.2140.8530.787
pMIC S. aureus0.041−0.1160.898−0.0510.824
%variance29.8720.1017.1512.1279.24
Table 6. Numerical values of factor loading numbers 1–5 for some descriptors after VARIMAX rotation (against C. albicans).
Table 6. Numerical values of factor loading numbers 1–5 for some descriptors after VARIMAX rotation (against C. albicans).
12345Commonality
Std−0.491−0.431−0.459−0.1070.0950.657
DMz−0.0070.102−0.2090.8600.3220.898
HOMO0.2400.811−0.156−0.3490.0140.861
Electonegativity−0.706−0.3890.1420.323−0.3100.871
X5A−0.627−0.664−0.134−0.1020.1290.879
PW3−0.1660.594−0.377−0.1580.8930.584
PW50.913−0.0790.0550.135−0.1320.879
IC20.5790.272−0.1640.2100.5840.820
piID0.750−0.070−0.333−0.190−0.2080.758
ASP−0.0750.0870.866−0.1980.3220.905
L/Bw0.0640.1170.926−0.0230.1640.902
nCp−0.2060.754−0.224−0.097−0.3250.777
nNR2−0.3660.7220.1480.287−0.2340.814
nOHPh−0.191−0.415−0.165−0.4470.3560.562
nRORPh0.571−0.5220.3790.002−0.3410.858
pMIC C. albicans0.628−0.627−0.277−0.1070.6020.872
%variance22.5820.5814.7114.028.7180.60

Share and Cite

MDPI and ACS Style

Sabet, R.; Fassihi, A. QSAR Study of Antimicrobial 3-Hydroxypyridine-4-one and 3-Hydroxypyran-4-one Derivatives Using Different Chemometric Tools. Int. J. Mol. Sci. 2008, 9, 2407-2423. https://doi.org/10.3390/ijms9122407

AMA Style

Sabet R, Fassihi A. QSAR Study of Antimicrobial 3-Hydroxypyridine-4-one and 3-Hydroxypyran-4-one Derivatives Using Different Chemometric Tools. International Journal of Molecular Sciences. 2008; 9(12):2407-2423. https://doi.org/10.3390/ijms9122407

Chicago/Turabian Style

Sabet, Razieh, and Afshin Fassihi. 2008. "QSAR Study of Antimicrobial 3-Hydroxypyridine-4-one and 3-Hydroxypyran-4-one Derivatives Using Different Chemometric Tools" International Journal of Molecular Sciences 9, no. 12: 2407-2423. https://doi.org/10.3390/ijms9122407

Article Metrics

Back to TopTop