ijms-logo

Journal Browser

Journal Browser

QSAR and Chemoinformatics in Molecular Modeling and Drug Design

A special issue of International Journal of Molecular Sciences (ISSN 1422-0067). This special issue belongs to the section "Molecular Informatics".

Deadline for manuscript submissions: closed (31 July 2020) | Viewed by 39959

Special Issue Editor


E-Mail Website
Guest Editor

Special Issue Information

Dear Colleagues,

Chemoinformatics is a multidisciplinary area of research, primarily engaged with the collection, deposition, retrieval and analysis of information in order to address chemistry-related problems. The analysis of chemistry-related data can take many forms, one of the most important ones being quantitative structure activity relationship (QSAR). QSAR could be broadly defined as finding correlations between molecular activities (defined in the broadest possible sense) and a set of structure-based descriptors, by means of mathematical models. Starting with early studies by Hansch and co-workers, the field has rapidly evolved by introducing many significant advances into all its aspects, including data curation, descriptors calculation, regression algorithms, and evaluation metrics. Over the years, QSAR models have been widely and successfully used in many research areas, including chemistry, biology, toxicology, and material sciences, to both analyze the factors affecting molecular properties and to design new compounds

The purpose of this Special Issue is to provide a state of the art picture of current chemoinformatics methodologies with an emphasis on QSAR and to describe how these are used in molecular modeling and drug design. Thus, we welcome original research articles, review articles, and communications covering one or more of the following topics:

(1) Development, implementation, and application of chemoinformatics databases.

(2) Development and applications of new chemoinformatics tools

(3) Development and application of new molecular descriptors.

(4) Construction, visualization and navigation through the chemical space.

(5) Development of new QSAR algorithms and workflows.

(6) Application of chemoinformatics and QSAR methodologies in molecular modeling and drug design.

We hope that this Issue will serve as an entry point for newcomers into the exciting world of chemoinformatics/QSAR as well as a valuable reference for more experienced practitioners of the field.

Prof. Dr. Hanoch Senderowitz
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. International Journal of Molecular Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. There is an Article Processing Charge (APC) for publication in this open access journal. For details about the APC please see here. Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Chemoinformatics
  • Machine learning
  • Data mining
  • Quantitative structure activity relationship (QSAR)
  • Quantitative structure property relationship (QSPR)
  • Computer aided drug design (CADD)
  • Molecular descriptors
  • Databases
  • Chemical space
  • Data visualization

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Related Special Issues

Published Papers (10 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research, Review

9 pages, 717 KiB  
Editorial
The Literature of Chemoinformatics: 1978–2018
by Peter Willett
Int. J. Mol. Sci. 2020, 21(15), 5576; https://doi.org/10.3390/ijms21155576 - 4 Aug 2020
Cited by 10 | Viewed by 5019
Abstract
This article presents a study of the literature of chemoinformatics, updating and building upon an analogous bibliometric investigation that was published in 2008. Data on outputs in the field, and citations to those outputs, were obtained by means of topic searches of the [...] Read more.
This article presents a study of the literature of chemoinformatics, updating and building upon an analogous bibliometric investigation that was published in 2008. Data on outputs in the field, and citations to those outputs, were obtained by means of topic searches of the Web of Science Core Collection. The searches demonstrate that chemoinformatics is by now a well-defined sub-discipline of chemistry, and one that forms an essential part of the chemical educational curriculum. There are three core journals for the subject: The Journal of Chemical Information and Modeling, the Journal of Cheminformatics, and Molecular Informatics, and, having established itself, chemoinformatics is now starting to export knowledge to disciplines outside of chemistry. Full article
(This article belongs to the Special Issue QSAR and Chemoinformatics in Molecular Modeling and Drug Design)
Show Figures

Figure 1

Research

Jump to: Editorial, Review

20 pages, 554 KiB  
Article
Evaluation of QSAR Equations for Virtual Screening
by Jacob Spiegel and Hanoch Senderowitz
Int. J. Mol. Sci. 2020, 21(21), 7828; https://doi.org/10.3390/ijms21217828 - 22 Oct 2020
Cited by 12 | Viewed by 2990
Abstract
Quantitative Structure Activity Relationship (QSAR) models can inform on the correlation between activities and structure-based molecular descriptors. This information is important for the understanding of the factors that govern molecular properties and for designing new compounds with favorable properties. Due to the large [...] Read more.
Quantitative Structure Activity Relationship (QSAR) models can inform on the correlation between activities and structure-based molecular descriptors. This information is important for the understanding of the factors that govern molecular properties and for designing new compounds with favorable properties. Due to the large number of calculate-able descriptors and consequently, the much larger number of descriptors combinations, the derivation of QSAR models could be treated as an optimization problem. For continuous responses, metrics which are typically being optimized in this process are related to model performances on the training set, for example, R2 and QCV2. Similar metrics, calculated on an external set of data (e.g., QF1/F2/F32), are used to evaluate the performances of the final models. A common theme of these metrics is that they are context -” ignorant”. In this work we propose that QSAR models should be evaluated based on their intended usage. More specifically, we argue that QSAR models developed for Virtual Screening (VS) should be derived and evaluated using a virtual screening-aware metric, e.g., an enrichment-based metric. To demonstrate this point, we have developed 21 Multiple Linear Regression (MLR) models for seven targets (three models per target), evaluated them first on validation sets and subsequently tested their performances on two additional test sets constructed to mimic small-scale virtual screening campaigns. As expected, we found no correlation between model performances evaluated by “classical” metrics, e.g., R2 and QF1/F2/F32 and the number of active compounds picked by the models from within a pool of random compounds. In particular, in some cases models with favorable R2 and/or QF1/F2/F32 values were unable to pick a single active compound from within the pool whereas in other cases, models with poor R2 and/or QF1/F2/F32 values performed well in the context of virtual screening. We also found no significant correlation between the number of active compounds correctly identified by the models in the training, validation and test sets. Next, we have developed a new algorithm for the derivation of MLR models by optimizing an enrichment-based metric and tested its performances on the same datasets. We found that the best models derived in this manner showed, in most cases, much more consistent results across the training, validation and test sets and outperformed the corresponding MLR models in most virtual screening tests. Finally, we demonstrated that when tested as binary classifiers, models derived for the same targets by the new algorithm outperformed Random Forest (RF) and Support Vector Machine (SVM)-based models across training/validation/test sets, in most cases. We attribute the better performances of the Enrichment Optimizer Algorithm (EOA) models in VS to better handling of inactive random compounds. Optimizing an enrichment-based metric is therefore a promising strategy for the derivation of QSAR models for classification and virtual screening. Full article
(This article belongs to the Special Issue QSAR and Chemoinformatics in Molecular Modeling and Drug Design)
Show Figures

Figure 1

22 pages, 8025 KiB  
Article
Discovery of Novel Hsp90 C-Terminal Inhibitors Using 3D-Pharmacophores Derived from Molecular Dynamics Simulations
by Tihomir Tomašič, Martina Durcik, Bradley M. Keegan, Darja Gramec Skledar, Živa Zajec, Brian S. J. Blagg and Sharon D. Bryant
Int. J. Mol. Sci. 2020, 21(18), 6898; https://doi.org/10.3390/ijms21186898 - 20 Sep 2020
Cited by 25 | Viewed by 5004
Abstract
Hsp90 C-terminal domain (CTD) inhibitors are promising novel agents for cancer treatment, as they do not induce the heat shock response associated with Hsp90 N-terminal inhibitors. One challenge associated with CTD inhibitors is the lack of a co-crystallized complex, requiring the use of [...] Read more.
Hsp90 C-terminal domain (CTD) inhibitors are promising novel agents for cancer treatment, as they do not induce the heat shock response associated with Hsp90 N-terminal inhibitors. One challenge associated with CTD inhibitors is the lack of a co-crystallized complex, requiring the use of predicted allosteric apo pocket, limiting structure-based (SB) design approaches. To address this, a unique approach that enables the derivation and analysis of interactions between ligands and proteins from molecular dynamics (MD) trajectories was used to derive pharmacophore models for virtual screening (VS) and identify suitable binding sites for SB design. Furthermore, ligand-based (LB) pharmacophores were developed using a set of CTD inhibitors to compare VS performance with the MD derived models. Virtual hits identified by VS with both SB and LB models were tested for antiproliferative activity. Compounds 9 and 11 displayed antiproliferative activities in MCF-7 and Hep G2 cancer cell lines. Compound 11 inhibited Hsp90-dependent refolding of denatured luciferase and induced the degradation of Hsp90 clients without the concomitant induction of Hsp70 levels. Furthermore, compound 11 offers a unique scaffold that is promising for the further synthetic optimization and development of molecules needed for the evaluation of the Hsp90 CTD as a target for the development of anticancer drugs. Full article
(This article belongs to the Special Issue QSAR and Chemoinformatics in Molecular Modeling and Drug Design)
Show Figures

Figure 1

23 pages, 2314 KiB  
Article
Consensus-Based Pharmacophore Mapping for New Set of N-(disubstituted-phenyl)-3-hydroxyl-naphthalene-2-carboxamides
by Andrzej Bak, Jiri Kos, Hana Michnova, Tomas Gonec, Sarka Pospisilova, Violetta Kozik, Alois Cizek, Adam Smolinski and Josef Jampilek
Int. J. Mol. Sci. 2020, 21(18), 6583; https://doi.org/10.3390/ijms21186583 - 9 Sep 2020
Cited by 14 | Viewed by 2669
Abstract
A series of twenty-two novel N-(disubstituted-phenyl)-3-hydroxynaphthalene- 2-carboxamide derivatives was synthesized and characterized as potential antimicrobial agents. N-[3,5-bis(trifluoromethyl)phenyl]- and N-[2-chloro-5-(trifluoromethyl)phenyl]-3-hydroxy- naphthalene-2-carboxamide showed submicromolar (MICs 0.16–0.68 µM) activity against methicillin-resistant Staphylococcus aureus isolates. N-[3,5-bis(trifluoromethyl)phenyl]- and N-[4-bromo-3-(trifluoromethyl)phenyl]-3-hydroxynaphthalene-2-carboxamide revealed activity against M. [...] Read more.
A series of twenty-two novel N-(disubstituted-phenyl)-3-hydroxynaphthalene- 2-carboxamide derivatives was synthesized and characterized as potential antimicrobial agents. N-[3,5-bis(trifluoromethyl)phenyl]- and N-[2-chloro-5-(trifluoromethyl)phenyl]-3-hydroxy- naphthalene-2-carboxamide showed submicromolar (MICs 0.16–0.68 µM) activity against methicillin-resistant Staphylococcus aureus isolates. N-[3,5-bis(trifluoromethyl)phenyl]- and N-[4-bromo-3-(trifluoromethyl)phenyl]-3-hydroxynaphthalene-2-carboxamide revealed activity against M. tuberculosis (both MICs 10 µM) comparable with that of rifampicin. Synergistic activity was observed for the combinations of ciprofloxacin with N-[4-bromo-3-(trifluoromethyl)phenyl]- and N-(4-bromo-3-fluorophenyl)-3-hydroxynaphthalene-2-carboxamides against MRSA SA 630 isolate. The similarity-related property space assessment for the congeneric series of structurally related carboxamide derivatives was performed using the principal component analysis. Interestingly, different distribution of mono-halogenated carboxamide derivatives with the –CF3 substituent is accompanied by the increased activity profile. A symmetric matrix of Tanimoto coefficients indicated the structural dissimilarities of dichloro- and dimetoxy-substituted isomers from the remaining ones. Moreover, the quantitative sampling of similarity-related activity landscape provided a subtle picture of favorable and disallowed structural modifications that are valid for determining activity cliffs. Finally, the advanced method of neural network quantitative SAR was engaged to illustrate the key 3D steric/electronic/lipophilic features of the ligand-site composition by the systematic probing of the functional group. Full article
(This article belongs to the Special Issue QSAR and Chemoinformatics in Molecular Modeling and Drug Design)
Show Figures

Figure 1

11 pages, 5369 KiB  
Article
Deep Learning Modeling of Androgen Receptor Responses to Prostate Cancer Therapies
by Oliver Snow, Nada Lallous, Martin Ester and Artem Cherkasov
Int. J. Mol. Sci. 2020, 21(16), 5847; https://doi.org/10.3390/ijms21165847 - 14 Aug 2020
Cited by 13 | Viewed by 3052
Abstract
Gain-of-function mutations in human androgen receptor (AR) are among the major causes of drug resistance in prostate cancer (PCa). Identifying mutations that cause resistant phenotype is of critical importance for guiding treatment protocols, as well as for designing drugs that do not elicit [...] Read more.
Gain-of-function mutations in human androgen receptor (AR) are among the major causes of drug resistance in prostate cancer (PCa). Identifying mutations that cause resistant phenotype is of critical importance for guiding treatment protocols, as well as for designing drugs that do not elicit adverse responses. However, experimental characterization of these mutations is time consuming and costly; thus, predictive models are needed to anticipate resistant mutations and to guide the drug discovery process. In this work, we leverage experimental data collected on 68 AR mutants, either observed in the clinic or described in the literature, to train a deep neural network (DNN) that predicts the response of these mutants to currently used and experimental anti-androgens and testosterone. We demonstrate that the use of this DNN, with general 2D descriptors, provides a more accurate prediction of the biological outcome (inhibition, activation, no-response, mixed-response) in AR mutant-drug pairs compared to other machine learning approaches. Finally, the developed approach was used to make predictions of AR mutant response to the latest AR inhibitor darolutamide, which were then validated by in-vitro experiments. Full article
(This article belongs to the Special Issue QSAR and Chemoinformatics in Molecular Modeling and Drug Design)
Show Figures

Figure 1

20 pages, 1860 KiB  
Article
Comprehensive Analysis of Applicability Domains of QSPR Models for Chemical Reactions
by Assima Rakhimbekova, Timur I. Madzhidov, Ramil I. Nugmanov, Timur R. Gimadiev, Igor I. Baskin and Alexandre Varnek
Int. J. Mol. Sci. 2020, 21(15), 5542; https://doi.org/10.3390/ijms21155542 - 3 Aug 2020
Cited by 38 | Viewed by 4190
Abstract
Nowadays, the problem of the model’s applicability domain (AD) definition is an active research topic in chemoinformatics. Although many various AD definitions for the models predicting properties of molecules (Quantitative Structure-Activity/Property Relationship (QSAR/QSPR) models) were described in the literature, no one for chemical [...] Read more.
Nowadays, the problem of the model’s applicability domain (AD) definition is an active research topic in chemoinformatics. Although many various AD definitions for the models predicting properties of molecules (Quantitative Structure-Activity/Property Relationship (QSAR/QSPR) models) were described in the literature, no one for chemical reactions (Quantitative Reaction-Property Relationships (QRPR)) has been reported to date. The point is that a chemical reaction is a much more complex object than an individual molecule, and its yield, thermodynamic and kinetic characteristics depend not only on the structures of reactants and products but also on experimental conditions. The QRPR models’ performance largely depends on the way that chemical transformation is encoded. In this study, various AD definition methods extensively used in QSAR/QSPR studies of individual molecules, as well as several novel approaches suggested in this work for reactions, were benchmarked on several reaction datasets. The ability to exclude wrong reaction types, increase coverage, improve the model performance and detect Y-outliers were tested. As a result, several “best” AD definitions for the QRPR models predicting reaction characteristics have been revealed and tested on a previously published external dataset with a clear AD definition problem. Full article
(This article belongs to the Special Issue QSAR and Chemoinformatics in Molecular Modeling and Drug Design)
Show Figures

Figure 1

12 pages, 3597 KiB  
Article
X-ray Structure-Based Chemoinformatic Analysis Identifies Promiscuous Ligands Binding to Proteins from Different Classes with Varying Shapes
by Christian Feldmann and Jürgen Bajorath
Int. J. Mol. Sci. 2020, 21(11), 3782; https://doi.org/10.3390/ijms21113782 - 27 May 2020
Cited by 6 | Viewed by 2317
Abstract
(1) Background: Compounds with multitarget activity are of interest in basic research to explore molecular foundations of promiscuous binding and in drug discovery as agents eliciting polypharmacological effects. Our study has aimed to systematically identify compounds that form complexes with proteins from distinct [...] Read more.
(1) Background: Compounds with multitarget activity are of interest in basic research to explore molecular foundations of promiscuous binding and in drug discovery as agents eliciting polypharmacological effects. Our study has aimed to systematically identify compounds that form complexes with proteins from distinct classes and compare their bioactive conformations and molecular properties. (2) Methods: A large-scale computational investigation was carried out that combined the analysis of complex X-ray structures, ligand binding modes, compound activity data, and various molecular properties. (3) Results: A total of 515 ligands with multitarget activity were identified that included 70 organic compounds binding to proteins from different classes. These multiclass ligands (MCLs) were often flexible and surprisingly hydrophilic. Moreover, they displayed a wide spectrum of binding modes. In different target structure environments, binding shapes of MCLs were often similar, but also distinct. (4) Conclusions: Combined structural and activity data analysis identified compounds with activity against proteins with distinct structures and functions. MCLs were found to have greatly varying shape similarity when binding to different protein classes. Hence, there were no apparent canonical binding shapes indicating multitarget activity. Rather, conformational versatility characterized MCL binding. Full article
(This article belongs to the Special Issue QSAR and Chemoinformatics in Molecular Modeling and Drug Design)
Show Figures

Figure 1

15 pages, 3374 KiB  
Article
Similarity-Based Methods and Machine Learning Approaches for Target Prediction in Early Drug Discovery: Performance and Scope
by Neann Mathai and Johannes Kirchmair
Int. J. Mol. Sci. 2020, 21(10), 3585; https://doi.org/10.3390/ijms21103585 - 19 May 2020
Cited by 29 | Viewed by 5377
Abstract
Computational methods for predicting the macromolecular targets of drugs and drug-like compounds have evolved as a key technology in drug discovery. However, the established validation protocols leave several key questions regarding the performance and scope of methods unaddressed. For example, prediction success rates [...] Read more.
Computational methods for predicting the macromolecular targets of drugs and drug-like compounds have evolved as a key technology in drug discovery. However, the established validation protocols leave several key questions regarding the performance and scope of methods unaddressed. For example, prediction success rates are commonly reported as averages over all compounds of a test set and do not consider the structural relationship between the individual test compounds and the training instances. In order to obtain a better understanding of the value of ligand-based methods for target prediction, we benchmarked a similarity-based method and a random forest based machine learning approach (both employing 2D molecular fingerprints) under three testing scenarios: a standard testing scenario with external data, a standard time-split scenario, and a scenario that is designed to most closely resemble real-world conditions. In addition, we deconvoluted the results based on the distances of the individual test molecules from the training data. We found that, surprisingly, the similarity-based approach generally outperformed the machine learning approach in all testing scenarios, even in cases where queries were structurally clearly distinct from the instances in the training (or reference) data, and despite a much higher coverage of the known target space. Full article
(This article belongs to the Special Issue QSAR and Chemoinformatics in Molecular Modeling and Drug Design)
Show Figures

Figure 1

24 pages, 3672 KiB  
Article
In Silico Prediction of Intestinal Permeability by Hierarchical Support Vector Regression
by Ming-Han Lee, Giang Huong Ta, Ching-Feng Weng and Max K. Leong
Int. J. Mol. Sci. 2020, 21(10), 3582; https://doi.org/10.3390/ijms21103582 - 19 May 2020
Cited by 13 | Viewed by 3468
Abstract
The vast majority of marketed drugs are orally administrated. As such, drug absorption is one of the important drug metabolism and pharmacokinetics parameters that should be assessed in the process of drug discovery and development. A nonlinear quantitative structure–activity relationship (QSAR) model was [...] Read more.
The vast majority of marketed drugs are orally administrated. As such, drug absorption is one of the important drug metabolism and pharmacokinetics parameters that should be assessed in the process of drug discovery and development. A nonlinear quantitative structure–activity relationship (QSAR) model was constructed in this investigation using the novel machine learning-based hierarchical support vector regression (HSVR) scheme to render the extremely complicated relationships between descriptors and intestinal permeability that can take place through various passive diffusion and carrier-mediated active transport routes. The predictions by HSVR were found to be in good agreement with the observed values for the molecules in the training set (n = 53, r2 = 0.93, q CV 2 = 0.84, RMSE = 0.17, s = 0.08), test set (n = 13, q2 = 0.75–0.89, RMSE = 0.26, s = 0.14), and even outlier set (n = 8, q2 = 0.78–0.92, RMSE = 0.19, s = 0.09). The built HSVR model consistently met the most stringent criteria when subjected to various statistical assessments. A mock test also assured the predictivity of HSVR. Consequently, this HSVR model can be adopted to facilitate drug discovery and development. Full article
(This article belongs to the Special Issue QSAR and Chemoinformatics in Molecular Modeling and Drug Design)
Show Figures

Figure 1

Review

Jump to: Editorial, Research

22 pages, 2252 KiB  
Review
Benchmarking Data Sets from PubChem BioAssay Data: Current Scenario and Room for Improvement
by Viet-Khoa Tran-Nguyen and Didier Rognan
Int. J. Mol. Sci. 2020, 21(12), 4380; https://doi.org/10.3390/ijms21124380 - 19 Jun 2020
Cited by 8 | Viewed by 4321
Abstract
Developing realistic data sets for evaluating virtual screening methods is a task that has been tackled by the cheminformatics community for many years. Numerous artificially constructed data collections were developed, such as DUD, DUD-E, or DEKOIS. However, they all suffer from multiple drawbacks, [...] Read more.
Developing realistic data sets for evaluating virtual screening methods is a task that has been tackled by the cheminformatics community for many years. Numerous artificially constructed data collections were developed, such as DUD, DUD-E, or DEKOIS. However, they all suffer from multiple drawbacks, one of which is the absence of experimental results confirming the impotence of presumably inactive molecules, leading to possible false negatives in the ligand sets. In light of this problem, the PubChem BioAssay database, an open-access repository providing the bioactivity information of compounds that were already tested on a biological target, is now a recommended source for data set construction. Nevertheless, there exist several issues with the use of such data that need to be properly addressed. In this article, an overview of benchmarking data collections built upon experimental PubChem BioAssay input is provided, along with a thorough discussion of noteworthy issues that one must consider during the design of new ligand sets from this database. The points raised in this review are expected to guide future developments in this regard, in hopes of offering better evaluation tools for novel in silico screening procedures. Full article
(This article belongs to the Special Issue QSAR and Chemoinformatics in Molecular Modeling and Drug Design)
Show Figures

Graphical abstract

Back to TopTop