Next Article in Journal
Design, Synthesis and Biological Evaluation of C(6)-Modified Celastrol Derivatives as Potential Antitumor Agents
Next Article in Special Issue
Playing with Opening and Closing of Heterocycles: Using the Cusmano-Ruccia Reaction to Develop a Novel Class of Oxadiazolothiazinones, Active as Calcium Channel Modulators and P-Glycoprotein Inhibitors
Previous Article in Journal
First Order Temperature Dependent Phase Transition in a Monoclinic Polymorph Crystal of 1,6-Hexanedioic Acid: An Interpretation Based on the Landau Theory Approach
Previous Article in Special Issue
Investigation of the Flexibility of Protein Kinases Implicated in the Pathology of Alzheimer’s Disease
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Challenges, Applications, and Recent Advances of Protein-Ligand Docking in Structure-Based Drug Design

1
Informatics Institute, University of Missouri, Columbia, MO 65211, USA
2
Dalton Cardiovascular Research Center, University of Missouri, Columbia, MO 65211, USA
3
Department of Physics & Astronomy, University of Missouri, Columbia, MO 65211, USA
4
Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA
*
Author to whom correspondence should be addressed.
Molecules 2014, 19(7), 10150-10176; https://doi.org/10.3390/molecules190710150
Submission received: 23 April 2014 / Revised: 13 June 2014 / Accepted: 2 July 2014 / Published: 11 July 2014
(This article belongs to the Special Issue In-Silico Drug Design and In-Silico Screening)

Abstract

:
The docking methods used in structure-based virtual database screening offer the ability to quickly and cheaply estimate the affinity and binding mode of a ligand for the protein receptor of interest, such as a drug target. These methods can be used to enrich a database of compounds, so that more compounds that are subsequently experimentally tested are found to be pharmaceutically interesting. In addition, like all virtual screening methods used for drug design, structure-based virtual screening can focus on curated libraries of synthesizable compounds, helping to reduce the expense of subsequent experimental verification. In this review, we introduce the protein-ligand docking methods used for structure-based drug design and other biological applications. We discuss the fundamental challenges facing these methods and some of the current methodological topics of interest. We also discuss the main approaches for applying protein-ligand docking methods. We end with a discussion of the challenging aspects of evaluating or benchmarking the accuracy of docking methods for their improvement, and discuss future directions.

1. Introduction

In the not-so-distant past, the effects of drugs on disease were known only by empirical observation. A century of subsequent research has revealed many intricacies in the working of cellular receptors and other drug targets, and likewise the methodology of finding small molecules that bind to specific targets has become increasingly complex. This development has been marked by the realization that the interacting surfaces of cellular receptors are chemically active and often flexible, and that these properties tend to be critical to the biological effects of the small molecules, or ligands, that bind to these receptors. Within this climate of complexity, the field of rational drug design emerged to play an important role in the search for new medications [1].
An important early step in rational drug design is the identification of a biological target of interest [2]. This target of interest may be as simple as the ligand receptor of an enzyme whose over-activity is associated with disease; a compound that attenuates the enzyme’s action by competitive inhibition would be pharmaceutically interesting. However, there are many other types of drug targets and a variety of chemicals that bind to them (see Table 1 for a list of the most frequently-targeted gene families) [3].
Table 1. The percent distribution of the gene families targeted by FDA-approved drugs as of 2005. These statistics were compiled by Overington et al. from FDA data in 2005 [3].
Table 1. The percent distribution of the gene families targeted by FDA-approved drugs as of 2005. These statistics were compiled by Overington et al. from FDA data in 2005 [3].
Portion of Drugs Family of Drug Target
26.8%Rhodopsin-like GPCRs
13.0%Nuclear receptors
7.9%Ligand-gated ion channels
5.5%Voltage-gated ion channels
4.1%Penicillin-binding protein
3.0%Myeloperoxidase-like
2.7%Sodium: neurotransmitter symporter family
2.3%Type II DNA topoisomerase
≈35%(other)
Rational drug design aims to use knowledge of the biological target of interest to optimize the process of finding new medications. It may be divided into two broad categories: de novo drug design, in which a novel compound is designed from scratch, and virtual database screening, in which computational methods are used to search through libraries of small molecules, in order to find those that are predicted to be the most likely to bind to a drug target of interest [1]. De novo drug design has the advantage of versatility; only the imagination and the need to synthesize the compound in question limit its conceptual possibilities. However, this advantage can also be a disadvantage. New compounds can prove difficult or expensive to synthesize, constraining the number of new compounds that may be subsequently analyzed by experiment. In addition, predicting the interactions of entirely novel compounds is inherently difficult. The other category, virtual database screening, helps mitigate the synthesis problem by focusing on large databases of synthesizable compounds.
In virtual database screening, computational techniques are used to search databases of compounds for small molecules predicted to bind to a drug target [4]. Such predictions are not meant to replace experimental affinity determination, but virtual screening methods can complement the experimental methods by producing an enriched subset of a large chemical database; the enriched subset is one in which the proportion of compounds that actually bind to the drug target of interest is increased, compared to the proportion within the whole database [5]. Thus, compounds from the subset that pass the initial virtual screening are found to be pharmaceutically interesting at a higher rate and at a lower cost.
In principle, the methods used in virtual screening may be applied to any conceivable compounds, but in practice one usually focuses on curated libraries of purchasable or synthesizable compounds, or close analogues of such compounds. Some examples include Accelrys Available Chemicals Directory (Accelrys, Inc., San Diego, CA, USA), eMolecules Database (eMolecules, Inc., La Jolla, CA, USA), and the free ZINC database [6].
There are two general types of virtual screening: ligand-based virtual screening and structure-based virtual screening. In ligand-based virtual screening, properties of a set of ligands known to bind to the drug target of interest are used to build a model for the common features believed to be important for a ligand’s biological effects. This model can then be used to find new ligands that share these common features [7]. In structure-based virtual screening, the ligands are modeled as physical entities and scoring functions are used to predict the affinity of the ligand for the binding site of interest [4]. The present review will focus primarily on structure-based methods, but will occasionally refer to ligand-based methods, given the complementary role they often play in the drug design process.
Structure-based virtual screening typically employs docking software that is designed to explore the possible binding modes of a ligand within a binding site of interest and scoring functions that are used to estimate the affinity of the ligand for the binding site of interest [8,9,10,11]. These sampling and scoring methods will be discussed in more detail in the next section. The scoring of ligands likely to bind to a protein target of interest may also make use of QSAR (Quantitative structure–activity relationship) models, which relate features in the ligand alone or features of the protein-ligand interaction to the biological activities of those ligands [12,13]. Protein-ligand docking methods require a structural representation of the binding site, which may come from X-ray crystal structures, NMR experiments, or homology models [14]. The structure of the small molecules may similarly come from crystal structures, but for large-scale database screening, it is often necessary to model the possible conformations de novo. Protein-ligand docking is not usually applied to the whole surface of a protein to predict a ligand’s binding site. Instead, separate methods or biological information about the protein target are used to determine the primary binding site, and the subsequent docking is restricted to this site of interest. Nevertheless there are a few combined methods in which the binding site search and docking procedure are performed simultaneously (also known as blind docking) [15,16].
There is a great variety of software packages available for performing protein-ligand docking. Some popular ones include DOCK [17,18,19,20,21,22], AutoDock [23], LUDI [24], FlexX [25,26], GOLD [27], Glide [28,29], and AutoDock Vina [30], in addition to MDock [31,32,33,34,35], developed in our laboratory. An exhaustive review of literature-cited protein-ligand docking software packages was presented in [36]. Depending on factors such as the scoring function and sampling exhaustiveness, the docking software used to perform structure-based virtual database screening may vary greatly in speed, but is often slower than the ligand-based methods. While the ligand-based methods tend to be quite fast, they have the disadvantage that they require a set of ligands known to bind to the target. The ability of ligand-based method to find new active compounds is greatly dependent on the diversity or exhaustiveness of the set of ligands used to build the model [37].
In summary, we will review the application, methodologies, and evaluation of the protein-ligand docking approaches. In Section 2, we will introduce the basics of computational protein-ligand docking, discussing the fundamental challenges faced by these methods, and follow with some recent challenges that have been intensely researched in the last two years. In Section 3 we will sample the main approaches used in the biological application of protein-ligand docking methods. In Section 4 we discuss the benchmarks and evaluations used to compare the success of various protein-ligand methods. Finally, we end with some discussion and remarks about the future direction of the field.

2. Challenges in Protein-Ligand Docking

As aforementioned, protein-ligand docking software attempts to sample the possible ways a ligand can be positioned in a protein receptor of interest, and typically provides an estimate of the binding affinity and binding mode of a ligand for the protein receptor [8,9,10]. Docking involves an intrinsic trade-off between the speed of the docking algorithm and its accuracy. In an attempt to achieve higher accuracy, one can employ more advanced scoring functions or more exhaustive sampling of the possible binding modes and flexibility, but these modifications usually add to the computational cost. This tradeoff is very evident in large-scale virtual database screening, in which the number of compounds involved tends to place practical limits on the available computational time per compound. Despite all these challenges, protein-ligand docking methods have enjoyed considerable success in applications [38]. In this section, we will first introduce the basic methodologies of docking in the context of the two fundamental challenges: sampling and scoring. We will then discuss more recent methodological work.

2.1. Scoring Methods

An essential component of docking methods is the scoring function. In protein-ligand docking, the scoring function typically assesses the overall favorability of a protein-ligand complex and is meant to be comparable to the free energy of binding of the protein and ligand [39,40]. There are other attributes than one may want to score for practical reasons such as toxicity and properties related to absorption, distribution, metabolism, and excretion [41]. In addition, even within the sampling algorithm itself it may be advantageous to use more than one scoring function; for example one can use a quick, simple scoring function to discard the worst binding modes before assessing the rest more thoroughly [20,42]. However, accurately predicting binding free energy with a general scoring function, while a very ambitious goal, would revolutionize the utility of the docking methods in drug design and other applications [43].
Computing the binding affinities of protein-ligand complexes cannot yet be done very accurately by a general scoring function. The calculation is especially challenging due to the combinatorial explosion of possible conformational states of the flexible protein and ligand, and of the surrounding water molecules and ions. In addition, the binding process involves a balance between many different physical interactions: a flexible ligand may gain favorable interactions upon binding, while simultaneously suffering a substantial entropic penalty as a result of binding. For example, charged polar groups may gain favorable electrostatic interactions when binding, while simultaneously losing favorable interactions with the solvent. Even when polar groups are solvent-exposed, as they like to be, this energetic favorability is partially tempered by the loss of entropic freedom of nearby water molecules. The contribution of each of these interactions can be substantial, yet they tend to cancel each other out; therefore the total binding free energy involves a delicate balance, and inaccuracies in the computation of any one type of interaction can lead to substantial inaccuracies in the computation of total binding free energies [39].
Here we discuss the three main types of scoring functions used for docking. Firstly, there are the force-field-based approaches, which attempt to exhaustively model the many types of interactions involved in protein-ligand binding using physics-based functional forms and parameters that are derived from experiments or quantum mechanical simulations. Secondly, there are empirical approaches, in which regression or machine learning methods are used to associate the desired prediction, typically the binding affinity of the complexes, with general features of those complexes such as the number of hydrogen-bonding pairs. Finally, there are statistical potentials, in which energy-like terms are assigned to structural features of protein-ligand interactions based on the frequency with which those features occur in a training set of protein-ligand complexes.

2.1.1. Force-Field-Based Potentials

Force-field-based potentials can be used in protein-ligand docking as well as molecular dynamics (MD) simulations. They generally include a number of terms representing the various kinds of physical interactions that dominate protein-ligand binding. There are many popular force-field-based potentials in use for various applications, but most of them are quite similar in functional form. The main differences between them are which terms are included in the functional form and which specific values are used for the parameters in those terms. These parameters can be derived for experiment or fitted based on quantum mechanical simulations [44]. Consequently, the entities that are referred to as force fields in docking and MD simulations are typically sets of parameters for use with the functional forms described below. One popular functional form of the force-field based potentials is the one associated with the AMBER molecular dynamics software package [45].
The AMBER force fields take the following functional form [46].
EAMBER = Eangle + Ebond + Edihedral + Enon−bonded
Eangle and Ebond are harmonic approximations of the bond angle and strain energies, respectively, and Edihedral is an energy term associated with the dihedral angles of linearly-bonded sets of four atoms (especially, the backbone dihedral angels of proteins). The term Enon−bonded aggregates the non-bonded interactions: a Lennard-Jones 6-12 potential which approximates the van der Waals attraction and Pauli repulsion [47], and an electrostatic potential term.
The ff94 force field, which uses the AMBER functional form, has been very popular for simulating proteins [44], as have several subsequent versions such as AMBER 99SB force field, which differs from ff94 in the parameters associated with the backbone torsion angles [44]. The general AMBER force field (GAFF) offers parameters suitable for simulating small organic molecules such as drugs [45].
The CHARMM force fields are similar the AMBER force fields, but include some additional terms.
ECHARMM = Eangle + Ebond + Edihedral + Enon−bonded + Eimproper + EUB
The terms Ebond, Eangle, Edihedral, and Enon−bonded have functional forms like those in the AMBER force fields, but the parameter values may differ. The Urey-Bradley term, EUB [48], is based on the distance between the outer atoms when three atoms are linearly-bonded to each other. The Eimproper term provides an energy penalty for improper dihedral angles and helps to control the interconversion of stereocenters. The parameters in EUB and Eimproper can be optimized based on vibrational spectra [48]. The CHARMM22 force field is one of the popular ones that use the functional form defined in Equation (2) and is suitable for modeling proteins [48]. The more-recent CGenFF is suitable as a general force field for small molecules [49].
In addition to functional forms and parameterization, the force-field-based approaches are also distinguished by the method of simulating the solvent. The most obvious approach is to model explicitly all of the water molecules in the vicinity of a ligand-receptor and their interactions with the protein and ligand. There are a variety of explicit water models, which are distinguished by the number of sites used to represent the charge distribution of each water molecules: TIP3P uses 3-sites to represent the charge of the oxygen and two hydrogens, TIP4P splits the oxygen into two sites to better represent the charge distribution, and so on [50]. Due to the large number of degrees of freedom of the water molecules, simulating them explicitly is very computationally expensive.
To simulate some of the effects of the solvent while reducing the computational expense, implicit solvent approximations were introduced. These approximations generally start with the assumption that the solvent can be treated as a continuous dielectric medium with a charge distribution and a resulting electrostatic potential that obeys the Poisson-Boltzmann (PB) equation [51,52,53,54]. The PB equation can be used directly, or alternatively, further simplifications are possible. The most common of these is the generalized Born (GB) model of solvation, in which the protein and ligand atoms are modeled as spheres with a different dielectric constant than the solvent [39,55,56,57,58,59,60,61]. The PB and GB models provide adequate approximations of the electrostatic effects of the solvent. In order to also include an approximation of favorable hydrophobic-hydrophobic interactions, the solvent-accessible (SA) surface area method may be used in combination with the PB and GB models [55]. In this method, the free energy of solvation is assumed to be proportion of the surface area of the solvent accessible atoms, where the contribution of each atom depends on its type. The resulting models, PB/SA and GB/SA respectively, provide high-speed approximations for the major energetic effects of the solvent [62,63,64,65,66,67].
To further decrease the complexity, there are empirical solvent methods. In these methods, the electrostatic forces between the protein and ligand are modulated by an empirical distance-dependent parameter that roughly models the tendency of water to screen the electrostatic forces between charged atoms. For one example, this approach was used in DOCK [18].
Finally, it is worth noting that the force-field-based potentials mentioned above give estimates of the internal energy of the protein-ligand system in a specific microstate rather than the free energy of binding. In principle, one can use direct integration of the partition function to compute the free energy. Computing the full partition function is typically intractable, but sometimes approximations of the partition function are used [68]. In practice, to estimate the binding free energy using force-field-based potentials, it is necessary to either include the entropic contribution to free energy as an additional approximate term, or to employ umbrella sampling or free energy perturbation methods [69].

2.1.2. Empirical Scoring Functions

Given their relatively complex functional forms, the force-field-based approaches described in the previous subsection are computationally intensive. In order to provide a higher-speed alternative, researchers introduced empirical scoring functions [70,71]. Like force-field-based potentials, empirical scoring functions contain terms that are based on structural features and are often inspired by physical interactions. However, empirical scoring functions differ in that the underlying functional form of these terms is simplified in an effort to capture the favorability of an interaction without capturing the underlying physics of the interaction [72].
Empirical scoring functions combine features such as hydrophobic contacts, hydrophilic contacts, or number of hydrogen bonds, and parameterize these features as favorable or disfavorable based on regression or machine learning methods. Typically the parameters are optimized to predict the binding affinities of a set of protein-ligand complexes that are used as a training set [70,71]. In this way, empirical scoring functions are reminiscent of the ligand-based models mentioned previously, except that instead of building a specialized model for each drug target, one uses a diverse training set in an attempt to produce a scoring function that can interpolate the binding affinities for drug targets not considered in the training set. Nevertheless, the general performance of empirical scoring functions has been limited by over-simplifications of some of the physical interactions [73]. Some examples of empirical scoring functions include LUDI [24], ChemScore [70], and X-SCORE [74].

2.1.3. Statistical Potentials

Besides the empirical scoring functions, there is another type of scoring function that uses a simpler functional form than the ones given in Equations (1) and (2). Statistical potential-based scoring functions (also known as knowledge-based scoring functions) assign energy-like quantities to structural features, based on the frequency with which those features are found to occur in a training set of suitable examples, such as a set of protein-ligand complexes, relative to a reference state [75,76,77,78,79]. Usually, the inverse-Boltzmann equation is used to provide the relationship between the frequency of features and the energy that is assigned to those features. For protein-ligand interactions, the energy assigned to the interaction between ligand atom type i, protein atom type j, at a distance of rk (the distance of the k-th bin), can be computed as follows [80].
Molecules 19 10150 i001
The quantity ρij(rk)/ρij,ref is the relative radial density for atom pair typeij within the training set and is a function of the binned distance rk. The density ρij,ref associated with the reference state may be computed using an ideal gas approximation or other approaches [81,82].
The derivation of statistical potentials using the inverse-Boltzmann relation is not necessarily physically rigorous, and typically involves some false assumptions. One example is the assumption that the occurrences of features in the training set are conditionally independent of each other, given the energies associated with those features. In applications such as protein interactions and protein structure prediction, the interdependencies neglected by the derivation can manifest in the form of problems such as the excluded volume problem [77]. However, much like naïve Bayes classifiers, statistical potentials based on the inverse-Boltzmann relation have performed well in a variety of applications, regardless of the existence of dependencies within the feature set. Some more recent works have used a multibody approach that reduces this problem [83,84].
Another open problem in the implementation of statistical potentials is the reference state problem [77,79,82,85,86]. In order to use the inverse-Boltzmann relation to assign energies to features such as protein-ligand atom pair distances as in Equation (3), it is necessary to define a representative non-interacting state, which provides the frequencies of features one would expect to see in the training set if the features were energetically neutral. A simple ideal gas approximation may be used, and many alternatives have been proposed [81,87,88,89]. Thomas et al. introduced an iterative method of deriving a statistical potential that helps avoid the need to define a specific reference state. In this method, which was developed for protein folding, the interaction potentials between residues were iteratively adjusted based on the difference between the residue pair frequencies in the native structures and the residue pair frequencies in a Boltzmann-weighted ensemble of decoy conformations [78]. The extension of this approach to atomic, distance-dependent statistical potentials is non-trivial, due to the involvement of high-dimensional parameter optimization. This challenge was addressed by later methods, which have been applied to protein-ligand interactions, protein-protein interactions, and protein-RNA interactions [31,90,91,92].
Finally, there is also the sparse data problem. The inverse-Boltzmann relation in Equation (3) maps the observed frequency of features in a training set to the energies assigned to those features; for features that occur infrequently in the training set, the deriving energies are inaccurate or undefined. Even when very large training sets are available, the problem persists due to physically disallowed states such as very close atom pair distances (i.e., clashes) [93,94].
Given the many approaches that have been used to tackle the problems mentioned above and the diverse applications, there are many examples of statistical potentials. Some of the popular ones for protein-ligand interactions include DFIRE [95], DrugScore [96,97], ITScore [31,32], PMF-score [80], and SMoG [98].

2.1.4. Summary

The force-field-based potentials, statistical potentials, and empirical scoring functions each offer specific advantages and consequently tend to be used for different applications. Force-field-based potentials separate the types of physical interactions within the system into separate terms, and can therefore provide information on the contribution of these interactions to the internal energy of the system [45,46,48]. Force-field-based potentials can also provide a more detailed simulation of the solvent, especially when explicit water is used. These advantages come at the expense of much greater computational complexity, so force-field-based potentials are not commonly used for high-throughput docking studies.
Empirical scoring functions [70,71] use a much simpler functional form that permits high-speed implementations for virtual database screening. Empirical scoring functions often give good performance for families of proteins or compounds that are similar to complexes within the training set, but do not tend to generalize well to different protein families. As the number of protein-ligand crystal structures with known affinities increase, the ability of empirical scoring functions to provide good general performance is likely to increase [99].
Like empirical scoring functions, statistical potentials use a simple functional form allowing for faster implementations than are typically possible with force-field-based potentials [75,76,77,78,79]. Unlike empirical scoring functions, the interaction terms in statistical potentials are not typically fitted to reproduce the affinities associated with a set of protein-ligand complexes; instead, the terms are derived based on a presumed relationship between the frequency of features in a training set and the energies associated with those features. The derivation of a statistical potentials therefore is less prone to over-fitting, and the performance can generalize well to protein-ligand complexes that differ from those in the training set. Statistical potentials are advantageous when low computational complexity is desired and when the performance of the potential is expected to generalize well to cases for which the training set provides poor coverage.
There have been some efforts to combine different scoring functions in a way that provides a compelling combination of the advantages in each scoring function. An early example of a consensus scoring function may be found in [100]. Other examples of the consensus scoring approach include MultiScore [101], X-Score [74], and VoteDock [102]. The scoring function presented in [94] to deal with the sparse data problem can also be considered a consensus approach.

2.2. Sampling Methods

The other fundamental challenge facing protein-ligand docking methods is sampling. Protein-ligand binding involves changes in the relative orientation and conformation of the ligand, as well as possible conformational changes to the protein. Docking software attempts to sample these possible changes with varying degrees of exhaustiveness [103].
The simplest approach for sampling the possible ligand binding modes is rigid docking; the docking software can simply explore the six degrees of translational and rotational freedom and filter those with poor shape complementarity before final scoring. This approach was used by older versions of DOCK [104] as well as MDock [31,32]. Ligand flexibility can still be considered by such software by pre-computing ensembles of putative ligand conformations, using software such as OMEGA (OpenEye Scientific Software, Santa Fe, NM, USA) [105,106], and rigidly docking each conformation to the protein receptor of interest.
There are also docking approaches that sample the possible ligand conformations on-the-fly. One method of on-the-fly sampling is the incremental construction method, also known as the anchor-and-grow method. A rigid central portion of the ligand is placed in the binding site, and the rest of the ligand is incrementally grown from this rigid anchor, filtering out those possibilities that clash with the protein receptor during the process [20,107]. DOCK uses this approach [20]. Similar to above, there are also fragmentation methods, in which multiple rigid fragments are placed within the binding site and the docking software attempts to link these pieces together to reconstruct plausible conformations of the target ligand. LUDI uses this approach [24].
Another approach for sampling ligand conformations is the hierarchical docking method. In this approach, low-energy conformations for each ligand are pre-computed and aligned so that as many atoms as possible are identically-positioned. Each ensemble of pre-generated ligand conformations is organized into a hierarchy so that similar conformations are similarly positioned within the hierarchy. Then, for each possible translation and rotation of the ligand, the docking software makes use of the hierarchical data structure to simultaneously prune or filter sets of conformations that are not sterically possible for the given translation and rotation. For example, if an atom near the rigid center of the ligand is found to clash with the protein in a given rotation/translation, the method can confidently reject all of the descendent conformations in the hierarchy for that rotation/translation, because the descendants must contain the same clash, without having to sample each descendant conformation individually [108]. The Glide software package uses hierarchical filters during ligand sampling [28,29].
In addition to these methods of sampling ligand conformations, there are methods to handle protein flexibility. One simple approach is to rigidly dock the ligands to several putative conformations of the protein, to represent some of the protein’s conformational variability [33,34,109,110]. Another approach, which may be used alone or in conjunction with ensemble docking is energy minimization. Minimization may be performed using Monte Carlo methods or gradient descent minimization to help simulate some of the induced fit that occurs when a ligand binds to a protein receptor [111]. Finally, one can also attempt to explore the conformational space of critical residues of the protein, using methods analogous to the ligand methods mentioned before. For example, AutoDock4 [112] and AutoDock Vina [30] can adjust the rotatable bonds of critical residues in order to simulate protein conformation changes during binding.

2.3. Recent Topics

While sampling and scoring constitute the fundamental challenges in docking, much research focuses on more specialized topics. We do not attempt to exhaustively review all of the active topics, but instead sample a few topics that have received a large amount of recent attention.

2.3.1. Structural Water

As mentioned in Section 2.1.1, water plays an important role in protein-ligand binding, often counteracting the attractive interactions between the protein and ligand and resulting in a delicate balance of forces that is a difficult to model accurately [113]. One aspect of the solvent that has been increasingly recognized as a major actor in protein-ligand interactions is structural water. In the vicinity of the protein and ligand, molecules of water may become bound or semi-bound in certain favorable positions, stabilized by hydrogen bonds, and these structural water molecules can play a critical role in the stability of a protein-ligand interaction [114,115]. In work by Lie et al., modeling structural water molecules was found to increase the accuracy of docking simulations, up to a binding mode success rate of 67% [116]. Other work involving the inclusion of structural or bridging water molecules into high-accuracy protein-ligand docking simulations also shows improvements in accuracy [117]. In one paper, the ability of Rosetta) [118] to reproduce the binding mode of the HIV-1 protease/protease inhibitor crystal structures was investigated. It was found that the inclusion of just a single structural water molecule in the interface was crucial for accurate prediction of an inhibitor binding pose [119].
Docking methods improved by structural water simulation have seen practical applications, such as an inverse docking application in [120].

2.3.2. Ligand Promiscuity

It has been well-recognized that drugs may bind to many targets with significant affinities, and that this drug promiscuity gives rise to a complex polypharmacology with clinical relevance to the toxicity and side effects of pharmaceuticals [3,121]. The utility of considering such ligand promiscuity early in the drug design process has already been well-recognized. A review by Taboureau et al. noted the regulatory recommendation that all new drug candidates be tested for their potential to block human Ether-a-go-go Related-Gene (hERG) potassium channel, given the substantial risk of cardiotoxic side effects such as arrhythmias [122]. They considered in silico screening to be a useful step in identifying cardiotoxic leads before they are given larger investments.
Besides the prediction of toxicity and side effects, the tendency of ligands to bind to several sites presents another challenge: if a ligand binds tightly somewhere, even near the desired binding site, it may still not substantially affect the drug target in question. Gowthaman et al. point out that this is particularly important for non-traditional drug targets, such as a target within the interface of protein-protein interactions. For such targets, compounds that bind are often inadequate, if they do not bind in a sufficiently buried manner to achieve good ligand efficiency [123]. Perez-Nueno et al. introduced a ligand-based approach that uses shape matching to identify promiscuous ligands [124].
One the other hand, particularly for multifaceted diseases such as cancers or metabolic disorders, it may be desirable for a drug to bind to multiple targets. Peng et al. review the chemogenomics approaches in which a spectrum of a ligand’s interactions with many drug targets is predicted by structure-or ligand-based methods. Using these approaches, one can attempt to increase those interactions within the spectrum that are desired while simultaneously reducing unwanted interactions [125].

2.3.3. Accurate Models of the Protein Receptor

Docking studies often employ comparative (or homology) models of the protein target that are based on the crystal structures of homologous proteins. The methodology behind building homology models is outside the scope of this review, but here we remark on the popularity of the approach in practice. It is notable that a number of recent successful virtual screening projects used homology models of the protein receptor, and for some the best template for homology modeling was fairly divergent from the target structure, in terms of percent sequence identity [126,127,128,129,130,131,132].
Nguyen et al. investigated the accuracy of predicting ligand binding modes in comparative models of G-protein coupled receptors. The researchers found that for the best models with template structures over 50% sequence identity, the accuracy of binding mode prediction was within 2.9 Å RMSD (root-mean-squared standard deviation) from the native experimental structure on average. In cases of low sequence similarity, it is challenging to produce a homology model with sufficient accuracy to use as a basis for virtual screening [133,134], but percent sequence identity is not the only useful metric. It has also been suggested that choosing a template based on ligand occupancy can yield a better homology model for docking than choosing one based on percent sequence identity [135].

3. Protein-Ligand Docking Approaches

Having introduced structure-based drug design and the methodologies of protein-ligand docking, we will now sample the most common research approaches in which such methods have been applied.

3.1. Screening for New Inhibitors

Docking methods have a long and successful history of identifying new protein inhibitors and enriching compound databases in structure-based virtual screening. Here we discuss some examples of this common application.
Recently, Mahasenan et al. used structure-based virtual screening to identify new inhibitors of maternal embryonic leucine zipper kinase (MELK), an important kinase target known to be involved in several types of cancer. As signaling molecules, kinase targets tend to be challenging for docking methods due to their tendency to undergo major conformation changes induced by ligand binding [136]. Their three discovered inhibitors vary in affinity from 0.37 µM to 18 µM, and may have future applications in diseases involving mis-regulation of MELK [136].
In another recent work, Heusser et al. performed a virtual screening study of Gloeobacter violaceus ligand-gated ion channel (GLIC), a bacterial homolog of GABAA receptors, to search for compounds that bind to the same site as the anesthetic propofol. Among a database of commercially available compounds, 29 compounds were experimentally tested of which 16 were found to exhibit significant inhibition of GLIC relative to dimethyl sulfoxide. The active compounds were further tested on GABAA receptors. One of the compounds, like propofol, was found to inhibit both GLIC and GABAA receptors, suggesting that the GLIC receptor may be a plausible model system for GABAA receptor ligands.
In a third example, Tahir et al. used MODELLER [137] to build a homology model of TNFRSF10B protein, which is believed to inhibit tumor formation [130]. In an effort to further understand this protein, they also used protein-ligand docking to screen compounds from the Mcule compound database [138] for new potential inhibitors [130].
In a final example, a series of substituted heteroaromatic piperazine and piperidine derivatives were found through virtual screening based on the structure of human enterovirus 71 capsid protein VP1. The preliminary biological evaluation revealed that two of the compounds (8e and 9e) have potent activity against EV71 and Coxsackievirus A16 with low cytotoxicity [139].

3.2. Hybrid Approaches for Drug Design

The structure-and ligand-based methods of performing virtual database screening are not simply competing alternatives to perform the same task. They each have unique strengths and weaknesses and can therefore play a complementary role in the drug design process and other applications. Such hybrid approaches have become increasingly popular [140]. Here are a few recent examples.
One example of a hybrid approach may be found in Ahmed et al. In this work, the binding profiles of the spherical C 60 version of fullerene and its derivatives were investigated. Aside from the remarkable physicochemical characteristics of these molecules, fullerene and its derivatives are increasingly investigated for their unique biological effects [141]. The hybrid approach used by Ahmed et al. included quantum-mechanical calculations, protein-ligand docking and QSAR. They used quantum-mechanical calculations to determine geometries, dipole moments, orbital energies, and other parameters of the fullerene derivatives. They used protein-ligand docking software including AutoDock Vina [30] and Schrödinger Glide [28,29] to search for possible binding modes of the fullerene derivatives’ interactions with HIV-1 protease and to identify which residues of HIV-1 protease tend to be involved in the binding. They also compared the docking scores with experimental binding affinities. Finally they used genetic algorithms to choose a suitable QSAR model predictive of the fullerene derivatives’ binding activity. The most important features in the QSAR model were found to be the 3D-molecular geometry of the fullerene derivative, its number of ring systems, and its specific topology [141].
Another work that used a hybrid approach of structure-and ligand-based methods can be found in a study identifying four inhibitors of heat shock protein 90 (Hsp90), which is an important chaperone protein and anticancer drug target [142]. In this work, the researchers built a QSAR model to perform ligand-based virtual screening [142], and used a combined ligand-based/structure-based protocol to screen 1785 compounds for their predicted ability to bind to Hsp90 [143]. 80 of the predicted compounds were further evaluated by experiment and found to inhibit Hsp90 with IC50 values between 18 and 63 µM. The compounds contain possible new molecular scaffolds capable of inhibiting Hsp90 [142,143].
The last example of the hybrid approach that we will mention here involved DNA G-quadruplex structures, which are found in some critical positions within the genome such as near the telomeres and gene promoter regions. Unsurprisingly, they are involved in cellular aging and cancers [144]. Alcaro et al. used a hybrid approach to screen a database of commercially available compounds for their predicted ability to bind G-quadruplex structures. Before this work, there were already a variety of knowns binders for G-quadruplex structures. They first screened over one million compounds from the ZINC database [6] using ligand-based methods that compared the compounds in this database to the known binders using both 2D-similarity and 3D-similarity methods. The compounds which passed this first screening were then investigated using ensemble docking simulations on a few of the conformations of telomeric G-quadruplex structures that have been structurally characterized. They analyzed the compounds with the highest docking consensus score using several experimental techniques, and determined that they had found a new G-quadruplex binding moiety [144].

3.3. Mechanistic Studies Using Inverse Docking

Virtual database screening studies do not always start with the identification of a drug target of interest. Often, one is interested in a compound that is known to have an important biological effect, but for which the underlying molecular mechanism is unknown [145]. Consequently, rather than looking for small molecules that bind to a binding site of interest, protein-ligand docking methods may instead be used to perform the inverse search, called inverse docking [146,147]. Inverse docking involves some additional challenges. Relative scoring of protein-ligand complexes that differ according to the protein rather than the ligand is challenging for a number of reasons. Firstly, one needs structures or models of the protein receptors to be screened, but the structures of many proteins have not been solved. This necessitates the laborious process of gathering those proteins relevant to the research in question and determining the location of the binding sites. Alternatively, one may use a curated repository of known drug targets, such as the Potential Drug Target Database [148]. Secondly, proteins are often found to exist in several closely related isoforms, so the scoring function in inverse docking is challenged by the need to rank these subtle differences [149]. Thirdly, scoring functions are usually validated on benchmarks that determine their ability to accurately rank entirely different protein-ligand complexes, or many ligands against a smaller number of proteins. Benchmarks do not usually contain many examples of the same ligand docked to many different proteins, so the performance of most docking methods is more doubtful in this application. Despite these challenges, inverse docking is a popular and useful approach.
A recent application of inverse docking may be found in [150]. Some plant-derived isoprenoids have antiparasitic effects but the relevant molecular targets of these compounds were unknown. Noting the mortality of leishmaniasis, especially in some tropical regions due to the poor availability of resources to fight drug-resistant parasites, Ogungbe and Setzer used an inverse docking approach to investigate the underlying molecular mechanism of the relevant antiparasitic isoprenoids. Specifically, they compiled the known protein targets of the drugs used to treat Leishmania and docked the isoprenoids of interest to these proteins in order to predict which of the isoprenoids may share similar targets and to offer some clues regarding their functional mechanisms [150].

4. Docking Benchmarks and Evaluation

Benchmarking plays an important role in the development and improvement of docking methodologies [40]. Public databases combining crystal structures of protein-ligand complexes with experimentally-determined affinity data provide a standard way of assessing the accuracy of the binding mode predictions and binding affinity predictions of protein-ligand docking methods [72,151,152]. In addition to the standard benchmarks, there are prospective evaluations for protein-ligand interaction predictions, also called blind competitions. These prospective evaluations play an important role in the improvement of docking methods by validating new methods on targets that were unknown to the researchers at the time the methodology was developed [153,154,155]. Here we discuss a number of the challenges in benchmarking docking methods.

4.1. Making Testable Predictions

It would be ideal for docking scoring functions, sampling schemes, and other methodologies to be tested in prospective studies in which the targets of the benchmark were unknown at the time the methodology was developed. Such prospective evaluations are not always available when new methods are introduced. In such cases, a rigorous experimental design can help ensure trustable evaluations, especially with regard to the independence of the benchmark from the development of the methods. Examples of prospective evaluations of protein-ligand docking methods include CSAR, the Community Structure Activity Resource [153,155], and OpenEye SAMPL [156].

4.2. Assuming Lack of Knowledge of the Native, Bound Conformation

In practice, docking methods are unable to rely on the availability of structurally accurate knowledge of the bound, native conformation of a protein binding site and ligand, due to the conformational changes that occur during binding. A realistic evaluation of docking would require the method to dock an arbitrary conformation of the ligand to either a ligand-free crystal structure of the protein, or if none are available, then a crystal structure bound to a different ligand than the one being docked. This allows the docking software to be tested for the ability to either simulate the induced fit of binding, or test the success of a smoother scoring function designed for soft docking. Examples of recent evaluations of docking methods that included unbound evaluations may be found in [157,158].
In addition to the change in protein conformation associated with the induced fit of protein-ligand binding, docking methods also have to deal with the flexibility of ligands. To evaluate scoring function performance in flexible binding mode predictions, one approach is to pre-generate many decoy binding modes for a ligand in the vicinity of the binding site, and test the ability of the scoring function to distinguish between the native pose and decoys. This approach was used for decoy sets that extend the CSAR benchmark [152].
Korb et al. suggested that this approach of testing docking scoring functions with predefined sets of decoy ligands is not adequate for distinguishing a scoring function that performs well in practice from one that performs poorly. The reason is simple: in docking, sampling that is sufficient to identify the native pose and conformation must be very thorough, and so in practice many more diverse poses and conformations are considered during docking than are typically generated for the decoy poses generated for scoring function evaluation [159]. It seems that this problem could be mostly avoided by ensuring that the generated decoys are numerous and diverse, or entirely avoided by testing scoring functions simultaneously with sampling, as in realistic practice [160].

4.3. Assessing Binding Mode Predictions Involving Symmetric Molecules

Another challenge in evaluating docking methods is the need for special handling of symmetric molecules when evaluating binding mode predictions. The binding mode predictions of a docking method are commonly evaluated using the root-mean-squared standard deviation (RMSD) of atom positions between the known native binding mode of a ligand and its predicted mode according to the docking method. However, comparing the atom positions between two structures of a ligand requires mapping the atoms in the native ligand conformation to the atoms in the docked conformation of the ligand. Due to the symmetry of entire molecules, or substructures within them, these mappings can be ambiguous and a naïve treatment of binding mode evaluations can consider a perfect binding mode prediction to be a poor prediction. Recently, work by Allen et al. addresses this problem by using the Hungarian algorithm. The Hungarian algorithm can be used to find the optimal mapping of two graphs under a cost function, and in addition to other applications in chemical informatics, has recently been used by Allen et al. to find the optimal mapping between two molecules in RMSD calculations, ensuring that a correct binding mode prediction will be recognized as such [161]. This method has been implemented within DOCK 6 [22] and we anticipate its wide adoption.

5. Conclusions

The methods used to simulate the binding of proteins and small molecules face substantial challenges. Two of the fundamental challenges are sampling and scoring [8]. Protein-ligand interactions involve a delicate balance of competing forces, and these forces may occur between flexible structures that can reposition themselves in far too many combinations to sample exhaustively. Another challenge is the need to account for structural water in approximative models of the solvent [114]. In addition, there is demand for methodologies that can adequately evaluate the wide spectrum of possible binding partners of a given ligand; this is especially important for designing drugs that maximize efficacy while minimizing side effects and toxicity [3,121]. Finally, there is also a need for rigorous, wide evaluations of new docking methodologies, which are rapidly being introduced [40,153].
Another important component of protein-ligand binding that is sometimes neglected is the effect of entropy. Rigorous computation of the entropic contribution to binding free energy is intractable for large molecular systems such as protein-ligand complexes. Some approximations have been introduced to deal with entropy [35,162,163], but many of the most popular docking scoring functions for structure-based virtual screening either ignore this important component of protein-ligand binding free energies or use overly simplistic empirical approximations. Future efforts to find computationally efficient ways to include the effect of entropy are likely to play a crucial role in the future advancement of docking methodologies.
Despite all the challenges, protein-ligand docking has a long and successful history of practical applications including newly discovered enzyme inhibitors, receptor antagonists and agonists, ion channel blockers, as well as the subsequent approval of new drugs discovered with the help of structure-based drug design. Docking methods have provided new mechanistic insights into protein-ligand binding mechanisms, and have also helped investigate the influence of protein mutations on ligand binding, offering clues regarding the mutations that enable the robust survival of drug-resistant pathogens. As the continued increases in computational power expand the practical applications of molecular models, the field is quickly advancing, with approximations that offer a better tradeoff between accuracy and computational cost. These efforts will undoubtably lead to many more intriguing applications into the future.

Acknowledgments

Support to XZ from OpenEye Scientific Software Inc. (Santa Fe, NM, USA) is gratefully acknowledged. This work is supported by NSF CAREER Award DBI-0953839, and the American Heart Association (Midwest Affiliate) 13GRNT16990076 (XZ). Additional financial support was provided by Dell, SGI, Sun Microsystems, TimeLogic, and Intel.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Liljefors, T.; Krogsgaard-Larsen, P.; Madsen, U. Textbook of Drug Design and Discovery, 3rd ed.; CRC Press: Boca Raton, FL, USA, 2003. [Google Scholar]
  2. Khanna, I. Drug discovery in pharmaceutical industry: Productivity challenges and trends. Drug Discov. Today 2012, 17, 1088–1102. [Google Scholar] [CrossRef] [PubMed]
  3. Overington, J.P.; Al-Lazikani, B.; Hopkins, A.L. How many drug targets are there? Nat. Rev. Drug Discov. 2006, 5, 993–996. [Google Scholar]
  4. Schneider, G.; Böhm, H.J. Virtual screening and fast automated docking methods. Drug Discov. Today 2002, 7, 64–70. [Google Scholar] [CrossRef] [PubMed]
  5. Scior, T.; Bender, A.; Tresadern, G.; Medina-Franco, J.L.; Martínez-Mayorga, K.; Langer, T.; Cuanalo-Contreras, K.; Agrafiotis, D.K. Recognizing pitfalls in virtual screening: A critical review. J. Chem. Inf. Model. 2012, 52, 867–881. [Google Scholar] [CrossRef] [PubMed]
  6. Irwin, J.J.; Shoichet, B.K. ZINC–a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model. 2005, 45, 177–182. [Google Scholar] [CrossRef] [PubMed]
  7. Brown, R.D.; Martin, Y.C. The Information Content of 2D and 3D Structural Descriptors Relevant to Ligand-Receptor Binding. J. Chem. Inf. Comput. Sci. 1997, 37, 1–9. [Google Scholar]
  8. Lyne, P.D. Structure-based virtual screening: An overview. Drug Discov. Today 2002, 7, 1047–1055. [Google Scholar] [CrossRef] [PubMed]
  9. Brooijmans, N.; Kuntz, I.D. Molecular recognition and docking algorithms. Annu. Rev. Biophys. Biomol. Struct. 2003, 32, 335–373. [Google Scholar] [PubMed]
  10. Leach, A.R.; Shoichet, B.K.; Peishoff, C.E. Prediction of protein-ligand interactions. Docking and scoring: Successes and gaps. J. Med. Chem. 2006, 49, 5851–5855. [Google Scholar]
  11. Huang, S.Y.; Zou, X. Advances and Challenges in Protein-Ligand Docking. Int. J. Mol. Sci. 2010, 11, 3016–3034. [Google Scholar] [PubMed]
  12. Hansch, C. The physicochemical approach to drug design and discovery (QSAR). Drug Dev. Res. 1981, 1, 267–309. [Google Scholar] [CrossRef]
  13. Ortiz, A.R.; Pisabarro, M.T.; Gago, F.; Wade, R.C. Prediction of drug binding affinities by comparative binding energy analysis. J. Med. Chem. 1995, 38, 2681–2691. [Google Scholar] [PubMed]
  14. Cereto-Massagué, A.; Ojeda, M.J.; Joosten, R.P.; Valls, C.; Mulero, M.; Salvado, M.J.; Arola-Arnal, A.; Arola, L.; Garcia-Vallvé, S.; Pujadas, G. The good,the bad and the dubious: VHELIBS,a validation helper for ligands and binding sites. J. Cheminform. 2013, 5, 36. [Google Scholar] [CrossRef] [PubMed]
  15. Hetényi, C.; van der Spoel, D. Blind docking of drug-sized compounds to proteins with up to a thousand residues. FEBS Lett. 2006, 580, 1447–1450. [Google Scholar] [CrossRef] [PubMed]
  16. Hetényi, C.; van der Spoel, D. Toward prediction of functional protein pockets using blind docking and pocket search algorithms. Protein Sci. 2011, 20, 880–893. [Google Scholar] [CrossRef] [PubMed]
  17. DesJarlais, R.L.; Sheridan, R.P.; Seibel, G.L.; Dixon, J.S.; Kuntz, I.D.; Venkataraghavan, R. Using shape complementarity as an initial screen in designing ligands for a receptor binding site of known three-dimensional structure. J. Med. Chem. 1988, 31, 722–729. [Google Scholar] [CrossRef] [PubMed]
  18. Meng, E.C.; Shoichet, B.K.; Kuntz, I.D. Automated docking with grid-based energy evaluation. J. Comput. Chem. 1992, 13, 505–524. [Google Scholar] [CrossRef]
  19. Kuntz, I.D.; Meng, E.C.; Shoichet, B.K. Structure-Based Molecular Design. Acc. Chem. Res. 1994, 27, 117–123. [Google Scholar] [CrossRef]
  20. Ewing, T.J.; Makino, S.; Skillman, A.G.; Kuntz, I.D. DOCK 4.0: Search strategies for automated molecular docking of flexible molecule databases. J. Comput. Aided Mol. Des. 2001, 15, 411–428. [Google Scholar] [CrossRef] [PubMed]
  21. Moustakas, D.T.; Lang, P.T.; Pegg, S.; Pettersen, E.; Kuntz, I.D.; Brooijmans, N.; Rizzo, R.C. Development and validation of a modular,extensible docking program: DOCK 5. J. Comput. Aided Mol. Des. 2006, 20, 601–619. [Google Scholar] [CrossRef] [PubMed]
  22. Lang, P.T.; Brozell, S.R.; Mukherjee, S.; Pettersen, E.F.; Meng, E.C.; Thomas, V.; Rizzo, R.C.; Case, D.A.; James, T.L.; Kuntz, I.D. DOCK 6: Combining techniques to model RNA-small molecule complexes. RNA 2009, 15, 1219–1230. [Google Scholar] [CrossRef] [PubMed]
  23. Morris, G.M.; Goodsell, D.S.; Halliday, R.S.; Huey, R.; Hart, W.E.; Belew, R.K.; Olson, A.J. Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J. Comput. Chem. 1998, 19, 1639–1662. [Google Scholar] [CrossRef]
  24. Böhm, H.J. The computer program LUDI: A new method for the de novo design of enzyme inhibitors. J. Comput. Aided Mol. Des. 1992, 6, 61–78. [Google Scholar] [CrossRef] [PubMed]
  25. Rarey, M.; Kramer, B.; Lengauer, T.; Klebe, G. A fast flexible docking method using an incremental construction algorithm. J. Mol. Biol. 1996, 261, 470–489. [Google Scholar] [CrossRef] [PubMed]
  26. Kramer, B.; Rarey, M.; Lengauer, T. Evaluation of the FLEXX incremental construction algorithm for protein-ligand docking. Proteins 1999, 37, 228–241. [Google Scholar] [CrossRef] [PubMed]
  27. Jones, G.; Willett, P.; Glen, R.C.; Leach, A.R.; Taylor, R. Development and validation of a genetic algorithm for flexible docking. J. Mol. Biol. 1997, 267, 727–748. [Google Scholar] [CrossRef] [PubMed]
  28. Friesner, R.A.; Banks, J.L.; Murphy, R.B.; Halgren, T.A.; Klicic, J.J.; Mainz, D.T.; Repasky, M.P.; Knoll, E.H.; Shelley, M.; Perry, J.K. Glide: A new approach for rapid,accurate docking and scoring. 1. Method and assessment of docking accuracy. J. Med. Chem. 2004, 47, 1739–1749. [Google Scholar]
  29. Halgren, T.A.; Murphy, R.B.; Friesner, R.A.; Beard, H.S.; Frye, L.L.; Pollard, W.T.; Banks, J.L. Glide: A new approach for rapid,accurate docking and scoring. 2. Enrichment factors in database screening. J. Med. Chem. 2004, 47, 1750–1759. [Google Scholar]
  30. Trott, O.; Olson, A.J. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function,efficient optimization,and multithreading. J. Comput. Chem. 2010, 31, 455–461. [Google Scholar] [PubMed]
  31. Huang, S.Y.; Zou, X. An iterative knowledge-based scoring function to predict protein-ligand interactions: I. Derivation of interaction potentials. J. Comput. Chem. 2006, 27, 1866–1875. [Google Scholar]
  32. Huang, S.Y.; Zou, X. An iterative knowledge-based scoring function to predict protein-ligand interactions: II. Validation of the scoring function. J. Comput. Chem. 2006, 27, 1876–1882. [Google Scholar]
  33. Huang, S.Y.; Zou, X. Ensemble docking of multiple protein structures: Considering protein structural variations in molecular docking. Proteins 2007, 66, 399–421. [Google Scholar] [CrossRef] [PubMed]
  34. Huang, S.Y.; Zou, X. Efficient molecular docking of NMR structures: Application to HIV-1 protease. Protein Sci. 2007, 16, 43–51. [Google Scholar] [CrossRef] [PubMed]
  35. Huang, S.Y.; Zou, X. Inclusion of solvation and entropy in the knowledge-based scoring function for protein-ligand interactions. J. Chem. Inf. Model. 2010, 50, 262–273. [Google Scholar] [CrossRef] [PubMed]
  36. Sousa, S.F.; Ribeiro, A.J.M.; Coimbra, J.T.S.; Neves, R.P.P.; Martins, S.A.; Moorthy, N.S.H.N.; Fernandes, P.A.; Ramos, M.J. Protein-ligand docking in the new millennium–a retrospective of 10 years in the field. Curr. Med. Chem. 2013, 20, 2296–2314. [Google Scholar] [CrossRef] [PubMed]
  37. Zhou, H.; Skolnick, J. FINDSITE(comb): A threading/structure-based,proteomic-scale virtual ligand screening approach. J. Chem. Inf. Model. 2013, 53, 230–240. [Google Scholar] [PubMed]
  38. Anderson, A.C. The process of structure-based drug design. Chem. Biol. 2003, 10, 787–797. [Google Scholar] [CrossRef] [PubMed]
  39. Gilson, M.K.; Zhou, H.X. Calculation of protein-ligand binding affinities. Annu. Rev. Biophys. Biomol. Struct. 2007, 36, 21–42. [Google Scholar] [PubMed]
  40. Huang, S.Y.; Grinter, S.Z.; Zou, X. Scoring functions and their evaluation methods for protein-ligand docking: Recent advances and future directions. Phys. Chem. Chem. Phys. 2010, 12, 12899–12908. [Google Scholar] [CrossRef] [PubMed]
  41. Kitchen, D.B.; Decornez, H.; Furr, J.R.; Bajorath, J. Docking and scoring in virtual screening for drug discovery: Methods and applications. Nat. Rev. Drug Discov. 2004, 3, 935–949. [Google Scholar] [CrossRef] [PubMed]
  42. Rahaman, O.; Estrada, T.P.; Doren, D.J.; Taufer, M.; Brooks, C.L., 3rd; Armen, R.S. Evaluation of several two-step scoring functions based on linear interaction energy, effective ligand size, and empirical pair potentials for prediction of protein-ligand binding geometry and free energy. J. Chem. Inf. Model. 2011, 51, 2047–2065. [Google Scholar]
  43. Nicolini, P.; Frezzato, D.; Gellini, C.; Bizzarri, M.; Chelli, R. Toward quantitative estimates of binding affinities for protein-ligand systems involving large inhibitor compounds: A steered molecular dynamics simulation route. J. Comput. Chem. 2013, 34, 1561–1576. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  44. Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins 2006, 65, 712–725. [Google Scholar] [CrossRef] [PubMed]
  45. Wang, J.; Wolf, R.M.; Caldwell, J.W.; Kollman, P.A.; Case, D.A. Development and testing of a general amber force field. J. Comput. Chem. 2004, 25, 1157–1174. [Google Scholar] [CrossRef] [PubMed]
  46. Cornell, W.D.; Cieplak, P.; Bayly, C.I.; Gould, I.R.; Merz, K.M.; Ferguson, D.M.; Spellmeyer, D.C.; Fox, T.; Caldwell, J.W.; Kollman, P.A. A Second Generation Force Field for the Simulation of Proteins,Nucleic Acids,and Organic Molecules. J. Am. Chem. Soc. 1995, 117, 5179–5197. [Google Scholar] [CrossRef]
  47. Jones, J.E. On the Determination of Molecular Fields. II. From the Equation of State of a Gas. Proc. R. Soc. Lond. A 1924, 106, 463–477. [Google Scholar]
  48. MacKerell, A.D.; Bashford, D.; Bellott, M.; Dunbrack, R.L.; Evanseck, J.D.; Field, M.J.; Fischer, S.; Gao, J.; Guo, H.; Ha, S.; et al. All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. J. Phys. Chem. B 1998, 102, 3586–3616. [Google Scholar] [PubMed]
  49. Vanommeslaeghe, K.; Hatcher, E.; Acharya, C.; Kundu, S.; Zhong, S.; Shim, J.; Darian, E.; Guvench, O.; Lopes, P.; Vorobyov, I.; et al. CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. J. Comput. Chem. 2010, 31, 671–690. [Google Scholar] [PubMed]
  50. Jorgensen, W.L.; Chandrasekhar, J.; Madura, J.D.; Impey, R.W.; Klein, M.L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983, 79, 926–935. [Google Scholar]
  51. Gilson, M.K.; Rashin, A.; Fine, R.; Honig, B. On the calculation of electrostatic interactions in proteins. J. Mol. Biol. 1985, 184, 503–516. [Google Scholar] [CrossRef] [PubMed]
  52. Grant, J.A.; Pickup, B.T.; Nicholls, A. A smooth permittivity function for Poisson–Boltzmann solvation methods. J. Comput. Chem. 2001, 22, 608–640. [Google Scholar] [CrossRef]
  53. Baker, N.A.; Sept, D.; Joseph, S.; Holst, M.J.; McCammon, J.A. Electrostatics of nanosystems: Application to microtubules and the ribosome. Proc. Natl. Acad. Sci. USA 2001, 98, 10037–10041. [Google Scholar] [CrossRef] [PubMed]
  54. Rocchia, W.; Sridharan, S.; Nicholls, A.; Alexov, E.; Chiabrera, A.; Honig, B. Rapid grid-based construction of the molecular surface and the use of induced surface charge to calculate reaction field energies: Applications to the molecular systems and geometric objects. J. Comput. Chem. 2002, 23, 128–137. [Google Scholar] [CrossRef] [PubMed]
  55. Still, W.C.; Tempczyk, A.; Hawley, R.C.; Hendrickson, T. Semianalytical treatment of solvation for molecular mechanics and dynamics. J. Am. Chem. Soc. 1990, 112, 6127–6129. [Google Scholar] [CrossRef]
  56. Bashford, D.; Case, D.A. Generalized born models of macromolecular solvation effects. Annu. Rev. Phys. Chem. 2000, 51, 129–152. [Google Scholar] [PubMed]
  57. Hawkins, G.D.; Cramer, C.J.; Truhlar, D.G. Pairwise solute descreening of solute charges from a dielectric medium. Chem. Phys. Lett. 1995, 246, 122–129. [Google Scholar]
  58. Grycuk, T. Deficiency of the Coulomb-field approximation in the generalized Born model: An improved formula for Born radii evaluation. J. Chem. Phys. 2003, 119, 4817–4826. [Google Scholar]
  59. Feig, M.; Onufriev, A.; Lee, M.S.; Im, W.; Case, D.A.; Brooks, C.L., 3rd. Performance comparison of generalized born and Poisson methods in the calculation of electrostatic solvation energies for protein structures. J. Comput. Chem. 2004, 25, 265–284. [Google Scholar] [CrossRef] [PubMed]
  60. Liu, H.Y.; Zou, X. Electrostatics of ligand binding: Parametrization of the generalized Born model and comparison with the Poisson-Boltzmann approach. J. Phys. Chem. B 2006, 110, 9304–9313. [Google Scholar] [PubMed]
  61. Tjong, H.; Zhou, H.X. GBr(6): A parameterization-free,accurate,analytical generalized born method. J. Phys. Chem. B 2007, 111, 3055–3061. [Google Scholar] [PubMed]
  62. Srinivasan, J.; Miller, J.; Kollman, P.A.; Case, D.A. Continuum solvent studies of the stability of RNA hairpin loops and helices. J. Biomol. Struct. Dyn. 1998, 16, 671–682. [Google Scholar] [PubMed]
  63. Zou, X.; Yaxiong; Kuntz, I.D. Inclusion of Solvation in Ligand Binding Free Energy Calculations Using the Generalized-Born Model. J. Am. Chem. Soc. 1999, 121, 8033–8043. [Google Scholar] [CrossRef]
  64. Wang, J.; Morin, P.; Wang, W.; Kollman, P.A. Use of MM-PBSA in Reproducing the Binding Free Energies to HIV-1 RT of TIBO Derivatives and Predicting the Binding Mode to HIV-1 RT of Efavirenz by Docking and MM-PBSA. J. Am. Chem. Soc. 2001, 123, 5221–5230. [Google Scholar] [CrossRef] [PubMed]
  65. Zhou, R. Free energy landscape of protein folding in water: Explicit vs. implicit solvent. Proteins 2003, 53, 148–161. [Google Scholar]
  66. Liu, H.Y.; Kuntz, I.D.; Zou, X. Pairwise GB/SA Scoring Function for Structure-based Drug Design. J. Phys. Chem. B 2004, 108, 5453–5462. [Google Scholar]
  67. Liu, H.Y.; Grinter, S.Z.; Zou, X. Multiscale generalized Born modeling of ligand binding energies for virtual database screening. J. Phys. Chem. B 2009, 113, 11793–11799. [Google Scholar] [PubMed]
  68. Purisima, E.O.; Hogues, H. Protein-ligand binding free energies from exhaustive docking. J. Phys. Chem. B 2012, 116, 6872–6879. [Google Scholar] [PubMed]
  69. Kollman, P. Free energy calculations: Applications to chemical and biochemical phenomena. Chem. Rev. 1993, 93, 2395–2417. [Google Scholar] [CrossRef]
  70. Eldridge, M.D.; Murray, C.W.; Auton, T.R.; Paolini, G.V.; Mee, R.P. Empirical scoring functions: The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes. J. Comput. Aided Mol. Des. 1997, 11, 425–445. [Google Scholar] [CrossRef] [PubMed]
  71. Böhm, H.J. Prediction of binding constants of protein ligands: A fast method for the prioritization of hits obtained from de novo design or 3D database search programs. J. Comput. Aided Mol. Des. 1998, 12, 309–323. [Google Scholar] [CrossRef] [PubMed]
  72. Wang, R.; Lu, Y.; Wang, S. Comparative evaluation of 11 scoring functions for molecular docking. J. Med. Chem. 2003, 46, 2287–2303. [Google Scholar] [CrossRef] [PubMed]
  73. Temiz, N.A.; Trapp, A.; Prokopyev, O.A.; Camacho, C.J. Optimization of minimum set of protein-DNA interactions: A quasi exact solution with minimum over-fitting. Bioinformatics 2010, 26, 319–325. [Google Scholar] [PubMed]
  74. Wang, R.; Lai, L.; Wang, S. Further development and validation of empirical scoring functions for structure-based binding affinity prediction. J. Comput. Aided Mol. Des. 2002, 16, 11–26. [Google Scholar] [CrossRef] [PubMed]
  75. Tanaka, S.; Scheraga, H.A. Model of protein folding: Incorporation of a one-dimensional short-range (Ising) model. Proc. Natl. Acad. Sci. USA 1977, 74, 1320–1323. [Google Scholar] [CrossRef] [PubMed]
  76. Miyazawa, S.; Jernigan, R.L. Estimation of effective interresidue contact energies from protein crystal structures: Quasi-chemical approximation. Macromolecules 1985, 18, 534–552. [Google Scholar]
  77. Thomas, P.D.; Dill, K.A. Statistical potentials extracted from protein structures: How accurate are they? J. Mol. Biol. 1996, 257, 457–469. [Google Scholar] [CrossRef]
  78. Thomas, P.D.; Dill, K.A. An iterative method for extracting energy-like quantities from protein structures. Proc. Natl. Acad. Sci. USA 1996, 93, 11628–11633. [Google Scholar] [CrossRef] [PubMed]
  79. Huang, S.Y.; Zou, X. Chapter 14—Mean-Force Scoring Functions for Protein–Ligand Binding. In Annual Reports in Computational Chemistry; Wheeler, R.A., Ed.; Elsevier: Amsterdam, The Netherlands, 2010; Volume 6, pp. 280–296. [Google Scholar]
  80. Muegge, I.; Martin, Y.C.; Hajduk, P.J.; Fesik, S.W. Evaluation of PMF scoring in docking weak ligands to the FK506 binding protein. J. Med. Chem. 1999, 42, 2498–2503. [Google Scholar] [CrossRef] [PubMed]
  81. Sippl, M.J.; Ortner, M.; Jaritz, M.; Lackner, P.; Flöckner, H. Helmholtz free energies of atom pair interactions in proteins. Fold Des. 1996, 1, 289–298. [Google Scholar] [CrossRef] [PubMed]
  82. Li, X.; Liang, J. Knowledge-based energy functions for computational studies of proteins. In Computational Methods for Protein Structure Prediction and Modeling; Springer: New York, NY, USA, 2007; pp. 71–123. [Google Scholar]
  83. Munson, P.J.; Singh, R.K. Statistical significance of hierarchical multi-body potentials based on Delaunay tessellation and their application in sequence-structure alignment. Protein Sci. 1997, 6, 1467–1481. [Google Scholar] [CrossRef] [PubMed]
  84. Zimmermann, M.T.; Leelananda, S.P.; Kloczkowski, A.; Jernigan, R.L. Combining statistical potentials with dynamics-based entropies improves selection from protein decoys and docking poses. J. Phys. Chem. B 2012, 116, 6725–6731. [Google Scholar] [PubMed]
  85. Jernigan, R.L.; Bahar, I. Structure-derived potentials and protein simulations. Curr. Opin. Struct. Biol. 1996, 6, 195–209. [Google Scholar] [CrossRef] [PubMed]
  86. Zhang, L.; Skolnick, J. How do potentials derived from structural databases relate to “true” potentials? Protein Sci. 1998, 7, 112–122. [Google Scholar] [CrossRef] [PubMed]
  87. Muegge, I.; Martin, Y.C. A general and fast scoring function for protein-ligand interactions: A simplified potential approach. J. Med. Chem. 1999, 42, 791–804. [Google Scholar] [PubMed]
  88. Zhou, H.; Zhou, Y. Distance-scaled,finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Sci. 2002, 11, 2714–2726. [Google Scholar] [PubMed]
  89. Kozakov, D.; Brenke, R.; Comeau, S.R.; Vajda, S. PIPER: An FFT-based protein docking program with pairwise potentials. Proteins 2006, 65, 392–406. [Google Scholar] [CrossRef] [PubMed]
  90. Huang, S.Y.; Zou, X. An iterative knowledge-based scoring function for protein-protein recognition. Proteins 2008, 72, 557–579. [Google Scholar] [CrossRef] [PubMed]
  91. Ravikant, D.V.S.; Elber, R. Energy design for protein-protein interactions. J. Chem. Phys. 2011, 135, 065102. [Google Scholar] [CrossRef] [PubMed]
  92. Huang, S.Y.; Zou, X. A knowledge-based scoring function for protein-RNA interactions derived from a statistical mechanics-based iterative method. Nucleic Acids Res. 2014. [CrossRef]
  93. Sippl, M.J. Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. J. Mol. Biol. 1990, 213, 859–883. [Google Scholar]
  94. Grinter, S.Z.; Zou, X. A Bayesian statistical approach of improving knowledge-based scoring functions for protein-ligand interactions. J. Comput. Chem. 2014, 35, 932–943. [Google Scholar] [CrossRef] [PubMed]
  95. Zhang, C.; Liu, S.; Zhu, Q.; Zhou, Y. A knowledge-based energy function for protein-ligand,protein-protein,and protein-DNA complexes. J. Med. Chem. 2005, 48, 2325–2335. [Google Scholar] [CrossRef] [PubMed]
  96. Gohlke, H.; Hendlich, M.; Klebe, G. Knowledge-based scoring function to predict protein-ligand interactions. J. Mol. Biol. 2000, 295, 337–356. [Google Scholar] [CrossRef] [PubMed]
  97. Velec, H.F.G.; Gohlke, H.; Klebe, G. DrugScore(CSD)-knowledge-based scoring function derived from small molecule crystal data with superior recognition rate of near-native ligand poses and better affinity prediction. J. Med. Chem. 2005, 48, 6296–6303. [Google Scholar] [CrossRef] [PubMed]
  98. DeWitte, R.; Shakhnovich, E. SMoG: De Novo Design Method Based on Simple,Fast,and Accurate Free Energy Estimates. 1. Methodology and Supporting Evidence. J. Am. Chem. Soc. 1996, 118, 11733–11744. [Google Scholar] [CrossRef]
  99. Huang, S.Y.; Zou, X. Advances and Challenges in Protein-Ligand Docking. Int. J. Mol. Sci. 2010, 11, 3016–3034. [Google Scholar] [CrossRef] [PubMed]
  100. Charifson, P.S.; Corkery, J.J.; Murcko, M.A.; Walters, W.P. Consensus scoring: A method for obtaining improved hit rates from docking databases of three-dimensional structures into proteins. J. Med. Chem. 1999, 42, 5100–5109. [Google Scholar] [CrossRef] [PubMed]
  101. Terp, G.E.; Johansen, B.N.; Christensen, I.T.; Jørgensen, F.S. A new concept for multidimensional selection of ligand conformations (MultiSelect) and multidimensional scoring (MultiScore) of protein-ligand binding affinities. J. Med. Chem. 2001, 44, 2333–2343. [Google Scholar] [PubMed]
  102. Plewczynski, D.; Łazniewski, M.; von Grotthuss, M.; Rychlewski, L.; Ginalski, K. VoteDock: Consensus docking method for prediction of protein-ligand interactions. J. Comput. Chem. 2011, 32, 568–581. [Google Scholar]
  103. Erickson, J.A.; Jalaie, M.; Robertson, D.H.; Lewis, R.A.; Vieth, M. Lessons in molecular recognition: The effects of ligand and protein flexibility on molecular docking accuracy. J. Med. Chem. 2004, 47, 45–55. [Google Scholar] [CrossRef] [PubMed]
  104. Shoichet, B.K.; Kuntz, I.D.; Bodian, D.L. Molecular docking using shape descriptors. J. Comput. Chem. 1992, 13, 380–397. [Google Scholar]
  105. Hawkins, P.C.D.; Skillman, A.G.; Warren, G.L.; Ellingson, B.A.; Stahl, M.T. Conformer generation with OMEGA: Algorithm and validation using high quality structures from the Protein Databank and Cambridge Structural Database. J. Chem. Inf. Model. 2010, 50, 572–584. [Google Scholar] [CrossRef] [PubMed]
  106. Hawkins, P.C.D.; Nicholls, A. Conformer generation with OMEGA: Learning from the data set and the analysis of failures. J. Chem. Inf. Model. 2012, 52, 2919–2936. [Google Scholar] [CrossRef] [PubMed]
  107. Leach, A.R.; Kuntz, I.D. Conformational analysis of flexible ligands in macromolecular receptor sites. J. Comput. Chem. 1992, 13, 730–748. [Google Scholar] [CrossRef]
  108. Lorber, D.M.; Shoichet, B.K. Hierarchical docking of databases of multiple ligand conformations. Curr. Top. Med. Chem. 2005, 5, 739–749. [Google Scholar] [PubMed]
  109. Damm, K.L.; Carlson, H.A. Exploring experimental sources of multiple protein conformations in structure-based drug design. J. Am. Chem. Soc. 2007, 129, 8225–8235. [Google Scholar] [CrossRef] [PubMed]
  110. Bottegoni, G.; Kufareva, I.; Totrov, M.; Abagyan, R. Four-dimensional docking: A fast and accurate account of discrete receptor flexibility in ligand docking. J. Med. Chem. 2009, 52, 397–406. [Google Scholar] [CrossRef] [PubMed]
  111. Apostolakis, J.; Plückthun, A.; Caflisch, A. Docking small ligands in flexible binding sites. J. Comput. Chem. 1998, 19, 21–37. [Google Scholar] [CrossRef]
  112. Morris, G.M.; Huey, R.; Lindstrom, W.; Sanner, M.F.; Belew, R.K.; Goodsell, D.S.; Olson, A.J. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J. Comput. Chem. 2009, 30, 2785–2791. [Google Scholar] [CrossRef] [PubMed]
  113. Limongelli, V.; Marinelli, L.; Cosconati, S.; La Motta, C.; Sartini, S.; Mugnaini, L.; Da Settimo, F.; Novellino, E.; Parrinello, M. Sampling protein motion and solvent effect during ligand binding. Proc. Natl. Acad. Sci. USA 2012, 109, 1467–1472. [Google Scholar] [PubMed]
  114. Rarey, M.; Kramer, B.; Lengauer, T. The particle concept: Placing discrete water molecules during protein-ligand docking predictions. Proteins 1999, 34, 17–28. [Google Scholar] [CrossRef] [PubMed]
  115. Sahai, M.A.; Biggin, P.C. Quantifying water-mediated protein-ligand interactions in a glutamate receptor: A DFT study. J. Phys. Chem. B 2011, 115, 7085–7096. [Google Scholar] [PubMed]
  116. Lie, M.A.; Thomsen, R.; Pedersen, C.N.S.; Schiøtt, B.; Christensen, M.H. Molecular docking with ligand attached water molecules. J. Chem. Inf. Model. 2011, 51, 909–917. [Google Scholar] [CrossRef] [PubMed]
  117. Liu, J.; He, X.; Zhang, J.Z.H. Improving the scoring of protein-ligand binding affinity by including the effects of structural water and electronic polarization. J. Chem. Inf. Model. 2013, 53, 1306–1314. [Google Scholar] [CrossRef] [PubMed]
  118. Wang, C.; Bradley, P.; Baker, D. Protein-protein docking with backbone flexibility. J. Mol. Biol. 2007, 373, 503–519. [Google Scholar] [CrossRef] [PubMed]
  119. Lemmon, G.; Meiler, J. Rosetta Ligand docking with flexible XML protocols. Methods Mol. Biol. 2012, 819, 143–155. [Google Scholar] [PubMed]
  120. Huggins, D.J.; Tidor, B. Systematic placement of structural water molecules for improved scoring of protein-ligand interactions. Protein Eng. Des. Sel. 2011, 24, 777–789. [Google Scholar] [CrossRef] [PubMed]
  121. Reddy, A.S.; Zhang, S. Polypharmacology: Drug discovery for the future. Expert Rev. Clin. Pharmacol. 2013, 6, 41–47. [Google Scholar] [CrossRef] [PubMed]
  122. Taboureau, O.; Jørgensen, F.S. In silico predictions of hERG channel blockers in drug discovery: From ligand-based and target-based approaches to systems chemical biology. Comb. Chem. High Throughput Screen. 2011, 14, 375–387. [Google Scholar] [CrossRef] [PubMed]
  123. Gowthaman, R.; Deeds, E.J.; Karanicolas, J. Structural properties of non-traditional drug targets present new challenges for virtual screening. J. Chem. Inf. Model. 2013, 53, 2073–2081. [Google Scholar] [CrossRef] [PubMed]
  124. Pérez-Nueno, V.I.; Ritchie, D.W. Using consensus-shape clustering to identify promiscuous ligands and protein targets and to choose the right query for shape-based virtual screening. J. Chem. Inf. Model. 2011, 51, 1233–1248. [Google Scholar] [CrossRef] [PubMed]
  125. Peng, S.; Lin, X.; Guo, Z.; Huang, N. Identifying multiple-target ligands via computational chemogenomics approaches. Curr. Top. Med. Chem. 2012, 12, 1363–1375. [Google Scholar] [PubMed]
  126. Shrinivasan, M.; Skariyachan, S.; Aparna, V.; Kolte, V.R. Homology modelling of CB1 receptor and selection of potential inhibitor against Obesity. Bioinformation 2012, 8, 523–528. [Google Scholar] [CrossRef] [PubMed]
  127. Skariyachan, S.; Mahajanakatti, A.B.; Sharma, N.; Karanth, S.; Rao, S.; Rajeswari, N. Structure based virtual screening of novel inhibitors against multidrug resistant superbugs. Bioinformation 2012, 8, 420–425. [Google Scholar] [CrossRef] [PubMed]
  128. Skariyachan, S.; Prakash, N.; Bharadwaj, N. In silico exploration of novel phytoligands against probable drug target of Clostridium tetani. Interdiscip. Sci. 2012, 4, 273–281. [Google Scholar] [PubMed]
  129. Kar, R.K.; Ansari, M.Y.; Suryadevara, P.; Sahoo, B.R.; Sahoo, G.C.; Dikhit, M.R.; Das, P. Computational elucidation of structural basis for ligand binding with Leishmania donovani adenosine kinase. Biomed. Res. Int. 2013, 2013, 609289:1–609289:14. [Google Scholar]
  130. Tahir, R.A.; Sehgal, S.A.; Khattak, N.A.; Khan Khattak, J.Z.; Mir, A. Tumor necrosis factor receptor superfamily 10B (TNFRSF10B): An insight from structure modeling to virtual screening for designing drug against head and neck cancer. Theor. Biol. Med. Model. 2013, 10, 38. [Google Scholar] [CrossRef] [PubMed]
  131. Skariyachan, S.; Jayaprakash, N.; Bharadwaj, N.; Narayanappa, R. Exploring insights for virulent gene inhibition of multidrug resistant Salmonella typhi,Vibrio cholerae,and Staphylococcus areus by potential phytoligands via in silico screeningd. J. Biomol. Struct. Dyn. 2014, 32, 1379–1395. [Google Scholar] [CrossRef] [PubMed]
  132. Merlino, A.; Vieites, M.; Gambino, D.; Laura Coitiño, E. Homology modeling of T. cruzi and L. major NADH-dependent fumarate reductases: Ligand docking, molecular dynamics validation, and insights on their binding modes. J. Mol. Graph. Model. 2014, 48, 47–59. [Google Scholar]
  133. Orry, A.J.W.; Abagyan, R. Preparation and refinement of model protein-ligand complexes. Methods Mol. Biol. 2012, 857, 351–373. [Google Scholar] [PubMed]
  134. Combs, S.A.; Deluca, S.L.; Deluca, S.H.; Lemmon, G.H.; Nannemann, D.P.; Nguyen, E.D.; Willis, J.R.; Sheehan, J.H.; Meiler, J. Small-molecule ligand docking into comparative models with Rosetta. Nat. Protoc. 2013, 8, 1277–1298. [Google Scholar] [CrossRef] [PubMed]
  135. Kaufmann, K.W.; Meiler, J. Using RosettaLigand for small molecule docking into comparative models. PLoS One 2012, 7, e50769. [Google Scholar] [PubMed]
  136. Mahasenan, K.V.; Li, C. Novel inhibitor discovery through virtual screening against multiple protein conformations generated via ligand-directed modeling: A maternal embryonic leucine zipper kinase example. J. Chem. Inf. Model. 2012, 52, 1345–1355. [Google Scholar] [CrossRef] [PubMed]
  137. Sali, A.; Blundell, T.L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 1993, 234, 779–815. [Google Scholar] [CrossRef] [PubMed]
  138. Kiss, R.; Sandor, M.; Szalai, F.A. http://Mcule.com: A public web service for drug discovery. J. Cheminform. 2012, 4, P17. [Google Scholar] [CrossRef]
  139. Zhang, X.; Wang, H.; Li, Y.; Cao, R.; Zhong, W.; Zheng, Z.; Wang, G.; Xiao, J.; Li, S. Novel substituted heteroaromatic piperazine and piperidine derivatives as inhibitors of human enterovirus 71 and coxsackievirus a16. Molecules 2013, 18, 5059–5071. [Google Scholar] [CrossRef] [PubMed]
  140. Wilson, G.L.; Lill, M.A. Integrating structure-based and ligand-based approaches for computational drug design. Future Med. Chem. 2011, 3, 735–750. [Google Scholar] [PubMed]
  141. Ahmed, L.; Rasulev, B.; Turabekova, M.; Leszczynska, D.; Leszczynski, J. Receptor-and ligand-based study of fullerene analogues: comprehensive computational approach including quantum-chemical,QSAR and molecular docking simulations. Org. Biomol. Chem. 2013, 11, 5798–5808. [Google Scholar] [CrossRef] [PubMed]
  142. Ballante, F.; Caroli, A.; Wickersham, Richard B, r.; Ragno, R. Hsp90 Inhibitors, Part 1: Definition of 3-D QSAutogrid/R Models as a Tool for Virtual Screening. J. Chem. Inf. Model. 2014, 54, 956–969. [Google Scholar]
  143. Caroli, A.; Ballante, F.; Wickersham, R.B., 3rd; Corelli, F.; Ragno, R. Hsp90 Inhibitors, Part 2: Combining Ligand-Based and Structure-Based Approaches for Virtual Screening Application. J. Chem. Inf. Model. 2014, 54, 970–977. [Google Scholar]
  144. Alcaro, S.; Musetti, C.; Distinto, S.; Casatti, M.; Zagotto, G.; Artese, A.; Parrotta, L.; Moraca, F.; Costa, G.; Ortuso, F.; et al. Identification and characterization of new DNA G-quadruplex binders selected by a combination of ligand and structure-based virtual screening approaches. J. Med. Chem. 2013, 56, 843–855. [Google Scholar] [CrossRef] [PubMed]
  145. Grinter, S.Z.; Liang, Y.; Huang, S.Y.; Hyder, S.M.; Zou, X. An inverse docking approach for identifying new potential anti-cancer targets. J. Mol. Graph. Model. 2011, 29, 795–799. [Google Scholar] [PubMed]
  146. Chen, Y.Z.; Zhi, D.G. Ligand-protein inverse docking and its potential use in the computer search of protein targets of a small molecule. Proteins 2001, 43, 217–226. [Google Scholar] [CrossRef] [PubMed]
  147. Paul, N.; Kellenberger, E.; Bret, G.; Müller, P.; Rognan, D. Recovering the true targets of specific ligands by virtual screening of the protein data bank. Proteins 2004, 54, 671–680. [Google Scholar] [CrossRef] [PubMed]
  148. Gao, Z.; Li, H.; Zhang, H.; Liu, X.; Kang, L.; Luo, X.; Zhu, W.; Chen, K.; Wang, X.; Jiang, H. PDTD: A web-accessible protein database for drug target identification. BMC Bioinform. 2008, 9, 104. [Google Scholar] [CrossRef]
  149. Kumar, S.P.; Pandya, H.A.; Desai, V.H.; Jasrai, Y.T. Compound prioritization from inverse docking experiment using receptor-centric and ligand-centric methods: A case study on Plasmodium falciparum Fab enzymes. J. Mol. Recognit. 2014, 27, 215–229. [Google Scholar] [CrossRef] [PubMed]
  150. Ogungbe, I.V.; Setzer, W.N. In-silico Leishmania target selectivity of antiparasitic terpenoids. Molecules 2013, 18, 7761–7847. [Google Scholar] [CrossRef] [PubMed]
  151. Wang, R.; Fang, X.; Lu, Y.; Wang, S. The PDBbind database: Collection of binding affinities for protein-ligand complexes with known three-dimensional structures. J. Med. Chem. 2004, 47, 2977–2980. [Google Scholar] [CrossRef] [PubMed]
  152. Huang, S.Y.; Zou, X. Construction and test of ligand decoy sets using MDock: Community structure-activity resource benchmarks for binding mode prediction. J. Chem. Inf. Model. 2011, 51, 2107–2114. [Google Scholar] [CrossRef] [PubMed]
  153. Dunbar, J.B.; Smith, R.D.; Yang, C.Y.; Ung, P.M.U.; Lexa, K.W.; Khazanov, N.A.; Stuckey, J.A.; Wang, S.; Carlson, H.A. CSAR Benchmark Exercise of 2010: Selection of the Protein-Ligand Complexes. J. Chem. Inf. Model. 2011, 51, 2036–2046. [Google Scholar] [CrossRef] [PubMed]
  154. Kumar, A.; Zhang, K.Y.J. Computational fragment-based screening using RosettaLigand: The SAMPL3 challenge. J. Comput. Aided Mol. Des. 2012, 26, 603–616. [Google Scholar] [CrossRef] [PubMed]
  155. Dunbar, J.B., Jr.; Smith, R.D.; Damm-Ganamet, K.L.; Ahmed, A.; Esposito, E.X.; Delproposto, J.; Chinnaswamy, K.; Kang, Y.N.; Kubish, G.; Gestwicki, J.E.; et al. CSAR data set release 2012: Ligands,affinities,complexes,and docking decoys. J. Chem. Inf. Model. 2013, 53, 1842–1852. [Google Scholar] [CrossRef] [PubMed]
  156. Skillman, A.G.; Geballe, M.T.; Nicholls, A. SAMPL2 challenge: Prediction of solvation energies and tautomer ratios. J. Comput. Aided Mol. Des. 2010, 24, 257–258. [Google Scholar] [CrossRef] [PubMed]
  157. Grinter, S.Z.; Yan, C.; Huang, S.Y.; Jiang, L.; Zou, X. Automated large-scale file preparation,docking,and scoring: Evaluation of ITScore and STScore using the 2012 Community Structure-Activity Resource benchmark. J. Chem. Inf. Model. 2013, 53, 1905–1914. [Google Scholar] [CrossRef] [PubMed]
  158. Bolia, A.; Gerek, Z.N.; Ozkan, S.B. BP-Dock: A Flexible Docking Scheme for Exploring Protein-Ligand Interactions Based on Unbound Structures. J. Chem. Inf. Model. 2014, 54, 913–925. [Google Scholar] [CrossRef] [PubMed]
  159. Korb, O.; Ten Brink, T.; Victor Paul Raj, F.R.D.; Keil, M.; Exner, T.E. Are predefined decoy sets of ligand poses able to quantify scoring function accuracy? J. Comput. Aided Mol. Des. 2012, 26, 185–197. [Google Scholar] [CrossRef]
  160. Vajda, S.; Hall, D.R.; Kozakov, D. Sampling and scoring: A marriage made in heaven. Proteins 2013, 81, 1874–1884. [Google Scholar] [CrossRef] [PubMed]
  161. Allen, W.J.; Rizzo, R.C. Implementation of the hungarian algorithm to account for ligand symmetry and similarity in structure-based design. J. Chem. Inf. Model. 2014, 54, 518–529. [Google Scholar] [CrossRef] [PubMed]
  162. Head, M.S.; Given, J.A.; Gilson, M.K. “Mining Minima”: Direct Computation of Conformational Free Energy. J. Phys. Chem. A 1997, 101, 1609–1618. [Google Scholar]
  163. Ruvinsky, A.M. Role of binding entropy in the refinement of protein-ligand docking predictions: Analysis based on the use of 11 scoring functions. J. Comput. Chem. 2007, 28, 1364–1372. [Google Scholar] [CrossRef] [PubMed]

Share and Cite

MDPI and ACS Style

Grinter, S.Z.; Zou, X. Challenges, Applications, and Recent Advances of Protein-Ligand Docking in Structure-Based Drug Design. Molecules 2014, 19, 10150-10176. https://doi.org/10.3390/molecules190710150

AMA Style

Grinter SZ, Zou X. Challenges, Applications, and Recent Advances of Protein-Ligand Docking in Structure-Based Drug Design. Molecules. 2014; 19(7):10150-10176. https://doi.org/10.3390/molecules190710150

Chicago/Turabian Style

Grinter, Sam Z., and Xiaoqin Zou. 2014. "Challenges, Applications, and Recent Advances of Protein-Ligand Docking in Structure-Based Drug Design" Molecules 19, no. 7: 10150-10176. https://doi.org/10.3390/molecules190710150

Article Metrics

Back to TopTop