1. Introduction
Inflammatory bowel disease (IBD) is a group of chronic, nonspecific inflammatory diseases of the intestine whose etiology has not yet been elucidated, particularly ulcerative colitis (UC). The main symptoms of UC are recurrent diarrhea, mucus, pus, bloody stools, and abdominal pain. The course of the disease is chronic, and the current treatment is to maintain symptom relief and promote mucosal healing (colonic lesions in the mucosa and submucosa), prevent complications, and improve the quality of life of patients [
1]. Conventional treatments, including long-term use of anti-inflammatory and immunosuppressive agents, may increase the risk of serious complications such as opportunistic infections, malignancy, autoimmunity, and hepatotoxicity [
1]. Consequently, the development of new therapeutic agents and strategies holds significant potential and social value.
DDR1 belongs to the DDR family, a unique group of receptor tyrosine kinases (RTKs) that are thought to play an important role in inflammatory bowel disease [
2]. DDR1 is a promising therapeutic target because it is involved in the regulation of various cellular functions such as cell proliferation, differentiation, invasion, migration, and matrix remodeling, and is closely related to the occurrence and progression of many human diseases such as cancer, fibrosis, and inflammation [
3]. The expression of DDR1 in intestinal epithelial cells is closely related to its function in UC, likely because DDR1 can influence epithelial cell apoptosis by regulating the expression of tight junction proteins and the integrity of the intestinal mucosal barrier [
2]. DDR1 disrupts the intestinal barrier through the NF-κB p65-MLCK-p-MLC2 signaling pathway [
4]. DDR1 deficiency attenuates intestinal mucosal barrier damage induced by dextran sulfate sodium (DSS)-induced colitis and reduces pro-inflammatory cytokine production [
4]. The involvement of DDR1 in the pathogenesis of colitis by mediating intestinal mucosal barrier damage in UC has been demonstrated [
4], making DDR1 a novel target for the treatment of intestinal inflammation.
Early DDR1 inhibitors mainly targeted the ATP-binding pocket of DDR1 kinase, but the structural similarity between the ATP-binding pockets of DDR1 kinase and other protein kinases often led to “off-target” toxicity problems [
5]. Research on DDR1 inhibitors for colitis treatment has focused on their anti-inflammatory effects and their role in protecting the intestinal mucosal barrier; numerous molecules have demonstrated therapeutic efficacy in preclinical studies [
4]. For instance, a team led by Mingyue Zheng at SIOPA, Chinese Academy of Sciences, designed a novel DDR1 inhibitor that exhibited favorable oral therapeutic effects in DSS-induced colitis models in mice and effectively reduced inflammation [
4]. In recent years, there have been continuous innovations in the design of DDR1 inhibitors by researchers, such as some designing dual-target or multi-target inhibition [
3], and some developing DDR1 inhibitors that do not target the ATP-binding pocket [
5]. However, no selective small molecule inhibitors have entered clinical trials thus far due to challenges related to drug selectivity, delivery methods, and the development of drug resistance [
3].
Computer-aided drug design (CADD) is a crucial tool for drug development [
6]. Compared to traditional methods, CADD offers the advantages of reduced costs and accelerated processes [
6]. In recent years, a number of novel inhibitors with high selectivity for DDR1 have been identified through artificial intelligence techniques such as DNA coding library screening, structure-guided optimization studies, and machine learning drug design platforms [
5]. Pharmacophore-based virtual screening is an efficient drug discovery method that swiftly identifies potential drug candidates from large-scale compound libraries by utilizing pharmacophore features to predict the binding affinity of compounds to specific targets [
7]. Structure-based docking is a method that can focus experimental drug screening on the most promising subset of candidate compounds [
8]. ADMET refers to the absorption, distribution, metabolism, excretion, and toxicity of a drug, serving as a vital indicator for evaluating drug success and durability. Suboptimal ADMET properties are a leading cause of drug development failure. Molecular dynamics simulations enhance our understanding of drug-target interactions in complex systems, thus guiding the drug discovery and design processes [
9].
Marine small molecules are a vital foundation for drug research, characterized by novel chemical structures, high biological activity, and promising success rates in drug screening. They hold potential for treating major diseases, enable targeted therapies, promote sustainable resource use, and benefit from policy support and increased investment, fostering interdisciplinary research [
10]. Natural products derived from marine organisms often exhibit high bioactivity, an essential advantage for drug development, which accounts for the superior success rate of marine drug development compared to traditional methods [
11]. A large number of marine-derived compounds with drug potential have been discovered, such as PLK1-PBD inhibitors [
12], USP7 inhibitors [
13], SLC7A11 inhibitors [
14], CDK4/6 inhibitors [
15], and AXL/HDAC2 inhibitors [
16].
In this study, in order to screen novel and active DDR1 inhibitors, we introduced the technique of fragment replacement in addition to classical drug discovery methods, including multi-ligand co-featured pharmacophore modeling, structure-based virtual screening, molecular docking, ADMET, and molecular dynamics. Thanks to the rapid development of organic synthesis methodology and computer science in recent years, such fragment replacement strategies have become more efficient and reliable, which is crucial for drug discovery. Starting from the search for DDR1 inhibitors, we sequentially completed the establishment and validation of pharmacophore models based on the common features of multiple ligands and successfully screened 17 active small molecules from 52,119 marine small molecules using pharmacophore models and structure-based virtual screening (SBVS) strategy, and optimized the small molecule structures and pharmacological properties using backbone relocation technology. Molecules 39713a, 34346a, and 34419a stood out in terms of binding efficacy and pharmacological properties and performed excellently in kinetic simulations. The idea and strategy adopted in this study are shown in
Figure 1.
2. Results
2.1. Establishment and Validation of Pharmacophore Models Based on Common Features of Multiple Ligands
The pharmacophore hypothesis, which is based on the common features of multiple ligands, involves superimposing a set of ligands and extracting the essential characteristics necessary for their biological activity [
17]. This approach provides a valuable technical tool for computer-aided drug design and aids in the screening of novel small-molecule inhibitors. We generated 20 distinct pharmacophore models utilizing the aforementioned data and methods, analyzing the models to reveal that DDR1 inhibitors typically contain motifs such as hydrogen-bond acceptor (A), hydrogen-bond donor (D), and aromatic ring (R). Small molecules exhibiting these three motif characteristics are more likely to be prioritized in the screening process.
Among the evaluation metrics for the 20 pharmacophores, relying solely on the Phase Hypo Score function did not guarantee evaluation accuracy. Therefore, we combined several metrics to conduct a comprehensive assessment of the pharmacophore models based on the common features of multiple ligands, ensuring that the selected models effectively distinguish between active and inactive molecules.
The ratio of active molecules in the test set to small molecules in the decoy set was utilized to calculate the enrichment factor (EF1%) for 1% of known active molecules. A higher EF1% value indicates superior pharmacophore quality. We also employed the ‘Boltzmann-enhanced Discrimination Receiver Operator’ (BEDROC) as an additional indicator for calculating EF1%. Generally, the BEDROC value fluctuates between 0 and 1; a value of 1 indicates ideal pharmacophore screening performance.
The receiver operating characteristic (ROC) curve is a model visualization tool that accurately classifies active and decoy molecules. It is widely accepted that a higher degree of convexity and skewness towards the upper left corner of the curve indicates better predictive performance. In this context, the horizontal axis represents the false positive rate (FP), while the vertical axis represents the true positive rate (TP). Any point on the curve corresponds to specific sensitivity and specificity values in a screening test. As the x-axis approaches zero, the predictive accuracy of the model increases. A larger y-axis signifies greater model sensitivity and more effective screening. When the ROC score is 1, it indicates that the pharmacophore model has excellent predictive capability, with a true positive rate of 100% and a false positive rate of 0%. Conversely, an ROC score below 0.5 suggests a lack of discriminatory ability.
The area under the accumulation curve (AUAC) reflects the diagnostic test’s magnitude; a larger area, closer to 1.0, indicates higher diagnostic accuracy, while values closer to 0.5 suggest lower accuracy. An AUAC value of 0.5 indicates no diagnostic value; however, the AUAC solely reflects the model’s overall performance and is independent of any truncation value.
As shown in
Table 1, we comprehensively evaluated the indicators of the 20 pharmacophores and selected the best model, ADHRR_3, for screening effective DDR1 inhibitors from the marine compound library. This model comprises five pharmacophore features: two aromatic rings, a hydrogen bond acceptor, a hydrogen bond donor, and a hydrophobic group. The enrichment factor for this model was 28.60, with a BEDROC value of 0.48, ROC of 0.95, and AUAC of 0.94 (
Table 1,
Figure 2C). These results indicate that ADHRR_3 outperformed the other 19 pharmacophore models in the comprehensive evaluation, effectively screening active small molecules from extensive small molecule data. To illustrate the discriminative ability of the pharmacophore models for active and inactive molecules, we visualized their binding pattern maps. As shown in
Figure 2A,B, the small molecule 71624791 fits well with ADHRR_3, with a significant portion of it covered by pharmacophore features. In contrast, the small molecule 89884371 is poorly fitted, with almost no segments overlapping with the pharmacophore features, which suggests that the pharmacophore model possesses good discriminatory ability, further validating our choice that the ADHRR_3 pharmacophore model is suitable for screening DDR1 inhibitors in marine compound libraries.
2.2. Virtual Screening Based on Pharmacophore Models Based on Common Features of Multiple Ligands
Pharmacophores refer to the physicochemical features and their spatial arrangement that are essential for the molecular recognition of a ligand by a biomolecule (receptor). These “pharmacophore elements” represent the active sites involved in ligand–receptor interactions, allowing for differentiation between active and inactive small molecules. We utilized the ADHRR_3 pharmacophore to conduct pharmacophore-based virtual screening of 52,119 marine natural small molecules, which were required to exhibit at least four out of the five characteristics of the pharmacophore to qualify for screening. Ultimately, a total of 7797 active small molecules were identified for the next phase of virtual screening.
2.3. SBVS
Given that precision docking (XP) requires intricate intermolecular spatial morphology complementarity and energy matching, which entails significant computational resources and time, it is impractical to screen the 7797 small molecules identified by the pharmacophore model using XP. Therefore, we opted to employ HTVS to rank the affinity of these 7797 small molecules based on several parameters, including docking score, glide score, ligand efficiency, binding energy (both high and low), conformational fit at the binding position, hydrophobic interactions, hydrogen bonding, and van der Waals forces. Ultimately, the top 100 small molecules were selected for the subsequent phase of the study.
2.4. Molecular Docking Before Fragment Replacement
Molecular docking technology, a vital method in computer-aided drug design, is fundamentally a process of mutual recognition between two or more molecules involving spatial and energetic matching. The ligand interacts with the receptor in a manner akin to a lock and key; however, it is essential to recognize that both the receptor and ligand are flexible during molecular docking. This flexibility means that the conformation of the target protein can change throughout the binding process. Furthermore, successful docking requires not only spatial shape matching but also energy compatibility, as indicated by the change in binding free energy (ΔGbind), which determines the feasibility of their interaction [
18].
Although we employed the HTVS method to screen a large number of marine small molecules rapidly, this high-speed screening sacrifices some precision. To mitigate false-positive results, we opted for the precision docking (XP) method, which more accurately predicts the binding modes of small molecules with the target DDR1. At the same time, in order to prove the reliability of the docking tool we used, we docked the DDR1 protein and the confirmed active small molecule DI1 again before the actual docking. The docking score was −12.607 kcal/mol, and the binding mode is shown in
Figure 3. It shows that the target protein and the small molecule DI1 form hydrogen bond interactions and cation–π interactions, etc., similar to previous studies, and the precise binding of the active small molecules and proteins proves the reliability of the docking procedures we use. Thus, it can be used for subsequent screening and validation. We ranked the small molecules based on the calculated docking scores. To ensure accurate docking at the active site of the target, we restricted the binding pocket based on the literature, identifying Glu672, Asp702, Met704, and Asp784 as key active protein residues. The DDR1 inhibitor VU6015929, which has demonstrated activity, was used as a positive control, and positive small molecules were docked alongside 100 small molecules screened by HTVS. During precise docking, we utilized the ‘Ligand Docking’ module in Maestro 11.8, allowing each small molecule to explore up to 30 conformations to identify optimal binding poses. Ultimately, we established a cut-off docking score of −12.330 for the positive compound VU6015929 and identified 17 marine small molecules that exhibited better docking effects than this positive control.
Figure 4 displays the 2D structures of these 17 small molecules for further investigation. In order to further confirm the screening ability of pharmacophores and show the differences between active and inactive molecules in the process of precise docking, we selected the negative small molecule 89884371 screened by pharmacophore for precise docking. Under the same docking conditions, the docking fraction of the negative small molecule 89884371 was −3.196 kcal/mol, which was far lower than the minimum cut-off score we set; this can be seen in the binding mode in
Figure 3C,D. The negative small molecule does not bind strongly to the target protein, DDR1, and there is less interaction of forces, which further demonstrates the reliability of our screen.
2.5. Scaffold Hopping by Fragment Replacement
The primary goal of scaffold hopping is to enhance the physicochemical and pharmacological properties of the original drug and to develop compounds with entirely new intellectual property rights [
19]. This drug design approach allows for the rational replacement of the core structure of known active compounds to create new molecules with similar three-dimensional structures to the parent compounds, potentially offering superior drug properties and increased affinities. In this study, we utilized Discovery Studio 2019 (DS) to perform scaffold hopping on 17 marine small molecules. We selected DS’s built-in fragment library, which contains 1,495,478 compound fragments that adhere to the three-fold rule (fragment molecular weight < 300, lipid–water partition coefficient, and number of hydrogen bond donors/acceptors < 3). This rule effectively limits the structural complexity of fragment molecules, ensuring that the small molecule fragments used in the substitution exhibit better water solubility. Additionally, the rule helps control the size, number of hydrogen bonds, and flexibility (number of rotatable bonds) of the fragments, allowing for potential structural modifications at later stages. During the scaffold hopping process, we analyzed regions where small molecules exhibited relatively weak binding to target proteins based on molecular docking results. Fragments that demonstrated weak interactions with the receptor were replaced, resulting in new small molecules with stronger affinities. Given the extensive number of fragments in the library, this process generated numerous new molecules. We then applied ADMET analysis, interaction force analysis, docking scores, and other evaluation criteria to screen these molecules, ultimately identifying three small molecules with optimized structures and strong drug-like properties. As shown in
Table 2, we present before-and-after comparison diagrams for these three small molecules, highlighting significant improvements in their affinities and drug-like properties following scaffold hopping. This process not only enhanced drug properties, including membrane permeability, solubility, and uptake, but also improved the binding affinity of the small molecules for the target protein DDR1, offering a more effective option for screening DDR1 inhibitors.
2.6. Docking Analysis
To further verify the reliability of the structures of the three selected small molecules (39713a, 34346a, 34419a) and the positive control (VU6015929), we conducted precision docking of these molecules using Schrödinger Suite 2019 software. The optimized structures of the three small molecules yielded results that were similar to, or better than, the positive control compound VU6015929 in precision docking. As shown in
Figure 5A–D, we visualized the docking results of the three small molecules using PyMOL 2.5.0 software. The 3D structural representations indicate that compounds 39713a, 34346a, 34419a, and VU6015929, along with the positive control, formed hydrogen bonding interactions with residue Met704. Additionally, compounds 39713a and 34419a exhibited aromatic hydrogen bonding interactions with residue PHE785. In order to better highlight the activity of the optimized small molecule, we also demonstrated the binding effect of the positive small molecule VU6015929 to the compound in
Figure 5D and confirmed the superior binding affinity of the three small molecules through comparison.
These interactions between the small molecules and target proteins were largely consistent with the docking results obtained from Maestro. In order to illustrate more clearly the conformation and interactions of small molecules within the binding pocket, we utilized the “Ligand Interaction” module in Maestro 11.8 to generate a two-dimensional diagram depicting the binding interactions between the small molecules and target proteins, as shown in
Figure 6A–D. The optimized structures of these three small molecules demonstrate potential as candidate compounds.
2.7. ADMET
ADMET properties refer to the five key processes of a drug within the body: absorption, distribution, metabolism, excretion, and toxicity [
20,
21]. Collectively, these properties determine the ultimate efficacy and safety of a drug. In this study, Discovery Studio 2019 software was utilized to analyze the drug-like properties of small molecules following a leapfrogging approach. Three small molecules that outperformed the positive controls in terms of both affinity and drug-like properties were identified, as depicted in
Figure 7. This figure illustrates two series of ellipses indicating the 95% and 99% confidence intervals for the human intestinal absorption (HIA) model. All three small molecules demonstrated favorable human intestinal absorption, with predicted values falling within the 99% confidence intervals of the HIA models. Additionally, property data regarding the ADMET profiles of the three small molecules and the positive controls are presented in
Table 3. A comparison of the ADMET properties reveals that compounds 39713a, 34346a, and 34419a outperform the positive controls in terms of absorption rate, mutagenicity, and hepatotoxicity, suggesting their significant potential for future drug development. At the same time, we also explored the negative compounds, and the data showed that the drug-like properties of the negative compounds were significantly worse than those of the positive compounds, which further supported our screening results.
2.8. Molecular Dynamics Simulations
The potential impact of pKa values of proteins on the process of MD simulations on the functional and structural stability of proteins was analyzed before we performed molecular dynamics simulations; the results are shown in
Supplementary Table S1. The results show that no protonation of amino acids is required in our pH-neutral simulation system.
Root mean square deviation (RMSD) is a measure of the overall structural change of a protein or other molecule relative to its initial structure during a simulation. As shown in
Figure 8A, during the 100 ns molecular dynamics simulation, the RMSD values of the protein–ligand complexes mainly fluctuated below 0.4, indicating that these ligand–protein complexes exhibited high stability throughout the simulation. Among them, during the simulation, our three candidate compound systems reached the equilibrium state with lower values at the 10th ns, which is closer to the positive control DI1. As shown in
Supplementary Figure S1, RMSD analyses of the small molecules themselves indicate that the small molecules in the four simulated systems remained at a low level of less than 0.2 nm throughout the 100 ns simulation, reflecting the stability of the small molecules in the simulation. Among them, the RMSD of 34346a was almost always lower than that of the positive control compound DI1 in the same period, which may reflect its good binding stability, while the RMSD of 34419a and 37913a were slightly higher than that of the positive control, but also remained at an acceptable low level of less than 0.2 nm. The RMSD values of the positive compound DI1 and our three candidate compounds at simulated equilibrium are, overall, less than that of molecule 89884371, which reflects the rationality of our screening. In conclusion, all three protoprotein–ligand complexes reached a steady state during the 100 ns molecular dynamics simulation.
Root mean square fluctuation (RMSF) quantifies the range of fluctuation of each atom in a molecule. As depicted in
Figure 8B, the standard deviation (SD) schematics of the five ligand complexes overlap significantly, indicating that the proteins underwent no major structural changes during the simulation, thus validating the integrity of our simulations. The positive control compound and the three candidate compounds had similar RMSF overlaps with the system composed of proteins during the simulation, reflecting the reasonable selection of candidate compounds. During RMSF analysis, we found that the negative control and our candidate compounds and positive control compounds had similar results, which may be due to the properties of the proteins themselves. Throughout the simulation, we observed that the RMSF of each protein segment remained below 0.9, reflecting low fluctuation and indicating that each residue possesses high stability.
The radius of gyration (Rg) of a protein measures the degree of structural compactness within a molecule. As shown in
Figure 8C, we analyzed the radius of gyration over the entire simulation process. The results indicate that the radius of gyration fluctuated below 2.15, suggesting that the overall structure of the protein is highly compact. In the 37913a molecule–protein system, the radius of gyration of the protein was lower than that of the positive control and the other two candidate compounds for almost the entire simulation, reflecting the more compact spatial configuration of the protein. In general, the overall structure of the proteins in the three candidate compounds and the protein system remained more compact.
In order to reduce the chance bias that may be introduced when relying only on a single kinetic trajectory for the overall potential analysis of the system, in this study, three repetitions of the molecular dynamics simulation were carried out for each simulated system, and the average of the three simulations was analyzed in the overall potential analysis of the system. As presented in
Figure 8D, the overall potential energy of the complex system formed by molecules 34419a, 37913a, 34346a, and DI1, along with the proteins, fluctuated at around 303 kJ/mo. In three independent iterations of the simulation, the average values of the overall potential energy of the system for 34419a, 37913a, 34346a, and DI1 are 303.1174745, 303.1019262, 303.0737617, and 303.0967909, respectively. As a comparison, the negative control small molecule 89884371 has an overall potential energy average of 303.0785536, which indicates that although this molecule was not selected by us, it still demonstrates some stability in molecular dynamics simulations. This analysis reflects structural stability from an energy standpoint.
As illustrated in
Figure 9, we monitored the interactions between the protein and the ligand during the simulation.
Figure 9A–C depict the hydrogen bonding interactions, revealing a consistently high number of hydrogen bonds throughout the simulation, particularly for molecules 34419a and 34346a. We defined interactions based on atom distances of less than 0.35 nm, which is reflected in the overall interaction counts presented in
Figure 9D–F. The results indicate that all three ligands and their corresponding proteins maintained a high number of interactions throughout the simulation, with molecule 34419a demonstrating especially notable stability.
PCA can help researchers understand the conformational changes and dynamic behavior of proteins and small molecule complexes. As shown in
Figure 10, we calculated the molecular dynamics mode trajectories for PCA treatment. As shown in
Figure 10D,H,L, the molecular dynamics simulations of all three ligand–protein systems can be well explained by the first three principal components. As shown in
Figure 10A–C,I–K, the first three components of the 34419a–protein system and the 34346a–protein system can describe 67.39% and 65.85% of the motions of the systems, respectively. As shown in
Figure 10E–G, the first three components of the 37913a–protein system can describe only 42.23% of the motion of the system, i.e., the intramolecular motions are relatively fine-grained, and there are no large motions similar to protein folding or conformational transitions.
The MM-PBSA method is a computational method used to estimate molecular binding free energy. As shown in
Table 4, we calculated the binding energies of 34419a, 37913a, and 34346a binding to the target proteins. In three independent kinetic simulations for each system, the binding energy of 34419a remained below −101.801 kJ/mol and the binding energy of 34346a remained below −93.258 kJ/mol; these are very low binding energies. 37913a had a slightly higher binding energy than 34419a and 34346a, but this also remained below −72.753. Thus, the stability of the binding is reflected in terms of the binding energy.
We performed three independent replicates of residue decomposition binding energy calculations for each candidate compound–protein system and averaged the values for analysis. We calculated the energy breakdown of protein residues in different systems. As shown in
Figure 11A–C, 34419a, 37913a, and 34346a bind to the protein, with most of the residues having a low binding energy, reflecting the stability of protein–ligand binding.
3. Discussion
DDR1 is a gene that encodes a receptor tyrosine kinase [
22] that is central to the initiation of signaling and plays a key role in promoting cell differentiation, proliferation, apoptosis, and migration [
23,
24]. Additionally, it regulates extracellular matrix homeostasis and remodeling and contributes to pathological states such as cancer, fibrosis, and inflammation [
25]. DDR1 is predominantly expressed in epithelial cells across various tissues, where it not only induces the secretion of inflammatory cytokines but also amplifies its effects through stimuli such as pro-inflammatory cytokines or bacterial products [
26]. In vivo inhibition of DDR1 expression has demonstrated significant therapeutic protection against DSS-induced colitis [
4]. This effect may be attributed to DDR1 inhibitors blocking the activation of DDR1, thereby preventing apoptosis in intestinal epithelial cells and inhibiting the NF-κB-MLCK-P-MLC2 signaling pathway [
3]. Consequently, this reduces the expression of tight junction (TJ) proteins, including ZO-1 and occludin, thereby maintaining the integrity of the intestinal barrier and decreasing the occurrence of ulcerative colitis [
27]. Therefore, DDR1 may serve as a novel target for the treatment of this condition.
In this study, we assessed the current landscape of global drug and antibody development targeting DDR1 and found that the target DDR1 has a research history of more than 20 years; however, due to its high toxicity to cells, poor drug selectivity, and other reasons, researchers have been hindered from conducting more in-depth research. At the same time, similar studies in the past mainly used traditional drug screening methods. Additionally, there is no partial modification of natural compounds, which increases the limitations of the research; thus, we used traditional computer-aided drug screening technology based on current progress in computer-aided drug design to develop new research ideas for efficient screening and design of potential inhibitors of protein DDR1. We utilized computer technology and relevant software to assist in the molecular design, optimization, and screening of drugs that are highly significant for clinical treatment of ulcerative colitis [
4]. The oceans, which cover more than 70% of the Earth, contain some of the richest and most diverse organisms on the planet and are a treasure trove for natural product chemistry research. Marine organisms, including sponges, corals, algae, mollusks, fish, and microorganisms, have evolved complex survival strategies under unique environments characterized by high salinity and pressure [
28]. These adaptations include the production of a variety of natural compounds with unique structures and biological activities used for defense against predators, competition for space, or chemical communication with other species. Such natural compounds are increasingly becoming the focus of drug discovery and clinical trials, leading to the development of products with unique chemical structures, significant biological activity, and high medicinal value [
29].
To further exploit marine resources, we integrated three databases related to marine natural products, compiling a total of 52,119 small molecules in the search for active DDR1 inhibitors. In recent years, researchers have identified several DDR1 inhibitors with varying selectivity and demonstrated their therapeutic potential in various in vivo models. However, major concerns regarding selectivity, pharmacokinetic properties, mutation resistance, and safety have been raised, resulting in the absence of selective DDR1 inhibitors in clinical studies to date. Thus, there is an urgent need to develop specific DDR1 inhibitors using various drug discovery tools. In this study, we collected 85 active DDR1 inhibitors from BindingDB and analyzed the data to develop 20 pharmacophore models based on the common features of multiple ligands. Using the bait and test sets, we calculated various metrics to evaluate the pharmacophore models and performed a comprehensive assessment. The best model, ADHRR_3, excelled across several metrics, including ROC and AUAC, indicating its superior screening performance in distinguishing active DDR1 inhibitors from a large database of small molecules. We then applied the pharmacophore model to screen an integrated marine compound library and conducted high-throughput screening and precise docking of the screened small molecules. These advanced computer screening techniques not only significantly reduce false-positive results but also accurately predict the binding patterns and modes of interaction between small molecules and their targets. This approach enabled us to select 17 marine compounds that exhibited superior properties compared to the positive compounds.
Subsequently, we performed fragment replacement on these 17 small molecules and conducted further precise docking and ADMET analyses. We compared docking scores, interaction effects, and drug-like data with those of the positive compound VU6015929, revealing that the molecules 39713a, 34346a, and 34419a demonstrated higher activity and favorable drug-like properties. Finally, the stability of these three small molecules during their interaction with target proteins was evaluated through molecular dynamics simulations. The results indicated that 39713a, 34346a, and 34419a possess ideal binding characteristics and drug-like properties, establishing them as promising candidates for the development of DDR1 kinase inhibitors.
4. Materials and Methods
Effective performance of the tools is essential for screening lead compounds. To enhance the credibility of the results, we utilized several well-known software packages, including Schrödinger 2019, PyMOL 2.5.0, Discovery Studio 2019, and GROMACS 2019. Specifically, the Schrödinger Suite 2019 (Schrödinger, Inc., New York, NY, USA) was employed for protein preparation, small molecule preparation, lattice generation, pharmacophore fitting and validation, pharmacophore screening, virtual screening, and molecular docking. Additionally, Discovery Studio 2019 was used for fragment replacement of small molecules, PyMOL 2.5.0 for visualizing protein–ligand complexes, and GROMACS 2019 for molecular dynamics simulation studies.
4.1. Protein Preparation
The crystal structure of the human DDR1 kinase domain in complex with DDR1-IN-1 was obtained from the Protein Data Bank (
https://www.rcsb.org/, accessed on 29 March 2024) at a resolution of 2.2 Å (PDB Code:4CKR) [
30]. This single-chain protein contains a potent inhibitor that has been extensively validated. The crystal structure was processed using the “Protein Preparation Wizard” module in Maestro 11.8 [
31], which involved removing water molecules and crystalline residues (EDOs), adding missing residues and hydrogen atoms, and protonating the prepared protein structure at pH 7.0. To refine the protein structure, energy minimization was performed using a conjugate gradient method based on the OPLS3e force field for improved simulation accuracy.
4.2. Ligand Preparation
Studies have demonstrated that unique environmental factors, such as high salinity, high pressure, weak alkalinity, and low temperatures, contribute to the formation of many active substances in the ocean that differ from those found in terrestrial organisms. Some of these substances extracted from marine organisms exhibit antitumor, antithrombotic, and antimicrobial effects, revealing potential for screening inhibitors of DDR1 activity with novel structures. To enhance the comprehensiveness of marine compound sources, we integrated three databases related to marine natural products: (a) Marine Natural Products Database (MNPD) [
32]; (b) Comprehensive Marine Natural Products Database (CMNPD) [
33]; and (c) Seaweed Metabolism Database (SWMD) [
34]. We aimed to utilize these valuable oceanic resources to screen for new DDR1 inhibitors. Subsequently, we employed the “LigPrep” module in Maestro 11.8 to optimize the integrated small molecules, generating 3D structures and their isomers in corresponding low-energy states. This process included (1) geometrical optimization of all small molecule structures at pH 7.0 ± 2.0, resulting in 3D structures, and (2) optimization of the 3D structures using the OPLS3e force field for energy minimization.
4.3. Compound Dataset Preparation
We conducted a thorough search for known DDR1 inhibitors with IC50 activity in the publicly accessible database BindingDB (
https://www.bindingdb.org/rwd/bind/index.jsp, accessed on 18 April 2024). The BindingDB database contained a total of 1896 potential DDR1 inhibitors. We used IC50 values equal to 10 nM and 1000 nM as the dividing lines, defining small molecules with IC50 values less than or equal to 10 nM as active DDR1 inhibitors and small molecules with IC50 values greater than or equal to 1000 nM as inactive small molecules. We selected the active small molecules and removed duplicated data for small molecules in the database. Using these conditions, we successfully identified 85 active small molecules and 13 inactive small molecules. Additionally, we gathered 1150 bait molecules in SMILES format for the validation of pharmacophores using the online resource DUD-E (
http://dude.docking.org/, accessed on 26 April 2024). Subsequently, we employed StoneMND Collector (StoneWise, Beijing, China;
https://stonemind.stonewise.cn) to convert the small molecules collected in SMILES format to SDF format. These small molecules were processed using the ‘LigPrep’ module in Schrödinger, where all structures were desalted at pH 7.0 ± 2.0 using the OPLS3e force field and the Epik module, resulting in 85 small ligands with 3D structures in their corresponding low-energy states. The 85 compounds were then randomly divided into a training set and a test set in an 8:2 ratio, yielding 68 compounds for the training set and 17 for the test set. The training set was utilized to generate pharmacophore models, while the test set was employed to assess the predictive ability of these models.
4.4. Generation and Validation of Pharmacophore Models Based on Common Features of Multiple Ligands
Pharmacophore modeling, based on the common features of multiple ligands, highlights the spatial complementarity between small molecules and target proteins by analyzing the physicochemical characteristics necessary for ligand recognition by target protein molecules and their spatial arrangement. This approach provides guidance for small molecule screening, which was performed in this study using Schrodinger’s “PHASE” panel. In this study, the “PHASE” panel was employed to fit the collected small molecules; it has developed into a powerful platform for pharmacophore development that offers comprehensive solutions and services for life sciences. The platform includes a range of tools and procedures, from pharmacophore modeling to drug validation and screening. Based on this procedure, we classified the collected small molecules, defining those less than or equal to 10 nM as active and those greater than or equal to 1000 nM as inactive. We successfully classified the 98 collected DDR1 inhibitors into 85 active and 13 inactive molecules. Concurrently, we divided the active small molecules into experimental and validation groups for the fitting and validation of the pharmacophore. We organized these small molecules into clusters, capturing their presumed bioactive conformations and the structures contributing to their activities in three-dimensional space for structure–activity studies. This was combined with conformational analysis and molecular superposition to determine their optimal conformations and common features. We obtained a pharmacophore based on the common characteristics of these ligand molecules using the software’s built-in phase-low scoring function as an evaluation index. The characteristics of the pharmacophore were set as Acceptor (A), Donor (D), Hydrophobic (H), Negative Ion (N), Positive Ion (P), and Aromatic Ring (R), with a count of 0 to 3 and an allowable deviation of 2.0. Additionally, we required that at least 50% of the active molecules match the pharmacophore hypothesis, restricting each model to contain 4 to 6 pharmacophore feature groups and allowing a deviation from the standard of 0.5. These conditions facilitated the retrieval of as many common features of the active molecules as possible, leading to the generation of robust pharmacophore models.
4.5. Validation of the Model
The pharmacophore generation step often yields multiple pharmacophore models that must be validated empirically or with data. Invalid models are removed, and the retained models undergo further optimization to identify the best resulting pharmacophore model for database screening. To ensure that the pharmacophore accurately identifies active small molecules from the marine compound library, we analyzed models from a validation set composed of an active small molecule dataset (17 small molecules) combined with a decoy set (1150 small molecules). Each of the 20 generated pharmacophore models was used to screen the validation set, determining the degree of enrichment during the screening process. We employed the enrichment factor for recovering 1% of the known actives, Boltzmann-enhanced discrimination receiver operating characteristic area under the curve, receiver operating characteristic area under the curve, and area under the accumulation curve, which formed the basis for selecting the pharmacophore hypothesis.
4.6. Virtual Screening Based on Pharmacophore
Pharmacophore-based virtual screening is a computational chemistry approach that utilizes computer technology and bioinformatics tools to rapidly identify potentially biologically active candidate molecules from a large library of compounds, prioritizing and limiting the number of structures selected for experimental synthesis. This technique plays a crucial role in drug discovery, including the identification and optimization of lead compounds and drug design. In this study, we employed the pharmacophore model ADHRR_3 using the “Phase Ligand Screening” tool within the “Phase” module of Maestro 11.8 to virtually screen a library of 52,119 marine compounds. This approach aimed to identify small molecules with pharmacophore profiles and to isolate potential DDR1 inhibitors in the database, thereby facilitating the subsequent step of structure-based virtual screening (SBVS).
4.7. Structure-Based Virtual Screening
The theoretical basis of structure-based virtual screening (SBVS) is the lock-and-key theory. According to the three-dimensional structure of the target protein, small molecules are sequentially placed into the binding site of the receptor protein through molecular docking. The ligand conformation and position are continuously optimized to achieve the optimal binding state with the receptor, allowing for the determination of the binding conformation of the small molecule and the target. The binding capacity of the target and small molecule compounds is evaluated based on an affinity scoring function related to the binding energy. Compounds with more favorable binding modes and higher prediction scores are selected for subsequent bioactivity testing. To screen for active small molecules in the marine compound library, we utilized the receptor grid generator tool in the “Glide” module of Maestro 11.8 to calculate the binding pocket of the receptor DDR1 (PDB:4CKR). The van der Waals radius of the receptor atoms was scaled according to the molecular weight of the small molecules, and the compounds were docked flexibly. We employed the high-throughput virtual screening (HTVS) method to evaluate the binding ability of all compounds that successfully passed the pharmacophore screening. To enhance scoring accuracy, we allowed the conformation of the small molecules to be altered to reflect the interaction forces, enabling the “Glide” program to control conformational flexibility through extensive conformational searches, thereby eliminating inappropriate molecular conformations. We calculated the binding pocket of the receptor DDR1 (PDB:4CKR) using Maestro 11.8, scaled the van der Waals radius of the receptor atoms according to the molecular weight of the small molecules, and docked the compounds flexibly. We used the HTVS method to evaluate the binding ability of all compounds that passed the pharmacophore screening successfully. To make the scoring more accurate, we allowed the conformation of the small molecules to be altered so that the ligand’s attitude could be varied according to the interaction forces, and through these conditions, the “Glide” program can control the conformational flexibility through extensive conformational searches with the aim of eliminating inappropriate molecular conformations.
4.8. Molecular Docking
To further screen the DDR1 inhibitors, the 100 small molecules identified in this study underwent precision docking (XP), a computational chemistry method that simulates, at the atomic level, the binding of small molecules to biomolecular surfaces. This technique predicts possible binding modes and capacities, thereby assessing drug activity and selectivity while elucidating the fundamental biochemical processes underlying drug–protein interactions. The point that needs to be emphasized is that the good performance of molecular docking tools is necessary for structural screening and analysis of protein–ligand interaction forces. In order to undertake the molecular docking process more rigorously, before true covalent docking, the DDR1 protein (PDBID: 4CKR) is used to re-dock its eutectic ligand DI1 (PuChem cid:71664577) to the active site of the target with additional precision (XP). The analytical structure of the protein and the binding activity of the eutectic ligand have been confirmed by the researchers; by docking again and analyzing the docking fraction and binding mode, we can evaluate the performance of the molecular docking tool and decide whether to use the integration tool for the next step of screening and validation [
30]. In this study, we utilized the “Receptor Grid Generation” module in Maestro 11.8 to create binding pockets at the receptor’s active site and subsequently docked the small molecules using the docking tool [
35]. Asp702, Met704, and Asp784 were designated as binding sites. To enhance accuracy, we employed flexible docking and introduced the positive compound VU6015929 (a known active DDR1 inhibitor) to compare the docking scores and effects of the 100 marine compounds with the reference compound through precision docking. Small molecules exhibiting similar or superior results were selected for further fragment replacement analysis. Finally, the small molecules obtained from fragment replacement underwent additional precision docking and drug-like analysis (ADMET) to identify improved candidates for further study.
4.9. Scaffold Hopping by Fragment Replacement
Scaffold hopping is a widely employed drug modification strategy in both academia and industry [
19]. This approach has evolved due to significant advancements in organic synthesis methodologies and computer science in recent years, enhancing the efficiency and reliability of synthesizing and rationally designing scaffold hopping analogs, which is crucial for drug discovery. The chemical structure of small molecule drugs typically comprises three components: a ring, a linker, and a side chain, with the continuous combination of the ring structure and linker referred to as the molecular backbone. We utilize visualization of the interaction forces between small molecules and receptors to identify fragments of small molecules that do not contribute to binding affinity. By judiciously replacing the core backbone of the drug, we can generate numerous new compounds with spatial structures similar to the original drug but with differing potencies. This approach provides greater opportunities to discover small molecules with favorable potency and high drug-like properties. Finally, through precise docking studies involving ADMET, we can screen for small molecules exhibiting enhanced drug-like properties and greater activity.
4.10. ADMET
ADMET is a comprehensive study of drug absorption, distribution, metabolism, excretion, and toxicity [
36]. The evaluation of ADMET properties can effectively address the challenges associated with the poor drug characteristics of screened small molecules. This evaluation significantly enhances the success rate of drug development, reduces development costs, lowers the incidence of drug toxicity and side effects, and guides the rational use of drugs in clinical settings. Therefore, ADMET pharmacokinetic methods are essential in contemporary drug design and screening. In this study, we conducted an ADMET analysis of over one thousand small molecules generated after backbone jumping using Discovery Studio 2019. Additionally, we further screened active DDR1 inhibitors by predicting blood–brain barrier permeability (BBB), water solubility, intestinal absorption, and hepatotoxicity of the small molecules in conjunction with precise docking results.
4.11. Molecular Dynamics Simulation
Molecular docking methods primarily assess the theoretical binding affinity of a compound to a receptor under idealized, independent conditions; thus, favorable docking results do not fully characterize the target binding capability of a lead compound in realistic scenarios. Molecular dynamics (MD) simulations are frequently employed to monitor the stability of protein–ligand binding systems within simulated environments at specific temperatures, pressures, and salt concentrations. Firstly, we calculated the pKa of proteins using PropKa On-line (
https://www.ddl.unimi.it/vegaol/propka.htm, accessed on 15 August 2024) [
37] to explore the potential impact of pKa values on protein function and structural stability. Accordingly, we calculated the conformational fluctuations of complexes formed by the binding of three ligands to target proteins over a duration of 50 ns. The system’s stability was analyzed based on the conformational fluctuations of the ligands, the solvent-accessible surface area, the protein radius of gyration, and the overall potential energy of the system. Initially, mol files for ligands and PDB files for receptor proteins were generated and exported from the Discovery Studio platform. The ligand topology was constructed using the GAFF force field via the ACPYPE online server [
38] (
https://www.bio2byte.be/acpype/, accessed on 26 August 2024). The 2019 version of the GROMACS 2019 [
39] was employed to construct topological files and perform MD calculations for proteins, utilizing the AMBER99SB-ILDN force field and the TIP3P water model. A cubic box with a radius of 1.2 nm was created to accommodate the topological system of the protein–receptor complex, which was populated with the SPC216 water model to simulate an aquatic environment. Appropriate amounts of sodium and chloride ions were added to the solvent system to neutralize the charge. Following the successful construction of the simulated system, 50,000 steps of energy minimization were conducted at a temperature of 300 K. Firstly, we performed 50,000 steps of energy minimization calculations at a simulated temperature of 300 K. Subsequently, the system was equilibrated for receptors, ligands, and solvents under constant temperature and constant volume (NVT) and constant temperature and constant pressure (NPT) conditions, with equilibration durations of 25 ps and step sizes of 25,000 steps. Van der Waals interactions during equilibration were based on cut-off values. Finally, MD simulations of the system were executed for a duration of 100 ns. Finally, we performed periodic corrections on the output trajectory files. The root mean square deviation (RMSD) and root mean square fluctuation (RMSF) of the atomic positions were analyzed, along with the radius of gyration (Rg), the total potential energy variation curve, and the number of hydrogen bonds for each system. In order to avoid chance errors caused by a single kinetic simulation on the analysis of the thermodynamic results, we performed three independent kinetic analyses for each simulated system when calculating the overall potential of the system and averaged the results of the three simulations for analysis.
We used the R language package Bio3D [
40] for PCA of the simulated trajectories. Firstly, the molecular dynamics simulation trajectories were converted into DCD format, and then PCA processing and visualization of the data were performed by Bio3D.
The molecular mechanics/Poisson–Boltzmann surface area method (MM-PBSA) is widely used to calculate the free energy of receptor–ligand binding. We obtained trajectory text, topology text, and index files from molecular dynamics simulations. First, we extracted the last 90 to 100 ns of the trajectory number from molecular dynamics simulations for the calculation. We used g_MMPBSA [
41,
42] for MM-PBSA calculations. During the calculations, we set the dielectric constant of the solute to 2 and simulated a temperature of 300 K to calculate the van der Waals forces, the Coulomb interaction energy, the polar solvation energy, and the non-polar solvation energy. The binding energy was then calculated using the following Equation (1):
In the above equation, Gcomplex, Gprotein, and Gligand are the free energy of the protein–ligand complex, protein free energy, and ligand free energy, respectively. ∆EMM represents the energy of molecular mechanics, ∆Gpolar represents the energy of polar solvation, and ∆Gnonpolar represents the energy of non-polar solvation.
For each small-molecule–protein system, we performed three independent MM-PBSA analyses and analyzed the results by averaging the results to avoid chance errors caused by single simulations of binding energy.