Next Article in Journal
Unraveling Novel Strategies in Mesothelioma Treatments Using a Newly Synthetized Platinum(IV) Compound
Next Article in Special Issue
Revitalizing Bacillus Calmette–Guérin Immunotherapy for Bladder Cancer: Nanotechnology and Bioengineering Approaches
Previous Article in Journal
Advice to the US FDA to Allow US Pharmacopeia to Create Biological Product Specifications (BPS) to Remove Side-by-Side Analytical Comparisons of Biosimilars with Reference Products
Previous Article in Special Issue
Amorphous Polymer–Phospholipid Solid Dispersions for the Co-Delivery of Curcumin and Piperine Prepared via Hot-Melt Extrusion
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Human Feedback Strategy for Photoresponsive Molecules in Drug Delivery: Utilizing GPT-2 and Time-Dependent Density Functional Theory Calculations

1
Faculty of Medicine, Imperial College London, London SW7 2AZ, UK
2
School of Chemistry and Chemical Engineering, Ningxia University, Yinchuan 750014, China
3
Bioengineering Department and Imperial-X, Imperial College London, London W12 7SL, UK
4
College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
5
National Heart and Lung Institute, Imperial College London, London SW7 2AZ, UK
6
Cardiovascular Research Centre, Royal Brompton Hospital, London SW3 6NP, UK
7
School of Biomedical Engineering & Imaging Sciences, King’s College London, London WC2R 2LS, UK
*
Authors to whom correspondence should be addressed.
Pharmaceutics 2024, 16(8), 1014; https://doi.org/10.3390/pharmaceutics16081014
Submission received: 28 May 2024 / Revised: 11 July 2024 / Accepted: 19 July 2024 / Published: 31 July 2024
(This article belongs to the Special Issue Advanced Materials Science and Technology in Drug Delivery)

Abstract

:
Photoresponsive drug delivery stands as a pivotal frontier in smart drug administration, leveraging the non-invasive, stable, and finely tunable nature of light-triggered methodologies. The generative pre-trained transformer (GPT) has been employed to generate molecular structures. In our study, we harnessed GPT-2 on the QM7b dataset to refine a UV-GPT model with adapters, enabling the generation of molecules responsive to UV light excitation. Utilizing the Coulomb matrix as a molecular descriptor, we predicted the excitation wavelengths of these molecules. Furthermore, we validated the excited state properties through quantum chemical simulations. Based on the results of these calculations, we summarized some tips for chemical structures and integrated them into the alignment of large-scale language models within the reinforcement learning from human feedback (RLHF) framework. The synergy of these findings underscores the successful application of GPT technology in this critical domain.

1. Introduction

Smart drug delivery systems, which have gained significant attention in pharmaceutical research, enhance patient health by ensuring targeted therapeutic delivery [1]. In recent decades, artificial intelligence (AI) has demonstrated its capability to address complex challenges within pharmaceutical research, particularly in smart drug delivery, from the observable to the micro-/nanoscale [2,3,4]. These advanced drug delivery systems are designed to be self-regulating, responding to a range of stimuli associated with disease pathology [5,6]. Light stands out as an external actuation method for therapeutic applications due to its non-invasive characteristics, stability in biological settings, adjustable intensity, and unparalleled temporal and spatial precision [7,8,9,10,11,12].
It is widely used in smart delivery systems for various therapeutics, with different wavelengths ranging from ultraviolet (UV, 100–400 nm) to visible (400–750 nm) and near-infrared (NIR, 750–2000 nm) light eliciting different responses [13]. Short-wavelength light, or UV light, has enough energy to alter chemical bonds and configurations, such as breaking covalent bonds or changing cis–trans conformations, effectively triggering drug delivery mechanisms [14,15]. Owing to these advantages, UV light is frequently used as a stimulus in diverse research and applications [16,17].
It is expected that deep learning will also be used in the solutions of more problems in drug delivery, including the design of stimulus-responsive molecules [4,18]. The prediction of temporal dynamics in drug delivery has been accomplished through the application of convolutional neural networks and long short-term memory networks [19]. When deep learning is used to solve molecular design problems in chemistry and materials, the lack of specialized datasets often leads to task failure. Advancements in pre-training and fine-tuning methods for large language models (LLMs) have facilitated the creation of chemical molecules [20,21]. The potential utilization of the generative pre-trained transformer for designing light-responsive molecules is significant, and there is currently limited research in this specific domain.
Causal language models such as GPT, GPT-2, and GPT-3 are trained to calculate/predict the probability of the occurrence of several words given all preceding words, making them ideal for text generation [22,23,24]. After character-level RNNs and masked transformer language models were used to capture Simplified Molecular Input Line Entry System (SMILES) language patterns [21,25], Sanjar Adilov constructed a GPT-2-like language model to learn SMILES representations and transfer knowledge to downstream molecular generative tasks [26]. In subsequent studies, GPT has attracted the attention of more researchers [27,28]. Additionally, instruction tuning using human experience to enhance large models has proven to be a highly effective method for improving the quality of generated content. This approach includes the use of Direct Preference Optimization (DPO) algorithms [29], Contrastive Preference Optimization (CPO) algorithms [30], and others. Recently, researchers have also proposed Kahneman–Tversky Optimization (KTO) algorithms to simplify dataset preparation for this purpose [31]. Our integration of GPT-2 with these techniques for applications in light-responsive drug delivery represents a very promising and meaningful exploration.
The open-source dataset QM7b, provided by the Quantum Machine Project, encompasses a variety of physicochemical properties of molecules [32,33]. In order to ultimately achieve the generation of light-responsive drug delivery molecules, these data were used to fine-tune the pre-trained language model [33,34]. Among the physicochemical properties in the dataset are the excitation energies of the molecules. Our UV-GPT inherits the transformer structure from GPT-2 and utilizes the tokenizer from SMILES-GPT, facilitating the downstream generation of UV-responsive molecules. By integrating predictive modeling and TDDFT calculations, we discovered that the fine-tuning UV-GPT model generates molecules with UV photoresponsive properties. The molecules generated were positively influenced by the pre-trained SMILES-GPT, as evidenced by the statistics on drug-like properties and synthesizability. This evidence shows that our application of the combined GPT and TDDFT calculations in designing stimulus-responsive molecules holds practical value for drug delivery. Additionally, we explored various implementations of RLHF, utilizing structural knowledge generated by computational chemistry as human feedback to enhance the quality of the generated content. However, our current model does not fully integrate the chemists’ extensive theoretical knowledge of chemistry.

2. Materials and Methods

Pre-trained language model and adapter tuning
We trained our model for 6 epochs on the SMILES strings of the QM7b datasets [32,33]. We used AdamW for optimization and cosine annealing for learning-rate scheduling. The initial and final learning rates were set to 5 × 10 4 and 5 × 10 8 , respectively. We kept the default Adam hyperparameters and optimized the batch size (128) and maximum sequence length (512).
Our SMILES tokenizer was pre-trained on SMILES-GPT. This transformer decoder replicates GPT-2 except that during tokenization, it used the character-level byte-pair encoding instead of byte-level encoding. We reserved 72 characters from the SMILES alphabet as an initial vocabulary and supplemented the vocabulary with up to 1000 of the most frequent merges. The model uses parameterized token and position embeddings, 8 attention heads, and 4 attention blocks. With the embedding/hidden dimension of 512, it has 13.4 M parameters.
Prediction model for excitation energy
The Coulomb matrix of the molecule was generated using DeepChem computation. The training set and test set were divided in an 80:20 ratio. Implementation of Support Vector Regression (SVR) with a quantum kernel was achieved with sklearn and qiskit, while SVR with alternative kernels was implemented solely with sklearn.
Drug-likeness and evaluation metrics for generative molecules
Drug-likeness is a consideration when evaluating the generative molecules for photoresponsive drug delivery. We utilized the quantitative estimate of drug-likeness (QED) as a metric, as introduced in [35]. The QED metric yields a numerical score ranging from 0 to 1, where elevated scores correspond to an increased probability of drug-likeness.
The Synthetic Accessibility Score (SAscore) [36], used to assess the ease of synthesizing drug-like molecules, rates molecules from 1 to 10 based on historical synthetic data and molecular complexity. Fragment contributions and a complexity penalty form the basis, derived from PubChem’s vast molecule database. Validation against expert chemists’ estimations showed strong agreement (r2 = 0.89). This method harnesses big data to streamline and enhance the synthesis evaluation process in molecular design.
DFT and TDDFT simulation
Density functional theory (DFT) calculations were performed on the molecules using ORCA 5.0.4 [37]. Optimization of molecules was performed at the P B E 0 / 6 311 G level of theory [38]. Time-dependent density functional theory (TDDFT) calculations were performed at the P B E 0 / T Z V P level [38,39]. The gas phase, water, and chlorobenzene solvents were modeled using the implicit solvent polarizable continuum model ( P C M ) [40,41] with Grimme’s D3 [42,43,44] dispersion corrections during optimization and TDDFT calculations.
RLHF
We utilized the transformer reinforcement learning package for reinforcement learning with human feedback. In the fine-tuning process, we applied Direct Preference Optimization (DPO), Contrastive Preference Optimization (CPO), and Kahneman–Tversky Optimization (KTO) with the chemical knowledge datasets.
The default sigmoid loss was used in DPO, where the beta factor was set at 0.1. Similarly, the loss type of CPO was sigmoid, and its beta factor was set at 0.1. The beta factor in the KTO loss was set at 0.1, with a higher value meaning less divergence from the initial policy. The desirable and undesirable losses of KTO are weighed by desirable weight and undesirable weight, respectively. Both of them were set at 1.0.
The preference dataset used in DPO and CPO is a dictionary object with the keys ‘prompt’, ‘chosen’, and ‘rejected’. The binary signal dataset used in KTO is a dictionary object with the keys ‘prompt’, ‘completion’, and ‘label’.

3. Results and Discussion

3.1. Generative Workflow for Photoresponsive Drug Delivery

Numerous studies have focused on utilizing AI-based databases to scale up, optimize, and accelerate the development of nanocarrier drug delivery systems that are safe, effective, and stable. Endogenous triggers like pH variations, hormone levels, enzymatic actions, overexpression of biomarkers, glucose, or redox gradients are intrinsic to the body’s disease state. These triggers can externally prompt or amplify drug release in affected areas [6]. UV light’s high energy can modify chemical bonds, facilitating drug delivery mechanisms. Its versatility makes UV a common stimulus in research and applications for drug delivery. To design more effective UV-responsive drug delivery molecules, we employed pre-trained language models, fine-tuned target molecule datasets, and used machine learning predictive models of molecular excitation energies.
To date, no studies have established a methodology for applying GPT technology to drug delivery molecules and validating its efficacy. Here, we opted for a transformer structure with adapter layers, specifically the GPT-2 model. Pre-trained on the PubChem dataset, this GPT-2 model aims to generate molecules with high drug-likeness and easy synthesis. A SMILES tokenizer was developed by Sanjar Adilov, based on atomic dictionaries and linked characters. The linked characters of molecular SMILES provide information about chemical bonds and molecular configurations. The SMILES tokenizer also incorporates Multi-Layer Perceptron (MLP) and attention mechanism networks. Our goal is to adapt a pre-trained GPT for generating effective photoresponsive molecules using adapter layers. Once this workflow is proven to work, we can integrate various pre-trained language models through Hugging Face. This workflow for generating stimulus-responsive molecules via a pre-trained language model is shown in Figure 1.
The QM7b dataset provides excitation energies for molecules, which were utilized as input for fine-tuning our UV-GPT and training our screening model. This open-source dataset also includes the Coulomb matrix and physicochemical properties of molecules. Prof. Alexandre Tkatchenko’s group shared the SMILES of molecules with us [32,33], which are crucial for our adapter training and serve as the foundation of our workflow. Our generative pre-trained transformer for UV light-responsive drug delivery (UVGPT) utilized training datasets containing molecules with excitation energies ranging from 4.13 to 12.41 eV. Understanding the differences in the properties of various molecules based on chemical bonds, atomic potentials, and molecular conformations is a direct manifestation of the quantitative structure–activity relationship (QSAR) of molecules. Experienced chemical researchers can design molecules with specific properties based on their intuition. Our UVGPT learns the QSAR of UV light-responsive molecules from training datasets.

3.2. Screening the Generative Molecules

After training our UVGPT on UV light-responsive molecules, a total of one thousand molecules were generated by the model. Among these, 443 possessed valid chemical structures processed with RDKit. Dylan M. Anstine and Olexandr Isayev [45] reviewed generative methods in chemical sciences and proposed evaluation metrics. The methods for calculating drug-likeness and SAscore, as outlined in their publications [45], were employed in our research. QED estimated the drug-likeness of molecules using a machine learning model trained on a dataset of drug-like compounds. Higher QED values suggest an increased likelihood of drug-likeness. Similarly, the Synthetic Accessibility Score calculation methods aimed to systematically assess the ease of synthesizing drug-like molecules, aiding in prioritizing compounds for molecular design.
In the following workflow, we evaluated the excitation energy, drug-likeness, and SAscore of these molecules, as illustrated in Figure 2. We utilized the Coulomb matrix of molecules as the molecular descriptor in our prediction model. Support Vector Regression (SVR) served as the predictive machine learning model. We methodically compared various sets of parameters for SVR, encompassing different kernels (rbf, sigmoid, and quantum), regularization parameters (from 0.01 to 80), and kernel coefficients (auto, scale, 0.8, 0.84, and 2.3). Based on the Mean Squared Error (MSE) of both the training and test sets, our predictive model selected the rbf kernel, a regularization parameter of 80, and a kernel coefficient of 0.84. Prior to utilizing SVR for predicting the excitation energies of generative molecules, we employed DeepChem to compute the Coulomb matrix of these generative molecules. The results are shown in Figure 2a.
When analyzing the density distribution plots of excitation energies depicted in Figure 2a, we observe a range spanning from 5 to 11 eV, with the majority concentrated between 8 and 9 eV. This distribution closely aligns with that of the training samples, indicating UVGPT’s success in learning the QSAR from molecular data and facilitating molecular design.
To refine the assessment of molecule excitation energies via precise TDDFT quantum chemical calculations, we conducted additional screening of the molecules. Given that drug delivery molecules, while not directly engaged in target binding at the lesion site, still exert direct effects on the human body, drug-like characteristics are equally crucial for the generated molecules. The parameter-tuned UVGPT inherited the pre-trained model’s performance on PubChem, and the distribution calculated using the QED method is illustrated in Figure 2b.
Based on the QED ranking, we identified the nine molecules with the highest degree of drug-like properties, and their corresponding SMILES representations and QED values are shown in Figure 3. These nine molecules exhibited QED values ranging from 0.528 to 0.57. Notably, the molecule with the SMILES notation OC1CC1OC(C)C demonstrated the highest degree of drug-like properties and was absent from the PubChem database. Additionally, the compound (1S,2S)-2-methylcyclopropan-1-ol in PubChem shared similarities with OC1CC1OC(C)C. Both findings further contribute to the evidence demonstrating the effectiveness of UVGPT.
Similarly, based on the density distribution results of the SAscore in Figure 2c, we identified eight recommended molecules after filtering out mediocre results. The SAscore values of these eight molecules ranged from 5.47 to 5.92. Notably, the molecule with the SMILES notation C=CC=NSN=C achieved an SAscore value of 5.537, indicating its status as a conjugated alkene. Additionally, the molecule with the SMILES notation CC(C(Cl)N)S=N also achieved an SAscore value of 5.537, classifying it as a cumulene. These are shown in Figure 4.

3.3. Quantum Chemical Calculations

To enhance our understanding and validate the outcomes generated by UVGPT, we conducted quantum chemical simulations to compute the physicochemical properties of the screened molecules, focusing primarily on their excitation energies. Additionally, we assessed the rationality and stability of the molecular structures within the chemical context.
To validate the excited state properties, vertical excitation by DFT calculations were performed. As shown in Table 1, nearly all (16/17) generated molecules contained heteroatoms, including oxygen, sulfur, chlorine, and especially nitrogen, suggesting potential for biological applications. Six molecules were saturated with a first excited energy greater than 6.199 eV (putative corresponding maximum absorption peak wavelength < 200 nm). Seven molecules exhibited a maximum absorption peak wavelength between 200 nm and 400 nm. The remaining five molecules were in the visible and near-infrared bands. Enhancements in the generalization of predictive models could contribute to increased efficiency in our smart workflow.
Moreover, as discussed earlier, the molecule with the SMILES notation OC1CC1OC(C)C is significant for drug application. Unfortunately, this molecule exhibits far-ultraviolet absorption and is unlikely to be used in drug delivery applications. Additionally, we did not find useful analogs in the databases. Therefore, we propose that modifying the isopropyl group by introducing double bonds (e.g., aldehyde, nitro, etc.) could decrease the excitation energy, potentially making it suitable for photoresponsive molecules with special drug applications.
It is widely recognized that conjugated alkenes are more stable than cumulenes. The cumulene molecule generated by UVGPT does not conform to an optimal structure within the bounds of established chemical knowledge. For example, the molecule with the SMILES notation CC1NC(C)=C=C1 (cumulene) lies several kcal/mol higher in Gibbs energy compared to its isomer. Addressing this issue requires incorporating additional chemists’ intuition and improving the quality of the datasets utilized.
Additionally, the conjugated alkene with the SMILES notation C=CC=NSN=C exhibits ultraviolet absorption properties. Its ability to undergo photochemical reactions upon UV light exposure renders it potentially useful in drug delivery applications.

3.4. Fine-Tuning with Human Feedback

By combining the results of our quantum chemical simulations shown in Table 1 with the molecular structures shown in Figure 3 and Figure 4, we have leveraged our chemical knowledge to establish three structural criteria for identifying molecules with longer excitation wavelengths:
  • The molecule has a polyatomic ring structure.
  • The atomic ring structure contains more than one unsaturated chemical bond.
  • The ring structure includes non-carbon atoms, such as nitrogen (N), sulfur (S), oxygen (O), etc.
We filtered the molecular data according to these three criteria, resulting in 121 molecules that meet a weakened criterion and 251 molecules that do not meet the criteria at all. The weakened criterion means satisfying at least two of the three conditions.
As shown in Figure 5, the KTO instruction fine-tuning relies on a binary signal dataset labeled based on human experience. We labeled 121 molecules that meet the weakened criterion as ‘True’ and those that do not meet the criteria at all as ‘False’, resulting in a binary signal dataset of 372 entries. Additionally, the DPO and CPO algorithms rely on a preference dataset, which consists of 121 molecules for the recommended samples and another set for the rejected data. As noted by the proposer of the KTO algorithm, the training set for KTO is much easier to obtain under these conditions. Our findings in our application were consistent with this.
We further introduced a low-rank adapter based on the pre-trained GPT-2 to implement the fine-tuned model, aiming to make the generated molecular content more aligned with chemical experience and intuition. The structure of the fine-tuning model is shown in Figure 5, in which we updated the parameters of the adapter within the RLHF framework. The training parameters were primarily sourced from the relevant functional layers of the GPT-2 attention mechanism. Additionally, we fine-tuned the GPT-2-based model using KTO, DPO, and CPO trainers, respectively. We then counted the number of molecules in the generated content that satisfied the aforementioned weakened condition, with the results shown in Table 2. It was found that CPO produced a higher total number and proportion of valid molecular data. Although the dichotomous dataset for the KTO algorithm is more readily available, the quality of its generated content was significantly inferior to that produced by the other models in our scenario. Nonetheless, we recognize the potential of these three types of trainers, along with additional RLHF algorithms, for developing more effective language models for our task.

4. Future Work

Previous studies have highlighted the limitations of UV-responsive materials for in vivo applications, citing concerns like potential cellular photodamage and inadequate penetration [10]. In subsequent investigations, researchers have turned their attention to visible and near-infrared light-responsive molecular materials [11,46]. However, the development of molecules utilizing LLM modeling for this purpose remains hindered by the absence of high-quality datasets. We aim to tackle this challenge to enhance the applicability and credibility of our workflow for medical clinical applications.
The remarkable success of the generative pre-trained transformer in various applications has garnered significant attention from both academia and industry. Communities like AdapterHub and Hugging Face have amassed numerous open-source pre-trained transformer structures, facilitating the design of light-responsive molecules for drug delivery systems. This diversity of molecular generation tools and content enhances the applicability of these technologies. Moreover, these advancements can seamlessly integrate into the workflows discussed in this paper.
While our current models are proficient in inheriting properties from training datasets, they have yet to reach a level of innovative insight that could rival human chemists. There is a need to further explore methods for refining the model parameters using additional knowledge and more effective techniques. Reinforcement learning from human feedback (RLHF) methods provide an exciting opportunity to incorporate more theoretical chemical knowledge into generating high-quality molecular content. To achieve this goal, we need to invest further efforts into designing the entire algorithmic framework of the generative pre-trained transformer (GPT) from scratch.

Author Contributions

Conceptualization, J.H. and G.Y.; Methodology, J.H., P.W., S.W., B.W. and G.Y.; Software, J.H. and P.W.; Validation, J.H.; Formal analysis, J.H. and P.W.; Investigation, J.H.; Data curation, J.H. and P.W.;Writing—original draft, J.H.;Writing—review & editing, J.H. and G.Y.; Visualization, J.H.; Project administration, J.H. and G.Y.; Funding acquisition, G.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported in part by the ERC IMI (101005122), the H2020 (952172), the MRC (MC/PC/21013), the Royal Society (IEC/NSFC/211235), the NVIDIA Academic Hardware Grant Program, the SABER project supported by Boehringer Ingelheim Ltd., NIHR Imperial Biomedical Research Centre (RDA01), Wellcome Leap Dynamic Resilience, UKRI guarantee funding for Horizon Europe MSCA Postdoctoral Fellowships (EP/Z002206/1), and the UKRI Future Leaders Fellowship (MR/V023799/1).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data files and Codes could be found via this link: https://github.com/jhu22/Pharmaceutics2024.

Acknowledgments

Many thanks to Alexandre Tkatchenko and Leonardo Medrano Sandonas for their important help in understanding and using QM7b (http://quantum-machine.org). We extend our gratitude to Qi Li for collecting the literature on drug delivery.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Vargason, A.M.; Anselmo, A.C.; Mitragotri, S. The evolution of commercial drug delivery technologies. Nat. Biomed. Eng. 2021, 5, 1–17. [Google Scholar] [CrossRef] [PubMed]
  2. Hassanzadeh, P.; Atyabi, F.; Dinarvand, R. The significance of artificial intelligence in drug delivery system design. Adv. Drug Deliv. Rev. 2019, 151–152, 169–190. [Google Scholar] [CrossRef] [PubMed]
  3. Meenakshi, D.U.; Nakumar, S.; Francis, A.P.; Sweety, P.; Fuloria, S.; Fuloria, N.K.; Subramaniyan, V.; Khan, S.A. Deep Learning and Site-Specific Drug Delivery; Wiley: Hoboken, NJ, USA, 2022; pp. 1–38. [Google Scholar] [CrossRef]
  4. Vora, L.K.; Gholap, A.D.; Jetha, K.; Thakur, R.R.S.; Solanki, H.K.; Chavda, V.P. Artificial Intelligence in Pharmaceutical Technology and Drug Delivery Design. Pharmaceutics 2023, 15, 1916. [Google Scholar] [CrossRef] [PubMed]
  5. Tao, Y.; Chan, H.F.; Shi, B.; Li, M.; Leong, K.W. Light: A Magical Tool for Controlled Drug Delivery. Adv. Funct. Mater. 2020, 30, 2005029. [Google Scholar] [CrossRef] [PubMed]
  6. Liu, D.; Yang, F.; Xiong, F.; Gu, N. The Smart Drug Delivery System and Its Clinical Potential. Theranostics 2016, 6, 1306–1323. [Google Scholar] [CrossRef] [PubMed]
  7. Lan, G.; Ni, K.; Lin, W. Nanoscale metal–organic frameworks for phototherapy of cancer. Coord. Chem. Rev. 2019, 379, 65–81. [Google Scholar] [CrossRef] [PubMed]
  8. Bouchaala, R.; Anton, N.; Anton, H.; Vamme, T.; Vermot, J.; Smail, D.; Mély, Y.; Klymchenko, A.S. Light-triggered release from dye-loaded fluorescent lipid nanocarriers in vitro and in vivo. Colloids Surfaces B Biointerfaces 2017, 156, 414–421. [Google Scholar] [CrossRef] [PubMed]
  9. Son, J.; Yi, G.; Yoo, J.; Park, C.; Koo, H.; Choi, H.S. Light-responsive nanomedicine for biophotonic imaging and targeted therapy. Adv. Drug Deliv. Rev. 2019, 138, 133–147. [Google Scholar] [CrossRef] [PubMed]
  10. Jia, S.; Fong, W.-K.; Graham, B.; Boyd, B.J. Photoswitchable Molecules in Long-Wavelength Light-Responsive Drug Delivery: From Molecular Design to Applications. Chem. Mater. 2018, 30, 2873–2887. [Google Scholar] [CrossRef]
  11. Cho, H.J.; Chung, M.; Shim, M.S. Engineered photo-responsive materials for near-infrared-triggered drug delivery. J. Ind. Eng. Chem. 2015, 31, 15–25. [Google Scholar] [CrossRef]
  12. Liu, J.; Kang, W.; Wang, W. Photocleavage-based Photoresponsive Drug Delivery†. Photochem. Photobiol. 2021, 98, 288–302. [Google Scholar] [CrossRef] [PubMed]
  13. Barhoumi, A.; Liu, Q.; Kohane, D.S. Ultraviolet light-mediated drug delivery: Principles, applications, and challenges. J. Control. Release 2015, 219, 31–42. [Google Scholar] [CrossRef] [PubMed]
  14. Weissleder, R. A clearer vision for in vivo imaging. Nat. Biotechnol. 2001, 19, 316–317. [Google Scholar] [CrossRef] [PubMed]
  15. Bagheri, A.; Arandiyan, H.; Boyer, C.; Lim, M. Lanthanide-Doped Upconversion Nanoparticles: Emerging Intelligent Light-Activated Drug Delivery Systems. Adv. Sci. 2016, 3, 1500437. [Google Scholar] [CrossRef] [PubMed]
  16. Karimi, M.; Zangabad, P.S.; Baghaee-Ravari, S.; Ghazadeh, M.; Mirshekari, H.; Hamblin, M.R. Smart Nanostructures for Cargo Delivery: Uncaging and Activating by Light. J. Am. Chem. Soc. 2017, 139, 4584–4610. [Google Scholar] [CrossRef] [PubMed]
  17. Linsley, C.S.; Wu, B.M. Recent advances in light-responsive on-demand drug-delivery systems. Ther. Deliv. 2017, 8, 89–107. [Google Scholar] [CrossRef] [PubMed]
  18. Gao, J.; Karp, J.M.; Langer, R.; Joshi, N. The Future of Drug Delivery. Chem. Mater. 2023, 35, 359–363. [Google Scholar] [CrossRef] [PubMed]
  19. Harrison, P.J.; Wiesler, H.; Sabirsh, A.; Karlsson, J.; Malmsjö, V.; Hellander, A.; Wählby, C.; Spjuth, O. Deep-learning models for lipid nanoparticle-based drug delivery. Nanomedicine 2021, 16, 1097–1110. [Google Scholar] [CrossRef] [PubMed]
  20. Schwaller, P.; Laino, T.; Gaudin, T.; Bolgar, P.; Hunter, C.A.; Bekas, C.; Lee, A.A. Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction. Acs Cent. Sci. 2019, 5, 1572–1583. [Google Scholar] [CrossRef]
  21. Chithrananda, S.; Grand, G.; Ramsundar, B. ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction. arXiv 2020, arXiv:2010.09885. [Google Scholar]
  22. Openai, A.; Openai, K.; Openai, T.; Openai, I. Improving Language Understanding by Generative Pre-Training. 2018. Available online: https://www.mikecaptain.com/resources/pdf/GPT-1.pdf (accessed on 1 July 2024).
  23. Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language Models are Unsupervised Multitask Learners. OpenAI Blog 2019, 1, 9. [Google Scholar]
  24. Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models are Few-Shot Learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
  25. Gupta, A.; Müller, A.T.; Huisman, B.J.H.; Fuchs, J.A.; Schneider, P.; Schneider, G. Generative Recurrent Networks for De Novo Drug Design. Mol. Inform. 2017, 37, 1700111. [Google Scholar] [CrossRef] [PubMed]
  26. Adilov, S. Generative Pre-Training from Molecules; Cambridge Engage Preprints: Cambridge, UK, 2021; Volume 16, Available online: https://chemrxiv.org/engage/chemrxiv/article-details/6142f60742198e8c31782e9e (accessed on 25 June 2024).
  27. Haroon, S.; Hafsath, C.A.; Hafsath, C.A. Generative Pre-trained Transformer (GPT) based model with relative attention for de novo drug design. Comput. Biol. Chem. 2023, 106, 107911. [Google Scholar] [CrossRef] [PubMed]
  28. Jablonka, K.M.; Schwaller, P.; Ortega-Guerrero, A.; Smit, B. Is GPT-3 All You Need for Low-Data Discovery in Chemistry; Cambridge Engage Preprints: Cambridge, UK, 2023; Volume 14. [Google Scholar]
  29. Rafailov, R.; Sharma, A.; Mitchell, E.; Manning, C.D.; Ermon, S.; Finn, C. Direct Preference Optimization: Your Language Model is Secretly a Reward Model. Adv. Neural Inf. Process. Syst. 2023, 36, 53728–53741. [Google Scholar]
  30. Xu, H.; Sharaf, A.; Chen, Y.; Tan, W.; Shen, L.; Van Durme, B.; Murray, K.; Kim, Y.J. Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation. arXiv 2024, arXiv:2401.08417. [Google Scholar]
  31. Ethayarajh, K.; Xu, W.; Muennighoff, N.; Jurafsky, D.; Kiela, D. KTO: Model Alignment as Prospect Theoretic Optimization. arXiv 2024, arXiv:2401.08417. [Google Scholar]
  32. Rupp, M.; Tkatchenko, A.; Müller, K.-R.; von Lilienfeld, O.A. Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning. Phys. Rev. Lett. 2012, 108, 058301. [Google Scholar] [CrossRef] [PubMed]
  33. Montavon, G.; Rupp, M.; Gobre, V.; Vazquez-Mayagoitia, A.; Hansen, K.; Tkatchenko, A.; Müller, K.R.; Von Lilienfeld, O.A. Machine learning of molecular electronic properties in chemical compound space. New J. Phys. 2013, 15, 095003. [Google Scholar] [CrossRef]
  34. Blum, L.C.; Reymond, J.-L. 970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13. J. Am. Chem. Soc. 2009, 131, 8732–8733. [Google Scholar] [CrossRef]
  35. Bickerton, G.R.; Paolini, G.V.; Besnard, J.; Muresan, S.; Hopkins, A.L. Quantifying the chemical beauty of drugs. Nat. Chem. 2012, 4, 90–98. [Google Scholar] [CrossRef] [PubMed]
  36. Ertl, P.; Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminform. 2009, 1, 8. [Google Scholar] [CrossRef] [PubMed]
  37. Neese, F. The ORCA program system. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2011, 2, 73–78. [Google Scholar] [CrossRef]
  38. Adamo, C.; Barone, V. Toward reliable density functional methods without adjustable parameters: The PBE0 model. J. Chem. Phys. 1999, 110, 6158–6170. [Google Scholar] [CrossRef]
  39. Weigend, F.; Ahlrichs, R. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy. Phys. Chem. Chem. Phys. 2005, 7, 3297. [Google Scholar] [CrossRef] [PubMed]
  40. Marenich, A.V.; Cramer, C.J.; Truhlar, D.G. Universal Solvation Model Based on Solute Electron Density and on a Continuum Model of the Solvent Defined by the Bulk Dielectric Constant and Atomic Surface Tensions. J. Phys. Chem. B 2009, 113, 6378–6396. [Google Scholar] [CrossRef] [PubMed]
  41. Skyner, R.E.; McDonagh, J.L.; Groom, C.R.; van Mourik, T.; Mitchell, J.B.O. A review of methods for the calculation of solution free energies and the modelling of systems in solution. Phys. Chem. Chem. Phys. 2015, 17, 6174–6191. [Google Scholar] [CrossRef] [PubMed]
  42. Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys. 2010, 132, 154104. [Google Scholar] [CrossRef] [PubMed]
  43. Grimme, S. Semiempirical GGA-type density functional constructed with a long-range dispersion correction. J. Comput. Chem. 2006, 27, 1787–1799. [Google Scholar] [CrossRef]
  44. Grimme, S.; Ehrlich, S.; Goerigk, L. Effect of the damping function in dispersion corrected density functional theory. J. Comput. Chem. 2011, 32, 1456–1465. [Google Scholar] [CrossRef]
  45. Anstine, D.M.; Isayev, O. Generative Models as an Emerging Paradigm in the Chemical Sciences. J. Am. Chem. Soc. 2023, 145, 8736–8750. [Google Scholar] [CrossRef] [PubMed]
  46. Olejniczak, J.; Carling, C.-J.; Almutairi, A. Photocontrolled release using one-photon absorption of visible or NIR light. J. Control. Release 2015, 219, 18–30. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The workflow for generating UV light-responsive molecules. This workflow includes an LLM model based on GPT-2, a SMILES tokenizer, pre-training on PubChem datasets, fine-tuning on UV molecule datasets, and a screening model incorporating the Coulomb matrix descriptor. Molecules excited by ultraviolet light have the potential to become stimuli-responsive materials in drug delivery systems. The dark-green areas represent the pre-trained transformer. The indigo areas represent the pre-training of GPT-2 combined with the SMILES tokenizer on the PubChem dataset. The orange-yellow areas indicate that a new adapter was fine-tuned using the ultraviolet light-excited molecule dataset on the pre-trained GPT-2. The orange-red areas show the prediction model, trained on the QM7b dataset and Coulomb matrix features, which predicts the properties of molecules generated by the fine-tuned GPT-2.
Figure 1. The workflow for generating UV light-responsive molecules. This workflow includes an LLM model based on GPT-2, a SMILES tokenizer, pre-training on PubChem datasets, fine-tuning on UV molecule datasets, and a screening model incorporating the Coulomb matrix descriptor. Molecules excited by ultraviolet light have the potential to become stimuli-responsive materials in drug delivery systems. The dark-green areas represent the pre-trained transformer. The indigo areas represent the pre-training of GPT-2 combined with the SMILES tokenizer on the PubChem dataset. The orange-yellow areas indicate that a new adapter was fine-tuned using the ultraviolet light-excited molecule dataset on the pre-trained GPT-2. The orange-red areas show the prediction model, trained on the QM7b dataset and Coulomb matrix features, which predicts the properties of molecules generated by the fine-tuned GPT-2.
Pharmaceutics 16 01014 g001
Figure 2. The excitation energy (a), drug-likeness (b), and SAscore (c) of generative molecules. These molecules were generated by GPT-2, fine-tuned on the ultraviolet light-excited molecule dataset. The excitation energy data in (a) are from predictions made by an SVM model. (b) Shows the distribution of the drug-likeness scores of the molecules, obtained from DeepChem’s calculation of the quantitative estimate of drug-likeness (QED) values. (c) presents the synthetic accessibility scores of the molecules, calculated using RDKit, with higher values indicating that the molecules are easier to synthesize.
Figure 2. The excitation energy (a), drug-likeness (b), and SAscore (c) of generative molecules. These molecules were generated by GPT-2, fine-tuned on the ultraviolet light-excited molecule dataset. The excitation energy data in (a) are from predictions made by an SVM model. (b) Shows the distribution of the drug-likeness scores of the molecules, obtained from DeepChem’s calculation of the quantitative estimate of drug-likeness (QED) values. (c) presents the synthetic accessibility scores of the molecules, calculated using RDKit, with higher values indicating that the molecules are easier to synthesize.
Pharmaceutics 16 01014 g002
Figure 3. The QED values and SMILES representations of nine selected molecules generated by UVGPT.
Figure 3. The QED values and SMILES representations of nine selected molecules generated by UVGPT.
Pharmaceutics 16 01014 g003
Figure 4. The SAscore values and SMILES representations of eight selected molecules generated by UVGPT.
Figure 4. The SAscore values and SMILES representations of eight selected molecules generated by UVGPT.
Pharmaceutics 16 01014 g004
Figure 5. Instruction tuning with chemical datasets. The details of the pre-trained GPT-2 model are provided, including the implementation process of RLHF, which involves KTO, DPO, and CPO. The characteristics of the binary signal dataset and the preference dataset are also shown. In the training and fine-tuning of the GPT-2 Peft model, the parameters in the orange-yellow area were used for training, while the parameters of the other layers retained their pre-trained values. The KTO algorithm used the binary signal dataset, while both DPO and CPO used the preference dataset. Refer to the Methods section for descriptions of the binary signal and preference datasets. Labels in the binary signal dataset are assigned by experts to the molecules as either True or False. In the preference dataset, experts select recommended molecules and identify unreasonable molecules as restricted.
Figure 5. Instruction tuning with chemical datasets. The details of the pre-trained GPT-2 model are provided, including the implementation process of RLHF, which involves KTO, DPO, and CPO. The characteristics of the binary signal dataset and the preference dataset are also shown. In the training and fine-tuning of the GPT-2 Peft model, the parameters in the orange-yellow area were used for training, while the parameters of the other layers retained their pre-trained values. The KTO algorithm used the binary signal dataset, while both DPO and CPO used the preference dataset. Refer to the Methods section for descriptions of the binary signal and preference datasets. Labels in the binary signal dataset are assigned by experts to the molecules as either True or False. In the preference dataset, experts select recommended molecules and identify unreasonable molecules as restricted.
Pharmaceutics 16 01014 g005
Table 1. Molecular structures and first excitation energies converted to wavelengths in the gas phase, water, and organic solvents.
Table 1. Molecular structures and first excitation energies converted to wavelengths in the gas phase, water, and organic solvents.
Molecular SMILESFirst Excitation Energy Converted to Wavelengths
Gas PhaseWaterOrganic Solvents
CCC(N)CC190178181
OC1CC1OC(C)C191177180
CCC(N)C(C)C193182185
CC(CN)C(C)193181184
CC(C)C(C)CN197187190
CCC(N)CCO199181185
COCC1CCN1201189192
CNC=CCC231219222
CC(C(Cl)N)S=N226305218
C1=C=C=N1235235235
NCCCC1SC1251246247
C=CC=NSN=C286288291
C=C=C=NN=C468445451
CC1CC=C=C1492696605
CC1C=C=NC1631644648
CC1NC(C)=C=C1733481529
CC1=NC=C=N1749736739
Table 2. Analysis of the generated content from the RLHF fine-tuned model.
Table 2. Analysis of the generated content from the RLHF fine-tuned model.
RLHF TrainerNumber of Molecules Satisfying Different Conditions
At Least Two JudgmentsRatio of Two Judgments
KTO1621.05%
DPO12435.13%
CPO13143.96%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hu, J.; Wu, P.; Wang, S.; Wang, B.; Yang, G. A Human Feedback Strategy for Photoresponsive Molecules in Drug Delivery: Utilizing GPT-2 and Time-Dependent Density Functional Theory Calculations. Pharmaceutics 2024, 16, 1014. https://doi.org/10.3390/pharmaceutics16081014

AMA Style

Hu J, Wu P, Wang S, Wang B, Yang G. A Human Feedback Strategy for Photoresponsive Molecules in Drug Delivery: Utilizing GPT-2 and Time-Dependent Density Functional Theory Calculations. Pharmaceutics. 2024; 16(8):1014. https://doi.org/10.3390/pharmaceutics16081014

Chicago/Turabian Style

Hu, Junjie, Peng Wu, Shiyi Wang, Binju Wang, and Guang Yang. 2024. "A Human Feedback Strategy for Photoresponsive Molecules in Drug Delivery: Utilizing GPT-2 and Time-Dependent Density Functional Theory Calculations" Pharmaceutics 16, no. 8: 1014. https://doi.org/10.3390/pharmaceutics16081014

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop