Machine Learning Methods to Improve Crystallization through the Prediction of Solute–Solvent Interactions
Abstract
:1. Introduction
2. Machine Learning
2.1. Supervised Learning
2.2. Unsupervised Learning
2.3. Semi-Supervised Learning
2.4. Reinforcement Learning
2.5. Ensemble Methods
3. Solute–Solvent Interactions
3.1. Introduction to Solute–Solvent Interactions
3.2. Solute–Solvent Interactions’ Effect on Crystallization Conditions
3.3. Enthalpy
3.4. Entropy
3.5. Gibbs Free Energy
4. Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Doxsee, K.M.; Francis, P.E. Crystallization of Ammonium Nitrate from Nonaqueous Solvents. Ind. Eng. Chem. Res. 2000, 39, 3493–3498. [Google Scholar] [CrossRef]
- Behrens, M.A.H.; Lacmann, R.; Schröder, W. Crystallization of potassium chloride: The additives ZnCl2, Na6 [(PO3)6] and K4 [Fe(CN)6]. Chem. Eng. Technol. 1995, 18, 295–301. [Google Scholar] [CrossRef]
- Wang, Y.; Qiu, L.P.; Hu, M.F. Magnesium ammonium phosphate crystallization: A possible way for recovery of phosphorus from wastewater. IOP Conf. Ser. Mater. Sci. Eng. 2018, 392, 032032. [Google Scholar] [CrossRef]
- Singh, M.K. Simulating Growth Morphology of Urea Crystals from Vapour and Aqueous Solution. CrystEngComm 2015, 17, 7731–7744. [Google Scholar] [CrossRef]
- Kim, S.; Wei, C.; Kiang, S. Crystallization Process Development of an Active Pharmaceutical Ingredient and Particle Engineering via the Use of Ultrasonics and Temperature Cycling. Org. Process. Res. Dev. 2003, 7, 997–1001. [Google Scholar] [CrossRef]
- Tung, H.H.; Paul, E.L.; Midler, M.; McCauley, J.A. Crystallization of Organic Compounds: An Industrial Perspective; Wiley: Hoboken, NJ, USA, 2024. [Google Scholar]
- Nebol’sin, V.A.; Swaikat, N. About Some Fundamental Aspects of the Growth Mechanism Vapor-Liquid-Solid Nanowires. J. Nanotechnol. 2023, 2023, e7906045. [Google Scholar] [CrossRef]
- Zhang, L.; Yang, M.; Zhang, S.; Niu, H. Unveiling the Crystallization Mechanism of Cadmium Selenide via Molecular Dynamics Simulation with Machine-Learning-Based Deep Potential. J. Mater. Sci. Technol. 2024, 185, 23–31. [Google Scholar] [CrossRef]
- Ma, Y.; Svärd, M.; Xiao, X.; Gardner, J.M.; Olsson, R.T.; Forsberg, K. Precipitation and Crystallization Used in the Production of Metal Salts for Li-Ion Battery Materials: A Review. Metals 2020, 10, 1609. [Google Scholar] [CrossRef]
- Lewis, A.; Seckler, M.; Kramer, H.; Van Rosmalen, G. Industrial Crystallization: Fundamentals and Applications; Cambridge University Press: Cambridge, UK, 2015. [Google Scholar]
- Qi, A.; Zhang, L. Review of Computer-Aided Methods in Fat Crystallization Studies. J. Am. Oil Chem. Soc. 2024. [Google Scholar] [CrossRef]
- Karpiński, P.; Bałdyga, J. Batch Crystallization. In Handbook of Industrial Crystallization; Cambridge University Press: Cambridge, UK, 2019; pp. 346–379. [Google Scholar] [CrossRef]
- Varshosaz, J.; Ghassami, E.; Ahmadipour, S. Crystal Engineering for Enhanced Solubility and Bioavailability of Poorly Soluble Drugs. Curr. Pharm. Des. 2018, 24, 2473–2496. [Google Scholar] [CrossRef]
- Adapa, S.; Schmidt, K.A.; Jeon, I.J.; Herald, T.J.; Flores, R.A. Mechanisms of Ice Crystallization and Recrystallization in Ice Cream: A Review. Food Rev. Int. 2000, 16, 259–271. [Google Scholar] [CrossRef]
- Statista. United States: Sugar Production 2023/24. Available online: https://www.statista.com/statistics/249661/us-sugar-production/ (accessed on 11 February 2024).
- Wu, G.; Yion, W.T.G.; Dang, K.L.N.Q.; Wu, Z. Physics-informed machine learning for MPC: Application to a batch crystallization process. Chem. Eng. Res. Des. 2023, 192, 556–569. [Google Scholar] [CrossRef]
- Heist, J.A.; Hunt, K.M. Material Recycling and Waste Minimization by Freeze Crystallization. Final Technical Report, August 1993–April 1994 (Technical Report). OSTI.GOV. Available online: https://www.osti.gov/biblio/189745 (accessed on 1 May 1995).
- Das, P.; Dutta, S.; Singh, K.; Maity, S. Energy saving integrated membrane crystallization: A sustainable technology solution. Sep. Purif. Technol. 2019, 228, 115722. [Google Scholar] [CrossRef]
- Liu, Q.; Wu, Y. Supervised Learning. In Encyclopedia of the Sciences of Learning; Springer: Boston, MA, USA, 2012. [Google Scholar] [CrossRef]
- Renuka, D.K.; Hamsapriya, T.; Chakkaravarthi, M.R.; Surya, P.L. Spam Classification Based on Supervised Learning Using Machine Learning Techniques. In Proceedings of the 2011 International Conference on Process Automation, Control and Computing (PACC), Coimbatore, India, 20–22 July 2011; pp. 1–7. [Google Scholar]
- Domala, V.; Kim, T.-W. A Univariate and Multivariate Machine Learning Approach for Prediction of Significant Wave Height. In Proceedings of the OCEANS 2022, Hampton Roads, VA, USA, 17–20 October 2022; Available online: https://ieeexplore.ieee.org/document/9977028 (accessed on 11 February 2024).
- Hodson, T.O.; Over, T.M.; Foks, S.S. Mean Squared Error, Deconstructed. J. Adv. Model. Earth Syst. 2021, 13, e2021MS002681. Available online: https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2021MS002681 (accessed on 11 February 2024). [CrossRef]
- Schneider, A.; Hommel, G.; Blettner, M. Linear Regression Analysis. Dtsch. Arztebl. Int. 2010, 107, 776–782. [Google Scholar] [CrossRef] [PubMed]
- Evgeniou, T.; Pontil, M. Support Vector Machines: Theory and Applications. In Advanced Course on Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2001; pp. 249–257. [Google Scholar] [CrossRef]
- Song, Y.-Y.; Lu, Y. Decision tree methods: Applications for classification and prediction. Shanghai Arch. Psychiatry 2015, 27, 130–135. [Google Scholar] [CrossRef] [PubMed]
- Naeem, S.; Ali, A.; Anam, S.; Ahmed, M.M. An Unsupervised Machine Learning Algorithms: Comprehensive Review. Int. J. Comput. Digit. Syst. 2023, 13, 911–921. [Google Scholar] [CrossRef] [PubMed]
- Caron, M.; Bojanowski, P.; Joulin, A.; Douze, M. Deep Clustering for Unsupervised Learning of Visual Features. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Roman, V. Unsupervised Learning: Dimensionality Reduction. Available online: https://towardsdatascience.com/unsupervised-learning-dimensionality-reduction-ddb4d55e0757 (accessed on 17 April 2021).
- Sinaga, K.P.; Yang, M.-S. Unsupervised K-Means Clustering Algorithm. IEEE Access 2020, 8, 80716–80727. [Google Scholar] [CrossRef]
- Jolliffe, I.T.; Cadima, J. Principal component analysis: A review and recent developments. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2016, 374, 20150202. [Google Scholar] [CrossRef]
- Reddy, Y.C.A.P.; Viswanath, P.; Reddy, B.E. Semi-supervised learning: A brief review. Int. J. Eng. Technol. 2018, 7, 81–85. [Google Scholar] [CrossRef]
- Lee, D.-H. Pseudo-Label: The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks. In Workshop: Challenges in Representation Learning (WREPL); ICML: Honolulu, HI, USA, 2013. [Google Scholar]
- Iscen, A.; Tolias, G.; Avrithis, Y.; Chum, O. Label Propagation for Deep Semi-Supervised Learning. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 5065–5074. [Google Scholar]
- Sivamayil, K.; Rajasekar, E.; Aljafari, B.; Nikolovski, S.; Vairavasundaram, S.; Vairavasundaram, I. A Systematic Study on Reinforcement Learning Based Applications. Energies 2023, 16, 1512. Available online: https://www.mdpi.com/1996-1073/16/3/1512 (accessed on 11 February 2024). [CrossRef]
- Nancy, J. Basics—Reinforcement Learning. Analytics Vidhya (Blog). Available online: https://medium.com/analytics-vidhya/basics-reinforcement-learning-66aae5da4c85 (accessed on 5 May 2020).
- Watkins, C.J.C.H.; Dayan, P. Q-learning. Mach. Learn. 1992, 8, 279–292. [Google Scholar] [CrossRef]
- Varga, B.; Kulcsár, B.; Chehreghani, M.H. Deep Q-learning: A robust control approach. Int. J. Robust Nonlinear Control. 2022, 33, 526–544. [Google Scholar] [CrossRef]
- Dietterich, T.G. Ensemble Methods in Machine Learning. In Multiple Classifier Systems, Proceedings of the First International Workshop on Multiple Classifier Systems, Cagliari, Italy, 9–11 June 2000; Springer: Berlin/Heidelberg, Germany, 2000; pp. 1–15. [Google Scholar] [CrossRef]
- Freund, Y.; Robert, E.S. A Short Introduction to Boosting. J. Jpn. Soc. Artif. Intell. 1999, 14, 1612. [Google Scholar]
- Behera, A.L.; Su, S.; Sachinkumar, P. Enhancement of Solubility: A Pharmaceutical Overview. Pharm. Lett. 2010, 2, 310–318. [Google Scholar]
- Ferraz-Caetano, J.; Teixeira, F.; Cordeiro, M.N.D.S. Explainable Supervised Machine Learning Model To Predict Solvation Gibbs Energy. J. Chem. Inf. Model. 2023, 64, 2250–2262. [Google Scholar] [CrossRef] [PubMed]
- Alibakhshi, A.; Hartke, B. Improved prediction of solvation free energies by machine-learning polarizable continuum solvation model. Nat. Commun. 2021, 12, 3584. [Google Scholar] [CrossRef] [PubMed]
- Karunanithi, A.; Luke, A. Chapter 4 Solvent Design for Crystallization of Pharmaceutical Products. Comput. Aided Chem. Eng. 2007, 23, 115–147. [Google Scholar] [CrossRef]
- Khamar, D. Investigating the Role of Solvent–Solute Interaction in Crystal Nucleation of Salicylic Acid from Organic Solvents. J. Am. Chem. Soc. 2014, 136, 11664–11673. [Google Scholar] [CrossRef]
- Keifer, D. Enthalpy and the Second Law of Thermodynamics. J. Chem. Educ. 2019, 96, 1407–1411. [Google Scholar] [CrossRef]
- Brissaud, J.-B. The meanings of entropy. Entropy 2005, 7, 68–96. [Google Scholar] [CrossRef]
- Chen, L.-Q. Chemical potential and Gibbs free energy. MRS Bull. 2019, 44, 520–523. [Google Scholar] [CrossRef]
- Chung, Y.; Vermeire, F.H.; Wu, H.; Walker, P.J.; Abraham, M.H.; Green, W.H. Group Contribution and Machine Learning Approaches to Predict Abraham Solute Parameters, Solvation Free Energy, and Solvation Enthalpy. J. Chem. Inf. Model. 2022, 62, 433–446. [Google Scholar] [CrossRef] [PubMed]
- Li, X.; Wang, N.; Huang, Y.; Xing, J.; Huang, X.; Ferguson, S.; Wang, T.; Zhou, L.; Hao, H. The role of solute conformation, solvent–solute and solute–solute interactions in crystal nucleation. AIChE J. 2023, 69, e18144. [Google Scholar] [CrossRef]
- Bund, R.K.; Hartel, R.W. Blends of delactosed permeate and pro-cream in ice cream: Effects on physical, textural and sensory attributes. Int. Dairy J. 2013, 31, 132–138. [Google Scholar] [CrossRef]
- Patel, D.G.D.; Benedict, J.B. Crystals in Materials Science. In Recent Advances in Crystallography; IntechOpen: London, UK, 2012. [Google Scholar] [CrossRef]
- Xiouras, C.; Cameli, F.; Quilló, G.L.; Kavousanakis, M.E.; Vlachos, D.G.; Stefanidis, G.D. Applications of Artificial Intelligence and Machine Learning Algorithms to Crystallization. Chem. Rev. 2022, 122, 13006–13042. [Google Scholar] [CrossRef] [PubMed]
- Kovács, E.A.; Szilágyi, B. A synthetic machine learning framework for complex crystallization processes: The case study of the second-order asymmetric transformation of enantiomers. Chem. Eng. J. 2023, 465, 142800. [Google Scholar] [CrossRef]
- Kirman, J.; Johnston, A.; Kuntz, D.A.; Askerka, M.; Gao, Y.; Todorović, P.; Ma, D.; Privé, G.G.; Sargent, E.H. Machine-Learning-Accelerated Perovskite Crystallization. Matter 2020, 2, 938–947. [Google Scholar] [CrossRef]
- Meyer, C.; Arora, A.; Scholl, S. A method for the rapid creation of AI driven crystallization process controllers. Comput. Chem. Eng. 2024, 186, 108680. [Google Scholar] [CrossRef]
- Kolluri, S. Machine Learning and Artificial Intelligence in Pharmaceutical Research and Development: A Review-PMC. AAPS J. 2022, 24, 19. Available online: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8726514/ (accessed on 11 February 2024). [CrossRef]
- Zhong, X. Explainable Machine Learning in Materials Science. npj Comput. Mater. 2022, 8, 204. Available online: https://www.nature.com/articles/s41524-022-00884-7 (accessed on 11 February 2024). [CrossRef]
- Dobbelaere, M. Machine Learning in Chemical Engineering: Strengths, Weaknesses, Opportunities, and Threats. Engineering 2021, 7, 1201–1211. Available online: https://www.sciencedirect.com/science/article/pii/S2095809921002010 (accessed on 29 July 2021). [CrossRef]
- Zhu, J.-J.; Yang, M.; Ren, Z.J. Machine Learning in Environmental Research: Common Pitfalls and Best Practices. Environ. Sci. Technol. 2023, 57, 17671–17689. [Google Scholar] [CrossRef] [PubMed]
- Kwon, H.; Ali, Z.A.; Wong, B.M. Harnessing Semi-Supervised Machine Learning to Automatically Predict Bioactivities of Per- and Polyfluoroalkyl Substances (PFASs). Environ. Sci. Technol. Lett. 2022, 10, 1017–1022. [Google Scholar] [CrossRef] [PubMed]
- Cai, H.; Zhang, X.; Liu, X. Semi-Supervised End-To-End Contrastive Learning For Time Series Classification. arXiv 2023. [Google Scholar] [CrossRef]
- Ibarz, J.; Tan, J.; Finn, C.; Kalakrishnan, M.; Pastor, P.; Levine, S. How to train your robot with deep reinforcement learning: Lessons we have learned. Int. J. Robot. Res. 2021, 40, 698–721. [Google Scholar] [CrossRef]
- Nakabi, T.A.; Toivanen, P. Deep reinforcement learning for energy management in a microgrid with flexible demand. Sustain. Energy Grids Netw. 2020, 25, 100413. [Google Scholar] [CrossRef]
- Wu, H.; Levinson, D. The ensemble approach to forecasting: A review and synthesis. Transp. Res. Part C Emerg. Technol. 2021, 132, 103357. [Google Scholar] [CrossRef]
- Gneiting, T.; Raftery, A.E. Weather Forecasting with Ensemble Methods. Science 2005, 310, 248–249. [Google Scholar] [CrossRef]
Type of Machine Learning | Overview | Past Uses | Applications |
---|---|---|---|
Supervised learning | In supervised learning, the algorithm is trained on a labeled dataset, which means each training example is paired with an output label. For predicting solute–solvent interactions, a dataset could be created where the features include various properties of the solute and solvent (e.g., polarity, molecular weight, etc.), and the label could be the type or strength of the interaction (e.g., hydrogen bonding, van der Waals forces, etc.). Algorithms like linear regression or support vector machines (SVMs) could be employed to predict these interactions. | Pharmaceuticals: supervised learning has been extensively used in drug discovery and development. Algorithms like SVMs have helped in predicting the bioactivity of compounds based on their chemical properties, which is analogous to predicting solute–solvent interactions [56]. Material science: linear regression and other supervised techniques have been used to predict material properties, such as tensile strength or thermal conductivity, based on their molecular composition [57]. | Once the model is trained to predict solute–solvent interactions, it can be applied to optimize crystallization processes in various industries. For example, in pharmaceuticals, knowing how a particular solute interacts with potential solvents can help in creating crystals with desired properties like solubility and bioavailability. |
Unsupervised learning | Unsupervised learning algorithms work with datasets without labeled responses. Clustering techniques like K-means could be used to group different types of solute–solvent interactions based on their properties. | Chemical engineering: unsupervised learning, especially clustering, has been employed to understand complex chemical processes. For example, clustering has been used to categorize different catalysts based on their performance characteristics [58]. Environmental science: clustering techniques have been used to analyze and group pollutants in water sources, providing insights similar to those for solute–solvent interactions [59]. | The clusters can reveal hidden patterns in how different solutes and solvents interact, which can be invaluable for improving crystallization methods. For instance, solutes prone to forming undesired crystal structures might be clustered together, allowing for targeted optimization. |
Semi-supervised learning | In semi-supervised learning, the algorithm is trained on a dataset that contains both labeled and unlabeled data. This approach is particularly useful when acquiring a fully labeled dataset is expensive or time-consuming. | Biotechnology: semi-supervised learning has been used in genomic sequencing, where only a part of the genetic data is labeled. This parallels predicting solute–solvent interactions in cases where only partial information is available [60]. Sensor data analysis: in manufacturing, semi-supervised learning has been applied to sensor data where only some data points are labeled, aiding in predictive maintenance [61]. | A semi-supervised model could be trained on a partially labeled dataset to predict solute–solvent interactions. The model could then be used to predict the interactions in a crystallization process, allowing for adjustments in real time to achieve the desired crystal properties. |
Reinforcement learning | Reinforcement learning involves agents who take actions in an environment to achieve a goal. In the context of solute–solvent interactions, the agent could be programmed to find the optimal conditions for a desired type of interaction, receiving rewards based on the quality of the crystals formed. | Robotics: reinforcement learning has been pivotal in robotics for tasks like object manipulation and navigation, which require adaptive learning similar to optimizing crystallization conditions [62]. Energy management: In smart grids, reinforcement learning has been used to optimize energy distribution and consumption, a concept that can be applied to managing crystallization processes [63]. | Reinforcement learning could be used to dynamically adjust conditions like temperature, pressure, and concentration during the crystallization process to achieve optimal crystal properties, thereby saving costs and improving efficiency. |
Ensemble methods | Ensemble methods combine multiple algorithms to improve performance. For example, a random forest algorithm could be used to predict solute–solvent interactions by leveraging the strengths of multiple decision trees. | Financial forecasting: ensemble methods, particularly random forests, have been used in stock market prediction, dealing with complex patterns much like those in predicting solute–solvent interactions [64] Weather prediction: the use of ensemble methods in weather forecasting models demonstrates their effectiveness in handling complex systems with many variables, akin to crystallization processes [65]. | Ensemble methods can provide more robust and accurate predictions, which is crucial for processes like crystallization, where small changes can have significant impacts. By accurately predicting solute–solvent interactions, the crystallization process can be optimized for better yield and quality. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kandaswamy, A.; Schwaminger, S.P. Machine Learning Methods to Improve Crystallization through the Prediction of Solute–Solvent Interactions. Crystals 2024, 14, 501. https://doi.org/10.3390/cryst14060501
Kandaswamy A, Schwaminger SP. Machine Learning Methods to Improve Crystallization through the Prediction of Solute–Solvent Interactions. Crystals. 2024; 14(6):501. https://doi.org/10.3390/cryst14060501
Chicago/Turabian StyleKandaswamy, Aatish, and Sebastian P. Schwaminger. 2024. "Machine Learning Methods to Improve Crystallization through the Prediction of Solute–Solvent Interactions" Crystals 14, no. 6: 501. https://doi.org/10.3390/cryst14060501
APA StyleKandaswamy, A., & Schwaminger, S. P. (2024). Machine Learning Methods to Improve Crystallization through the Prediction of Solute–Solvent Interactions. Crystals, 14(6), 501. https://doi.org/10.3390/cryst14060501