Use of Machine Learning to Predict the Glycemic Status of Patients with Diabetes †
Abstract
:1. Introduction
2. Machine Learning and Predictions
- Artificial Neural Network-ANN with a payoff equal to 7;
- Polynomial Regression with a payoff equal to 7;
- Gradient Boosted Trees Regression with a payoff equal to 14;
- Random Forest Regression with a payoff equal to 16;
- Simple Regression Tree with a payoff equal to 18;
- Tree Ensemble Regression with a payoff equal to 23;
- Linear Regression with a payoff equal to 27;
- Probabilistic Neural Network-PNN with a payoff equal to 32.
- Probabilistic Neural Networks-PNN with a payoff equal to 6;
- Simple Regression Tree with a payoff equal to 13;
- Gradient Boosted Trees Regression and Random Forest Regression with a payoff equal to 19;
- Linear Regression with a payoff equal to 27;
- Tree Ensemble Regression and Artificial Neural Network-ANN with a payoff equal to 28;
- Polynomial Regression with a payoff equal to 40.
- Data: consists of a single KNIME node which has the function of reading the data entered in the Excel format;
- Preprocessing: consists of a group of three different KNIME nodes. The first node consists of “Column Filter” and is a node in which it is possible to select the columns of interest through which to carry out the prediction activity. The second node consists of “Normalizer” and it is a node that compresses data in the range from 0 to 1. The third node consists of “Partitioning” and is a node in which the data is divided into two different groups: 70% is used for the training of the neural network while the remaining 30% is used for the actual prediction;
- Machine Learning and predictions: is the central part of the data analysis process aimed at predictions and consists of two nodes. The first, known as “RProp MLP Learner” is used for neural network training. It has hyperparameters that can be modified according to the analytical needs and which, in the analyzed case, were used in the basic version. The second node of KNIME is “Multilayer Perceptron Predictor” and is the node containing the real data prediction.
- Score: the last phase consists of the “Numeric Scorer” node which allows the evaluation of the predictive efficiency of the neural network through the analysis of both the R-square and the statistical errors.
3. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Massaro, A.; Maritati, V.; Savino, N.; Galiano, A.; Convertini, D.; De Fonte, E.; Di Muro, M. A Study of a health resources management platform integrating neural networks and DSS telemedicine for homecare assistance. Information 2018, 7, 176. [Google Scholar] [CrossRef]
- Massaro, A.; Maritati, V.; Savino, N.; Galiano, A. Neural networks for automated smart health platforms oriented on heart predictive diagnostic big data systems. In Proceedings of the 2018 AEIT International Annual Conference, Bari, Italy, 3–5 October 2018; pp. 1–5. [Google Scholar] [CrossRef]
- Massaro, A.; Maritati, V.; Giannone, D.; Convertini, D.; Galiano, A. LSTM DSS automatism and dataset optimization for diabetes prediction. Appl. Sci. 2019, 9, 3532. [Google Scholar] [CrossRef]
- Massaro, A. Electronics in Advanced Research Industries: Industry 4.0 to Industry 5.0 Advances; John Wiley & Sons: Hoboken, NJ, USA, 2021. [Google Scholar] [CrossRef]
- Denormalization of Predicted Data in Neural Networks. Available online: https://stackoverflow.com/questions/32888108/denormalization-of-predicted-data-in-neural-networks (accessed on 7 January 2022).
Algorithm | Mean Absolute Error (MAE) | Mean Squared Error (MSE) | Root Mean Squared Error (RMSE) | Mean Signed Difference (MSD) |
---|---|---|---|---|
Artificial Neural Network (ANN) | 0.199918 | 0.057661 | 0.240128 | 0.054425 |
Probabilistic Neural Network (PNN) | 0.275533 | 0.105013 | 0.324056 | 0.164210 |
Gradient Boosted Trees Regression | 0.238739 | 0.077760 | 0.278855 | 0.024218 |
Simple Regression Tree | 0.238739 | 0.077760 | 0.278855 | 0.024218 |
Random Forest Regression | 0.235623 | 0.076095 | 0.275853 | 0.073448 |
Tree Ensemble Regression | 0.241468 | 0.081222 | 0.284995 | 0.057196 |
Linear Regression | 0.242458 | 0.079036 | 0.281134 | 0.058953 |
Polynomial Regression | 0.211184 | 0.067966 | 0.260702 | 0.009944 |
Algorithm | R2 | Mean Squared Error (MSE) | Mean Squared Error (MSE) | Root Mean Squared Error (RMSE) | Mean Signed Difference (MSD) |
---|---|---|---|---|---|
Artificial Neural Network (ANN) | 0.03793089 | 0.02464794 | 0.00382001 | 0.06180622 | 0.001 |
Probabilistic Neural Network (PNN) | 0.96996654 | 0.00368226 | 0 | 0.01103748 | 0.001 |
Gradient Boosted Trees Regression | 0.71567727 | 0.02176984 | 0.00127028 | 0.03564093 | 0.01547137 |
Simple Regression Tree | 0.91064585 | 0.01378341 | 0 | 0.1895266 | 0.01228716 |
Random Forest Regression | 0.33102591 | 0.02416602 | 0.00273247 | 0.05227308 | 0.00120760 |
Tree Ensemble Regression | 0.29050064 | 0.03015130 | 0.00258986 | 0.05089066 | 0.01356692 |
Linear Regression | 0.04864026 | 0.02449140 | 0.00338018 | 0.05813930 | 0.00309530 |
Polynomial Regression | 0.01977070 | 0.03305683 | 0.00417221 | 0.06459261 | 0.01651951 |
Parameter | A | B |
---|---|---|
Mean | 126.71 | 147.27 |
Median | 128 | 146 |
Minimum | 90 | 140 |
Maximum | 132 | 233 |
Standard deviation | 81,069 | 65074 |
Asymmetry | −29,518 | 86,207 |
Kurtosis | 8.2884 | 95 |
5th percentile | 102 | 143 |
95th percentile | 132 | 153 |
Interquartile range | 60 | 30,000 |
Missing observation | 0 | 0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Massaro, A.; Magaletti, N.; Cosoli, G.; Leogrande, A.; Cannone, F. Use of Machine Learning to Predict the Glycemic Status of Patients with Diabetes. Med. Sci. Forum 2022, 10, 11. https://doi.org/10.3390/IECH2022-12293
Massaro A, Magaletti N, Cosoli G, Leogrande A, Cannone F. Use of Machine Learning to Predict the Glycemic Status of Patients with Diabetes. Medical Sciences Forum. 2022; 10(1):11. https://doi.org/10.3390/IECH2022-12293
Chicago/Turabian StyleMassaro, Alessandro, Nicola Magaletti, Gabriele Cosoli, Angelo Leogrande, and Francesco Cannone. 2022. "Use of Machine Learning to Predict the Glycemic Status of Patients with Diabetes" Medical Sciences Forum 10, no. 1: 11. https://doi.org/10.3390/IECH2022-12293
APA StyleMassaro, A., Magaletti, N., Cosoli, G., Leogrande, A., & Cannone, F. (2022). Use of Machine Learning to Predict the Glycemic Status of Patients with Diabetes. Medical Sciences Forum, 10(1), 11. https://doi.org/10.3390/IECH2022-12293