1. Introduction
Diabetes mellitus, a common chronic metabolic disorder characterized by elevated blood glucose levels, presents significant health challenges globally [
1]. Its considerable impact on healthcare systems highlights the urgent need for early diagnosis and effective management to prevent complications and improve patient outcomes [
2]. Among the complications associated with diabetes, diabetic foot problems are particularly severe, with the potential to lead to amputation in the most extreme cases [
3]. Traditionally, diabetes diagnosis has relied on clinical assessments, blood tests, and self-reported symptoms. However, recent advances in technology have paved the way for innovative approaches to predicting and managing diabetes more efficiently [
4]. Further complications, such as neuropathy, which causes nerve damage and loss of sensation, can lead to a diabetic foot, where minor injuries can go unnoticed and progress to severe infections or ulcers due to poor blood circulation and reduced healing capacity.
Accurate and timely prediction is essential for early intervention and effective disease management [
5]. In recent years, wearable sensor technologies have gained popularity as tools for monitoring physiological data, showing promise in identifying new indicators to predict diabetes [
6,
7]. Notably, the use of thermal and pressure data has shown potential in improving the accuracy of diabetes diagnosis and treatment. For instance, measuring the plantar temperature can reveal differences in the soles of diabetic feet, with the ability to detect ulcers and necrosis with accuracies of 90% and 88%, respectively [
8]. Infrared thermography has also proven to be effective in detecting temperature variability in the feet of diabetic patients, aiding in the early diagnosis and prevention of lesions in affected areas [
9]. Furthermore, regression analysis has shown the potential to accurately predict the maximum plantar pressure in patients with type 2 diabetes mellitus, which is crucial for the prevention and early detection of diabetic foot complications [
10]. However, studies that combine temperature and pressure data are limited. For example, Yavuz et al. [
11] found no significant correlation between plantar temperatures and triaxial plantar stresses in individuals with diabetes. This lack of correlation suggests that these variables can be effectively combined as independent attributes in machine learning models for the prediction of diabetes, potentially leading to more robust predictive models, as is also discussed in this paper.
Wearable sensor technologies, such as those integrated into shoes or insoles, have emerged as promising tools for the continuous monitoring of physiological parameters relevant to the management of diabetes [
12,
13,
14,
15]. These non-invasive technologies enable continuous data collection, facilitating the early detection of potential problems, including those related to diabetic foot complications [
16]. Of particular interest for the prediction of diabetes are the thermal patterns and pressure distribution measured by these devices [
17].
Thermal imaging techniques are used to visualize the distribution of skin temperature, which can indicate underlying metabolic processes and pathological conditions. This non-invasive diagnostic method uses the principles of heat transfer and physiological responses of the body to detect temperature variations that can signal health problems [
18]. Meanwhile, several studies have indicated that insole systems that measure plantar pressure can be beneficial in managing diabetic foot health by reducing ulcer recurrence, lowering plantar stress, helping to detect early complication, and improving gait and weight distribution [
19,
20,
21].
The relationship between plantar pressure, temperature, and diabetic foot complications is an emerging area of research. Diabetic neuropathy often leads to foot ulceration due to a combination of elevated temperatures, loss of sensation, and abnormal pressure distribution. Understanding these factors is essential for the prevention and management of diabetic foot complications. Diabetic neuropathy is associated with higher plantar foot temperatures, which can be measured non-invasively using infrared thermal imaging, indicating its potential as a tool for evaluating high-risk diabetic feet [
22]. Furthermore, plantar pressure measurements are increasingly integrated into clinical practice, with evidence supporting their role in ulcer prevention and the importance of long-term monitoring to provide feedback on concern pressure levels [
23]. Sawacha et al. [
24] emphasized that the simultaneous assessment of kinematics, kinetics, and plantar pressure can more accurately characterize the biomechanics of the diabetic foot, potentially helping to prevent foot ulcerations. The classification of plantar pressure distributions has proven useful in identifying diabetic patients at risk of foot ulceration and guiding the provision of preventive interventions, such as therapeutic footwear [
25]. Changes in these parameters have been linked to diabetes-related foot complications, such as neuropathy and an increased risk of ulceration, ultimately contributing to the development of a diabetic foot [
23]. However, there is a scarcity of studies that have investigated the potential of these measures to diagnose diabetes.
This study introduces a novel approach to predicting diabetes by integrating temperature and plantar pressure data, a combination not extensively explored in the previous research. While the existing studies typically focus on either temperature or pressure independently, our work leverages both modalities to enhance the predictive accuracy. This multimodal approach provides deeper insights into foot health and potential complications. This work contributes to the field by not only demonstrating the limitations of single-modality analysis but also by showing how integrating multiple data sources can yield more robust machine learning models for clinical prediction tasks.
Recently, the application of machine learning techniques in the prediction of diabetes has gained significant traction [
26]. Machine learning uses computational algorithms to analyze large datasets, identify complex patterns, and make accurate predictions [
27,
28].
The machine learning approaches to diabetes prediction are inherently data-driven, relying on diverse datasets that include a wide range of patient information such as clinical data, genetic markers, lifestyle factors, and physiological measurements. The integration of wearable sensor technologies has further enriched these datasets by providing real-time, continuous monitoring of the parameters relevant to diabetes [
29,
30]. In the context of diabetes, the features of interest include blood glucose levels, insulin sensitivity, physical activity, dietary habits, and, as explored in this paper, thermal and pressure data from the feet. The machine learning models for diabetes prediction encompass a wide range of algorithms, such as Decision Trees, Support Vector Machines, Random Forests, and neural networks, among others. Each algorithm offers unique advantages and may be suited to different aspects of diabetes prediction. For example, deep learning models can effectively capture complex patterns in large datasets [
31], while Decision Trees can provide more interpretable insights into risk factors [
32,
33]. Few studies have examined the application of several available algorithms in tandem on the predictions [
34,
35]. Evaluating the performance of machine learning models is a crucial step in the process. Metrics such as accuracy, sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC-ROC) are commonly used to assess the predictive capacity of these models. Despite these, there is a scarcity of previous studies that have investigated the potential of these measures in tandem to diagnose diabetes.
The primary objective of this study is to investigate the feasibility of using thermal data, pressure data, or a combination of both variables to predict the presence of diabetes. By analyzing the time series data of thermal and pressure measurements from a diverse group of individuals—including both diabetic and non-diabetic subjects—we aim to explore the potential predictive capabilities of these variables. The study employs machine learning algorithms to assess whether plantar pressure, plantar temperature, or a combination of both can effectively predict diabetes. For our analysis, we used a consolidated thermal time series that includes data from five anatomical points at the feet, combined with pressure data.
In our study, the initial experiments aimed to investigate whether significant correlations could be established between thermal data and plantar pressure estimates using a variety of machine learning models for regression. Despite the use of sophisticated regression techniques, these models demonstrated sub-optimal performance in predicting pressure values from thermal data alone, as indicated by the low accuracy metrics and high error rates. The lack of a correlation suggests that the physiological processes reflected by the thermal measurements may not directly translate to the biomechanical indicators captured by the pressure data.
Recognizing the limitations of this approach, we shifted our focus towards aggregating both data types—temperature and pressure—for the prediction of diabetes. This multimodal approach yielded significantly more encouraging results, with a notable improvement in predictive accuracy. The combination of these independent datasets allowed machine learning models to leverage the complementary nature of thermal and pressure data, offering a more comprehensive understanding of the physiological changes associated with diabetes. By integrating these variables, we were able to achieve high levels of prediction accuracy, underscoring the importance of multimodal data fusion in medical diagnostics.
We hypothesize that integrating these two physiological datasets will result in more accurate predictions. For example, when used independently, classifiers such as the Logistic Regression model have shown moderate accuracy with pressure data (around 68.75%), while temperature data can yield higher accuracy, with the Naive Bayes model reaching up to 87.5%. However, it is expected that the combination of pressure and temperature data will lead to significantly improved performance, with classifiers like the Extra Trees Classifier demonstrating stronger metrics, including precision, recall, and F1 scores. This study will also explore the importance of algorithm selection in optimizing prediction accuracy and the need to tailor models to different participant groups.
By examining the performance of multiple machine learning models with both individual and combined data, this research seeks to highlight the potential of multimodal data integration in enhancing the accuracy of diabetes prediction.
The rest of the paper is organized as follows:
Section 2 outlines the methods used in our study, detailing the composition of the dataset, the experimental setup and protocol, and the data acquisition procedures. In
Section 3, we present the results of the predictive models using machine learning techniques.
Section 4 delves into the discussion and conclusions, summarizing the insights gained and their implications for diabetes prediction and management.
2. Methods
This section describes the methodologies used in our investigation of the feasibility of predicting diabetes using thermal and pressure data. The following subsections provide a detailed discussion of the composition and characteristics of the dataset, the specific experimental procedures and protocols used to gather the data, and the techniques and instruments utilized for collecting plantar pressure and thermal measurements from the participants.
2.1. Dataset Description
The study involved a total of 26 participants, including 13 individuals at various stages of diabetes and 13 healthy controls. The group consisted of 18 women and 8 men, with ages ranging from 40 to 73 years. The weights of the participants ranged from 42 to 110 kilograms(kg), and their heights ranged between 1.42 and 1.90 meters(m).
2.2. Experimental Setup and Protocol
To ensure consistency and minimize confounding factors, an experimental area suitable for walking was carefully selected. The designated area was chosen to avoid “quick” twisting movements that could potentially increase friction inside the shoe, thereby impacting the measurements. A 25 m
∞-shaped walkway was established for data collection, as illustrated in
Figure 1.
Simultaneously, a thermal camera and a chronometer were arranged to record the necessary temperature measurements before and after the walks. The thermal camera was placed at a distance of 1 to 1.5 m from the feet of the participants to capture accurate thermal images, as shown in
Figure 2.
2.3. Data Acquisition Procedure
The participant was first informed about the study. Initially, plantar pressure measurements were taken in the shoe while walking. Participants were asked to walk on an ∞-shaped path in two familiarization trials, followed by three trials for pressure data collection. Twelve steps were analyzed to calculate the average peak plantar pressure during the stance phase of walking for each foot. The shoe pressure sensor (Parologg Pressure Measurement System, Paromed, Neu-Beuern, Germany) provided data that were processed to obtain the maximum plantar pressure, the average maximum plantar pressure, and the average plantar pressure at each anatomical point.
After completion of plantar pressure measurements, participants followed a procedure to measure foot temperature using a Flir One Pro thermal imaging camera. The temperature was recorded immediately before and immediately after the insoles, following these steps:
The participant was asked to be barefoot and lie supine on a flat couch, with a cushion placed under the head for comfort. Specific regions of interest on the feet were marked for thermal measurements. These regions included the hallux, first metatarsus, third metatarsus, fifth metatarsus, midfoot (proximal fifth metatarsal head), medial arch (proximal first metatarsal head), and heel, as shown in
Figure 3.
After a 15 min acclimatization period on the couch, baseline foot temperature was recorded using the thermal imaging camera.
The participant then wore shoes and walked at a natural pace along the designated pathway.
Immediately after completing the walk, the shoes were removed and the participant was asked to lie on the couch.
Temperature measurements were taken at specific intervals 30 s, 90 s, 120 s, and 180 s after walking. For each measurement, the participant returned to the couch, removed their footwear, and thermal images were captured.
These temperature readings, along with the baseline values recorded after acclimatization, were used to calculate temperature changes.
This systematic approach ensures that each participant is evaluated under consistent starting conditions, thereby improving the reliability and validity of the study results. The data collection process resulted in a comprehensive dataset that includes thermal and pressure measurements from each participant. Specifically, each dataset comprises baseline foot temperature, thermal images taken at various time points during walking, and corresponding average pressure metrics at specific anatomical points.
2.4. Feature Description and Data Structure
For the first set of experiments, which focused on predicting plantar pressure metrics using temperature data, each data point consisted of temperature measurements recorded at five specific time intervals immediately after walking (0 s) and at 30 s, 90 s, 120 s, and 180 s. The data were collected from eight anatomical points on each foot: the hallux, first metatarsus, third metatarsus, fifth metatarsus, heel, lateral midfoot (LatMF), and medial midfoot (MedMF). Each anatomical point contributed temperature features based on three summary statistics (mean, maximum, and minimum) in addition to the five time-specific measurements. This resulted in 15 temperature features per anatomical point (5 time-based measurements + 3 summary statistics). Given that temperature data were collected from 8 anatomical points, each foot provided a total of 120 temperature features (15 features × 8 anatomical points).
These 120 temperature features were used as input to the machine learning regression models, with the target being one of the three plantar pressure metrics: peak pressure (PPP), average peak pressure (APPP), or average pressure (APP) for each anatomical point. The models tested included Extra Trees Regressor, K Neighbors Regressor, Dummy Regressor, Light Gradient Boosting Machine, Bayesian Ridge, Random Forest Regressor, Gradient Boosting Regressor, AdaBoost Regressor, Extreme Gradient Boosting, Orthogonal Matching Pursuit, Elastic Net, Lasso Least Angle Regression, Lasso Regression, Ridge Regression, Decision Tree Regressor, Huber Regressor, Linear Regression, Passive Aggressive Regressor, and Least Angle Regression. The goal of these regression models was to assess whether the temperature data alone could be used to predict the corresponding plantar pressure values. However, as detailed in the results, the models faced difficulties due to the low or insignificant correlations between temperature and pressure variables.
In the subsequent experiments, a multimodal approach was employed for the classification task of predicting diabetic status. In this setup, both temperature and pressure data were combined into the feature set. Pressure data were summarized into 3 metrics for each of the seven anatomical points (first metatarsus, third metatarsus, fifth metatarsus, hallux, heel, lateral midfoot, and medial midfoot), resulting in 21 pressure-related features (3 metrics × 7 points). When combined with the 120 temperature features, the complete feature vector used for classification consisted of 141 attributes per foot (120 temperature features + 21 pressure features). These feature vectors were used in machine learning classification models aimed at distinguishing between diabetic and non-diabetic subjects. The models tested in this task included Extra Trees Classifier, Random Forest Classifier, Extreme Gradient Boosting, Ada Boost Classifier, Gradient Boosting Classifier, Naive Bayes, Logistic Regression, Decision Tree Classifier, Linear Discriminant Analysis, Ridge Classifier, Quadratic Discriminant Analysis, K Neighbors Classifier, Support Vector Machine (SVM) with a linear kernel, Light Gradient Boosting Machine, and Dummy Classifier.
The models were implemented and evaluated using the PyCaret library [
36], which automates the training and evaluation of machine learning algorithms for both regression and classification tasks. A 5-fold cross-validation strategy was employed to evaluate model performance. The dataset comprised 26 participants, with each foot treated as an independent instance, resulting in 52 instances in total. The cross-validation framework ensured that each participant’s data were used both in training and validation across different folds, mitigating overfitting and providing a robust estimate of model performance.
Analysis of Correlation Between Individuals’ Feet
To explore the validity of treating each foot as an independent instance in the dataset, a correlation analysis was performed across both feet of each individual and between feet of different individuals. The correlation index was calculated for each pair of feet, and the results are displayed in
Figure 4, which shows the correlation matrix of pressure and temperature data.
The correlation matrix reveals that, on average, the correlation between feet from different individuals is relatively high, with a mean value of approximately 0.86. This suggests that, while there are similarities between individuals, there is still sufficient variability across the dataset to justify treating each foot as an independent instance. Such variability is important for machine learning models to generalize effectively, even when the left and right feet of the same individual are included in both the training and test sets across different folds.
Although the data demonstrate subtle similarities between feet, particularly in healthy individuals, these subtle differences are likely to capture clinically meaningful variations, especially in diabetic patients. In this population, asymmetry between feet may reflect complications such as neuropathy, making it important for the model to learn from these discrepancies. This approach enables the model to better generalize across a wider range of physiological conditions, improving its ability to detect early signs of complications.
By treating each foot as independent, we effectively increase the dataset size, which is particularly important in studies with small datasets. Given the high correlation values, the model benefits from additional data points while still capturing enough variation to avoid overfitting. Moreover, the use of 5-fold cross-validation ensures that the model is evaluated on a wide range of data splits, further reducing the risk of overfitting and ensuring robustness.
While potential bias due to symmetry in healthy individuals is acknowledged, the correlation analysis suggests that this bias is minimal. From a machine learning perspective, the variability present in the dataset justifies the approach, particularly given the asymmetry often observed in diabetic patients. As a result, treating each foot as an independent instance not only increases the dataset’s robustness but also enhances the generalization of the machine learning models by exposing them to a broader range of conditions.
Rationale for Independent Foot Analysis: While treating each foot as an independent data point increases the dataset size and helps to capture key asymmetries in diabetic individuals, we acknowledge that this approach may introduce bias, particularly in healthy individuals where greater symmetry between left and right feet is typically observed. Inclusion of both feet in both the training and testing sets could lead to inflated model performance as the symmetry between feet in healthy individuals might not reflect true independent variability. Results should be interpreted with caution, especially in cases where foot symmetry is expected.
The essential variations in patients likely include differences in temperature distribution, pressure patterns, and structural abnormalities between the left and right feet. These variations are particularly important in diabetic patients, where asymmetries can indicate complications like ulcers, neuropathy, or other foot pathologies. From a machine learning standpoint, capturing these discrepancies improves the model’s ability to detect early signs of these complications and enhances the generalization of predictions by training on a wider range of physiological conditions. The model, thus, becomes better at identifying both subtle and more pronounced differences in foot health.
Furthermore, cross-validation was applied during model evaluation to mitigate overfitting and ensure that the models were tested across various data splits, reducing the risk of performance overestimation [
37,
38].This method is commonly used to assess machine learning models’ effectiveness in scenarios where data symmetry, such as in gait analysis, is a concern [
37]. Moreover, cross-validation is particularly effective in controlling overfitting when dealing with uncorrelated errors, as observed in machine learning models used for prediction tasks. Although k-fold cross-validation is not entirely immune to bias in small sample sizes, it offered a rigorous evaluation of the model’s performance across different subsets of the data [
38].
During each iteration or “fold” within the cross-validation protocol, the training subset was split into subgroups. One subgroup served as the training data, facilitating the model’s learning process, while the other subgroup acted as the validation data, against which the model’s performance was assessed. This partitioning adhered to the established 5-fold methodology, ensuring a comprehensive and evenly distributed assessment across the entire dataset. Following this procedure, each instance within the dataset participated in both the training and validation phases across five distinct folds. This approach mitigates any bias that could arise from a single partitioning of the data, leading to a thorough evaluation of the model’s effectiveness in handling diverse scenarios and potential variations within the dataset.
Upon completion of the cross-validation process, performance metrics such as accuracy, recall, precision, F1 score, and kappa were systematically compiled for each fold and averaged to provide an estimate of the model’s overall performance.
2.5. Machine Learning Algorithms
2.5.1. Regression Models
For the regression tasks aimed at predicting plantar pressure metrics from temperature data, the following algorithms were utilized:
Extra Trees Regressor: An ensemble learning method that aggregates results from multiple randomized Decision Trees to improve prediction accuracy.
K Neighbors Regressor: A non-parametric method that predicts the output based on the average value of the k-nearest neighbors in the feature space.
Dummy Regressor: A simple baseline model that makes predictions using basic strategies such as the mean or median of the target values.
Light Gradient Boosting Machine (LightGBM): A gradient boosting framework that uses tree-based learning algorithms, optimized for efficiency and performance.
Bayesian Ridge: A linear regression model that uses Bayesian inference to estimate the regression coefficients.
Random Forest Regressor: A tree-based ensemble model that builds multiple Decision Trees and averages their outputs to enhance predictive accuracy.
Gradient Boosting Regressor: An ensemble technique that builds models sequentially, optimizing the prediction by minimizing the error of previous models.
AdaBoost Regressor: A boosting method that combines weak regressors to produce a strong predictive model by focusing on the most difficult-to-predict instances.
Extreme Gradient Boosting (XGBoost): A highly efficient and flexible boosting algorithm that improves performance by reducing overfitting and increasing accuracy.
Orthogonal Matching Pursuit: A greedy algorithm for linear regression that selects the most correlated features in each iteration.
Elastic Net: A regularized regression model that linearly combines L1 and L2 penalties of the lasso and ridge methods to improve prediction and feature selection.
Lasso Least Angle Regression (LassoLARS): A variant of linear regression that automatically selects the most relevant features by shrinking the less important ones to zero.
Lasso Regression: A regression method that performs both variable selection and regularization to enhance prediction accuracy.
Ridge Regression: A technique used when multicollinearity exists, adding a degree of bias to the regression estimates.
Decision Tree Regressor: A non-linear regression model that splits the dataset into subsets based on the feature values to make predictions.
Huber Regressor: A robust regression technique that is less sensitive to outliers in the data than least squares regression.
Linear Regression: A basic regression model that assumes a linear relationship between the input features and the target values.
Passive Aggressive Regressor: An online learning algorithm that updates the model in response to each individual sample.
Least Angle Regression (LARS): A regression algorithm particularly suited for high-dimensional data, similar to forward stepwise regression.
2.5.2. Classification Models
For the classification task of predicting diabetes status based on combined pressure and temperature features, the following models were employed:
Extra Trees Classifier: An ensemble learning method that aggregates the results of multiple randomized Decision Trees to make predictions.
Random Forest Classifier: A tree-based ensemble method that creates multiple Decision Trees for classification and averages their outputs.
Extreme Gradient Boosting (XGBoost): A highly efficient boosting algorithm used for classification tasks, known for its high performance in structured data.
AdaBoost Classifier: A boosting algorithm that improves classification by combining weak classifiers to form a stronger overall classifier.
Gradient Boosting Classifier: An iterative boosting method that combines weak classifiers to produce a strong predictive model by sequentially reducing the classification error.
Naive Bayes: A probabilistic classifier based on Bayes’ theorem, assuming independence between the features.
Logistic Regression: A simple linear classifier used to predict the probability of a binary outcome (diabetes or non-diabetes).
Decision Tree Classifier: A non-linear model that classifies instances by recursively partitioning the feature space based on feature values.
Linear Discriminant Analysis (LDA): A classification algorithm that models the differences between multiple classes by assuming normally distributed features.
Ridge Classifier: A variant of Logistic Regression that uses regularization to handle collinearity and improve classification.
Quadratic Discriminant Analysis (QDA): A classifier that assumes each class is normally distributed but with different covariance matrices.
K Neighbors Classifier: A non-parametric method that classifies instances based on the majority class of the k-nearest neighbors in the feature space.
Support Vector Machine (SVM) with a linear kernel: A classification algorithm that creates a linear boundary between classes to maximize the margin between them.
Light Gradient Boosting Machine (LightGBM): A highly efficient gradient boosting method optimized for classification tasks on large datasets.
Dummy Classifier: A simple baseline model that makes predictions using basic strategies such as stratified or most frequent class predictions.
2.5.3. Handling of Correlated Features in Machine Learning Models
The machine learning algorithms used in this study, including Extra Trees, Random Forest, Gradient Boosting, and Support Vector Machines (SVMs), are well-suited to handle potential correlations between features, such as the left and right foot data [
39]. These algorithms are based on ensemble techniques, Decision Trees, or linear models, which inherently manage redundancy and correlation in input features [
40]. For instance, Random Forest is effective in handling feature selection, even with a high number of variables, which helps in improving model accuracy and performance by eliminating unimportant variables [
41]. Correlations between input features do not necessarily degrade the performance of these algorithms because they assess the contribution of each feature in relation to the target outcome, even when multiple features carry similar information (as could be the case with left and right foot data).
Moreover, the use of regularization techniques in models like SVMs [
42] or Logistic Regression helps to control for overfitting, which can occur in the presence of correlated data [
43].
Given the relatively high correlations observed in the dataset, these algorithms are capable of identifying meaningful patterns without being adversely affected by the similarity between left and right foot data. This approach ensures that the models generalize well even in the presence of correlated measurements.
2.5.4. Hyperparameter Tuning
In the process of model selection and training, default hyperparameters were used for each of the machine learning models as hyperparameter tuning was not performed in the initial comparison phase. PyCaret automatically trains a range of models using their standard settings, enabling quick evaluation of model performance. Once the best-performing model is identified, further optimization can be performed through hyperparameter tuning to improve the results.
Table 1 is a summary of the machine learning models used in our study, separated into classification and regression models, together with their respective default parameter values.
3. Results
3.1. Correlation Between Plantar Pressure and Temperature Data
This section analyzes the relationship between plantar pressure and temperature data by calculating the correlation coefficients for various anatomical points. The aim of this analysis is to explore how changes in plantar pressure at specific regions of the foot correlate with variations in temperature. Understanding these correlations can provide insights into the biomechanical and thermodynamic responses of the foot under varying load conditions and diabetes conditions.
3.1.1. Calculation of Correlation Coefficients
The relationship between pressure and temperature was assessed by computing both Pearson and Spearman correlation coefficients for each combination of temperature and plantar pressure features. The Pearson correlation coefficient measures the linear relationship between two continuous variables, while the Spearman correlation coefficient captures potential non-linear relationships by assessing the monotonic association between ranked data.
The formulas used for these calculations are as follows:
Pearson correlation:
where
and
represent the data points for temperature and pressure, respectively, and
and
are their respective means.
Spearman correlation:
where
is the difference between the ranks of the corresponding values and
n is the number of observations.
Both correlation coefficients were computed for each anatomical point, such as the first metatarsal (1stM), fifth metatarsal (5thM), hallux, heel, and the lateral and medial midfoot (LatM and MedMF), in combination with various temperature measurements.
3.1.2. Anatomical Points and Correlation with Temperature
Due to the large number of data points across different anatomical regions, it becomes challenging to present all the correlation results in a single table. Comprehensive results are available in the
Supplementary Materials at the end of the paper (
Supplementary Materials). Therefore, the key findings from each anatomical point are summarized below.
Hallux
The hallux consistently showed positive correlations with temperature, particularly in relation to the MedMF temperature features. The Pearson correlations between the MedMF temperature features and hallux pressure ranged from to , indicating a moderate positive relationship. This suggests that, as the plantar pressure at the hallux increases, the temperature in the midfoot rises, reflecting the transfer of the mechanical load and thermal response in this region.
Heel
The correlations between the temperature and pressure in the heel were relatively weak compared to those in the hallux and metatarsals. Moderate positive correlations were observed in some cases, particularly in relation to MedMF_Min and MedMF_Max, suggesting that temperature increases in the heel are weakly associated with pressure increases.
3.1.3. Possible Impact of Correlation on Machine Learning Performance
The results of correlation analysis between the temperature and plantar pressure data highlights the potential limitations of using temperature as a standalone feature for pressure prediction. The generally weak and inconsistent correlations across most anatomical points suggest that the underlying relationship between these variables is not strong enough to support accurate pressure predictions using only temperature data. This lack of correlation may serve as an early indicator of poor performance in machine learning models designed to predict plantar pressure from temperature alone. Given the weak association, it is hypothesized that these models will struggle to capture the necessary patterns for reliable predictions.
However, this condition also presents an opportunity for exploring multimodal approaches, where temperature data are combined with other biomechanical features such as the pressure readings to improve the prediction accuracy. In the following sections, a series of machine learning experiments are conducted to test the predictive capability of temperature alone, and then in combination with other relevant features.
3.2. Machine Learning Analysis for Pressure Estimation
In this section, the machine learning analysis conducted to estimate pressure values based on thermal time series data is presented. The experiments were designed to explore the correlation between the thermal data and three pressure metrics: peak pressure, average peak pressure, and average pressure at various anatomical points on the feet.
3.2.1. Pressure Estimation at Individual Anatomical Points
As part of the first set of experiments, a dataset was constructed where the target regression values were each of the three pressure metrics. The features were derived from the thermal time series data. The temperature measurements, representing a time series, were recorded at specific intervals—30 s, 90 s, 120 s, and 180 s—at various anatomical points on the feet. The pressure values were consolidated into three metrics: peak pressure, average peak pressure, and average pressure for each anatomical point.
The analysis of the machine learning models revealed a lack of a significant correlation between the thermal time series data and the predicted pressure values. This outcome was consistent across all the anatomical points and pressure metrics, as indicated by the negative R2 values. The performance metrics for the regression models, including MAE, MSE, RMSE, RMSLE, and MAPE, exhibited high error rates and negative R2 values, suggesting that the models struggled to accurately estimate the pressure values based on thermal data alone. These findings indicate that thermal data, when used in isolation, may not provide a reliable basis for pressure estimation.
Due to the extensive volume of results, only the outcomes for three anatomical points are presented as examples.
Table 2 provides the metrics for a regressor estimating the average peak pressure at the hallux,
Table 3 presents the metrics for the average peak plantar pressure estimation at the third metatarsus, and
Table 4 displays the metrics for peak plantar pressure estimation at the heel. As shown in these tables, large errors were obtained, suggesting the difficulty of predicting pressure values solely from thermal time series data. These results raise questions about the suitability of using thermal data alone for pressure estimation. Further investigation is necessary to identify additional features or data sources that may enhance the accuracy of pressure prediction models. The results indicate significant challenges in predicting pressure values using thermal time series data alone, as evidenced by the high error rates and negative R
2 values across most of the models.
For the hallux, as shown in
Table 2, the Extra Trees Regressor had the best performance among the models, but even this model yielded a negative R
2 value (−0.4872) and substantial errors (MAE = 23.7315; RMSE = 27.1971), indicating poor predictive accuracy. Other models, such as the K Neighbors Regressor and Random Forest Regressor, demonstrated even higher errors and more negative R
2 values, further underscoring the difficulty of estimating pressure from thermal data in this region.
In the third metatarsus (
Table 3), the models performed similarly poorly. The Light Gradient Boosting Machine and Dummy Regressor produced the same results, with a negative R
2 value of −0.7524 and considerable errors (MAE = 59.1336; RMSE = 69.5438). The Least Angle Regression model performed particularly poorly, with extreme errors (MAE = 1358.865; RMSE = 1583.866) and a highly negative R
2 value of −1376.8, suggesting that this model is entirely unsuitable for this task.
For the heel (
Table 4), the Extra Trees Regressor again performed the best among the models but with a negative R
2 value (−0.8203) and significant errors (MAE = 42.6667; RMSE = 48.1738). The Linear Regression and Least Angle Regression models were particularly ineffective, with extremely high error metrics and highly negative R
2 values, indicating a complete failure to predict the pressure values accurately in this region.
Overall, the results across all the anatomical points suggest that the regression models struggle to accurately predict plantar pressure based solely on thermal data. The consistently high errors and negative R2 values across the models raise questions about the feasibility of using thermal time series data as a standalone predictor for plantar pressure. Further research is needed to explore alternative features or combinations of data that may improve the predictive accuracy of these models.
3.2.2. Analysis of Correlation Between Consolidated Temperature and Pressure Prediction
An additional analysis was conducted to explore the potential correlation between consolidated temperature data from the entire foot, measured at the five anatomical points, and the pressure metrics at a single point. To test this hypothesis, the dataset was augmented to include the consolidated thermal time series data from these five anatomical points. These combined temperature data were then used to predict the average pressure at each of the anatomical points, focusing on the three pressure metrics. The objective of this analysis was to determine whether temperature information from multiple locations on the foot could enhance the prediction of pressure at a specific site.
Given the extensive results collected from five anatomical points and three pressure metrics,
Table 5,
Table 6 and
Table 7 provide a representative sample of the findings. These tables focus on the first metatarsus targeting average peak pressure, the fifth metatarsus targeting average peak plantar pressure, and the lateral midfoot targeting peak plantar pressure. As observed, the results indicate that the correlation between the consolidated temperature and the pressure metrics did not produce promising outcomes. The analysis shows negative R
2 values and high errors across various evaluation metrics, suggesting that the consolidated temperature data are not sufficient for accurately predicting the pressure values at these specific anatomical points.
The regression models employed in this analysis were consistent with those used in the previous sections, and similar metrics were utilized to evaluate the performance of these models. The objective was to determine whether consolidated temperature data from multiple anatomical points could improve the prediction of the pressure metrics at specific locations on the foot. However, the analysis reveals a weak correlation between the consolidated temperature data and the pressure metrics, as indicated by consistently negative R2 values and high error metrics across all the models.
In
Table 5, which presents the results for the first metatarsus targeting average peak pressure, all the regression models demonstrate poor performance. The Light Gradient Boosting Machine and Dummy Regressor, which produced identical results, recorded a mean absolute error (MAE) of 24.3803 and a root mean square error (RMSE) of 29.9025, with a negative R
2 value of −0.6947. This suggests that these models, like the others, failed to capture any meaningful relationship between the consolidated temperature data and the pressure values at the first metatarsus. The errors were substantial across the board, with the Least Angle Regression model showing the worst performance, recording an MAE of 5.46 × 10
35 and infinite values for both MSE and RMSE, reflecting the model’s complete inability to make accurate predictions.
Similarly, in
Table 6, which focuses on the fifth metatarsus targeting average peak plantar pressure, the results were equally discouraging. The Light Gradient Boosting Machine and Dummy Regressor again demonstrated a poor performance, with an MAE of 37.8672 and an RMSE of 46.2579, coupled with a negative R
2 value of −0.7488. Even the more advanced models, such as Extreme Gradient Boosting and Elastic Net, yielded high error metrics (e.g., RMSE values of 57.8543 and 55.4159, respectively) and negative R
2 values (−1.9084 and −1.7926), further underscoring the inadequacy of using consolidated temperature data to predict the pressure metrics at the fifth metatarsus. The Least Angle Regression model once again produced extreme results, with errors of the same magnitude as those observed in the previous table, highlighting its unsuitability for this task.
Overall, these results suggest that the consolidated temperature data from multiple anatomical points are not sufficient to predict the pressure metrics accurately at specific locations on the foot. The negative R2 values across all the models indicate that the regression models failed to capture any meaningful relationship between the temperature data and the pressure metrics. The consistently high errors in metrics such as MAE, MSE, RMSE, RMSLE, and MAPE further reinforce this conclusion, suggesting that alternative data sources or additional features may be required to enhance the accuracy of the pressure prediction models.
The results of this analysis suggest that consolidating thermal data from multiple anatomical points did not improve the accuracy of the pressure predictions at individual anatomical points, as demonstrated by the regression metrics for the lateral midfoot targeting peak plantar pressure (
Table 7). The weak correlation between the consolidated temperature data and the pressure metrics is evident from the consistently negative R
2 values and the high errors observed across the various models.
For example, the Extra Trees Regressor, which is generally a strong performer in regression tasks, yielded an MAE of 21.6342 and an RMSE of 24.3457, accompanied by a significantly negative R2 value of −4.1466. Similarly, the Elastic Net and Extreme Gradient Boosting models also exhibited poor performances, with negative R2 values of −4.509 and −4.8219, respectively, and relatively high error metrics (e.g., RMSE values of 26.9775 and 26.2291, respectively). These results indicate that the models were unable to effectively capture the relationship between the consolidated temperature data and the pressure at the lateral midfoot, leading to inaccurate predictions.
Additionally, the Passive Aggressive Regressor, which had one of the lowest RMSE values at 20.8999, still exhibited a negative R2 value of −5.027, further confirming the lack of a meaningful relationship between the temperature data and the pressure metrics. The consistently high MAE, MSE, RMSE, and RMSLE values across all the models, coupled with the negative R2 values, suggest that the temperature data from different anatomical points may not be directly related to the pressure at a specific location, at least with the models and features used in this study. The results for other models, such as the Decision Tree Regressor and Linear Regression, are even more striking, with highly negative R2 values (−8.7018 and −16.3879, respectively) and large errors (e.g., RMSE values of 31.3636 and 33.3176, respectively). These metrics highlight the complexity of accurately predicting the pressure distribution in the feet based on thermal data alone.
In all the models tested, negative
values were consistently observed, as shown in
Table 1,
Table 2,
Table 3,
Table 4,
Table 5,
Table 6 and
Table 7. A negative
value indicates that the model’s predictive power is worse than simply predicting the mean of the data. This suggests that the temperature data alone do not contribute effectively to predicting the plantar pressure at the anatomical points analyzed. These results corroborate the findings of the correlation analysis, where weak or no significant correlations were detected between the temperature and pressure data. Therefore, the poor predictive accuracy of the models, as indicated by the negative
values, is consistent with the inherent lack of a strong relationship between the two modalities.
3.2.3. Implications of Correlations for Machine Learning Model Performance
The observed correlations between plantar pressure and temperature, described in
Section 3.1, provide insights that may explain the challenges faced in the first set of experiments, where temperature data were used to predict the plantar pressure. The weak and inconsistent correlations, especially in regions like the 1stM and 3rdM, suggest that temperature alone may not be a strong predictor of pressure in certain anatomical points. The lack of substantial linear or monotonic relationships indicates that the temperature data do not capture the complexities and variations in the pressure distribution across the foot. These findings imply that machine learning models trained solely on temperature features are likely to perform poorly when predicting plantar pressure as the underlying relationship between these variables is weak. The moderate to strong correlations in the midfoot regions (LatM and MedMF) may offer some predictive potential, but, overall, the weak correlations in other areas suggest that additional features or more complex data representations will be required to enhance the performance of the predictive models. This emphasizes the importance of incorporating more diverse and relevant biomechanical features when developing machine learning algorithms for plantar pressure prediction.
3.3. Diabetes Prediction from Temperature and Pressure Data
The lack of a significant correlation between the consolidated thermal data and pressure metrics, as observed in the previous analysis, suggests that these two variables may not be directly related or reflect a causal relationship. However, this apparent disconnect between the thermal data and pressure values does not diminish their individual predictive potential. On the contrary, it opens the door to treating temperature and pressure as independent features in the context of diabetes prediction.
By considering thermal and pressure data as separate, uncorrelated inputs, it becomes possible to harness the unique predictive capabilities of each variable. Temperature data may capture specific physiological responses, such as inflammation or altered blood flow, that are indicative of diabetic conditions, while pressure data could reflect biomechanical abnormalities associated with diabetes, such as altered gait or foot structure. Together, these independent features have the potential to provide a more comprehensive and accurate prediction model for diabetes.
In this following section, we explore the application of these variables in diabetes prediction. By leveraging their individual strengths as separate features within machine learning models, we aim to enhance the accuracy and reliability of diabetes diagnosis, as demonstrated in the first study. The primary objective of this study is to assess the combined predictive potential of temperature and pressure data to diagnose diabetes. To achieve this, a set of machine learning models, facilitated by the PyCaret library, were employed to predict the diabetes status for each instance in the dataset.
Table 8 provides an overview of the results obtained from the diabetes prediction task across the tested machine learning models. The table is organized to reflect a performance hierarchy, listing the models in descending order according to their predictive effectiveness. This arrangement begins with the highest-performing model and progresses to those with comparatively lower predictive capabilities.
Upon reviewing the performance metrics across the various machine learning models, several key insights emerge. The Extra Trees Classifier stands out as the top performer, demonstrating the highest accuracy, AUC, recall, precision, F1 score, and kappa values, which collectively indicate its strong and consistent performance across multiple metrics. In contrast, models such as the Random Forest Classifier, Extreme Gradient Boosting, and Naive Bayes show variability in their performance across the different metrics, suggesting differences in how these models address various aspects of the prediction task.
Certain models, such as Quadratic Discriminant Analysis and K Neighbors Classifier, exhibit lower accuracy, recall, precision, and F1 scores, which may reflect limitations in their predictive capabilities for this particular dataset. Additionally, models like the SVM with a Linear Kernel, Light Gradient Boosting Machine, and Dummy Classifier display zero performance across all the metrics, indicating that they may not be suitable for this specific prediction task. In general, the data highlight a diverse range of model performances, shedding light on potential candidates that excel in predicting diabetes based on the combined temperature and pressure data.
Table 9 and
Table 10 summarize the results of two distinct experiments, each focusing on predicting diabetes using temperature or pressure as independent variables. Although several classifiers achieve respectable metrics when applied to individual variables, their performance does not match the superior precision observed with the combined data prediction presented in
Table 8. In the experiment utilizing temperature data, the classifiers exhibit varying degrees of accuracy, with the Naive Bayes model achieving the highest accuracy at 0.875. Similarly, in the pressure-only experiment, the Logistic Regression model attains an accuracy of 0.6875, marking the best performance among the classifiers for this experiment.
Furthermore, it is important to observe that the F1 score, recall, and precision metrics are closely aligned in the table for diabetes prediction using both the combined pressure and temperature features, as well as in the individual predictions. This consistency across the metrics indicates that the model provides a balanced response between both classes. In particular, the similarity of these metrics suggests that the model is not biased toward one class over the other, effectively managing the trade-off between false positives and false negatives. This balance is particularly important in clinical prediction tasks, where misclassifications can have significant consequences. The fact that these performance metrics remain comparable across different feature sets (pressure, temperature, and combined) further supports the robustness of the model and highlights that both modalities contribute meaningful information for predicting diabetic conditions.
The findings from this experiment highlight the challenges associated with predicting diabetes based solely on temperature or pressure data. While each of these modalities provides valuable insights into the state of the foot, neither offers sufficient discriminatory power on its own to reliably identify diabetic conditions. The low predictive accuracy observed in both the temperature-only and pressure-only models underscores the limitations of single-modality approaches. These results strongly suggest that the integration of multiple data sources, such as combining temperature and pressure data, is necessary to improve the predictive performance. Additionally, incorporating other physiological or biomechanical features could further enhance the accuracy of machine learning models in clinical diagnostics.
It is noteworthy that, according to
Table 8, the Extra Trees Classifier, which performs exceptionally well with the combined data, only achieves a modest accuracy of 0.375 in the pressure-only experiment. The key insight arises when these results are compared to the earlier section where the models utilized both temperature and pressure data in tandem for diabetes prediction. Although some classifiers display competitive metrics in the individual-variable experiments, their collective predictive power falls short of the combined data prediction. The Extra Trees Classifier stands out as the most effective model in the combined experiment, achieving an accuracy of 0.9375 and a perfect AUC of 1.0.
Figure 5 provides a comparative analysis of the Extra Trees Classifier and Random Forest Classifier, focusing on their performance in predicting diabetes using both thermal and pressure data.
Figure 5a shows the feature importance for the Extra Trees Classifier, where certain features, primarily related to thermal data such as “Hallux_Ref_Max” and “Hallux_Ref_Min”, are highlighted as the most significant contributors to the model’s predictions.
Figure 5c illustrates the feature importance for the Random Forest Classifier. Unlike the Extra Trees Classifier, the Random Forest Classifier shows a more balanced distribution of importance across both the temperature and pressure features. This indicates that the Random Forest model considers a mix of both types of data—temperature (e.g., “Hallux_Min_30s”) and pressure (e.g., “1stM_APP”)—to be equally important in making accurate predictions. This balanced approach suggests that integrating both temperature and pressure data enhances the model’s ability to predict diabetes effectively.
Figure 5b,d present the decision boundaries for the Extra Trees model and Random Forest Classifier, respectively. These boundaries visually demonstrate how each model differentiates between diabetic (1) and non-diabetic (0) cases based on the input features. The Random Forest Classifier’s reliance on a combination of temperature and pressure data is reflected in the complexity and distribution of its decision boundary, showing a nuanced understanding of the data compared to the Extra Trees model.
4. Conclusions
This study presents a comprehensive investigation into the potential use of thermal and pressure data for the prediction of diabetes, spanning two sets of experiments that explored the relationships between these variables and their combined predictive power. The findings present promising avenues for the development of innovative strategies to enable early intervention and ultimately improve patient outcomes.
The first set of experiments revealed the challenges in directly correlating thermal data with plantar pressure metrics. The weak correlations observed in the regression models suggest that thermal data, when used in isolation, may not be sufficient for accurately predicting the pressure values at specific anatomical points. However, presented new opportunities, allowing us to treat temperature and pressure as independent features in the subsequent diabetes prediction models. By doing so, we leveraged the unique strengths of each variable, leading to more robust and accurate predictive models.
In the second set of experiments, the integration of thermal and pressure data into machine learning models significantly enhanced the predictive capabilities for diabetes. The combined analysis demonstrated that using both temperature and pressure variables together provides a more comprehensive understanding of the physiological responses associated with diabetes. This approach paved the way for more accurate risk assessments and personalized diabetes management strategies. The integration of temperature and pressure data is expected to improve diabetes prediction because these two modalities provide complementary physiological insights. Temperature data can reveal early signs of inflammation, vascular issues, or tissue damage, which are common precursors to complications in diabetic patients, such as ulcers. On the other hand, plantar pressure measurements offer information about structural changes, foot deformities, and abnormal pressure distribution, which are also characteristic aspects of diabetic foot conditions. The previous approaches have primarily focused on one modality, limiting their ability to capture the complex interactions between these physiological factors. By combining both temperature and pressure data, the model can identify a broader range of diabetes-related abnormalities, improving the overall predictive accuracy and offering a more comprehensive assessment of foot health in diabetic patients.
The study results indicate that, while the classifiers performed respectably with the standalone variables, significantly better results were achieved when the temperature and pressure data were combined. Specifically, the Logistic Regression model achieved the best performance among the classifiers, with an accuracy of 68.75% when using only plantar pressure data. In contrast, using only temperature data, the classifiers exhibited varying degrees of accuracy, the Naive Bayes model achieving the highest at 87.5%. Furthermore, when the models used both temperature and pressure data in tandem for the prediction of diabetes, the Extra Trees Classifier emerged as the most effective, achieving a precision of 93.75% and a perfect AUC score of 1. This model demonstrated strong performances on multiple metrics, including recall (0.875), precision (1), F1 score (0.9333), and kappa (0.875) values. Other models, such as the Random Forest Classifier and Extreme Gradient Boosting, showed varied performances across these metrics, highlighting differences regarding how they handled the prediction task. These results highlight the importance of integrating temperature with plantar pressure measurements in monitoring the activities of daily living regarding diabetes. The findings also emphasize the need to evaluate multiple classification algorithms to determine which is the most accurate for predicting diabetes in different participant clusters.
While this study treats each foot’s data as an input to the model, we acknowledge that this approach could introduce potential bias, especially in healthy individuals, where symmetry between the left and right feet is generally observed. However, asymmetries in the foot data between individuals with diabetes exist, which can provide valuable diagnostic information. To further investigate this, we calculated the correlation between feet across different individuals, revealing that, while there is a high correlation between the feet of the same person, subtle variations exist across individuals. These findings support the use of cross-validation with independent foot data as the dataset still exhibits enough variability to provide meaningful insights. However, this may reduce the variability in healthy individuals and could potentially overestimate the model’s performance in specific cases.
Furthermore, the study underscored the complexity of pressure distribution in feet and highlighted the importance of comprehensive data integration and advanced feature engineering. The inability of regression models to accurately predict pressure from consolidated temperature data emphasized the need for a more nuanced approach that considers the independent contributions of each variable to diabetes prediction. The introduction of additional modalities has the potential to overcome the limitations observed in this correlation analysis, paving the way for more robust and accurate pressure prediction models.
Limitations and Future Work: One limitation of this study is the potential bias introduced by treating each foot from the same individual as an input to the model. In healthy individuals, where foot symmetry is generally present, this may lead to an overestimation of the model performance. Future studies should examine whether treating both feet as correlated data is a more appropriate method when symmetry is expected. Furthermore, exploring whether foot symmetry or asymmetry persists across different demographic factors such as gender, ethnicity, and foot dominance would add depth to the analysis. Additionally, future research should explore and contrast the condition of treating both feet from the same individual as independent instances. This approach should ensure that the data from the same individual are not included in both the training and testing datasets, which would enhance the robustness and generalizability of the model by avoiding potential bias introduced by symmetry or shared characteristics between the feet.
Looking forward, future research should also focus on refining the feature engineering techniques and optimizing the model selection to fully harness the predictive potential of these combined variables. Additionally, further studies are essential to explore how insoles and other biomechanical factors impact prediction models, which could lead to new therapeutic interventions and enhance the accuracy of diabetes management tools. By continuing to build on the insights gained from this study, there is considerable potential to advance the field of diabetes prediction and improve the quality of life for individuals affected by this condition.