1. Introduction
Multiple sclerosis (MS) is a central nervous system disease that affects quality of life and is frequently seen in young and middle-aged people [
1]. The clinical complexity of MS causes a wide range of symptoms in persons with MS (PwMS) [
1,
2]. Many tools have been developed in addition to clinical studies to define the severity of MS [
3]. One of the most widely used tools is the Expanded Disability Status Scale (EDSS). The EDSS is a scale used to determine the progress of disease in PwMS and to evaluate the effectiveness of clinical interventions. The EDSS ranges from 0 to 10 in 0.5-point steps, based on eight separate functional system (FS) scales. Lower EDSS scores are more dependent on physical examination, while higher EDSS scores (>EDSS 6) are more dependent on ambulation. Although the EDSS contains many uncertainties, it is still accepted as the standard for measuring disability in PwMS [
4,
5].
Although the EDSS has been widely used for nearly 30 years, the reliability and validity of the scale is strongly dependent on the neurologist performing the neurological examination. One of the main problems with EDSS is its lack of reliability, as different neurologists may produce different results for the same patients in a series of neurological examinations [
5]. Although an expert system has been developed to semi-automatically evaluate EDSS, scores of functional systems evaluated by neurologists are still required as entry parameters for this expert system [
6]. Therefore, a more objective measure to measure disability in MS is required, to overcome the limitations of the EDSS.
EDSS scores increase in parallel with the decrease in the functional capacity of the individual. However, a decrease in aerobic capacity is expected in individuals with reduced functional capacity [
7]. Previous studies in PwMS reported a weak to moderate correlation (range −0.25 to −0.58) between aerobic capacity and EDSS [
2]. Evaluation of an individual’s functional capacity may vary from physician to physician. On the other hand, evaluation of aerobic capacity as an objective parameter reflecting functional capacity can eliminate individual differences. It has been shown that aerobic capacity is associated with functional neuronal plasticity in PwMS [
8]. Neuroplasticity can be beneficial in overcoming relapses and resisting MS progression. It has been confirmed that physical exercise supports neuroplasticity, so it is not surprising that aerobic capacity is associated with functional performance, strength, fatigue, and cognitive functions in PwMS [
9,
10]. Increased levels of physical activity have been shown to modulate microglial activation in the central nervous system (CNS) and increase cortical thickness in MS [
11,
12]. For these reasons, aerobic capacity can be an important indicator for representing disabilities caused by neurodegeneration in the CNS for PwMS. This is because exercise or physical activity, which affects physical capacity, has neuroprotective properties. Additionally, the evaluation of an individual’s functional capacity can vary from one physician to another when using the EDSS assessment. On the other hand, assessing aerobic capacity as an objective parameter reflecting functional capacity can eliminate individual differences.
Artificial intelligence and machine learning-based studies that will help diagnose MS or make predictions about its progression are becoming increasingly important. The implementation of machine learning in MS has so far mainly been used to classify participants according to different disease stages [
13,
14]. DT-based algorithms compared to logistic regression and support vector machines were used to predict secondary progressive multiple sclerosis (SPMS) disability progression. Variables included EDSS, multiple sclerosis functional compound scores, T2 lesion volume, brain parenchymal fraction, disease duration, age, and gender. In their current form, the models developed in the study were not found to be clinically useful in predicting an individual’s disease course. However, SPMS disability progression was best predicted by non-parametric machine learning [
15]. However, to identify clinically isolated syndrome patients who have converted to MS using machine learning, a focus has been placed on lesion features in magnetic resonance imaging (MRI), particularly features that describe shape and brightness. Conversion or non-conversion was predicted correctly in 71 patients based on shape features derived from computer-assisted manual segmentation masks (84.5% accuracy) [
16]. There are few studies in the literature on EDSS score estimation using machine learning methods [
17,
18]. This study focused on generalizable and high-accuracy EDSS score prediction with machine learning methods, using PwMS patient data with EDSS scores between 0 and 5. The aim of the study was to provide a tool that can be used in the clinical management of MS patient by providing a more accurate and reliable prediction of EDSS scores with machine learning methods. In this context, using machine learning techniques, the performance of the model was tested with various algorithms, focusing on the generalizability of the results. The models used aim to provide reliable results on real world data.
Contribution and Paper Organization
This study proposes a machine learning approach to estimate EDSS scores using aerobic capacity data of PwMS. The contributions of the study to the literature can be summarized as follows.
- -
A machine learning approach to predict EDSS score using the aerobic capacity data of multiple sclerosis (MS) patients.
- -
Use of an automatic decision support system in EDSS evaluation
- -
Use of data based not only on classical neurological symptoms but also on physiological measurements in EDSS evaluation.
Regarding the organization of this study, the
Section 1 presents the purpose and scope of the study, and a review of the studies conducted in the literature.
Section 2 details the dataset used in the study, the steps followed for the development of the proposed machine learning models, and the performance evaluation criteria.
Section 3 presents the experimental results obtained from the machine learning algorithms, and
Section 4 provides a comprehensive analysis of the potential of the findings. Finally, in the
Section 5, a summary of the study is presented, the contributions made are expressed, and recommendations for future work are made.
3. Experimental Results
This study was carried out based on the approach of accurately estimating the EDSS score with machine learning methods using aerobic capacity data in patients with MS. The estimation process performed was evaluated using MAE, MAPE, MSE, RMSE, and R
2 metrics. The experimental results of the study are given in
Table 5, with the evaluation metrics for each k-fold cross-validation.
When
Table 5 is analyzed, it can be seen that the algorithms used within the scope of this study expressed significant results. For the MS, RMSE, and MAE parameters, it is observed that the lowest level of error, which was close to 0, was achieved by the XGBoost and CatBoost algorithms. The mean values of the results obtained from each k-fold validation of the algorithms used are given in
Table 6.
When
Table 6 is analyzed, it can be said that, since the value of 0.26 obtained by the extreme gradient boosting and decision tree algorithms for the MAE parameter was close to 0, the error rate was low, and the result obtained was significant. When the other algorithms are analyzed respectively, it is seen that gradient boosting and CatBoost algorithms were less successful than the other algorithms for the MAE metric. When the MAPE parameter has a value between 10 and 20, this means that a correct prediction model is created.
When the MAPE values obtained in the study are analyzed, it is seen that all algorithms were within this range and expressed meaningful results. It is seen that the XGBoost algorithm gave the most significant result, with a MAPE value of 16, in support of the MAE parameter. This value was 17.2 for the GBM algorithm and 18 for the DT algorithm, and it can be concluded that the GBM algorithm gave more meaningful results than the DT. The MSE, RMSE, and R
2 metrics also gave similar results to the other metrics, and according to these results, it can be seen that the most successful algorithm was the XGBoost algorithm. The XGBoost algorithm was followed by GBM algorithm in terms of performance, while the DT algorithm ranked third according to the evaluation metrics. The CatBoost algorithm, on the other hand, showed a less successful performance compared to the other algorithms according to the metrics obtained. If a general conclusion can be made by analyzing
Table 6, it is understood that each parameter expressed significant results within its value range and the prediction models were highly accurate.
In line with the significant results obtained from the performance metrics, the test data values and the predicted values were analyzed comparatively.
Figure 3 shows the test data given to the model and the line graph of the predicted outputs against these inputs.
The prediction values given in
Figure 3 support the results in
Table 6. When
Figure 3 is analyzed, it is observed that the prediction values produced by XGBoost and GBM models were closer to the actual values compared to the other models. The XGBoost model made predictions close to the true value at most of the sample data points, showing deviations only at some points. Similarly, the GBM model also exhibited high performance, in line with the general trend. When DT and CatBoost models are analyzed, it is observed that there were more deviations in some extreme data points compared to the other two models. In particular, when some data points were analyzed, it was observed that these two algorithms produced different predictions compared to the others.
4. Discussion
It is recommended to evaluate the actual walking distance in the evaluation of the EDSS [
4]. However, due to time and/or logistical constraints, the walking distance is often determined according to the statements of the patients. When the data of patients other than MS patients were examined, it was observed that the maximum walking distances declared by the patients did not match the actual maximum walking distances. Considering these studies, aerobic capacity data, which are related to walking distance and reflect functional capacity in PwMS, were used in this study [
22]. Thus, this aimed to eliminate individual differences in determining walking distance. As a result, when the estimated values and performance metrics in this study were examined together, it was observed that results close to the real EDSS scores were obtained, and the target determined in this direction was achieved.
Cardiopulmonary fitness is considered an important determinant of health and performance and is closely related to physical activity level and sedentary time [
23]. However, VO
2max is a sensitive measure and, intuitively, cardiopulmonary fitness is expected to decrease as disability increases in PwMS [
24]. Due to extensive damage to the CNS, the disease is characterized by various sensory, motor, cerebellar, and cognitive dysfunctions. These dysfunctions limit physical activity behavior in PwMS and can subsequently lead to deconditioning [
25,
26]. Many cross-sectional studies in small samples have reported correlations between disability level for EDSS and VO
2max in PwMS [
27,
28,
29]. The slope of the correlation between VO
2max and EDSS was determined such that a one-point increase in EDSS would decrease VO
2max by 2.6 mL·kg
−1·min
−1 [
2]. Physical activities that affect aerobic capacity provide neuroprotection in MS, possibly by directly affecting the brain. Several studies in healthy individuals proposed that the main factors responsible for enhancing neuroplasticity associated with improved brain function post-AE are transient increases in glutamatergic-mediated intracortical excitation and decreases in γ-aminobutyric acid (GABA)-mediated intracortical inhibition [
28,
29,
30,
31]. Aerobic capacity may be an important predictor for reflecting disability severity.
Furthermore, aerobic capacity has been reported to have a possible prophylactic effect on cardiovascular disease risk, better walking performance and improved cognitive processing speed, and structural decline of brain tissue in PwMS, while impaired aerobic capacity in healthy individuals has been reported to be associated with functional limitations that may hinder independent living [
8,
11,
32,
33]. For these reasons, aerobic capacity is considered an important physiological measure in PwMS.
It is widely known that applying the right treatment as a result of an early and accurate diagnosis prevents disability and reduces healthcare costs [
34,
35]. For these reasons, it is highly advantageous to utilize advanced technologies in the evaluation of aerobic capacity data to diagnose EDSS. In addition, testing walking distance in clinical practice is time-consuming and difficult to implement, and this distance is often reported by the patient, increasing the need for the development of new technology-based assistive methods. In the literature on predicting EDSS values, Alves et al. aimed to predict the EDSS scores of MS patients with machine learning algorithms using clinical notes provided by neurologists. A total of 13,766 patients’ data were used within the scope of the study, 684 of which had EDSS scores obtained from the OM1 MS Registry. They performed the prediction process using the XGBoost estimator, a machine learning model. As a result of the examinations, they determined that the Spearman R value was 0.75, the Pearson R value was 0.74, the AUC value was 0.91, the positive predictive value was 0.85, and the negative predictive value was 0.85 [
17]. Yang et al. aimed to calculate the EDSS score from patients’ electronic health records using natural language processing. They considered 16,441 medical records of 4808 patients who received care at an MS outpatient clinic in Canada. As a result of their study, they stated that they achieved results in accordance with their goals in the combined keyword model and stated that the study could be automated to extract clinically relevant information from unstructured notes [
18]. The difference between the proposed study from previous ones is that it predicts the EDSS score with machine learning methods and using aerobic capacity data. In this study, the aim was to objectively obtain the EDDs score by analyzing aerobic capacity data (VO
2max, VePeak, RfPeak, HRPeak, FeO
2Peak, LoadPeak, EEPeak) in PwMS. Thus, this eliminated individual differences in determining walking distance. In addition, it increased the reliability of the EDSS and prevented the problem of the differences of opinion among physicians in EDSS score estimation. Many studies in the current literature focused on estimating the EDSS score by considering clinical data or patient statements. Estimates may be inconsistent due to individual differences, and these approaches were often based on subjective assessments of patients. Furthermore, because clinical data are often limited and time-dependent, they may be insufficient to make long-term and large-scale predictions. In our study, EDSS scores were predicted by machine learning techniques using reproducible and objective biometric measures such as aerobic capacity data. It is thought that the proposed method will enable more consistent and reliable results to be obtained in clinical applications, by providing high accuracy and reliability to support traditional methods based on subjective evaluations.
This study has several limitations. Firstly, since the study included only RRMS or SPMS patients with an EDSS between 1 and 5, the results cannot be generalized to all individuals with MS. The EDSS (Expanded Disability Status Scale) is used to assess disease progression based on the physical abilities of MS patients. The EDSS scale range varies from 0 (normal examination) to 10 (death). The scale range used in this study covered the mild and moderate stages of the disease. This limitation restricts the applicability of the results to individuals with advanced disease. To overcome this limitation, a wider EDSS scale range should be included in the study, and the scope of the study should be expanded.
Secondly, although the results of the study were quite satisfactory, it is thought that higher accuracy values would be obtained by including an equal number of PwMS for each EDSS value. It is thought that a balanced sample set would increase the generalization and accuracy of the results, by ensuring equal representation of the EDSS group. It is thought that the applicability of the study findings to a wider disease population could be increased by providing a balanced distribution for each EDSS value, by addressing the unbalanced sampling constraint in future studies.
Another limitation is that, although other diseases that could affect the aerobic capacity of the participants were excluded, the individual differences and lifestyle patterns of the participants were ignored. Considering that this study is pioneering research, there is a need for further studies that consider the individual characteristics of the participants in detail. Lastly, artificial intelligence models require high-quality, accurate, and abundant data. The dataset used in this study was collected by expert physicians in the field, reflecting a high accuracy. However, the number of data points could be increased. It is believed that increasing the number of data points would enhance the performance of the prediction model.