Next Article in Journal
Comparing Several P300-Based Visuo-Auditory Brain-Computer Interfaces for a Completely Locked-in ALS Patient: A Longitudinal Case Study
Previous Article in Journal
Impact of Selected Yeast Strains on Quality Parameters of Obtained Sauerkraut
Previous Article in Special Issue
Neuromorphic Analog Machine Vision Enabled by Nanoelectronic Memristive Devices
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Identifying Correlated Functional Brain Network Patterns Associated with Touch Discrimination in Survivors of Stroke Using Automated Machine Learning

1
Occupational Therapy, School of Allied Health, Human Services and Sport, La Trobe University, Melbourne, VIC 3086, Australia
2
Neurorehabilitation and Recovery Group, The Florey, Melbourne, VIC 3084, Australia
3
HitIQ, Melbourne, VIC 3205, Australia
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(8), 3463; https://doi.org/10.3390/app14083463
Submission received: 28 January 2024 / Revised: 10 April 2024 / Accepted: 15 April 2024 / Published: 19 April 2024
(This article belongs to the Special Issue Artificial Intelligence (AI) in Neuroscience)

Abstract

:
Stroke recovery is multifaceted and complex. Machine learning approaches have potential to identify patterns of brain activity associated with clinical outcomes, providing new insights into recovery. We aim to use machine learning to characterise the contribution of and potential interaction between resting state functional connectivity networks in predicting touch discrimination outcomes in a well-phenotyped, but small, stroke cohort. We interrogated and compared a suite of automated machine learning approaches to identify patterns of brain activity associated with clinical outcomes. Using feature reduction, the identification of combined ‘golden features’, and five-fold cross-validation, two golden features patterns emerged. These golden features identified patterns of resting state connectivity involving interactive relationships: 1. The difference between right insula and right superior temporal lobe correlation and left cerebellum and vermis correlation; 2. The ratio between right inferior temporal lobe and left cerebellum correlation and left frontal inferior operculum and left supplementary motor area correlation. Our findings demonstrate evidence of the potential for automated machine learning to provide new insights into brain network patterns and their interactions associated with the prediction of quantitative touch discrimination outcomes, through the automated identification of robust associations and golden feature brain patterns, even in a small cohort of stroke survivors.

1. Introduction

Stroke recovery and rehabilitation is a critical field of practice in healthcare, aiming to restore and improve brain function, independence, and overall quality of life for individuals affected by stroke [1]. While traditional approaches in rehabilitation have shown efficacy [2], recent advancements in machine learning, particularly artificial neural networks (ANN), open new avenues for personalised and optimised interventions [3,4]. The demand for improved stroke rehabilitation methodologies is underscored by the need for solutions that consider the intricacies of each person’s strengths and impairments and tailor interventions accordingly.
In this paper, we explore the intersection of new insights from machine learning and stroke recovery. In particular, we explore the relationship between advanced brain imaging and clinical outcomes in a well-phenotyped stroke cohort [5]. Evidence suggests that disruptions to network connectivity predict impairment in multiple behavioural domains [6]. However, a common challenge when working with stroke survivors is access to large numbers of survivors, especially when detailed neuroimaging and clinical outcomes are required [7]. In this paper, we address and highlight the challenges and significance of data size, model complexity, and hyperparameter tuning in stroke research.
Artificial intelligence (AI) is a broad concept that refers to the development of computer systems that are capable of simulating human intelligence [8]. Machine learning (ML) is a subset of AI, focused on training algorithms to learn patterns from data and make predictions or decisions without explicit programming [9]. Models are investigated for their predictive power, with this index providing a means of benchmarking the value of the model generated. In essence, ML is a key technology within the field of AI and contains the concept of an error metric that the model can reference to gauge improvement in successive training runs [10].
Automated machine learning (AutoML) refers to the process of automating the end-to-end process of applying ML to real-world problems [11]. It involves automating tasks such as data pre-processing, feature engineering, model selection, hyperparameter tuning, and model evaluation. The goal is to create ML models with minimal manual intervention, making it more accessible for individuals with limited ML expertise and reducing the time taken to arrive at an accurate and reliable model. The final state is a balance between the competing goals of specificity and bias.
Machine learning (ML) and automated machine learning (AutoML) represent transformative approaches in the analysis of complex clinical data and the prediction of patient outcomes. ML and AutoML can be used to identify interactions between several variables and detect useful information in clinical and imaging data [12]. Machine learning models, such as random forest, logistic regression, and deep neural networks, have been increasingly applied in medical research, including stroke rehabilitation [13]. For example, these models are utilized for predicting functional recovery [14], predicting favourable outcomes following an intervention [15], and individualising stroke rehabilitation [16], typically employing a large amount of clinical and imaging data as input [13,17]. Additionally, convolutional neural networks (CNNs), a subset of deep learning models, have been employed for image analysis in stroke research, showcasing the ability to improve accuracy in predicting motor functions and aiding in rehabilitation planning [13,18]. Together, these studies provide evidence of robust brain–behaviour associations and demonstrate the value and potential of using machine learning approaches, particularly when large volumes of data are available. However, the selection and application of ML models are also often reliant on known relationships and prior datasets to train models [13].
In the domain of stroke research, where data are often scarce and highly dimensional, AutoML frameworks present a promising solution. These frameworks automate the process of model selection, feature engineering, and hyperparameter tuning, aiming to develop high-performing predictive models with minimal human intervention [19]. This has particular value when such parameters are not previously known or established for a particular population, such as stroke. By benchmarking various AutoML tools against traditional ML approaches, studies have demonstrated the potential of AutoML in enhancing disease prediction, including stroke outcome prediction, based on clinical and imaging data [19].
In the current study, we investigated different ML approaches to examining brain networks (functional brain imaging data) and their ability to predict clinical outcomes. The stroke cohort investigated is known as the START cohort (STroke imAging pRevention and Treatment) [5]. The cohort includes survivors of stroke over the first year post-stroke, from within hours after their stroke to 3 days, 3 months, and 12 months post-stroke. A subset of the START cohort received advanced imaging at 3 months and 12 months post-stroke. This included structural and functional MRI brain imaging. Here, we focus on functional brain imaging data, specifically resting state functional connectivity, given its potential to predict clinical outcomes [5]. Disruptions to functional resting state network connectivity have predicted impairment across multiple behavioural domains, with both specific and general changes in network patterns [6]. For example, Siegel et al., found an association between specific functional resting state brain networks and attention, visual memory, verbal memory, and language domains in a sample of 132 stroke survivors. They did not, however, investigate a relationship with somatosensation.
We focus on the prediction of quantitative touch discrimination in the hands (contralesional and/or ipsilesional) of stroke survivors as the clinical outcome of interest at 3 months and 12 months post-stroke. The impairment of body sensations (somatosensation), including changes in the ability to discriminate touch sensations, is experienced by one in two survivors of stroke [20]. Evidence-based therapy is available to treat this impairment [21]. However, the potential for new personalised and optimised interventions based on the knowledge of neural networks and interactions between them has not been fully investigated. Evidence of altered functional connectivity has been reported in a stroke cohort within four pre-defined regions of the somatosensory network, i.e., the primary (S1) and secondary (S2) somatosensory cortices in both hemispheres [22]. This cohort of chronic stroke survivors had ongoing, and often marked, impairment of somatosensation, specifically touch discrimination. In comparison, the current stroke cohort has a wide range of impairments, many patients are defined as having mild stroke according to the National Institute of Health Stroke Scale (NIHSS) [23], and touch discrimination was tested at specific times post-stroke (3 and 12 months). We use the tactile discrimination test (TDT) to quantify touch discrimination impairment [24,25]. To explore the complex effects of stroke on brain networks, this study introduces an innovative analytical approach centered on the identification and analysis of ‘golden features’. These features, derived from the mathematical manipulation of brain area correlations, represent a novel method of examining the multi-level interactions within post-stroke brain networks. The concept of golden features represents a pivotal innovation, particularly in the context of stroke rehabilitation research. These features are constructed through the mathematical manipulation—addition, subtraction, multiplication, and division—of existing correlations between brain areas. This methodology allows for the extraction of nuanced patterns of connectivity that might be obscured in direct, untransformed correlations [26,27]. Essentially, ‘golden features’ embody the complex interplay between different regions and networks of the brain, offering a more dynamic perspective on how stroke impacts neural networks. By defining and leveraging these ‘golden features’, our research aims not only to provide deeper insights into the impact of stroke on complex and interacting brain networks but also to pave the way for more personalised and effective rehabilitation strategies.
In summary, our overall aim is to characterise the contribution of and potential interaction between functional connectivity brain networks in predicting clinical outcomes after stroke. Our results may lead to the possibility of personalised approaches to achieve enhanced outcomes for stroke survivors. Specifically, we use machine learning to characterise the contribution of functional connectivity networks in predicting touch discrimination outcomes in a well-phenotyped, but small, stroke cohort, the START cohort [5]. ML is selected as the approach given its ability to recognise patterns in complex data. While we recognise that stroke is a multifaceted condition, and a range of factors likely impact clinical outcomes, our focus here is on use of AutoML to recognise robust brain patterns that would otherwise be hidden in the complexity of brain data to maximise prediction and interpretability.
In stroke research, due to the population and heterogeneous impairments experienced following stroke, numbers are often small, and it is difficult to create large datasets for training models. This makes model selection and feature reduction much more critical than in fields where more data are available. Therefore, in this study, we employed the MLjar AutoML package [28], which compares multiple variations of random forest, extra trees, LightGBM, Xgboost, CatBoost, and ANN ML algorithms to a baseline value. Our approach involves four main steps: preprocessing and the selection of initial models correlating brain imaging and clinical data; the identification and selection of features (brain regions that are correlated) that predict clinical outcomes, including new ‘golden features’ based on a combination of selected features; the optimisation of algorithms; and final ensemble or the stacking of multiple models. Features selected will then be interpreted relative to putative brain networks, and implications will be considered. The potential to glean new insights through the identification of ‘golden features’ that are derived from complex brain connectivity data and across multiple levels of relationship is clearly evident.

2. Materials and Methods

2.1. Participants

Participants from the STroke imAging pRevention and Treatment (START) study [5] that had advanced neuroimaging and quantitative clinical outcome data, including the Tactile Discrimination Test (TDT), were included in the study. Participants were recruited consecutively from metropolitan hospitals in Melbourne that had specialised stroke units. Eligibility criteria required participants to be diagnosed with acute ischaemic stroke, aged 18 years or older, English-speaking, and with no significant premorbid disability as determined by a modified Rankin Scale (mRS) score 2 . All participants who had advanced brain imaging scans at 3 months and 12 months post-stroke (n = 60) were included in the sample. Demographic and clinical information about the study participants can be seen in Table 1.

2.2. Tactile Discrimination

The tactile discrimination test (TDT) is a quantitative measure of touch or texture discrimination designed for use with survivors of stroke [25]. The test involves assessing an individual’s ability to perceive and distinguish between finely graded plastic texture grids using a three-alternative forced choice design [25]. The stimulus is a texture grating marked by ridges at set spatial intervals. Five different sets range from small to large differences, with each presented 5 times in a random order. The participant is required to tactually explore the sets of triplet texture grids with their preferred finger (index or middle) and indicate the one that is different. Testing typically takes 10–20 min per hand depending on sensorimotor control and rests required. This task evaluates the extent of somatosensory impairment in terms of the precision and accuracy with which individuals can perceive and discriminate touch stimuli. The TDT is scored as percent correct response using the area under the curve (AUC), while accounting for chance response [29]. The probability of correct response is mapped for each of the stimulus triplet sets. The standard grid for each stimulus set is 1500 µm, and the five comparison stimuli are 1550, 1700, 2100, 2600, and 3000 µm. Scores above chance range from 0 to 100, and 66.1 AUC is the defined criterion of abnormality, with lower scores indicating poorer performance [22,25,29]. The TDT has high test–retest reliability, normative standards, and excellent discriminative properties [25]. The normative standards and scoring were recently updated, including for the 25 stimuli test version used in this study [24]. The association between touch discrimination capacity, using the TDT, and brain regions and networks has been previously established in neuroimaging studies with older healthy controls [30] and in a different cohort of stroke survivors [16,22,31,32].

2.3. Design

Each participant was scanned at two time points, 3 months (90 ± 7 days) and 12 months (365 ± 7 days) post-stroke [5], and the TDT was administered at the time of the scan or within 48 h. The TDT was administered to the left and right hand, typically with the ipsilesional hand tested first. TDT scores of the first tested hand were used in the current analysis. This included scores ranging from 7.69 AUC to 100 AUC, with the right hand tested first for 36 participants and the left for 24 participants.

2.4. Resting State Preprocessing and Analysis

A customised data cleaning pipeline optimised for the preprocessing of stroke data was constructed [22]. The pipeline used functions from DCMstack https://github.com/moloney/dcmstack (accessed on 6 April 2017), Analysis of Functional NeuroImages (AFNI) [33], SPM12 v6685 http://www.fil.ion.ucl.ac.uk/spm/software/spm12/ (accessed on 6 April 2017), Advanced Normalization Tools (ANTs) [34], Numpy [35], Scipy [36], and Nibabel https://github.com/nipy/nibabel (accessed on 6 April 2017), combined under the NiPype framework [37]. Anatomical image preprocessing consisted of segmentation using the new segmentation method and coregistration to the mean EPI image [22]. White matter and cerebrospinal fluid (CSF) masks were created by thresholding the segmented white matter and CSF images at 0.99 and eroding two times using a 3 × 3 × 3 mm structure element to minimise partial volume effects. Normalisation to Montreal Neurological Institute (MNI) space was achieved by transforming an MNI space 3 × 3 × 3 mm template image to subject space and then using the inverse transformation matrix to warp the T1 image from subject space to MNI space. Stroke participants had their FLAIR and lesion mask included in the pipeline, which were coregistered to the T1 image and coregistered to the EPI image [22].
Prior to preprocessing, we conducted a systematic, visual quality inspection of each participant’s resting state data. Participants were excluded if their data were shown to have consistent, excessive motion or noticeable distortions. No participants were excluded on this basis. The preprocessing of EPI data included despiking, slice timing correction to the central slice, and realignment to the first volume. Motion- and physiological-related artefacts were regressed from the data using the Friston 24 parameter model [38] and aCompCor [39], taking the top five components each for white matter and CSF mask extracted signals. The global signal from within the brain mask was also regressed. This can help attenuate residual motion and physiological effects not removed by prior cleaning [40].

2.5. Identification and Definition of Brain Regions

The Automated Anatomical Labelling (AAL) Atlas [41] is a widely used brain atlas in neuroimaging research. It provides a predefined set of 116 anatomical regions, each associated with specific brain structures or functional areas. Researchers use the AAL Atlas to partition the brain into distinct regions, facilitating the analysis and interpretation of neuroimaging data, particularly when using techniques such as functional magnetic resonance imaging (fMRI). Functional correlation matrices were generated for each participant and timepoint using the preprocessed functional data.
Correlation matrices constructed from fMRI scans represent the functional connectivity between different brain regions. In the context of fMRI, these matrices capture the degree to which the blood oxygen level-dependent (BOLD) signal fluctuations in one brain region correlate with those in another. By examining these correlation patterns, researchers gain insights into the synchronised activity and communication between different brain areas, aiding in the understanding of functional networks and neural processes.

2.6. Correlation Matrices to ML Dataset Preprocessing

The correlation matrices for each of the 60 participants were initially flattened, and repeated correlations were removed (i.e., the upper triangle), as well as the central diagonal, giving 120 examples of scans with a corresponding TDT score from the affected hand. Thus, the total data for training were 120 rows × 6670 columns, and the targets for prediction and scoring were 120 corresponding TDT scores. A train test split of 75% was performed with sklearn, and the selection of model, tuning of hyperparameters, and creation of additional features were performed with 75% of the data, with 25% of the data held aside as testing data. The data were stratified by time point to ensure a similar mix of 3 month and 12 month data in both training and testing data. The information about the data collection time point was not included in the training data.

2.7. Auto ML Approach

The AutoML research model as described in the MLjar-supervised package is depicted in Figure 1. The figure depicts the four main process steps involved: preprocessing and initial models; feature engineering; optimization; and ensembling.
The MLjar-supervised package is a wrapper for Auto-sklearn, which is an AutoML module from the widely used Scikit-Learn package and was chosen for its thorough management of ML operations, i.e., ML-Ops ability [42]. It has a multi-tiered approach that is thorough and efficient. The first step is explain, which identifies suitable models from baseline, linear, decision tree, random forest, Xgboost, ‘neural network’ algorithms, and ensemble and establishes a baseline accuracy. The second step is perform, which uses five-fold cross-validation and includes learning curves and importance plots in reports.
A key step in processing small data sets is feature reduction [43]. The software package selected involves both feature reduction as well as the identification of ‘golden features’, defined as a combination of emergent features that improve the model [44]. The main steps involved in the process of selecting the best algorithms are ‘explain’ and ‘perform’. Explain performs an initial quick exploratory data analysis with various models and compares them to a baseline value. This identifies which models are likely to perform well. Perform (which was run with a time budget of 24 h), performs data preprocessing, the identification of ‘golden features’, the selection of high-performing features, and parameter tuning. A further value of using this approach is that it permits automated comparison across algorithmic approaches that are filtered and tested and, as purported on the website, may be used and interpreted by ‘machine learning non-experts’ (https://mljar.com/automl/, accessed on 5 August 2023).

2.8. Golden Features

Golden features is the name given to high-performing derived features by the MLjar-supervised package. They are combinations and permutations of pairs of features from the dataset. Pairs of original features are added, subtracted, divided, and multiplied to create new features. These new features are then assessed by decision tree analysis and those that perform well are added to the data.

2.9. Feature Selection

Feature selection was then performed on the original features and the created golden features using permutation-based feature importance, a robust statistical method that assesses the importance of individual features within predictive models [45,46]. The calculation of feature ‘importance’ score is performed by assessing the contribution of a feature to the prediction error, Root Mean Squared Error (RMSE) score for the model. Permutation testing involves systematically shuffling the values of each feature across the dataset to break the original association between the features and the target outcomes. The performance of the model is then re-evaluated with the permuted data, and the change in model accuracy is measured. This process is repeated multiple times to obtain a distribution of performance scores for each feature, thereby estimating the significance of each feature’s contribution to the model’s predictive power. The importance of permutation testing lies in its non-parametric nature, offering a model-agnostic and reliable measure of feature relevance without assuming an underlying distribution. This method is crucial for our analysis as it helps in identifying ‘golden features’ that hold the most predictive value for understanding touch discrimination outcomes post-stroke, ensuring that our findings are not artefacts of random chance but are truly indicative of underlying neural processes [47]. By employing permutation testing, we ensure the robustness and reliability of the features identified by our AutoML approach, providing a solid foundation for the subsequent interpretation of these features within the context of stroke rehabilitation.

2.10. Hyperparameter Tuning

The goal of hyperparameter tuning is to find the combination of parameters that results in the most effective and accurate model for a given task or dataset. Hyperparameter tuning involves optimising the configuration settings, known as hyperparameters, of a machine learning model to achieve the better performance of the predictive model [48]. These hyperparameters are not learned from the data but are set prior to the training process. Achieving this goal is important as correct hyperparameter tuning is at least as important as model architecture selection [49]. Finally, a five-fold cross-validation was used in the training runs, and RMSE was used as the error metric, where the sum of the difference between the model’s predicted value for the TDT score and the actual TDT score are calculated.

2.11. Cross-Validation

To ensure the reliability and generalisability of our machine learning models, we employed five-fold cross-validation, a widely recognised method for evaluating the performance of predictive models [50,51]. This technique involves partitioning the original dataset into five equal or nearly equal subsets. In each iteration of the process, four subsets are combined and used to train the model, while the remaining subset is used as a test set to evaluate model performance. This procedure is repeated five times, with each subset serving as the test set exactly once. The advantage of five-fold cross-validation lies in its ability to provide a more accurate estimate of model performance across different subsets of the data, minimising the potential bias that could result from a single train test split. Furthermore, this approach maximises both the training and testing data’s utilisation, essential in contexts like stroke research where datasets may be limited in size. By averaging the performance metrics across the five iterations, we obtain a comprehensive overview of the model’s predictive performance and robustness, ensuring that our findings are not merely a result of particular data partitioning but are reflective of the model’s true capability to generalise across unseen data [52]. The application of five-fold cross-validation in our analysis supports the reliability of the ‘golden features’ identified by the AutoML framework, reinforcing the validity of our conclusions regarding their significance for post-stroke touch discrimination outcomes.

3. Results

The selection of models for inclusion in the AutoML approach is first determined relative to a baseline RMSE score. While this baseline model is not expected to perform well, it is performed to provide an error score that the score of more complex models can be compared to. A baseline score of RMSE 20.22 was established by calculating the error when the mean TDT score is used as a prediction for every case; this score is represented by a red line in Figure 2.
LightGBM, Xgboost, CatBoost, neural network, and random forest models were trained and evaluated. No linear models passed the requirements of the exploratory stage of MLjar and were not included in the models trained in the ‘perform’ stage.
A comparison between the RMSE score achieved, grouped by model type, can be seen in Figure 2. These results show that ANN performed poorly, or at least inconsistently, compared to other model architectures and often performed worse than baseline (RMSE 20.22). This is not surprising given the limited amount of data for training, as neural networks are known to have poor performance without a large amount of training data. Light gradient boosting machine (LightGBM) performed consistently well, to the point where an attempt at creating an ensemble, by combining high-performing trained models, did no better than a single LightGBM model with feature selection and golden features achieved on its own.
The four most important features across the five folds for the best-performing LightGBM model with golden feature, selected features, and hill climbing steps applied (23_LightGBM_GF_SF_HC1) can be seen in Figure 3. The two features with the highest mean importance scores were the derived ‘golden features’. These relationships appeared across all of the folds of the cross-validation, providing evidence of the robustness of the relationship between these correlations and the TDT score. Thus, the process of creating derived or golden features found combinations that performed well and warrant further exploration for their efficacy in directing rehabilitation efforts and the prediction of likely recovery outcomes. The golden feature with the highest mean feature importance (0.36) involved a difference in correlation between the right insula and right superior temporal region (Insula_R <=> Temporal_Sup_R) and correlation between area 3 of the left cerebellum and vermis (Cerebelum_3_L <=> Vermis_1_2). The golden feature with the second highest mean feature importance (0.21) was based on the ratio of the correlation between the right inferior temporal lobe and left cerebellum (Temporal_Inf_R <=> Cerebelum_7b_L) and the left frontal inferior operculum and left supplementary motor area (Frontal_Inf_Oper_L <=> Supp_Motor_Area_L). These golden features contain information that are valuable predictors of the achieved TDT score across the majority of the group. These interactions between correlations were more valuable predictors of TDT scores than any other single correlation.
A table of all of the models trained, training steps involved, and RMSE achieved can be seen in Table 2. Table 2 compares various machine learning models based on their RMSE scores. Models are organised by type and the optimisation strategies applied, showcasing their performance across different enhancement steps. An asterisk (*) indicates the model with the lowest RMSE in each step. Numbers at the start of the configuration name are used to differentiate new models of the same type from previous models that performed well and have extra steps applied, eg., 1_Default_LightGBM and 1_Default_LightGBM_GF_SF (the same model with golden features and selected features).
The specific hyperparameters for each model can be seen in Appendix A.
A plot of predicted values vs. residuals for the best-performing LightGBM model (23_LightGBM_GF_SF_HC1) can be seen in Figure 4, and no obvious pattern in the error is present.

4. Discussion

ML approaches are providing new insights into complex systems and neuroscience [53]. Here, we interrogate and demonstrate the ability of a suite of machine learning approaches to identify patterns of brain activity associated with clinical outcomes in a relatively small cohort of stroke survivors. Using AutoML (https://mljar.com/automl/, accessed on 5 August 2023), we compared six candidate approaches and applied algorithms. Both feature reduction and the identification of ‘golden features’ were involved, together with five-fold cross-validation. LightGBM, which utilises a gradient boosting framework [54], emerged as the best candidate for this stroke cohort proof-of-concept data set.
In merging machine learning (ML) with stroke research, ‘prediction’ serves a distinct role. Unlike clinical predictions focused on prognoses, ML predictions evaluate how well models capture underlying data patterns—specifically, the relationship between brain network features and TDT scores post-stroke. The predictive performance of our models is not aimed at forecasting individual outcomes but at assessing the model’s accuracy in reflecting complex data relationships, such as with resting state connectivity data. This is crucial for confidence in identifying significant ‘golden features’ in brain networks and understanding recovery mechanisms, which can inform targeted rehabilitation strategies. The goal is to harness these models as analytical tools, providing insights into stroke recovery’s neural underpinnings. Such insights pave the way for interventions based on a nuanced understanding of post-stroke neural recovery, bridging the gap between ML and clinical applications. Thus, our findings, which focus on predictive accuracy, are a means to deepen our understanding of stroke recovery, guiding future rehabilitation efforts with data-driven precision. This marks a step towards innovative progress in stroke recovery research, leveraging ML to illuminate the complexities of neural impairment and recovery.
The derivation of ‘golden features’ from basic correlation metrics mirrors the inherent complexity of brain networks themselves. Such networks are not merely linear or additive in nature but encompass interactions that might be compensatory, synergistic, or inhibitory [55,56]. For example, a diminished functional connectivity between two brain regions post-stroke might be offset by enhanced connectivity in another part of the network, a dynamic interplay that ‘golden features’ are uniquely positioned to capture [57,58]. Moreover, the value of ‘golden features’ extends beyond their descriptive power; they also hold prognostic significance. By identifying specific patterns of network disruption or reorganization, these features can predict clinical outcomes with greater accuracy than traditional measures [59]. This predictive capability not only informs the development of targeted therapeutic interventions but also facilitates a more personalised approach to stroke rehabilitation, tailoring treatments to the individual’s unique neural landscape [60].
Two ‘golden feature’ patterns emerged based on the highest mean feature importance of the five folds of the cross-validation (0.36 and 0.21) for the LightGBM model. The brain regions associated involved the difference between the right insula and right superior temporal region correlation and area 3 of the left cerebellum and vermis correlation (0.36 importance), as well as the ratio between the right inferior temporal lobe and left cerebellum correlation and the left frontal inferior operculum and left supplementary motor area correlation (0.21 importance). Interestingly, the association between right insula and right superior temporal was based on the difference between the strength of these correlations. The two remaining features had mean feature importance across the five folds of the cross-validation of 0.10 and 0.13, respectively, and involved relationships between the right frontal interior orbital region and right insula (0.13 importance) and between the left hippocampus and right cerebellum (0.10 importance). These ‘golden features’ and correlations identified, which are based on patterns of resting state brain activity across neural networks, provide new insights in predicting touch discrimination in a cohort of stroke survivors in the first year post-stroke, as discussed below, and demonstrate the potential value of this machine learning approach.
Although ANNs have shown great promise in many areas of ML, ANNs typically require larger amounts of data compared to classical ML models [61]. This is because ANNs, especially deep learning models, have a high number of parameters that need to be optimised, and a large dataset helps prevent overfitting and allows the model to generalise well to unseen data [61]. In comparison, classical ML models, such as decision trees or linear regression, may perform reasonably well with smaller datasets, as they have fewer parameters and dependencies to learn. The architecture of the best-performing model type, LightGBM, is a machine learning library that provides algorithms which utilise a gradient boosting framework. This model essentially ignores a significant proportion of data with small gradients in favour of features with larger gradients, as these features play a more important role in prediction accuracy. Gradient-based one-side sampling (GOSS) can obtain quite an accurate estimation of the information gain with a much smaller data size. This approach is particularly useful in a situation like stroke studies when data amounts are likely to be restricted.
The AutoML approach allows a thoroughness that would be difficult and time-consuming to achieve manually. This thorough and systematic testing of models and parameters, along with the creation of new features and dimensionality reduction, has resulted in a high-performing model that may not have been arrived at without using the AutoML approach. In addition to the discovery of this particular combination of model architecture and parameters, the time taken to arrive at a model was also significantly reduced compared to manually running each step of the analysis.
The first ‘golden feature’ grouping pattern with highest mean feature importance (0.36), involved the difference between the strength of correlation between the right insula and right superior temporal lobe relative to the left cerebellum and vermis correlation. The insula and the medial temporal lobe are directly connected through white matter fibres, which connect the entire insular cortex with the temporal pole and the amygdaloid complex [62]. The large-scale connectivity of the insula cortex positions it to play an important role in processing and integrating internal and external multisensory stimuli. Further, distinct insula subregions are associated with particular neural networks (e.g., attentional and sensorimotor networks) [63], consistent with the importance of this connected region in the tactile discrimination of textured surfaces. The involvement of the right hemisphere is also consistent with the hypothesis that the right hemisphere plays a dominant role in tactile discrimination function and suggests the need for further systematic investigation. The left cerebellum and vermis (median portion of the cerebellum) were also associated and identified within this ‘golden feature’ combination. The cerebellum is involved in sensorimotor operations, cognitive tasks and affective processes [64]. In an earlier study of a different cohort of stroke survivors with somatosensory impairment, we found that clinical improvement in touch discrimination was associated with stronger correlations at 6 months between the contralesional thalamus and cerebellum [65]. Moreover, evidence of remote tract-specific reductions in axonal connectivity indicated by diffusion imaging measures suggest a model of losing connecting fibres in the cerebellum and interhemispheric sensorimotor areas in the somatosensory network after a stroke [31].
The second ‘golden feature’ grouping with the second highest mean importance value of 0.21 identified the ratio involving the relationship between the right temporal and left cerebellum and left frontal operculum and supplementary motor area (SMA). This grouping is a valuable predictor of the achieved TDT score across the majority of the group. The involvement of these connected regions is consistent with cerebro–cerebellar interactions involved in perceptual and motor aspects of temporal processing and the simulation of timing information through feed-forward computation in the cerebellum [66]. The second grouping also involved an association between the left inferior frontal operculum and SMA. The SMA is connected to the frontal lobe and opercular region via the superior longitudinal fasciculus [67]. The touch function of the ipsilesional hand has been associated with the superior longitudinal fasciculus after stroke [31], and the pre-SMA has been shown to have somatosensory organisation [68].
Other regions that were identified as being associated with TDT outcomes were the right frontal inferior orbital region and right insula (0.13 importance) and left hippocampus and right cerebellum (0.10 importance). The right orbitofrontal cortex has extensive connections with sensory areas, as well as limbic system structure, and may have a specific role in attending to tactile stimuli [69]. Differential links with pain and pain-related areas are also reported [70]. Finally, the hippocampus and cerebellum are functionally connected in a bidirectional manner such that the cerebellum can influence hippocampal activity and vice versa [71]. Functional connectivity between the cerebellum and somatosensory area has also been associated with the attenuation of self-generated touch [72], and the hippocampus has been associated with the ownership of one’s limb [73]. Together, our findings are consistent with a distributed model of somatosensory processing, wherein multiple networks are involved in separate subfunctions [73].
Finally, using automated ML, we have demonstrated new insights to functional connectivity networks associated with clinical outcomes, specifically touch discrimination, after stroke. This approach has identified brain regions and patterns of connectivity, i.e., combined ‘golden features’, that have importance in predicting clinical outcomes, beyond the predictions that are possible via manual human investigation. The next step, as highlighted in this Special Issue on the applications of AI in neuroscience, is to investigate the following question: how might we apply these new insights in clinical practice to personalise stroke recovery and rehabilitation? For example, the right insula and its correlated connections were identified as having an important role in two of the ‘golden feature’ groupings for stroke survivors with either right or left hemisphere lesions. This connected region plays an important role in multisensory integration. If this region is infarcted by the stroke in an individual, it might directly impact touch discrimination function. Alternatively, touch discrimination might also be indirectly impacted via interruption to correlated parts of the network. Conversely, if not impacted, this residual strength in the network could be manipulated in therapy, via its known behavioural function, to enhance cross-modal calibration and multisensory integration. Thus, the knowledge of the importance of these connected regions and networks has the potential to not only predict clinical outcome for an individual but also personalise treatment.
The predictive analysis between brain networks and clinical outcomes reported was conducted using functional resting state connectivity data. A high-quality dataset is a prerequisite for obtaining robust analysis results when using image data and ML [13,74]. While the limitations of resting state functional connectivity as a method are well known [74], using this method in stroke adds another layer of complexity to the analysis [75]. While no method is able to completely remove the associated artefacts, the pipeline used in this study [22] was developed specifically to minimise structural and functional changes that may be present as a result of stroke, as well as general sources of noise present in resting state fMRI. This, coupled with the quality of the imaging data, helps support the robust nature of our findings.

5. Conclusions

Automated machine learning can identify patterns of brain network activity associated with quantitative touch discrimination in survivors of stroke. A comparison across six candidate AutoML approaches revealed the best-performing algorithm for this stroke cohort (LightGBM), with the subsequent identification of two ‘golden feature’ patterns involving brain regions and networks putatively associated with touch discrimination. The potential to use this automated ML approach to predict quantitative clinical outcomes and identify associated brain networks and interacting brain networks, even with a small sample of stroke survivors, is demonstrated. The value of this approach lies in the following: the automation of model selection; the identification of correlated activity between brain imaging and clinical data filtered for robust associations; feature selection with the naming of brain regions; and the identification of ‘golden features’ that represent derived combined brain patterns based on decision tree analysis. Thus, the process is rigorous, provides an outcome based on fused data, and is able to be applied in small cohorts, as was the case in the current analysis. This approach has the potential to provide new insights into stroke recovery, as demonstrated in the identification of brain networks with correlated brain activity associated with clinical outcomes at 3 months and 12 months post-stroke.

Author Contributions

Conceptualization and methodology, A.W., P.G. and L.M.C.; formal analysis, A.W. and P.G.; data curation, A.W., P.G. and L.M.C.; data interpretation, A.W., P.G. and L.M.C.; writing—original draft preparation, A.W., P.G. and L.M.C.; writing—review and editing, A.W., P.G. and L.M.C.; data visualization, A.W.; supervision, P.G. and L.M.C.; funding acquisition, L.M.C.; software, A.W. All authors have read and agreed to the published version of the manuscript.

Funding

We acknowledge support from the Commonwealth Scientific Industrial Research Organisation (CSIRO) of the Australia Preventative Health Flagship grant (START cohort); the National Health and Medical Research Council (NHMRC) of Australia Partnership grant (GNT 1134495); the NHMRC Project grant (GNT 1022694); and the NHMRC Ideas grant (GNT 2004443) awarded to L.M.C.

Institutional Review Board Statement

This study was conducted in accordance with the Declaration of Helsinki and approved by the Human Ethics Committee of Austin Health (HREC/17/Austin/281, approved 4 September 2017 ) and the La Trobe University Human Ethics Committee (HREC/17/Austin/281, externally approved project, 14 March 2018 ), Melbourne, Victoria, Australia for studies involving humans. Central ethical approval for original data collection for the START_PrePARE cohort study was obtained from the Melbourne Health Human Research Ethics Committee (2009.079, approved 13 January 2010) and the Austin Health Human Research Ethics Committee (H2010/03588, approved 6 January 2010).

Informed Consent Statement

Informed consent was obtained from all participants involved in the study.

Data Availability Statement

Due to the personal nature of the data and original ethics approval, the data will not be made available broadly. De-identified data may be made available for related research and analysis by the research group and collaborators with additional ethics approval.

Acknowledgments

We would like to thank the stroke survivors who participated in the original studies and the members of the START research group and Neurorehabilitation and Recovery research team who contributed to data collection.

Conflicts of Interest

All authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ANNin this paper, referred to as artificial neural networks
to differentiate them from physical brain networks.
MLmachine learning
AIartificial intelligence
AutoMLautomated machine learning
NIHSSNational Institute of Health Stroke Scale
TDTTactile Discrimination Test
CSFcerebrospinal fluid
BOLDblood oxygen level-dependent
SMAsupplementary motor area
GOSSgradient-based one-side sampling

Appendix A. Model Hyperparameters

Appendix A.1. Summary of 1_Default_LightGBM

LightGBM
-n_jobs:
−1
-objective: regression
-num_leaves: 63
-learning_rate: 0.05
-feature_fraction: 0.9
-bagging_fraction: 0.9
-min_data_in_leaf: 10
-metric: rmse
-custom_eval_metric_name: None
-explain_level: 1

Appendix A.2. Summary of 2_Default_Xgboost

Extreme Gradient Boosting (Xgboost)
-n_jobs: −1
-objective: reg:squarederror
-eta: 0.075
-max_depth: 6
-min_child_weight: 1
-subsample: 1.0
-colsample_bytree: 1.0
-eval_metric: rmse
-explain_level: 1

Appendix A.3. Summary of 3_Default_CatBoost

CatBoost
-n_jobs: −1
-learning_rate: 0.1
-depth: 6
-rsm: 1
-loss_function: RMSE
-eval_metric: RMSE
-explain_level: 1

Appendix A.4. Summary of 4_Default_NeuralNetwork

Neural Network
-n_jobs: −1
-dense_1_size: 32
-dense_2_size: 16
-learning_rate: 0.05
-explain_level: 1

Appendix A.5. Summary of 5_Default_RandomForest

Random Forest
-n_jobs: −1
-criterion: squared_error
-max_features: 0.9
-min_samples_split: 30
-max_depth: 4
-eval_metric_name: rmse
-explain_level: 1

Appendix A.6. Summary of 10_LightGBM

LightGBM
-n_jobs: −1
-objective: regression
-num_leaves: 15
-learning_rate: 0.05
-feature_fraction: 0.8
-bagging_fraction: 0.5
-min_data_in_leaf: 50
-metric: rmse
-custom_eval_metric_name: None
-explain_level: 1

Appendix A.7. Summary of 6_Xgboost

Extreme Gradient Boosting (Xgboost)
-n_jobs: −1
-objective: reg:squarederror
-eta: 0.075
-max_depth: 8
-min_child_weight: 5
-subsample: 1.0
-colsample_bytree: 1.0
-eval_metric: rmse
-explain_level: 1

Appendix A.8. Summary of 14_CatBoost

CatBoost
-n_jobs: −1
-learning_rate: 0.05
-depth: 8
-rsm: 0.8
-loss_function: RMSE
-eval_metric: RMSE
-explain_level: 1

Appendix A.9. Summary of 18_RandomForest

Random Forest
-n_jobs: −1
-criterion: squared_error
-max_features: 0.5
-min_samples_split: 20
-max_depth: 4
-eval_metric_name: rmse
-explain_level: 1

Appendix A.10. Summary of 22_NeuralNetwork

Neural Network
-n_jobs: −1
-dense_1_size: 32
-dense_2_size: 4
-learning_rate: 0.05
-explain_level: 1

Appendix A.11. Summary of 11_LightGBM

LightGBM
-n_jobs: −1
-objective: regression
-num_leaves: 63
-learning_rate: 0.2
-feature_fraction: 0.5
-bagging_fraction: 1.0
-min_data_in_leaf: 10
-metric: rmse
-custom_eval_metric_name: None
-explain_level: 1

Appendix A.12. Summary of 7_Xgboost

Extreme Gradient Boosting (Xgboost)
-n_jobs: −1
-objective: reg:squarederror
-eta: 0.1
-max_depth: 8
-min_child_weight: 1
-subsample: 1.0
-colsample_bytree: 1.0
-eval_metric: rmse
-explain_level: 1

Appendix A.13. Summary of 15_CatBoost

CatBoost
-n_jobs: −1
-learning_rate: 0.1
-depth: 8
-rsm: 1.0
-loss_function: MAE
-eval_metric: RMSE
-explain_level: 1

Appendix A.14. Summary of 19_RandomForest

Random Forest
-n_jobs: −1
-criterion: squared_error
-max_features: 0.7
-min_samples_split: 50
-max_depth: 3
-eval_metric_name: rmse
-explain_level: 1

Appendix A.15. Summary of 19_RandomForest_GoldenFeatures

Random Forest
-n_jobs: −1
-criterion: squared_error
-max_features: 0.7
-min_samples_split: 50
-max_depth: 3
-eval_metric_name: rmse
-explain_level: 1

Appendix A.16. Summary of 1_Default_LightGBM_GoldenFeatures

LightGBM
-n_jobs: −1
-objective: regression
-num_leaves: 63
-learning_rate: 0.05
-feature_fraction: 0.9
-bagging_fraction: 0.9
-min_data_in_leaf: 10
-metric: rmse
-custom_eval_metric_name: None
-explain_level: 1

Appendix A.17. Summary of 11_LightGBM_GoldenFeatures

LightGBM
-n_jobs: −1
-objective: regression
-num_leaves: 63
-learning_rate: 0.2
-feature_fraction: 0.5
-bagging_fraction: 1.0
-min_data_in_leaf: 10
-metric: rmse
-custom_eval_metric_name: None
-explain_level: 1

Appendix A.18. Summary of 1_Default_LightGBM_GoldenFeatures_RandomFeature

LightGBM
-n_jobs: −1
-objective: regression
-num_leaves: 63
-learning_rate: 0.05
-feature_fraction: 0.9
-bagging_fraction: 0.9
-min_data_in_leaf: 10
-metric: rmse
-custom_eval_metric_name: None
-explain_level: 1

Appendix A.19. Summary of 1_Default_LightGBM_GoldenFeatures_SelectedFeatures

LightGBM
-n_jobs: −1
-objective: regression
-num_leaves: 63
-learning_rate: 0.05
-feature_fraction: 0.9
-bagging_fraction: 0.9
-min_data_in_leaf: 10
-metric: rmse
-custom_eval_metric_name: None
-explain_level: 1

Appendix A.20. Summary of 19_RandomForest_GoldenFeatures_SelectedFeatures

Random Forest
-n_jobs: −1
-criterion: squared_error
-max_features: 0.7
-min_samples_split: 50
-max_depth: 3
-eval_metric_name: rmse
-explain_level: 1

Appendix A.21. Summary of 6_Xgboost_SelectedFeatures

Extreme Gradient Boosting (Xgboost)
-n_jobs: −1
-objective: reg:squarederror
-eta: 0.075
-max_depth: 8
-min_child_weight: 5
-subsample: 1.0
-colsample_bytree: 1.0
-eval_metric: rmse
-explain_level: 1

Appendix A.22. Summary of 4_Default_NeuralNetwork_SelectedFeatures

Neural Network
-n_jobs: −1
-dense_1_size: 32
-dense_2_size: 16
-learning_rate: 0.05
-explain_level: 1

Appendix A.23. Summary of 23_LightGBM_GoldenFeatures_SelectedFeatures

LightGBM
-n_jobs: −1
-objective: regression
-num_leaves: 63
-learning_rate: 0.1
-feature_fraction: 0.9
-bagging_fraction: 0.9
-min_data_in_leaf: 10
-metric: rmse
-custom_eval_metric_name: None
-explain_level: 1

Appendix A.24. Summary of 24_LightGBM_GoldenFeatures

LightGBM
-n_jobs: −1
-objective: regression
-num_leaves: 63
-learning_rate: 0.1
-feature_fraction: 0.9
-bagging_fraction: 0.9
-min_data_in_leaf: 10
-metric: rmse
-custom_eval_metric_name: None
-explain_level: 1

Appendix A.25. Summary of 25_RandomForest_GoldenFeatures_SelectedFeatures

Random Forest
-n_jobs: −1
-criterion: squared_error
-max_features: 0.7
-min_samples_split: 50
-max_depth: 4
-eval_metric_name: rmse
-explain_level: 1

Appendix A.26. Summary of 26_RandomForest_GoldenFeatures

Random Forest
-n_jobs: −1
-criterion: squared_error
-max_features: 0.7
-min_samples_split: 50
-max_depth: 4
-eval_metric_name: rmse
-explain_level: 1

Appendix A.27. Summary of 27_Xgboost_SelectedFeatures

Extreme Gradient Boosting (Xgboost)
-n_jobs: −1
-objective: reg:squarederror
-eta: 0.075
-max_depth: 7
-min_child_weight: 5
-subsample: 1.0
-colsample_bytree: 1.0
-eval_metric: rmse
-explain_level: 1

Appendix A.28. Summary of 28_Xgboost_SelectedFeatures

Extreme Gradient Boosting (Xgboost)
-n_jobs: −1
-objective: reg:squarederror
-eta: 0.075
-max_depth: 9
-min_child_weight: 5
-subsample: 1.0
-colsample_bytree: 1.0
-eval_metric: rmse
-explain_level: 1

Appendix A.29. Summary of 29_Xgboost

Extreme Gradient Boosting (Xgboost)
-n_jobs: −1
-objective: reg:squarederror
-eta: 0.075
-max_depth: 7
-min_child_weight: 5
-subsample: 1.0
-colsample_bytree: 1.0
-eval_metric: rmse
-explain_level: 1

Appendix A.30. Summary of 30_Xgboost

Extreme Gradient Boosting (Xgboost)
-n_jobs: −1
-objective: reg:squarederror
-eta: 0.075
-max_depth: 9
-min_child_weight: 5
-subsample: 1.0
-colsample_bytree: 1.0
-eval_metric: rmse
-explain_level: 1

Appendix A.31. Summary of 31_CatBoost

CatBoost
-n_jobs: −1
-learning_rate: 0.1
-depth: 6
-rsm: 1
-loss_function: MAE
-eval_metric: RMSE
-explain_level: 1

Appendix A.32. Summary of 32_LightGBM_GoldenFeatures_SelectedFeatures

LightGBM
-n_jobs: −1
-objective: regression
-num_leaves: 63
-learning_rate: 0.1
-feature_fraction: 0.8
-bagging_fraction: 0.9
-min_data_in_leaf: 10
-metric: rmse
-custom_eval_metric_name: None
-explain_level: 1

Appendix A.33. Summary of 33_LightGBM_GoldenFeatures_SelectedFeatures

LightGBM
-n_jobs: −1
-objective: regression
-num_leaves: 63
-learning_rate: 0.1
-feature_fraction: 1.0
-bagging_fraction: 0.9
-min_data_in_leaf: 10
-metric: rmse
-custom_eval_metric_name: None
-explain_level: 1

Appendix A.34. Summary of 34_LightGBM_GoldenFeatures_SelectedFeatures

LightGBM
-n_jobs: −1
-objective: regression
-num_leaves: 63
-learning_rate: 0.05
-feature_fraction: 0.8
-bagging_fraction: 0.9
-min_data_in_leaf: 10
-metric: rmse
-custom_eval_metric_name: None
-explain_level: 1

Appendix A.35. Summary of 35_LightGBM_GoldenFeatures_SelectedFeatures

LightGBM
-n_jobs: −1
-objective: regression
-num_leaves: 63
-learning_rate: 0.05
-feature_fraction: 1.0
-bagging_fraction: 0.9
-min_data_in_leaf: 10
-metric: rmse
-custom_eval_metric_name: None
-explain_level: 1

Appendix A.36. Summary of 36_RandomForest_GoldenFeatures_SelectedFeatures

Random Forest
-n_jobs: −1
-criterion: squared_error
-max_features: 0.6
-min_samples_split: 50
-max_depth: 4
-eval_metric_name: rmse
-explain_level: 1

Appendix A.37. Summary of 37_RandomForest_GoldenFeatures_SelectedFeatures

Random Forest
-n_jobs: −1
-criterion: squared_error
-max_features: 0.8
-min_samples_split: 50
-max_depth: 4
-eval_metric_name: rmse
-explain_level: 1

Appendix A.38. Summary of 38_RandomForest_GoldenFeatures_SelectedFeatures

Random Forest
-n_jobs: −1
-criterion: squared_error
-max_features: 0.6
-min_samples_split: 50
-max_depth: 3
-eval_metric_name: rmse
-explain_level: 1

Appendix A.39. Summary of 39_RandomForest_GoldenFeatures_SelectedFeatures

Random Forest
-n_jobs: −1
-criterion: squared_error
-max_features: 0.8
-min_samples_split: 50
-max_depth: 3
-eval_metric_name: rmse
-explain_level: 1

Appendix A.40. Summary of 40_Xgboost_SelectedFeatures

Extreme Gradient Boosting (Xgboost)
-n_jobs: −1
-objective: reg:squarederror
-eta: 0.075
-max_depth: 9
-min_child_weight: 1
-subsample: 1.0
-colsample_bytree: 1.0
-eval_metric: rmse
-explain_level: 1

Appendix A.41. Summary of 41_Xgboost_SelectedFeatures

Extreme Gradient Boosting (Xgboost)
-n_jobs: −1
-objective: reg:squarederror
-eta: 0.075
-max_depth: 9
-min_child_weight: 10
-subsample: 1.0
-colsample_bytree: 1.0
-eval_metric: rmse
-explain_level: 1

Appendix A.42. Summary of 42_Xgboost_SelectedFeatures

Extreme Gradient Boosting (Xgboost)
-n_jobs: −1
-objective: reg:squarederror
-eta: 0.075
-max_depth: 7
-min_child_weight: 1
-subsample: 1.0
-colsample_bytree: 1.0
-eval_metric: rmse
-explain_level: 1

Appendix A.43. Summary of 43_Xgboost_SelectedFeatures

Extreme Gradient Boosting (Xgboost)
-n_jobs: −1
-objective: reg:squarederror
-eta: 0.075
-max_depth: 7
-min_child_weight: 10
-subsample: 1.0
-colsample_bytree: 1.0
-eval_metric: rmse
-explain_level: 1

Appendix A.44. Summary of 48_NeuralNetwork_SelectedFeatures

Neural Network
-n_jobs: −1
-dense_1_size: 32
-dense_2_size: 8
-learning_rate: 0.05
-explain_level: 1

Appendix A.45. Summary of 49_NeuralNetwork_SelectedFeatures

Neural Network
-n_jobs: −1
-dense_1_size: 32
-dense_2_size: 32
-learning_rate: 0.05
-explain_level: 1

Appendix A.46. Summary of 50_NeuralNetwork

Neural Network
-n_jobs: −1
-dense_1_size: 32
-dense_2_size: 8
-learning_rate: 0.05
-explain_level: 1

Appendix A.47. Summary of 51_NeuralNetwork

Neural Network
-n_jobs: −1
-dense_1_size: 32
-dense_2_size: 32
-learning_rate: 0.05
-explain_level: 1

References

  1. Grefkes, C.; Fink, G.R. Recovery from stroke: Current concepts and future perspectives. Neurol. Res. Pract. 2020, 2, 17. [Google Scholar] [CrossRef] [PubMed]
  2. Dobkin, B.H.; Dorsch, A. New Evidence for Therapies in Stroke Rehabilitation. Curr. Atheroscler. Rep. 2013, 15, 331. [Google Scholar] [CrossRef] [PubMed]
  3. Shameer, K.; Johnson, K.W.; Glicksberg, B.S.; Dudley, J.T.; Sengupta, P.P. Machine learning in cardiovascular medicine: Are we there yet? Heart 2018, 104, 1156–1164. [Google Scholar] [CrossRef] [PubMed]
  4. Lin, E.; Tsai, S.J. Machine Learning in Neural Networks. In Frontiers in Psychiatry: Artificial Intelligence, Precision Medicine, and Other Paradigm Shifts; Kim, Y.K., Ed.; Springer: Singapore, 2019; pp. 127–137. [Google Scholar] [CrossRef]
  5. Carey, L.M.; Crewther, S.; Salvado, O.; Lindén, T.; Connelly, A.; Wilson, W.; Howells, D.W.; Churilov, L.; Ma, H.; Tse, T.; et al. STroke imAging pRevention and Treatment (START): A Longitudinal Stroke Cohort Study: Clinical Trials Protocol. Int. J. Stroke 2013, 10, 636–644. [Google Scholar] [CrossRef] [PubMed]
  6. Siegel, J.S.; Ramsey, L.E.; Snyder, A.Z.; Metcalf, N.V.; Chacko, R.V.; Weinberger, K.; Baldassarre, A.; Hacker, C.D.; Shulman, G.L.; Corbetta, M. Disruptions of network connectivity predict impairment in multiple behavioral domains after stroke. Proc. Natl. Acad. Sci. USA 2016, 113, E4367–E4376. [Google Scholar] [CrossRef] [PubMed]
  7. Carey, L.M.; Seitz, R.J.; Parsons, M.; Levi, C.; Farquharson, S.; Tournier, J.D.; Palmer, S.; Connelly, A. Beyond the lesion: Neuroimaging foundations for post-stroke recovery. Future Neurol. 2013, 8, 507–527. [Google Scholar] [CrossRef]
  8. Russell, S.J.; Norvig, P.; Chang, M.W.; Devlin, J.; Dragan, A.; Forsyth, D.; Goodfellow, I.; Malik, J.; Mansinghka, V.; Pearl, J.; et al. Artificial Intelligence: A Modern Approach, 4th ed.; Pearson series in artificial intelligence; Pearson: Harlow, UK, 2022; 1166p. [Google Scholar]
  9. Koza, J.R.; Bennett, F.H.; Andre, D.; Keane, M.A. Automated Design of Both the Topology and Sizing of Analog Electrical Circuits Using Genetic Programming. In Artificial Intelligence in Design ’96; Gero, J.S., Sudweeks, F., Eds.; Springer: Dordrecht, The Netherlands, 1996; pp. 151–170. [Google Scholar] [CrossRef]
  10. Sarker, I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Comput. Sci. 2021, 2, 160. [Google Scholar] [CrossRef] [PubMed]
  11. Feurer, M.; Klein, A.; Eggensperger, K.; Springenberg, J.T.; Blum, M.; Hutter, F. Auto-sklearn: Efficient and Robust Automated Machine Learning. In Automated Machine Learning: Methods, Systems, Challenges; The Springer Series on Challenges in Machine Learning; Springer: Berlin/Heidelberg, Germany, 2019; p. 113. [Google Scholar] [CrossRef]
  12. Rajula, H.S.R.; Verlato, G.; Manchia, M.; Antonucci, N.; Fanos, V. Comparison of Conventional Statistical Methods with Machine Learning in Medicine: Diagnosis, Drug Development, and Treatment. Medicina 2020, 56, 455. [Google Scholar] [CrossRef] [PubMed]
  13. Choo, Y.J.; Chang, M.C. Use of Machine Learning in Stroke Rehabilitation: A Narrative Review. Brain Neurorehabil. 2022, 15, e26. [Google Scholar] [CrossRef] [PubMed]
  14. Lin, W.Y.; Chen, C.H.; Tseng, Y.J.; Tsai, Y.T.; Chang, C.Y.; Wang, H.Y.; Chen, C.K. Predicting post-stroke activities of daily living through a machine learning-based approach on initiating rehabilitation. Int. J. Med. Inform. 2018, 111, 159–164. [Google Scholar] [CrossRef] [PubMed]
  15. Mutke, M.A.; Madai, V.I.; Hilbert, A.; Zihni, E.; Potreck, A.; Weyland, C.S.; Möhlenbruch, M.A.; Heiland, S.; Ringleb, P.A.; Nagel, S.; et al. Comparing Poor and Favorable Outcome Prediction With Machine Learning After Mechanical Thrombectomy in Acute Ischemic Stroke. Front. Neurol. 2022, 13, 737667. [Google Scholar] [CrossRef] [PubMed]
  16. Liang, X.; Koh, C.L.; Yeh, C.H.; Goodin, P.; Lamp, G.; Connelly, A.; Carey, L.M. Predicting Post-Stroke Somatosensory Function from Resting-State Functional Connectivity: A Feasibility Study. Brain Sci. 2021, 11, 1388. [Google Scholar] [CrossRef] [PubMed]
  17. Senadheera, I.; Larssen, B.C.; Mak-Yuen, Y.Y.K.; Steinfort, S.; Carey, L.M.; Alahakoon, D. Profiling Somatosensory Impairment after Stroke: Characterizing Common ‘Fingerprints’ of Impairment Using Unsupervised Machine Learning-Based Cluster Analysis of Quantitative Measures of the Upper Limb. Brain Sci. 2023, 13, 1253. [Google Scholar] [CrossRef]
  18. Shin, H.; Kim, J.K.; Choo, Y.J.; Choi, G.S.; Chang, M.C. Prediction of Motor Outcome of Stroke Patients Using a Deep Learning Algorithm with Brain MRI as Input Data. Eur. Neurol. 2022, 85, 460–466. [Google Scholar] [CrossRef] [PubMed]
  19. A Romero, R.A.; Y Deypalan, M.N.; Mehrotra, S.; Jungao, J.T.; Sheils, N.E.; Manduchi, E.; Moore, J.H. Benchmarking AutoML frameworks for disease prediction using medical claims. BioData Min. 2022, 15, 15. [Google Scholar] [CrossRef] [PubMed]
  20. Carey, L. Review on Somatosensory Loss after Stroke. Crit. Rev. Phys. Rehabil. Med. 2017, 29, 1–41. [Google Scholar] [CrossRef]
  21. Carey, L.; Macdonell, R.; Matyas, T.A. SENSe: Study of the Effectiveness of Neurorehabilitation on Sensation: A randomized controlled trial. Neurorehabil. Neural Repair 2011, 25, 304–313. [Google Scholar] [CrossRef]
  22. Goodin, P.; Lamp, G.; Vidyasagar, R.; McArdle, D.; Seitz, R.J.; Carey, L.M. Altered functional connectivity differs in stroke survivors with impaired touch sensation following left and right hemisphere lesions. Neuroimage Clin. 2018, 18, 342–355. [Google Scholar] [CrossRef] [PubMed]
  23. Spilker, J.; Kongable, G.; Barch, C.; Braimah, J.; Brattina, P.; Daley, S.; Donnarumma, R.; Rapp, K.; Sailor, S. Using the NIH Stroke Scale to assess stroke patients. The NINDS rt-PA Stroke Study Group. J. Neurosci. Nurs. 1997, 29, 384–392. [Google Scholar] [CrossRef] [PubMed]
  24. Mak-Yuen, Y.Y.K.; Matyas, T.A.; Carey, L.M. Characterizing Touch Discrimination Impairment from Pooled Stroke Samples Using the Tactile Discrimination Test: Updated Criteria for Interpretation and Brief Test Version for Use in Clinical Practice Settings. Brain Sci. 2023, 13, 533. [Google Scholar] [CrossRef] [PubMed]
  25. Carey, L.M.; Oke, L.E.; Matyas, T.A. Impaired Touch Discrimination After Stroke: A Quantiative Test. J. Neurol. Rehabil. 1997, 11, 219–232. [Google Scholar] [CrossRef]
  26. Smith, S.M.; Vidaurre, D.; Beckmann, C.F.; Glasser, M.F.; Jenkinson, M.; Miller, K.L.; Nichols, T.E.; Robinson, E.C.; Salimi-Khorshidi, G.; Woolrich, M.W.; et al. Functional connectomics from resting state fMRI. Trends Cogn. Sci. 2013, 17, 666–682. [Google Scholar] [CrossRef]
  27. Varoquaux, G.; Craddock, R.C. Learning and comparing functional connectomes across subjects. Neuroimage 2013, 80, 405–415. [Google Scholar] [CrossRef] [PubMed]
  28. Płońska, A.; Płoński, P. MLJAR: State-of-the-Art Automated Machine Learning Framework for Tabular Data. Version 0.10.3, 2021.
  29. Matyas, T.A.; Mak-Yuen, Y.Y.K.; Boelsen-Robinson, T.P.; Carey, L.M. Calibration of Impairment Severity to Enable Comparison across Somatosensory Domains. Brain Sci. 2023, 13, 654. [Google Scholar] [CrossRef] [PubMed]
  30. Carey, L.M.; Abbott, D.F.; Egan, G.F.; Donnan, G.A. Reproducible activation in BA2, 1 and 3b associated with texture discrimination in healthy volunteers over time. Neuroimage 2008, 39, 40–51. [Google Scholar] [CrossRef] [PubMed]
  31. Koh, C.L.; Yeh, C.H.; Liang, X.; Vidyasagar, R.; Seitz, R.J.; Nilsson, M.; Connelly, A.; Carey, L.M. Structural Connectivity Remote From Lesions Correlates With Somatosensory Outcome Poststroke. Stroke 2021, 52, 2910–2920. [Google Scholar] [CrossRef] [PubMed]
  32. Carey, L.M.; Abbott, D.F.; Harvey, M.R.; Puce, A.; Seitz, R.J.; Donnan, G.A. Relationship between touch impairment and brain activation after lesions of subcortical and cortical somatosensory regions. Neurorehabil. Neural Repair 2011, 25, 443–457. [Google Scholar] [CrossRef] [PubMed]
  33. Saito, R.; Fujihara, K.; Kasagi, M.; Motegi, T.; Suzuki, Y.; Narita, K.; Ujita, K.; Fukuda, M. Can We Find Any Sustained Neurofunctional Alteration in Remitted Depressive Patients with a History of Modified Electroconvulsive Therapy? Open J. Depress. 2017, 6, 89–99. [Google Scholar] [CrossRef]
  34. Avants, B.B.; Tustison, N.J.; Song, G.; Cook, P.A.; Klein, A.; Gee, J.C. A reproducible evaluation of ANTs similarity metric performance in brain image registration. Neuroimage 2011, 54, 2033–2044. [Google Scholar] [CrossRef]
  35. van der Walt, S.; Colbert, S.C.; Varoquaux, G. The NumPy Array: A Structure for Efficient Numerical Computation. Comput. Sci. Eng. 2011, 13, 22–30. [Google Scholar] [CrossRef]
  36. Oliphant, T.E. Python for Scientific Computing. Comput. Sci. Eng. 2007, 9, 10–20. [Google Scholar] [CrossRef]
  37. Gorgolewski, K.; Burns, C.D.; Madison, C.; Clark, D.; Halchenko, Y.O.; Waskom, M.L.; Ghosh, S.S. Nipype: A flexible, lightweight and extensible neuroimaging data processing framework in python. Front. Neuroinform. 2011, 5, 13. [Google Scholar] [CrossRef] [PubMed]
  38. Friston, K.J.; Williams, S.; Howard, R.; Frackowiak, R.S.; Turner, R. Movement-related effects in fMRI time-series. Magn. Reson. Med. 1996, 35, 346–355. [Google Scholar] [CrossRef] [PubMed]
  39. Behzadi, Y.; Restom, K.; Liau, J.; Liu, T.T. A component based noise correction method (CompCor) for BOLD and perfusion based fMRI. Neuroimage 2007, 37, 90–101. [Google Scholar] [CrossRef] [PubMed]
  40. Yan, C.G.; Cheung, B.; Kelly, C.; Colcombe, S.; Craddock, R.C.; Di Martino, A.; Li, Q.; Zuo, X.N.; Castellanos, F.X.; Milham, M.P. A comprehensive assessment of regional variation in the impact of head micromovements on functional connectomics. Neuroimage 2013, 76, 183–201. [Google Scholar] [CrossRef] [PubMed]
  41. Rolls, E.T.; Huang, C.C.; Lin, C.P.; Feng, J.; Joliot, M. Automated anatomical labelling atlas 3. NeuroImage 2020, 206, 116189. [Google Scholar] [CrossRef] [PubMed]
  42. Kreuzberger, D.; Kuhl, N.; Hirschl, S. Machine Learning Operations (MLOps): Overview, Definition, and Architecture. IEEE Access 2023, 11, 31866–31879. [Google Scholar] [CrossRef]
  43. Jia, W.; Sun, M.; Lian, J.; Hou, S. Feature dimensionality reduction: A review. Complex Intell. Syst. 2022, 8, 2663–2693. [Google Scholar] [CrossRef]
  44. Piramuthu, S.; Sikora, R.T. Iterative feature construction for improving inductive learning algorithms. Expert Syst. Appl. 2009, 36, 3401–3406. [Google Scholar] [CrossRef]
  45. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  46. Altmann, A.; Toloşi, L.; Sander, O.; Lengauer, T. Permutation importance: A corrected feature importance measure. Bioinformatics 2010, 26, 1340–1347. [Google Scholar] [CrossRef] [PubMed]
  47. Ojala, M.; Garriga, G.C. Permutation Tests for Studying Classifier Performance. J. Mach. Learn. Res. 2010, 11, 1833–1863. [Google Scholar]
  48. Bergstra, J.; Bengio, Y. Random Search for Hyper-Parameter Optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
  49. Machlanski, D.; Samothrakis, S.; Clarke, P. Hyperparameter Tuning and Model Evaluation in Causal Effect Estimation. arXiv 2023, arXiv:2303.01412. Available online: http://xxx.lanl.gov/abs/2303.01412 (accessed on 5 August 2023).
  50. Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence—Volume 2, IJCAI’95, San Francisco, CA, USA, 20–25 August 1995; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1995; pp. 1137–1143. [Google Scholar]
  51. Refaeilzadeh, P.; Tang, L.; Liu, H. Cross-Validation. In Encyclopedia of Database Systems; Liu, L., Özsu, M.T., Eds.; Springer: Boston, MA, USA, 2009; pp. 532–538. [Google Scholar] [CrossRef]
  52. Arlot, S.; Celisse, A. A survey of cross-validation procedures for model selection. Stat. Surv. 2010, 4, 40–79. [Google Scholar] [CrossRef]
  53. Badrulhisham, F.; Pogatzki-Zahn, E.; Segelcke, D.; Spisak, T.; Vollert, J. Machine learning and artificial intelligence in neuroscience: A primer for researchers. Brain, Behav. Immun. 2024, 115, 470–479. [Google Scholar] [CrossRef] [PubMed]
  54. Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Advances in Neural Information Processing Systems; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Glasgow, UK, 2017; Volume 30. [Google Scholar]
  55. Bullmore, E.; Sporns, O. Complex brain networks: Graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. 2009, 10, 186–198. [Google Scholar] [CrossRef] [PubMed]
  56. Crossley, N.A.; Mechelli, A.; Scott, J.; Carletti, F.; Fox, P.T.; McGuire, P.; Bullmore, E.T. The hubs of the human connectome are generally implicated in the anatomy of brain disorders. Brain 2014, 137, 2382–2395. [Google Scholar] [CrossRef] [PubMed]
  57. Park, H.J.; Friston, K. Structural and functional brain networks: From connections to cognition. Science 2013, 342, 1238411. [Google Scholar] [CrossRef] [PubMed]
  58. Friston, K.J. Functional and effective connectivity: A review. Brain Connect. 2011, 1, 13–36. [Google Scholar] [CrossRef] [PubMed]
  59. Grefkes, C.; Fink, G.R. Connectivity-based approaches in stroke and recovery of function. Lancet Neurol. 2014, 13, 206–216. [Google Scholar] [CrossRef] [PubMed]
  60. Rehme, A.K.; Grefkes, C. Cerebral network disorders after stroke: Evidence from imaging-based connectivity analyses of active and resting brain states in humans. J. Physiol. 2013, 591, 17–31. [Google Scholar] [CrossRef] [PubMed]
  61. Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 53. [Google Scholar] [CrossRef] [PubMed]
  62. Nachtergaele, P.; Radwan, A.; Swinnen, S.; Decramer, T.; Uytterhoeven, M.; Sunaert, S.; van Loon, J.; Theys, T. The temporoinsular projection system: An anatomical study. J. Neurosurg. JNS 2020, 132, 615–623. [Google Scholar] [CrossRef] [PubMed]
  63. Gong, D.; He, H.; Liu, D.; Ma, W.; Dong, L.; Luo, C.; Yao, D. Enhanced functional connectivity and increased gray matter volume of insula related to action video game playing. Sci. Rep. 2015, 5, 9763. [Google Scholar] [CrossRef] [PubMed]
  64. Bodranghien, F.; Bastian, A.; Casali, C.; Hallett, M.; Louis, E.D.; Manto, M.; Mariën, P.; Nowak, D.A.; Schmahmann, J.D.; Serrao, M.; et al. Consensus Paper: Revisiting the Symptoms and Signs of Cerebellar Syndrome. Cerebellum 2016, 15, 369–391. [Google Scholar] [CrossRef] [PubMed]
  65. Bannister, L.C.; Crewther, S.G.; Gavrilescu, M.; Carey, L.M. Improvement in Touch Sensation after Stroke is Associated with Resting Functional Connectivity Changes. Front. Neurol. 2015, 6, 165. [Google Scholar] [CrossRef] [PubMed]
  66. Aso, K.; Hanakawa, T.; Aso, T.; Fukuyama, H. Cerebro-cerebellar Interactions Underlying Temporal Information Processing. J. Cogn. Neurosci. 2010, 22, 2913–2925. [Google Scholar] [CrossRef] [PubMed]
  67. AU Bozkurt, B.; AU Yagmurlu, K.; AU Middlebrooks, E.H.; AU Cayci, Z.; AU Cevik, O.M.; AU Karadag, A.; AU Moen, S.; AU Tanriover, N.; AU Grande, A.W. Fiber Connections of the Supplementary Motor Area Revisited: Methodology of Fiber Dissection, DTI, and Three Dimensional Documentation. JoVE 2017, 123, e55681. [Google Scholar] [CrossRef]
  68. Lehéricy, S.; Ducros, M.; Krainik, A.; Francois, C.; Van de Moortele, P.F.; Ugurbil, K.; Kim, D.S. 3-D diffusion tensor axonal tracking shows distinct SMA and pre-SMA projections to the human striatum. Cereb. Cortex 2004, 14, 1302–1309. [Google Scholar] [CrossRef] [PubMed]
  69. Hagen, M.C.; Zald, D.H.; Thornton, T.A.; Pardo, J. Somatosensory Processing in the Human Inferior Prefrontal Cortex. J. Neurophysiol. 2002, 88, 1400–1406. [Google Scholar] [CrossRef] [PubMed]
  70. Wiech, K.; Jbabdi, S.; Lin, C.S.; Andersson, J.; Tracey, I. Differential structural and resting state connectivity between insular subdivisions and other pain-related brain regions. Pain 2014, 155, 2047–2055. [Google Scholar] [CrossRef] [PubMed]
  71. Yu, W.; Krook-Magnuson, E. Cognitive Collaborations: Bidirectional Functional Connectivity Between the Cerebellum and the Hippocampus. Front. Syst. Neurosci. 2015, 9, 177. [Google Scholar] [CrossRef] [PubMed]
  72. Kilteni, K.; Ehrsson, H.H. Functional Connectivity between the Cerebellum and Somatosensory Areas Implements the Attenuation of Self-Generated Touch. J. Neurosci. 2020, 40, 894–906. [Google Scholar] [CrossRef] [PubMed]
  73. de Haan, E.H.F.; Dijkerman, H.C. Somatosensation in the Brain: A Theoretical Re-evaluation and a New Model. Trends Cogn. Sci. 2020, 24, 529–541. [Google Scholar] [CrossRef] [PubMed]
  74. Cole, D.M.; Smith, S.M.; Beckmann, C.F. Advances and pitfalls in the analysis and interpretation of resting state FMRI data. Front. Syst. Neurosci. 2010, 4, 8. [Google Scholar] [CrossRef] [PubMed]
  75. Siegel, J.S.; Shulman, G.L.; Corbetta, M. Measuring functional connectivity in stroke: Approaches and considerations. J. Cereb. Blood Flow Metab. 2017, 37, 2665–2678. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Flow diagram of AutoML research model and steps involved.
Figure 1. Flow diagram of AutoML research model and steps involved.
Applsci 14 03463 g001
Figure 2. Model performance comparison.
Figure 2. Model performance comparison.
Applsci 14 03463 g002
Figure 3. Feature importance across the five learners, which correspond to the five folds of the cross-validation, in the best-performing LightGBM model with golden features, selected features, and hill climbing steps applied (23_LightGBM_GF_SF_HC1).
Figure 3. Feature importance across the five learners, which correspond to the five folds of the cross-validation, in the best-performing LightGBM model with golden features, selected features, and hill climbing steps applied (23_LightGBM_GF_SF_HC1).
Applsci 14 03463 g003
Figure 4. Predicted vs. residuals for the best-performing LightGBM model with golden features, selected features, and hill climbing steps applied (23_LightGBM_GF_SF_HC1).
Figure 4. Predicted vs. residuals for the best-performing LightGBM model with golden features, selected features, and hill climbing steps applied (23_LightGBM_GF_SF_HC1).
Applsci 14 03463 g004
Table 1. Demographic and clinical data regarding study participants.
Table 1. Demographic and clinical data regarding study participants.
Age in yearsMean (SD)62.71 (13.23)
SexM/F18/42
Lesion Locationleft/right/bilateral/unknown21/31/1/7
TDT score (AUC, first tested hand)Mean (SD)58.14 (20.33)
Abbreviations: TDT = tactile discrimination test (score is percentage correct area under the curve [AUC], the criterion of normality is 66.1 AUC, with lower scores indicating poorer performance [22]).
Table 2. Comparative performance of machine learning models.
Table 2. Comparative performance of machine learning models.
StrategyConfigurationTypeRMSE
Default
1_Default_LightGBMLightGBM20.88 *
2_Default_XgboostXgboost21.05
3_Default_CatBoostCatBoost21.16
4_Default_NeuralNetworkNeural Network45.64
5_Default_RandomForestRandom Forest21.56
Not_So_Random
6_Xgboost_NSRXgboost21.04
7_Xgboost_NSRXgboost21.23
10_LightGBM_NSRLightGBM21.29
11_LightGBM_NSRLightGBM20.96
14_CatBoost_NSRCatBoost21.19
15_CatBoost_NSRCatBoost21.37
18_RandomForest_NSRRandom Forest21.34
19_RandomForest_NSRRandom Forest20.86 *
22_NeuralNetwork_NSRNeural Network65.83
Golden_Features
1_Default_LightGBM_GFLightGBM19.37 *
11_LightGBM_NSR_GFLightGBM19.59
19_RandomForest_NSR_GFRandom Forest20.86
Insert_Random_Feature
1_Default_LightGBM_GF_RFLightGBM19.27
Selected_Features
1_Default_LightGBM_GF_SFLightGBM15.86 *
4_Default_NeuralNetwork_SFNeural Network22.56
6_Xgboost_NSR_SFXgboost21.04
19_RandomForest_NSR_GF_SFRandom Forest20.09
Hill_Climbing_1
23_LightGBM_GF_SF_HC1LightGBM15.65 *
24_LightGBM_GF_HC1LightGBM19.15
25_RandomForest_GF_SF_HC1Random Forest20.09
26_RandomForest_GF_HC1Random Forest20.86
27_Xgboost_SF_HC1Xgboost21.04
28_Xgboost_SF_HC1Xgboost21.04
29_Xgboost_HC1Xgboost21.05
30_Xgboost_HC1Xgboost21.05
31_CatBoost_HC1CatBoost21.29
Hill_Climbing_2
32_LightGBM_GF_SF_HC2LightGBM15.92
33_LightGBM_GF_SF_HC2LightGBM15.65 *
34_LightGBM_GF_SF_HC2LightGBM15.76
35_LightGBM_GF_SF_HC2LightGBM15.86
36_RandomForest_GF_SF_HC2Random Forest20.09
37_RandomForest_GF_SF_HC2Random Forest20.00
38_RandomForest_GF_SF_HC2Random Forest20.09
39_RandomForest_GF_SF_HC2Random Forest20.00
40_Xgboost_SF_HC2Xgboost21.99
41_Xgboost_SF_HC2Xgboost19.62
42_Xgboost_SF_HC2Xgboost21.96
43_Xgboost_SF_HC2Xgboost19.62
48_NeuralNetwork_SF_HC2Neural Network21.58
49_NeuralNetwork_SF_HC2Neural Network21.60
50_NeuralNetwork_HC2Neural Network22.32
51_NeuralNetwork_HC2Neural Network97.35
Ensemble
EnsembleEnsemble15.65
* The model with the lowest RMSE at each step; Default: default hyperparameters; NSR: not so random—performs random search over a pre-defined set of hyperparameters; RF: random feature—the introduction of a feature consisting of random data as part of the feature selection procedure; GF: golden features—combinations of pairs of original features using functions like addition, subtraction, multiplication, and division; SF: selected features—uses permutation-based feature selection; HC1 and HC2: hill climbing 1 and 2—a model is selected for further tuning, then only one randomly selected hyperparameter from its setting is changed. The selected hyperparameter will be changed in two directions; Ensemble: a weighted combination of high-performing models to improve the score.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Walsh, A.; Goodin, P.; Carey, L.M. Identifying Correlated Functional Brain Network Patterns Associated with Touch Discrimination in Survivors of Stroke Using Automated Machine Learning. Appl. Sci. 2024, 14, 3463. https://doi.org/10.3390/app14083463

AMA Style

Walsh A, Goodin P, Carey LM. Identifying Correlated Functional Brain Network Patterns Associated with Touch Discrimination in Survivors of Stroke Using Automated Machine Learning. Applied Sciences. 2024; 14(8):3463. https://doi.org/10.3390/app14083463

Chicago/Turabian Style

Walsh, Alistair, Peter Goodin, and Leeanne M. Carey. 2024. "Identifying Correlated Functional Brain Network Patterns Associated with Touch Discrimination in Survivors of Stroke Using Automated Machine Learning" Applied Sciences 14, no. 8: 3463. https://doi.org/10.3390/app14083463

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop