An Intelligent Rehabilitation Assessment Method for Small-Sample Scenarios: Machine Learning Validation Based on Rehabilitation Matching Value

Wei, Hua; Luh, Dingbang; Chen, Zihao; Yan, Haixia; Zhang, Ruizhi

doi:10.3390/electronics14081607

Open AccessArticle

An Intelligent Rehabilitation Assessment Method for Small-Sample Scenarios: Machine Learning Validation Based on Rehabilitation Matching Value

by

Hua Wei

^1,*,

Dingbang Luh

²,

Zihao Chen

²,

Haixia Yan

³ and

Ruizhi Zhang

^1,*

¹

College of Textile and Clothing, Xinjiang University, Urumqi 830049, China

²

School of Art and Design, Guangdong University of Technology, Guangzhou 510090, China

³

School of Arts, Shaanxi University of Technology, Hanzhong 723000, China

^*

Authors to whom correspondence should be addressed.

Electronics 2025, 14(8), 1607; https://doi.org/10.3390/electronics14081607

Submission received: 18 March 2025 / Revised: 6 April 2025 / Accepted: 15 April 2025 / Published: 16 April 2025

(This article belongs to the Section Computer Science & Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Post-stroke finger dysfunction severely impacts patients’ daily living abilities and quality of life. Traditional rehabilitation assessment methods face challenges such as high subjectivity, insufficient precision, and difficulty in capturing subtle changes. These challenges are particularly pronounced in small-sample data scenarios, where the accuracy and robustness of assessment models are limited. This study proposes an intelligent rehabilitation assessment method tailored for small-sample scenarios, combining the rehabilitation matching value (RMV) with machine learning to address the challenges of rehabilitation assessment in such contexts. A rehabilitation matching value calculation model is constructed based on existing data, and interpolation methods are employed to expand the small-sample dataset. Machine learning models are then utilized for validation. Experimental results demonstrate that the proposed method effectively captures subtle changes in finger function, significantly improving the sensitivity and accuracy of rehabilitation assessments. This provides a scientific basis for the development of personalized rehabilitation training plans. Compared to traditional methods, the proposed approach exhibits significant advantages in flexibility, practicality, and adaptability to small-sample scenarios.

Keywords:

rehabilitation matching value (RMV); intelligent rehabilitation assessment; machine learning; small-sample learning

1. Introduction

Stroke is one of the leading causes of long-term disability in adults, with finger dysfunction being particularly prominent among its sequelae. This significantly impacts patients’ daily living abilities and quality of life [1,2,3]. Finger dysfunction makes it challenging for patients to perform daily tasks such as drinking water, flipping pages, and typing, primarily manifested as insufficient finger strength, reduced flexibility, and loss of fine motor control. These impairments not only reduce patients’ independence but also increase the caregiving burden on families and society [4,5,6]. Therefore, developing scientific rehabilitation assessment methods to effectively monitor patients’ recovery progress and design personalized rehabilitation plans has become a crucial research topic in modern rehabilitation medicine.

Traditional rehabilitation assessment methods, such as the Brunnstrom Stages and Fugl Meyer Assessment, have long been widely used in clinical practice as classical tools for post-stroke rehabilitation assessment [7,8,9]. These methods provide a basic quantitative basis for tracking rehabilitation progress by grading and scoring patients’ motor functions. However, they exhibit significant limitations in practical applications. Firstly, they heavily rely on the subjective judgment of evaluators, and the results can vary depending on the evaluator’s experience and expertise, leading to inconsistency and a lack of reliability. Secondly, these methods often rely on coarse grading or scoring, making it difficult to capture subtle changes in patients’ recovery, particularly in fine motor control and coordination of finger functions [10]. Furthermore, they lack relevance to real-world tasks and fail to comprehensively reflect patients’ functional performance in daily living scenarios [11].

In recent years, researchers have attempted to address the limitations of traditional methods by integrating advanced technologies such as robotics, electroencephalography (EEG), and other sensor technologies into rehabilitation assessments. For instance, robot-assisted rehabilitation systems can capture patients’ motor abilities through precise mechanical and motion data [12], while EEG technology can assess patients’ motor intentions and brain function by monitoring neural activity [13,14,15]. However, the application of these technologies still faces several challenges. Firstly, they often require expensive equipment and complex operational procedures, limiting their accessibility in primary healthcare settings [16]. Secondly, patient acceptance and adherence to these technologies remain issues, especially for elderly patients or those with limited financial resources. Additionally, these technologies encounter technical challenges in data processing and analysis, such as extracting key features from multidimensional data and achieving efficient rehabilitation assessments.

Imaging technologies (e.g., MRI, CT) [17,18,19], sensor technologies (e.g., inertial measurement units, pressure sensors) [20,21,22], and video technologies [23] have been applied in rehabilitation assessments, providing more precise means of quantifying patients’ motor functions. These technologies can capture subtle changes during patients’ movements, such as muscle activity, joint angles, and motion trajectories, addressing the subjectivity and coarseness of traditional assessment methods. However, these technologies also face significant limitations in practical applications. Firstly, the high cost of equipment makes them difficult to implement in resource-limited primary healthcare settings. Secondly, they often require professional operation and data analysis, with complex workflows further restricting their use in broader clinical scenarios. Moreover, field investigations have revealed that patient adherence and acceptance may be affected by the complexity and discomfort of using such equipment, particularly for elderly patients or those with multiple comorbidities.

In recent years, with the rapid development of artificial intelligence (AI) and sensor technologies, researchers have begun exploring the use of machine learning and motion scoring functions to improve the precision and automation of rehabilitation assessments [24,25,26]. Machine learning techniques can extract key features from multidimensional data and model complex feature relationships [27,28], offering new possibilities for rehabilitation assessments. However, the widespread application of machine learning methods faces a critical challenge: model training typically requires large amounts of labeled data [29]. Given the high cost of rehabilitation data collection and the limited availability of patient data, ensuring model accuracy and robustness in small-sample scenarios has become a significant challenge. At the same time, while existing motion scoring methods can quantify patients’ motor abilities, they rely on the precision of the data and reasonable standards, making it difficult to adapt to the personalized needs of different patients. For example, scoring methods often compare results against fixed reference standards, lacking the flexibility to adjust assessments based on individual differences, which limits their applicability in designing personalized rehabilitation plans [30].

Therefore, developing an efficient, accurate, and interpretable rehabilitation assessment model for small-sample data scenarios has become a key focus and challenge in current research. To address this issue, researchers need to innovate in data expansion, model design, and feature extraction to overcome the challenge of limited rehabilitation data samples, while improving model adaptability and practicality to provide scientific support for personalized rehabilitation training.

To address these challenges, this study proposes an intelligent rehabilitation assessment method that combines the rehabilitation matching value (RMV) with machine learning. The RMV quantifies the recovery level of the affected hand by comparing the motion data and task performance of the unaffected and affected hands, effectively capturing subtle changes in finger function. To address the issue of insufficient small-sample data, this study adopts an interpolation method, identified through experimental comparisons [31,32], as a suitable approach for expanding the dataset. This method is used to generate additional potential rehabilitation state samples, thereby enhancing the generalization ability of machine learning models. By integrating machine learning models such as Random Forest (RF), Support Vector Machines (SVMs), and neural networks (NNs) [33,34], this study validates and predicts the calculation results of rehabilitation matching values, aiming to improve the sensitivity and accuracy of rehabilitation assessment.

Compared to existing research, this study offers the following innovations:

(1): Introduction of RMV: the core metric RMV is proposed to comprehensively quantify patients’ recovery levels, providing a scientific basis for designing personalized rehabilitation training plans.
(2): Data Expansion: the use of interpolation methods to expand small-sample datasets addresses the reliance of traditional machine learning methods on large-scale labeled data.
(3): Model Validation: by combining multiple machine learning models, the study validates the effectiveness of the RMV calculation model, offering an efficient and flexible solution for intelligent rehabilitation assessment.

The structure of this study is organized as follows: Firstly, the RMV calculation model is introduced, including the definitions and calculation methods for single-finger matching value, task matching value, and comprehensive RMV. Next, the data expansion methods and feature extraction process are described, with a focus on the application of interpolation methods and their impact on data distribution consistency and model performance. Then, the effectiveness of the RMV calculation model is validated using three machine learning models: Random Forest, SVM, and neural networks. Finally, the experimental results are analyzed and discussed, highlighting the main contributions of this study and suggesting directions for future optimization.

2. Method

This study proposes an intelligent rehabilitation assessment method for small-sample scenarios, which calculates rehabilitation matching values based on feature values. After data expansion and normalization, the method is combined with machine learning to select the optimal model for assessment. The detailed process framework is shown in Figure 1.

2.1. Rehabilitation Matching Value Calculation Model

The rehabilitation matching value (RMV) quantifies the recovery level of the affected hand by comparing the motion data and task performance of the unaffected and affected hands. The core of the model includes the following three indicators:

Single-finger matching value (

M

): evaluates the recovery effect of an individual finger.

Task matching value (

M_{t a s k}

): assesses the overall performance of the hand in specific tasks.

Comprehensive rehabilitation matching value (

M_{c}

): provides an overall evaluation of the hand’s recovery level in daily life.

Building upon the team’s previous research [35], the detailed calculation formulas are as follows:

M = (1 - (\sum_{j = 1}^{4} ω_{j} \frac{∣ V_{H j} - V_{D j} ∣}{\max (|V_{H j}|, |V_{D j}|)})) \times 100 %

(1)

In this study,

M

represents the single-finger matching value, where

V_{H j}

denotes the value of the j-th feature parameter for the unaffected hand, and

V_{D j}

denotes the value of the j-th feature parameter for the affected hand.

V

represents the fingertip pressure

F

, fingertip motion angle

A

, and task completion score

T

. The weight coefficients

ω_{j} \in \{ω 1, ω 2, ω 3, ω 4\}

are defined such that

ω 1 + ω 2 + ω 3 + ω 4 = 1

, ensuring data normalization.

The initial weight coefficients are set equally (

ω 1 = ω 2 + ω 3 = ω 4

), indicating that each feature has equal importance in finger motion. In practical applications, the weights are adjusted based on experimental feedback to account for different feature contributions. For example, in this study, the initial weights were set to 0.25, but after multiple trials, they were adjusted to 0.3, 0.3, 0.2, and 0.2, respectively.

During computation, the weighted differences for each feature are calculated by determining the parameter differences between the two hands and normalizing the values. The normalized differences are then multiplied by their respective weight coefficients. The weighted differences are summed, and the percentage deviation is calculated to obtain the single-finger matching value

M

. The closer

M

is to 100%, the better the recovery effect of the finger.

This method ensures equal importance for each data feature in assessing rehabilitation effects, enabling a comprehensive evaluation of the recovery of finger movements. The resulting single-finger matching value

M

provides an accurate measurement of the rehabilitation effect. As a similarity metric,

M

is calculated using a distance measure: the numerator computes the absolute distance, while the denominator computes the relative distance. By weighting motion data features and task completion scores, the single-finger matching value is derived, with

M

ranging between 0 and 1.

M_{t a s k} = \frac{1}{n} \sum_{n = 1}^{n} M_{n}

(2)

In Equation (2),

M_{t a s k}

represents the task matching value,

n

is the number of fingers, and

M_{n}

is the single-finger matching value for the n-th finger. This formula calculates the matching value of the affected hand relative to the unaffected hand for each completed task.

\begin{matrix} M_{c} = \frac{\sum M_{t a s k} * ω_{i}}{m} \\ s . t . \sum ω_{i} = 1 \end{matrix}

(3)

In Equation (3),

M_{c}

represents the comprehensive rehabilitation matching value,

m

is the number of tasks, and

ω_{i}

denotes the weight of each task, which can be adjusted through experiments or determined by expert scoring. The comprehensive rehabilitation matching value reflects the overall progress of the patient’s recovery; the higher the value, the closer the functionality of the affected hand is to that of the unaffected hand.

This method ensures that each motion feature holds equal importance in the evaluation, providing a comprehensive assessment of the recovery of finger movements. It is worth noting that when only a single task is involved,

M_{t a s k}

serves as the task matching value, which equals

M_{c}

, the comprehensive rehabilitation matching value.

2.2. Data Expansion and Feature Extraction

2.2.1. Data Source

The data used in this study were derived from previous rehabilitation experiments, which included motion data and task scores for the thumb, index finger, and middle finger during three daily tasks: drinking water, typing, and flipping pages. These data were collected using the following devices:

Thin-Film Pressure Sensor 402 (FSR402, Kechuang Electronics Store, Henan, China): used to measure fingertip force (

F

), with a measurement range of 0–110 N, a linear error of ±3%, and a repeatability error of ±2.5%. Three-Axis Digital Gyroscope (WT9011DCL, Shenzhen Wit Intelligence Technology Co., Ltd., Shenzhen, China): used to measure fingertip angles (

A x

and

A y

), with a measurement range of ±2000°/s and a resolution of 0.061°/s.

During data collection, the motion data for the healthy hand were measured first, followed by the data for the affected hand. The data collection process is shown in Figure 2. Each task was repeated multiple times (10 repetitions in this study), and the average value was taken to reduce random errors and single-measurement deviations. Task completion scores (

T

) were assessed using a standardized questionnaire, which recorded task completion time, accuracy, and stability.

All participants provided informed consent, and the study was approved by the Ethics Committee of Guangdong University of Technology (protocol code GDUTXS2024035 and approval date: 5 March 2024). The participants were fully informed of the study details and signed consent forms. Prior to the validation experiments, the usability of the devices was verified through pilot testing. Participants must be young and middle-aged patients (18–40 years old) with a post-stroke duration of more than six months and significant unilateral hand dysfunction. Patients with a post-stroke duration of over six months are selected because their condition is stable, the acute-phase response has subsided, and the effects of rehabilitation training are easier to observe and evaluate. Additionally, at this stage, patients have often completed basic rehabilitation, but significant hand dysfunction persists, requiring precise assessment and targeted training. Patients with unilateral hand dysfunction have concentrated issues on the affected side, making it easier to quantify and evaluate finger function. Two typical young stroke patients (Brunnstrom stage IV) have already been recruited for validation, demonstrating a certain level of motor ability and the capacity to complete long-term follow-up experimental tasks. During the experiment, researchers provide necessary guidance and assistance to ensure the smooth progress of the study while avoiding biases caused by functional impairments.

2.2.2. Data Expansion Method

Due to the limited sample size of rehabilitation data, this study employs an interpolation method to expand the dataset and improve the generalization ability of the model.

The linear interpolation method is designed based on the time series of rehabilitation tasks. The generated data points are calculated through the weighted linear relationship of adjacent sampling points, allowing the method to generate data points suitable for low-dimensional features (e.g., fingertip force and motion angle) while maintaining the consistency of the original data distribution. Mathematically, this ensures the reasonableness of transitions between data points. Compared to SMOTE, which may introduce randomness in the generation of high-dimensional samples, and spline interpolation, which may introduce unrealistic smoothing effects into the existing data, linear interpolation avoids the distribution shift problems that complex regression models might cause. It is more suitable for rehabilitation task data with low-dimensional features, especially in small-sample time-series data scenarios. The linear interpolation method can generate high-quality samples with low complexity, meeting the specific needs of rehabilitation assessment.

The interpolation method generates additional samples between the original data points, simulating more possible rehabilitation states and enhancing the model’s adaptability to small-sample data.

The interpolation formula for generating an interpolated sample

S_{i}

between any two data points

A

and

B

is as follows:

\begin{matrix} S_{i} = A + \frac{i}{a + 1} * (B - A) \\ i \in {1,2, \dots, n} \end{matrix}

(4)

Here,

A

and

B

represent two adjacent original data points;

a

is the number of interpolated samples; and

i

is the index of the interpolated sample. Using this interpolation method, the required number of interpolated samples is generated between each pair of original data points, significantly expanding the size of the dataset.

To ensure the reasonableness and validity of the interpolated data, this study explored three data augmentation methods (linear interpolation, spline interpolation, and SMOTE) to generate additional training samples. These methods were combined with three machine learning models (Random Forest, Support Vector Machine, and neural network) to evaluate their effectiveness in improving model performance. A t-test was conducted to statistically validate the model performance of the three data augmentation methods.

2.2.3. Feature Extraction

The input features for the rehabilitation assessment model include fingertip force, fingertip angle, and task completion scores. These features comprehensively reflect the motor ability and task performance of the fingers.

Fingertip Force: the mean value of the pressure measurements obtained from the thin-film pressure sensor is used. Fingertip Angle: the mean value of angle changes recorded by the gyroscope is used as the fingertip angle. Task Completion Score: the mean value of scores from relevant evaluation items is used.

We designed a multidimensional task scoring scale, as shown in Table A1 of Appendix A, to evaluate task performance using a standardized scoring system. The tasks are tailored based on individual training content. It is important to note that although the questionnaire includes a quantitative assessment of task completion time, task completion time should be recorded independently as a reference for tracking rehabilitation progress.

To eliminate the influence of different feature scales, all features are standardized before being input into the machine learning model:

X_{n o r m} = \frac{X - μ}{δ}

(5)

where

X

represents the original feature value,

μ

is the mean of the feature, and

δ

is the standard deviation.

2.2.4. Data Feature Analysis

Statistical analysis of the expanded dataset revealed the following patterns:

Correlation Between Fingertip Force and Task Scores: samples with higher fingertip force generally have higher task scores, indicating that force is a key factor in task completion.

Task Dependency on Fingertip Angles: Different tasks have varying requirements for fingertip angles. For example, the flipping pages task shows a higher dependency on

A x

, while the drinking water task is more dependent on

A y

.

Differences Between Healthy and Affected Hands: the feature values of the healthy hand exhibit a more concentrated distribution, whereas the feature values of the affected hand are more dispersed, reflecting the instability of functionality during the rehabilitation process.

2.2.5. Interpretability of Data Features

To enhance the interpretability of the model, this study analyzed the importance of features. The results indicate that fingertip force (

F

) and task scores (

T

) contribute the most to the rehabilitation matching value. Fingertip angles (

A x

and

A y

) have higher weights in fine motor tasks, such as typing.

Through feature extraction and analysis, this study lays a foundation for the subsequent training and validation of machine learning models, while also providing a scientific basis for rehabilitation assessment.

2.3. Machine Learning Model

2.3.1. Model Selection

In this study, the rehabilitation matching value (RMV) is used as the core metric to quantify the rehabilitation level of the affected hand relative to the healthy hand. Due to the limitations of small-sample data, the interpolation method is employed to expand the dataset, and machine learning models are used to validate and predict the RMV. The following three machine learning models were selected, each suited to different task requirements:

(1): Random Forest (RF): Random Forest is an ensemble learning method based on decision trees, capable of effectively handling small-sample data with strong resistance to overfitting. Advantages: Handles nonlinear relationships effectively. Ranks feature importance, making model results easier to interpret. Robust to missing data and noise.
(2): Support Vector Machine (SVM): SVM is a classic small-sample learning model suitable for high-dimensional data and nonlinear classification problems. Advantages: Captures complex feature relationships by mapping data to higher-dimensional spaces using kernel functions (e.g., RBF kernel). Exhibits good generalization ability for small-sample data.
(3): Neural Network (NN): Neural networks have powerful nonlinear modeling capabilities and can capture complex feature interactions. Although neural networks typically require larger datasets, they can be adapted to small-sample scenarios by adjusting the network structure (e.g., reducing the number of hidden layers and neurons). Advantages: Automatically learns complex relationships between features. Suitable for multitask learning and regression problems.

2.3.2. Integration of Interpolation and Machine Learning

Due to the small sample size of rehabilitation data, this study uses the interpolation method to expand the dataset and enhance the training effectiveness of machine learning models. The interpolation method generates additional samples between original data points, simulating more possible rehabilitation states based on Equations (4) and (5), addressing the issue of insufficient small-sample data.

The roles of the interpolation method include the following:

(1): Data Expansion: generates additional samples to expand the small-sample dataset and improve the model’s generalization ability.
(2): Data Smoothing: the interpolated samples smooth the data distribution, reducing the impact of noise on model training.
(3): Feature Enrichment: the expanded data retain the original feature distribution while increasing sample diversity.

Workflow for integrating interpolation with machine learning:

(1): Interpolation Data Generation: perform interpolation on the original data to generate additional sample points and expand the dataset.
(2): Feature Extraction: extract rehabilitation matching values ( $M$ , $M_{t a s k}$ , $M_{C}$ ) and other motion features (e.g., fingertip force, angles, task scores) from the interpolated dataset.
(3): Model Training: input the expanded dataset into machine learning models for training and optimize model parameters.
(4): Model Validation: evaluate model performance on the test set to verify the effectiveness of the interpolated data expansion.

2.3.3. Model Input and Output

To integrate the rehabilitation matching value with machine learning models, the output of the RMV calculation model is used as input features for the machine learning models, while task completion scores and the patient’s rehabilitation status are used as target variables.

Input Features:

(1): Rehabilitation Matching Value-Related Features: single-finger matching value ( $M$ ), task matching value ( $M_{t a s k}$ ), rehabilitation matching value ( $M_{C}$ ).
(2): Motion Data Features: Fingertip force ( $F_{H}$ , $F_{D}$ ). Fingertip angles ( $A_{x H}$ , $A_{x D}$ , $A_{y H}$ , $A_{y D}$ ).
Task completion scores ( $T_{H}$ , $T_{D}$ ).
(3): Task Features: Task type (e.g., drinking water, typing, flipping pages). Task difficulty (quantified through scoring).

Output Targets:

(1): Classification Task: predict the patient’s rehabilitation status.
(2): Regression Task: predict the patient’s rehabilitation matching value ( $M_{C}$ ).

2.3.4. Model Training and Validation

To validate the performance of the machine learning models, this study adopts a crossvalidation approach and uses the following evaluation metrics:

(1)

Evaluation Metrics:

Accuracy: measures the overall correctness of the model’s predictions.
Precision: measures the accuracy of the model when predicting positive classes.
Recall: measures the model’s ability to identify positive class samples.
F1 Score: the harmonic mean of precision and recall, providing a comprehensive evaluation of model performance.

(2)

Training and Validation Workflow:

Data Preprocessing: Standardize input features to eliminate the influence of different feature scales. Use the interpolation method to expand the small-sample dataset and enhance the model’s generalization ability.
Model Training: Split the dataset into training and test sets. Use 5-fold crossvalidation to optimize model parameters and prevent overfitting.
Model Validation: Evaluate model performance on the test set, recording accuracy, precision, recall, and F1 score. Compare the performance of different models and select the optimal one.

3. Experiments and Results

3.1. Experimental Design

Dataset Division

To validate the effectiveness of the rehabilitation matching value (RMV) calculation model and evaluate the performance of machine learning models on small-sample data, the expanded dataset was divided into training and test sets. The specific division is as follows:

(1): Data Source: The original data were collected from three tasks (drinking water, typing, flipping pages) and included motion data (fingertip force, angle) and task completion scores for both the healthy hand and the affected hand. They reflect the differences in patients’ hand function and rehabilitation progress. The original dataset consists of 90 samples (30 samples for each task). The original training data are divided and used to generate augmented samples and evaluate the final performance.
(2): Interpolation Expansion: To address the issue of the small size of the original dataset, this study employed three interpolation methods to expand the dataset: (1) Linear Interpolation (Proposed Linear Interpolation): Intermediate interpolation points are generated by weighted averaging of adjacent sample points for each task. New samples are inserted in equal proportions, ensuring that the motion features of the newly added samples remain within the original data distribution, maintaining consistency and physical plausibility. (2) Cubic Spline Interpolation: A cubic spline fitting model is used to generate smooth interpolated data by fitting selected points. A cubic spline function is constructed between adjacent observation points, and feature points are inserted to ensure a smooth transition in the expanded training data. (3) SMOTE (Synthetic Minority Oversampling Technique): Oversampled data are generated by randomly inserting points in the feature space of the training samples. After expansion, a final dataset containing 300 samples (100 samples for each task) was formed.
(3): Dataset Split Ratio: The dataset was divided into training and test sets in an 8:2 ratio. The training set contained 240 samples (80 samples for each task) and was used for building machine learning models. The test set contained 60 samples (20 samples for each task) and was used to independently evaluate the generalization ability of the models. Five-fold crossvalidation was applied within the training data to optimize model parameters and prevent overfitting.

Experimental Tasks

The experimental tasks were based on three daily activities (drinking water, typing, flipping pages) to validate the effectiveness of the RMV calculation model and evaluate the performance of machine learning models:

(1): Drinking Water Task:

Objective: evaluate the grip strength and stability of the fingers.

Data Collection: record fingertip force, fingertip angle, and task completion scores.

Task Characteristics: requires relatively low finger functionality, primarily assessing grip ability.

(2): Typing Task:

Objective: evaluate fine motor control and coordination of the fingers.

Data Collection: record fingertip force, angle changes, and task completion scores.

Task Characteristics: requires high flexibility and fine control of the fingers.

(3): Flipping Pages Task:

Objective: evaluate the flexibility and coordination of the fingers.

Data Collection: record fingertip force, angle changes, and task completion scores.

Task Characteristics: requires finger functionality that is intermediate between the drinking water and typing tasks.

Experimental Procedure

(1): Rehabilitation Matching Value Calculation: Use the RMV calculation formulas (Equations (1)–(3)) to compute the single-finger matching value, task matching value, and overall rehabilitation matching value for each task. Compare the motion data of the healthy hand and the affected hand to quantify the rehabilitation level of the affected hand.
(2): Expanded augmented dataset: the data were expanded using linear interpolation (Proposed Linear Interpolation), cubic spline interpolation, and SMOTE (Synthetic Minority Oversampling Technique), respectively.
(3): Validation of Data Augmentation Methods Combined with Machine Learning Models: Model Input: features related to rehabilitation matching values and motion data characteristics. Model Output: predicted rehabilitation status.

Model Selection

(1): Random Forest: Parameters: number of decision trees set to 100, maximum tree depth set to 10. This model is suitable for analyzing feature importance and modeling nonlinear relationships.
(2): Support Vector Machine: Parameters: kernel function set to RBF kernel, penalty parameter C set to 1.0. This model performs well in nonlinear classification and is suitable for small-sample scenarios.
(3): Neural Network: A three-layer fully connected network with 64, 32, and 16 neurons in the hidden layers, respectively. The Adam optimizer is used with a learning rate of 0.001. The loss function is cross-entropy loss, the number of training epochs is 100, and the activation function is the Rectified Linear Unit (ReLU).

Validation Method: 5-fold crossvalidation was used to evaluate model performance, with 80 percent of the data used for training and 20 percent for testing. The following metrics were recorded: Accuracy: measures the overall correctness of the model’s predictions. Precision: measures the accuracy of positive predictions made by the model. Recall: measures the model’s ability to identify positive samples. F1 Score: provides a balanced evaluation of precision and recall.

3.2. Experimental Results

3.2.1. Rehabilitation Matching Value Calculation Results

Using the rehabilitation matching value (RMV) calculation formulas (Equations (1)–(3)), a quantitative analysis was conducted on the motion features and task scores of the thumb, index finger, and middle finger across the three tasks (drinking water, typing, flipping pages). The results are shown in Table 1.

Result Analysis

The overall rehabilitation matching value (

M_{C}

) of the hand is 79.83%.

(1): Drinking Water Task: Overall Matching Value ( $M_{t a s k}$ ): 82.66%. The drinking water task has relatively low requirements for finger functionality, and the performance of the affected hand is close to that of the healthy hand, resulting in a higher overall matching value. The index finger has the highest matching value ( $M$ ) at 91.47%, indicating better recovery in grip strength and stability. The middle finger has the lowest matching value ( $M$ ) at 74%, possibly due to its lower involvement in the task.
(2): Typing Task: Overall Matching Value ( $M_{t a s k}$ ): 74.12%. The typing task requires high precision and fine motor control, and the performance of the affected hand shows a significant gap compared to the healthy hand, resulting in the lowest overall matching value. The thumb has the lowest matching value ( $M$ ) at 65.28%, indicating slower recovery in fine control and coordination. The index finger has the highest matching value ( $M$ ) at 82.34%, showing better recovery in fine motor control.
(3): Flipping Pages Task: Overall Matching Value ( $M_{t a s k}$ ): 82.70%. The flipping pages task has moderate requirements for finger functionality, resulting in an intermediate overall matching value. The index finger has the highest matching value ( $M$ ) at 91.55%, indicating better recovery in flexibility and coordination. The thumb has the lowest matching value ( $M$ ) at 73.72%, possibly due to insufficient flexibility during the task.

Comparison Between Tasks

The drinking water task has the highest overall matching value, indicating that it has the lowest requirements for finger functionality and is suitable as an early-stage rehabilitation training task. The typing task has the lowest overall matching value, reflecting its high demands for fine motor control, making it suitable for mid-to-late-stage rehabilitation training. The flipping pages task has an intermediate overall matching value, making it suitable as a transitional task during the rehabilitation process.

3.2.2. Performance of Machine Learning Models

Before data augmentation, the performance metrics of machine learning models combined with the original data were generated, as shown in Table 2. After augmenting the dataset, the calculation results of the rehabilitation matching values were validated using machine learning models, and the model performance is shown in Table 3.

Result Analysis

As shown in Table 2 and Table 3, this study compared the performance changes in machine learning models before and after data augmentation, as well as the performance of different data augmentation methods (SMOTE, cubic spline interpolation, linear interpolation) combined with machine learning models (Random Forest, neural network, Support Vector Machine). The following analysis focuses on the effectiveness of each data augmentation method in improving model performance and its statistical significance, and further examines the adaptability of each model based on these results.

Comparative Analysis of Data Augmentation Methods

(1): Overall Advantage of Linear Interpolation:

Linear interpolation demonstrated the best overall performance. The augmented data generated by this method better align with the physical rationality and distribution continuity of the original data, significantly improving the overall performance of machine learning models, particularly for neural networks. The combination of neural networks and linear interpolation achieved an accuracy of 80% and an F1 score of 77% (±0.01), with balanced performance across accuracy, precision, and recall metrics, making it the optimal result among all test cases. The statistical test results showed the following:

Linear interpolation vs. SMOTE: t = 8.33, p < 0.01, highly significant difference.

Linear interpolation vs. cubic spline interpolation: t = 3.92, p < 0.05, significant difference.

Overall, linear interpolation significantly outperformed SMOTE and cubic spline interpolation in improving the F1 score and overall classification performance, validating its applicability and superior performance in augmenting small-sample datasets.

(2): Cubic Spline Interpolation:

Cubic spline interpolation generated samples with good smoothness and continuity. However, excessive smoothing may have weakened the detailed characteristics of the data, reducing the model’s ability to capture subtle features. Compared to SMOTE, cubic spline interpolation showed a more significant improvement in model performance but still slightly lagged behind linear interpolation, with its overall performance falling between the two.

(3): Limitations of SMOTE:

Although SMOTE alleviated the small-sample problem by balancing the class distribution, the generated samples may deviate from the original data distribution, impairing the model’s ability to learn features. For the Random Forest and SVM models, the performance improvement with SMOTE was limited, with the F1 scores only reaching 68–69%, significantly lower than those achieved with linear interpolation and cubic spline interpolation. Overall, SMOTE exhibited low adaptability in small-sample augmentation scenarios, with the least improvement in performance.

Comparative Analysis of Machine Learning Model Performance

Based on Table 3, Figure 3 (confusion matrix), and Figure 4 (ROC curve), the performance of each machine learning model is analyzed as follows:

(1): Random Forest:

Random Forest demonstrated relatively robust performance on the augmented small-sample data, achieving an accuracy of 75% and an F1 score of 72% (±0.01). It showed good adaptability to the augmented data. Its strengths lie in feature importance ranking and interpretability of classification results, maintaining stability and resistance to overfitting. However, the ROC curve of Random Forest indicates slightly lower classification ability compared to neural networks. Additionally, the confusion matrix reveals that Random Forest misclassified more samples than neural networks, suggesting that its ability to capture features in the augmented data is slightly weaker than that of neural networks and SVM.

(2): Neural Network:

Neural networks combined with the linear interpolation method exhibited the best classification performance, achieving an accuracy of 80% and an F1 score of 77% (±0.01). In Figure 4, the ROC curve of the neural network is almost perfectly aligned with the top-left corner, with an AUC of 1.00, indicating perfect discrimination between positive and negative classes. The confusion matrix in Figure 3 further reveals that, despite near-perfect classification performance, the model still had a small number of misclassifications: seven positive samples were predicted as negative, and seven negative samples were predicted as positive. Moreover, neural networks effectively leveraged the continuous data generated by the linear interpolation method, flexibly capturing complex feature interactions, making it the best-performing model for the current task.

(3): Support Vector Machine (SVM):

SVM performed well when combined with the linear interpolation method, achieving an accuracy of 78% and an F1 score of 74% (±0.01), second only to neural networks. Its performance, as shown in Table 3, reflects strong adaptability to the augmented data, particularly in scenarios with nonlinear feature relationships. In Figure 4, the ROC curve of SVM is nearly perfect, with an AUC of 1.00, but its classification accuracy is slightly lower than that of neural networks. From the confusion matrix in Figure 3, SVM misclassified eight positive samples as negative and eight negative samples as positive, slightly higher than the misclassification rate of neural networks.

Analysis of Model and Data Augmentation Method Compatibility

Based on Table 3, Figure 3 (confusion matrix), and Figure 4 (ROC curve), the compatibility between models and data augmentation methods is further analyzed as follows:

(1): Neural Network + Linear Interpolation:

This combination achieved the most significant performance improvement. The physical rationality and linear continuity of the augmented data enabled the neural network to effectively capture complex features, resulting in the best classification performance. This validates its superior adaptability in small-sample scenarios.

(2): Random Forest + Linear Interpolation:

Random Forest demonstrated stable classification performance on the augmented data, achieving an F1 score of 72%, with strong resistance to overfitting. However, its performance in more complex classification tasks was slightly lower than that of the neural network.

(3): SVM + Linear Interpolation:

SVM performed well in capturing nonlinear features and maintaining classification stability, making it a strong complementary solution for small-sample classification tasks.

Overall, this study confirmed the significant advantages of the linear interpolation method in data augmentation. The data generated by this method effectively maintained consistency with the original distribution, significantly improving the classification performance of machine learning models, outperforming cubic spline interpolation and SMOTE (p < 0.01). The combination of linear interpolation and neural networks achieved the best performance, fully leveraging the data to capture complex feature relationships and achieving optimal accuracy and F1 scores. Meanwhile, Random Forest and SVM demonstrated stability and robustness in different scenarios.

Figure 4 (ROC curve) confirmed the superior classification ability of neural networks and SVM (AUC = 1.00), while Figure 3 (confusion matrix) revealed the specific distribution of classification errors, indicating room for improvement in the classification of boundary samples.

This study demonstrates that the combination of linear interpolation and machine learning models is an efficient and reliable solution for small-sample data scenarios, with particularly outstanding performance in neural network applications.

4. Discussion

(1): Advantages of the Method: this study proposes an intelligent rehabilitation assessment method combining the rehabilitation matching value (RMV) with machine learning, offering the following significant advantages:

Advantages in Quantitative Precision: In current rehabilitation assessments, commonly used methods such as the Euclidean distance [36,37], cosine similarity [38,39], or dynamic time warping (DTW) [40,41,42] are typically employed to measure the similarity of motion trajectories and time series between the affected and unaffected sides or standard movements. However, these metrics only capture the external degree of matching between movements, making it difficult to fully quantify subtle changes in fine motor function. The innovation of the RMV lies in its weighted integration of task completion, fingertip force, and fingertip angle data, constructing a more comprehensive, high-dimensional quantitative metric. By introducing a weighted matching algorithm, the RMV can dynamically adjust the weights of different dimensions to adapt to individual patient characteristics, avoiding the limitations of traditional methods that rely on a single perspective for evaluation.

Sensitivity and Accuracy: Traditional metrics often capture functional changes in patients through noticeable or macro-level differences, with limited sensitivity to minor functional improvements or declines. For instance, simple trajectory similarity fails to capture the details of the movement process, such as micro-force control and coordination. The rehabilitation matching value (RMV) calculation model effectively captures subtle changes in finger function by performing quantitative analysis on the motion data and task scores of the unaffected and affected hands, providing a comprehensive reflection of the patient’s rehabilitation level [43]. This offers a more detailed assessment compared to traditional upper limb ability scales for rehabilitation [44].

Adaptability to Small Samples: through the use of interpolation methods to expand small-sample datasets, the generalization ability of machine learning models has been significantly improved, addressing the reliance of traditional machine learning methods on large-scale labeled data [29].

Model Performance: Neural networks, Support Vector Machines, and Random Forest models all demonstrated excellent performance on small-sample data. Among them, the combination of linear interpolation and neural networks exhibited the best task adaptability, supporting the applicability of this model in fine-grained rehabilitation assessment tasks.

(2): Task Differences: the experimental results indicate significant differences in finger function requirements across different tasks:

Drinking Water Task: the rehabilitation matching value (RMV) was relatively high (82.66%), indicating that this task has lower finger function requirements and is suitable as a training task in the early stages of rehabilitation.

Typing Task: the RMV was the lowest (74.12%), reflecting the high demands of this task for fine motor control, making it suitable as a training task in the middle to later stages of rehabilitation.

Page Turning Task: the RMV was the highest (82.70%), close to the level of the drinking water task, suggesting that it is suitable as a transitional task from the early to later stages of rehabilitation.

The overall rehabilitation status based on the three tasks was 79.83%, representing the overall functional recovery level of the hand. These results provide a scientific basis for developing personalized rehabilitation training plans, recommending the selection of appropriate training tasks based on task characteristics and the patient’s rehabilitation stage.

(3): Effectiveness of Data Expansion: Linear Interpolation: the generated samples strictly follow the data distribution rules, ensuring both accuracy and physical consistency, which significantly improves the performance of machine learning models.

Cubic Spline Interpolation: the generated samples exhibit good continuity, but excessive smoothing leads to the loss of high-frequency features [45,46].

SMOTE Method: the generated samples exhibit a certain degree of randomness, which may deviate from the original data distribution, resulting in a decline in model performance [47,48].

The comparison of data augmentation methods validates the applicability of linear interpolation, especially in rehabilitation task data, where the generated samples effectively support the improvement in model performance.

(4): Applicability of Model Selection: the experiments show that different models exhibit distinct characteristics in rehabilitation matching tasks:

Neural Network: Best suited for regression analysis tasks, excelling in capturing complex feature relationships. It is recommended as a priority for rehabilitation status assessment studies.

Support Vector Machine (SVM): performs consistently well in classification tasks, making it a powerful tool for small-sample classification tasks.

Random Forest: although its performance is slightly inferior to the other two models, it offers strong interpretability, making it suitable for feature importance analysis and providing a reliable basis for further optimization of rehabilitation training strategies.

(5): Limitations and Future Directions: although this study preliminarily validated the effectiveness of the RMV calculation model and machine learning methods, there are still the following limitations that need improvement:

Data Scale Limitation: Despite using interpolation methods to expand the dataset, the actual sample size remains relatively small. Future studies can further validate the model’s performance by increasing the number of real samples.

Lack of Task Diversity: This study only selected three daily tasks (drinking water, typing, and page-turning). In the future, more task types can be included to verify the model’s generalizability.

Dynamic Weight Optimization: The selection of task weights in the RMV calculation currently relies on experimental feedback. In the future, algorithmic optimization methods (e.g., Bayesian optimization) can be employed to achieve personalized weight adjustments.

5. Conclusions

This study proposed an intelligent rehabilitation assessment method based on the combination of the rehabilitation matching value (RMV) and machine learning, aiming to address the challenges of rehabilitation assessment in small-sample data scenarios. By constructing the RMV calculation model, using linear interpolation to expand the dataset, and validating with machine learning models, the study reached the following key conclusions:

(1): Effectiveness of the RMV Calculation Model: the RMV effectively quantifies the rehabilitation level of the affected hand relative to the healthy hand, capturing subtle changes in finger functionality and providing scientific guidance for developing personalized rehabilitation training plans.
(2): Advantages of the Linear Interpolation Method: Linear interpolation significantly expanded the dataset while maintaining consistency with the original data distribution, providing an efficient solution for small-sample data augmentation. The validation results showed that it achieved the most significant performance improvement for neural network models.
(3): Performance of Machine Learning Models: Neural networks performed best in regression tasks, while Support Vector Machines and Random Forest models demonstrated stable performance in classification tasks. These findings suggest that machine learning methods can effectively improve the sensitivity and accuracy of rehabilitation assessments.
(4): Guidance from Task Differences: Different tasks have varying requirements for finger functionality. Rehabilitation training should be tailored to the task characteristics and the patient’s rehabilitation stage to ensure optimal outcomes.

Compared to traditional rehabilitation assessment methods, the proposed method offers significant advantages in flexibility, practicality, and adaptability to small-sample scenarios. It provides an efficient and scientific solution for intelligent rehabilitation assessment. Future research could further optimize data collection and model design, explore a wider variety of tasks, and introduce personalized weight adjustment methods to improve the generalizability and adaptability of the models.

Author Contributions

Conceptualization, H.W. and D.L.; methodology, H.W. and Z.C.; software, H.W.; validation, H.W. and Z.C.; formal analysis, H.W. and Z.C.; investigation, H.W. and Z.C.; resources, D.L., H.Y. and R.Z.; data curation, H.W.; writing—original draft preparation, H.W.; writing—review and editing, H.W.; supervision, D.L.; project administration, D.L.; funding acquisition, D.L. and R.Z. All authors have read and agreed to the published version of the manuscript.

Funding

The authors received no specific funding for this study.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethics Committee of Guangdong University of Technology (protocol code GDUTXS2024035 and date of approval 5 March 2024).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest to report regarding the present study.

Appendix A

Table A1. Task completion questionnaire and scoring.

Section	Question	Options	Scoring Criteria
Daily Activity Ability	1. Can you independently complete the following daily activities?	Drinking water □ Unable to complete □ Requires assistance to complete □ Can complete independently but unstably □ Completes with ease Flipping through a book □ Unable to complete □ Requires assistance to complete □ Can complete independently but unstably □ Completes with ease Typing □ Unable to complete □ Requires assistance to complete □ Can complete independently but unstably □ Completes with ease Personalized task (please specify) _______ □ Unable to complete □ Requires assistance to complete □ Can complete independently but unstably □ Completes with ease	Unable to complete: 0 points Partially completed: 1 point Independently completed but unstable: 2 points Completes with ease: 3 points
Daily Activity Ability	2. For the tasks above, how was the time required to complete them?	□ Time significantly exceeded normal range, unable to complete independentl □ Time was longer than normal, requiring assistance □ Time was close to normal but the process was unstable □ Time was within the normal range, completed with ease	Unable to complete: 0 points Partially completed: 1 point Independently completed but unstable: 2 points Completes with ease: 3 points
Rehabilitation Training Effectiveness	3. How would you rate your overall satisfaction with the current rehabilitation training?	□ Very dissatisfied □ Dissatisfied □ Satisfied □ Very satisfied	Very dissatisfied: 0 points Dissatisfied: 1 point Satisfied: 2 points Very satisfied: 3 points
	4. To what extent do you think rehabilitation training has improved the following aspects?	Finger flexibility □ No improvement □ Slight improvement □ Significant improvement □ Remarkable improvement Finger strength □ No improvement □ Slight improvement □ Significant improvement □ Remarkable improvement Daily activity ability □ No improvement □ Slight improvement □ Significant improvement □ Remarkable improvement	No improvement: 0 points Slight improvement: 1 point Significant improvement: 2 points Remarkable improvement: 3 points
	5. Are you willing to continue with the current rehabilitation training program?	□ Yes □ No	No: 0 points Yes: 3 points
Rehabilitation Training Experience	6. How satisfied are you with the current rehabilitation training?	□ Very dissatisfied □ Dissatisfied □ Satisfied □ Very satisfied	Very dissatisfied: 0 points Dissatisfied: 1 point Satisfied: 2 points Very satisfied: 3 points
	7. How do you feel about the intensity of the rehabilitation training?	□ Too light □ Appropriate □ Too heavy	Too light or too heavy: 0 points Appropriate: 3 points
	8. During rehabilitation training, have you experienced any discomfort or pain?	□ Always □ Frequently □ Occasionally □ Never	Always feel discomfort: 0 points Frequently feel discomfort: 1 point Occasionally feel discomfort: 2 points Never feel discomfort: 3 points
Challenges and Suggestions	9. What are the main difficulties you have encountered during the rehabilitation process? (Multiple choices allowed)	□ Lack of time □ Rehabilitation training is boring □ Difficulty using equipment □ Lack of professional guidance □ Other (please specify): ________	Multiple difficulties: 0 points One difficulty: 1 point No difficulties: 3 points
	10. What factors do you think are hindering your rehabilitation progress? (Multiple choices allowed)	□ Insufficient time □ Limited resources □ Personal health condition □ Other (please specify): ________	Multiple hindrances: 0 points One hindrance: 1 point No hindrance:3 points
	11. What are your expectations or suggestions for future rehabilitation training?	Please describe: ________ Note: This is an open-ended question and is not scored.	Not scored

References

Hunan Daily. China Stroke Prevention and Treatment Report (2023): One person Dies of Stroke Every 28 Seconds in China; Early Identification and Prevention are Crucial. Tencent News. November 2023. Available online: https://new.qq.com/rain/a/20231104A041T600 (accessed on 1 January 2024).
Wang, Y.N.; Wu, S.M.; Liu, M. Trends and characteristics of stroke in China over 15 years. West China Med. J. 2021, 36, 803–807. [Google Scholar]
Tiwari, S.; Joshi, A.; Rai, N.; Satpathy, P. Impact of stroke on quality of life of stroke survivors and their caregivers: A qualitative study from India. J. Neurosci. Rural Pract. 2021, 12, 680–688. [Google Scholar] [CrossRef] [PubMed]
Gao, P. Effects of community-based rehabilitation training focusing on active movement on daily living abilities in post-stroke patients during the recovery phase. Chin. J. Rehabil. Theory Pract. 2011, 17, 289–290. [Google Scholar]
Jiang, L.H.; Deng, C.Y.; Li, Z.J.; Zhang, L.L.; Lu, Q. Analysis of self-perceived burden and its influencing factors in stroke patients. Tianjin Nurs. J. 2019, 27, 514–517. [Google Scholar]
Pang, D.; Na, L. Survey on the burden of primary caregivers of community-based stroke patients. Chin. J. Nurs. 2005, 42, 49–51. [Google Scholar]
Wang, I.; Yen, S.-C.; Rahman, M.; Li, X.; Longwell-Grice, E.; Liu, C. Hierarchical properties and functional staging of the Fugl-Meyer Assessment Lower Extremity Scale. Arch. Phys. Med. Rehabil. 2022, 103, e114. [Google Scholar] [CrossRef]
Woodbury, M.; Grattan, E.S.; Li, C.-Y. Development of a short form assessment combining the Fugl-Meyer Assessment–Upper Extremity and the Wolf Motor Function Test for evaluating stroke recovery. Arch. Phys. Med. Rehabil. 2023, 104, 1661–1668. [Google Scholar] [CrossRef]
Zhang, Z.; Fang, Q.; Gu, X. Fuzzy inference system-based automatic Brunnstrom stage classification for upper-extremity rehabilitation. Expert Syst. Appl. 2014, 41, 1973–1980. [Google Scholar] [CrossRef]
Chen, K.; Huang, X.; Zhang, Y.; Ai, Q. Research on rehabilitation assessment methods based on human gait and sEMG. Cogent Eng. 2016, 3, 1220113. [Google Scholar] [CrossRef]
Zestas, O.N.; Soumis, D.N.; Kyriakou, K.D.; Seklou, K.; Tselikas, N.D. A computer-vision-based hand rehabilitation assessment suite. AEU Int. J. Electron. Commun. 2023, 169, 154762. [Google Scholar] [CrossRef]
Jiang, Z.; Hu, P.; Cheng, R.; Wang, H.; Zhang, Q.; Ma, S.; Tsai, T.Y. Quantitative analysis of gait dysfunction in sarcopenia patients: Based on spatiotemporal parameters and kinematic performance. Gait Posture 2024, 118, 108–114. [Google Scholar] [CrossRef]
Simis, M.; Sato, J.R.; Santos, K.; Fregni, F.; Battistella, L.R. Using functional near-infrared spectroscopy (FNIRS) to assess the effect of transcranial direct-current stimulation (TDCS) on spinal cord injury patients during robot-assisted gait. Ann. Phys. Rehabil. Med. 2018, 61, e80–e81. [Google Scholar] [CrossRef]
Ramos-Murguialday, A.; Curado, M.R.; Broetz, D.; Yilmaz, Ö.; Brasil, F.L.; Liberati, G.; Garcia-Cossio, E.; Cho, W.; Caria, A.; Cohen, L.G.; et al. Brain-machine interface in chronic stroke: Randomized trial long-term follow-up. Neurorehabil. Neural Repair. 2019, 33, 188–198. [Google Scholar] [CrossRef]
Bertolucci, F.; Lamola, G.; Fanciullacci, C.; Artoni, F.; Panarese, A.; Micera, S.; Chisari, C. EEG predicts upper limb motor improvement after robotic rehabilitation in chronic stroke patients. Ann. Phys. Rehabil. Med. 2018, 61, e200–e201. [Google Scholar] [CrossRef]
Du, Y.; Shi, Y.; Ma, H.; Li, D.; Su, T.; Meidege, O.Z.; Wang, B.; Lu, X. Application of multi-dimensional intelligent visual quantitative assessment system to evaluate hand function rehabilitation in stroke patients. Brain Sci. 2022, 12, 1698. [Google Scholar] [CrossRef] [PubMed]
Jo, S.; Song, Y.; Lee, Y.; Heo, S.H.; Jang, S.J.; Kim, Y.; Shin, J.H.; Jeong, J.; Park, H.S. Functional MRI assessment of brain activity during hand rehabilitation with an MR-compatible soft glove in chronic stroke patients: A preliminary study. In Proceedings of the 2023 International Conference on Rehabilitation Robotics (ICORR), Singapore, Singapore, 24–28 September 2023; pp. 1–6. [Google Scholar] [CrossRef]
Nunna, B.; Parihar, P.; Wanjari, M.; Shetty, N.; Bora, N. High-resolution imaging insights into shoulder joint pain: A comprehensive review of ultrasound and magnetic resonance imaging (MRI). In Cureus; Springer: Berlin/Heidelberg, Germany, 2023. [Google Scholar] [CrossRef]
Yu, J.; Luo, L.; Zhu, W.; Li, Y.; Xie, P.; Zhang, L. A novel low-pressure robotic glove based on CT-optimized finger joint kinematic model for long-term rehabilitation of stroke patients. IEEE Trans. Neural Syst. Rehabil. Eng. 2024, 32, 53–62. [Google Scholar] [CrossRef]
Bai, J.; Li, G.; Lu, X.; Wen, X. Automatic rehabilitation assessment method of upper limb motor function based on posture and distribution force. Front. Neurosci. 2024, 18, 1362495. [Google Scholar] [CrossRef]
Wang, H.; Chen, P.; Li, Y.; Sun, B.; Liao, Z.; Niu, B.; Niu, J. New rehabilitation assessment method of the end-effector finger rehabilitation robot based on multi-sensor source. Healthcare 2021, 9, 1251. [Google Scholar] [CrossRef]
Liu, C.; Lu, J.; Yang, H.; Guo, K. Current state of robotics in hand rehabilitation after stroke: A systematic review. Appl. Sci. 2022, 12, 4540. [Google Scholar] [CrossRef]
Mundt, M.; Colyer, S.; Wade, L.; Needham, L.; Evans, M.; Millett, E.; Alderson, J. Automating video-based two-dimensional motion analysis in sport? Implications for gait event detection, pose estimation, and performance parameter analysis. Scand. J. Med. Sci. Sports 2024, 34, e14693. [Google Scholar] [CrossRef]
Sheng, B.; Lei, X.; Cheng, J.; Xie, Q.; Tao, J.; Chen, Y. Novel digital assessment system for upper-limb movement in stroke patients using markless-sensing technology and deep learning algorithms. J. Shanghai Jiaotong Univ. Sci. 2024, 7. [Google Scholar] [CrossRef]
Ai, Q.; Liu, Z.; Meng, W.; Liu, Q.; Xie, S.Q. Machine learning in robot-assisted upper limb rehabilitation: A focused review. IEEE Trans. Cogn. Dev. Syst. 2023, 15, 2053–2063. [Google Scholar] [CrossRef]
Hossain, D.; Scott, S.H.; Cluff, T.; Dukelow, S.P. The use of machine learning and deep learning techniques to assess proprioceptive impairments of the upper limb after stroke. J. Neuroeng. Rehabil. 2023, 20, 15. [Google Scholar] [CrossRef] [PubMed]
Gyamerah, S.; Soori, G.T.; Korda, D.R.; Tawiah, J.K.; Akolgo, E.A.; Dapaah, E.O. Comparative analysis of feature extraction of high-dimensional data reduction using machine learning techniques. Am. J. Electr. Comput. Eng. 2023, 7, 374–383. [Google Scholar] [CrossRef]
Lian, J.; Chen, T. Research on complex data mining analysis and pattern recognition based on deep learning. J. Comput. Electron. Inf. Manag. 2024, 12, 37–41. [Google Scholar] [CrossRef]
Cranford, S. Getting DEEP with machine learning. Matter 2023, 6, 3113–3116. [Google Scholar] [CrossRef]
Mitikhin, V.; Solokhina, T. Personalized assessment of the effectiveness of psychosocial rehabilitation: An innovative approach based on the process of analytical hierarchy. Eur. Psychiatr. 2024, 67, S189. [Google Scholar] [CrossRef]
Demirsoy, M.S.; Gül, A.N.A. Respiratory analysis with electrocardiogram data: Evaluation of Pan-Tompkins algorithm and cubic curve interpolation method. Black Sea J. Eng. Sci. 2024, 7, 374–383. [Google Scholar] [CrossRef]
Cheung, W.K.; Pakzad, A.; Mogulkoc, N.; Needleman, S.H.; Rangelov, B.; Gudmundsson, E.; Zhao, A.; Abbas, M.; McLaverty, D.; Asimakopoulos, D. Interpolation-split: A data-centric deep learning approach with big interpolated data to boost airway segmentation performance. J. Big Data 2024, 11, 104. [Google Scholar] [CrossRef]
Mazumder, P.; Baruah, S. A hybrid model for predicting classification dataset based on random forest, support vector machine and artificial neural network. Int. J. Innov. Technol. Explor. Eng. 2023, 13, 19–25. [Google Scholar] [CrossRef]
Baxani, R.; Edinburgh, M. Heart disease prediction using machine learning algorithms logistic regression, support vector machine and random forest classification techniques. In Support Vector Machine and Random Forest Classification Techniques; SSRN: New York, NY, USA, 2022. [Google Scholar] [CrossRef]
Hua, W. Design and implementation of a finger rehabilitation device for stroke patients. In Proceedings of the 2024 17th International Convention on Rehabilitation Engineering and Assistive Technology (i-CREATe), Shanghai, China, 23–26 August 2024. [Google Scholar]
Anwary, A.R.; Yu, H.; Vassallo, M. Gait evaluation using Procrustes and Euclidean distance matrix analysis. IEEE J. Biomed. Health Inform. 2019, 23, 2021–2029. [Google Scholar] [CrossRef] [PubMed]
Cao, Y.; Li, B.; Li, Q.; Xie, J.; Cao, B.; Yu, S. Kinect-based gait analyses of patients with Parkinson’s disease, patients with stroke with hemiplegia, and healthy adults. CNS Neurosci. Ther. 2017, 23, 447–449. [Google Scholar] [CrossRef] [PubMed]
Xiang, K.; Wang, W.; Hou, Z.G.; Zhang, C.; Wang, J.; Shi, W.; Jiao, Y.; Lin, T. Muscle synergy analysis based on NMF for lower limb motor function assessment. In Proceedings of the 2022 IEEE International Conference on Robotics and Biomimetics (ROBIO), Xishuangbanna, China, 5–9 December 2022; pp. 2116–2121. [Google Scholar] [CrossRef]
Yun, I.; Jeung, J.; Song, Y.; Chung, Y. Non-invasive quantitative muscle fatigue estimation based on correlation between sEMG signal and muscle mass. IEEE Access 2020, 8, 191751–191757. [Google Scholar] [CrossRef]
Tutor, L.J.; Cai, Y. Monitoring rehabilitation of stroke patients using automated Fugl-Meyer assessment. Presented at the 15th International Conference on Applied Human Factors and Ergonomics (AHFE), Nice, France, 22–27 July 2024. [Google Scholar] [CrossRef]
Capecci, M.; Ceravolo, M.G.; Ferracuti, F.; Iarlori, S.; Kyrki, V.; Monteriu, A.; Romeo, L.; Verdini, F. A hidden semi-Markov model-based approach for rehabilitation exercise assessment. J. Biomed. Inform. 2018, 78, 1–11. [Google Scholar] [CrossRef]
Wang, S.; Wu, X.; Lai, W.; Yao, J.; Gou, X.; Ye, H.; Yi, J.; Cao, D. Rehabilitation evaluation method and application for upper limb post-stroke based on improved DTW. Biomed. Signal Process. Control 2024, 106, 107775. [Google Scholar] [CrossRef]
Bai, J.; Song, A. Development of a novel home-based multi-scene upper limb rehabilitation training and evaluation system for post-stroke patients. IEEE Access 2019, 7, 9667–9677. [Google Scholar] [CrossRef]
Ghafari, G.; Jaywant, A.; Campo, M.; Toglia, J.; O’Dell, M. Construct validity of the Stroke Upper-Limb Capacity Scale as a measure of upper extremity capacity. Arch. Phys. Med. Rehabil. 2022, 103, e105–e106. [Google Scholar] [CrossRef]
Zhu, Y.; Tang, Y. A class of rational quartic splines and their local tensor product extensions. Comput. Aided Des. 2023, 164, 103603. [Google Scholar] [CrossRef]
Ruiz-Moreno, E.; López-Ramos, L.M.; Beferull-Lozano, B. A trainable approach to zero-delay smoothing spline interpolation. IEEE Trans. Signal Process. 2023, 71, 4317–4329. [Google Scholar] [CrossRef]
Liu, S. SMOTE-LMKNN: A synthetic minority oversampling technique based on local means-based k-nearest neighbor. Int. J. Patt. Recogn. Artif. Intell. 2022, 36, 2250019. [Google Scholar] [CrossRef]
Zhang, Y.; Deng, L.; Huang, H.; Wei, B. An improved SMOTE based on center offset factor and synthesis strategy for imbalanced data classification. J. Supercomput. 2023, 80, 22479–22519. [Google Scholar] [CrossRef]

Figure 1. Intelligent rehabilitation assessment method for small-sample scenarios.

Figure 2. (1) Drinking water task, (2) Flipping pages task, (3) Typing task. Experimental setup (a) and real experiment scene (b).

Figure 3. Confusion matrix of linear interpolation method combined with neural network.

Figure 4. ROC curve of linear interpolation method combined with neural network.

Table 1. Motion features, task scores, and rehabilitation matching value calculations.

Task/Finger	Type	$F_{H}$	$F_{D}$	$A_{x H}$	$A_{x D}$	$A_{y H}$	$A_{y D}$	$T_{H}$	$T_{D}$	$M$	$M_{t a s k}$	$M_{c}$
Drinking Water Task	Thumb	50	45	30	25	20	18	3	2	82.50%	82.66%	79.83%
	Index Finger	48	44	28	24	18	16	3	3	91.47%
	Middle Finger	46	42	26	22	16	14	3	1	74%
Typing Task	Thumb	52	47	32	27	22	19	3	0	65.28%	74.12%
	Index Finger	50	46	30	26	20	17	3	2	82.34%
	Middle Finger	48	44	28	24	18	16	3	1	74.75%
Page-Turning Task	Thumb	54	49	34	29	24	21	3	1	73.72%	82.70%
	Index Finger	52	48	32	28	22	19	3	3	91.55%
	Middle Finger	50	46	30	26	20	18	3	2	82.84%

Table 2. Performance metrics of machine learning models.

Category	Accuracy	Precision	Recall	F1 Score
Random Forest	0.3	0.21	0.3	0.25
Neural Network	0.47	0.43	0.47	0.45
SVM	0.4	0.23	0.4	0.29

Table 3. Performance metrics of machine learning models after data augmentation.

Data Augmentation Method	Machine Learning Model	Accuracy	Precision	Recall	F1 Score	F1 Score (95% Confidence Interval)
SMOTE	Random Forest	0.72	0.66	0.69	0.68	0.68 ± 0.02
	Neural Network	0.75	0.70	0.73	0.71	0.71 ± 0.03
	SVM	0.74	0.68	0.71	0.69	0.69 ± 0.02
Cubic Spline Interpolation	Random Forest	0.74	0.69	0.71	0.70	0.70 ± 0.02
	Neural Network	0.78	0.73	0.76	0.75	0.75 ± 0.02
	SVM	0.76	0.71	0.74	0.73	0.73 ± 0.02
Linear Interpolation	Random Forest	0.75	0.70	0.73	0.72	0.72 ± 0.01
	Neural Network	0.80	0.75	0.78	0.77	0.77 ± 0.01
	SVM	0.78	0.72	0.76	0.74	0.74 ± 0.01

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wei, H.; Luh, D.; Chen, Z.; Yan, H.; Zhang, R. An Intelligent Rehabilitation Assessment Method for Small-Sample Scenarios: Machine Learning Validation Based on Rehabilitation Matching Value. Electronics 2025, 14, 1607. https://doi.org/10.3390/electronics14081607

AMA Style

Wei H, Luh D, Chen Z, Yan H, Zhang R. An Intelligent Rehabilitation Assessment Method for Small-Sample Scenarios: Machine Learning Validation Based on Rehabilitation Matching Value. Electronics. 2025; 14(8):1607. https://doi.org/10.3390/electronics14081607

Chicago/Turabian Style

Wei, Hua, Dingbang Luh, Zihao Chen, Haixia Yan, and Ruizhi Zhang. 2025. "An Intelligent Rehabilitation Assessment Method for Small-Sample Scenarios: Machine Learning Validation Based on Rehabilitation Matching Value" Electronics 14, no. 8: 1607. https://doi.org/10.3390/electronics14081607

APA Style

Wei, H., Luh, D., Chen, Z., Yan, H., & Zhang, R. (2025). An Intelligent Rehabilitation Assessment Method for Small-Sample Scenarios: Machine Learning Validation Based on Rehabilitation Matching Value. Electronics, 14(8), 1607. https://doi.org/10.3390/electronics14081607

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Intelligent Rehabilitation Assessment Method for Small-Sample Scenarios: Machine Learning Validation Based on Rehabilitation Matching Value

Abstract

1. Introduction

2. Method

2.1. Rehabilitation Matching Value Calculation Model

2.2. Data Expansion and Feature Extraction

2.2.1. Data Source

2.2.2. Data Expansion Method

2.2.3. Feature Extraction

2.2.4. Data Feature Analysis

2.2.5. Interpretability of Data Features

2.3. Machine Learning Model

2.3.1. Model Selection

2.3.2. Integration of Interpolation and Machine Learning

2.3.3. Model Input and Output

2.3.4. Model Training and Validation

3. Experiments and Results

3.1. Experimental Design

3.2. Experimental Results

3.2.1. Rehabilitation Matching Value Calculation Results

3.2.2. Performance of Machine Learning Models

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI