1. Introduction
Advances in computerized control have further improved the function of intelligent prostheses [
1,
2], which help assist lower limb amputees with versatile activities, such as walking and climbing ramps [
3]. The correct control mode setting provides torque for adjusting the joint impedance, which is based on the recognition of the user’s movement intent as accurately as possible [
4,
5,
6]. To allow smooth locomotion task transition, neural control of computerized powered prostheses is demanded [
7].
Previous studies have proposed addressing this challenge through EMG signals, a major neural control source widely used in clinical muscle disease diagnosis, rehabilitation engineering, and other fields [
8,
9,
10]. The motion recognition technology based on EMG directly reflects the human action intention by obtaining information on skeletal muscle activity that controls limb movement and has high reliability and accuracy of action perception [
11,
12]. The surface EMG signal can not only directly reflect the intensity of the relevant muscle activity, but also, when the human body performs a series of actions continuously, there will be intervals and excesses between the actions [
13], which are reflected in the significant changes of the surface EMG signal amplitude, and this information can play a key role in solving the segmentation of the actions.
The EMG signals of the lower limb can reflect the motion state of the human lower limbs. It demonstrated that EMG signals might be used to classify locomotion modes, including level-ground walking, and ascending and descending a ramp [
14]. An algorithm for terrain identification based on EMG signals only made one decision per stride cycle, leading to a time delay of one stride cycle in real-time, which is inadequate in practical application for safe control requirements [
15,
16].
In practical research, too many and repetitive EMG signals can lead to signal redundancy, which in turn affects subsequent data processing and application. However, too few or randomly selected lower limb muscles to collect EMG signals cannot accurately characterize human lower limb movements. How to find a certain muscle or a combination of certain muscles that can best reflect the movement of the joints among many lower limb muscles, and how to filter out the superficial muscles that make the greatest contribution to the movement are the key points and difficulties in the study of EMG signals nowadays. There are many muscles in the lower limb, and there are cases in which a joint movement is accomplished by the cooperation of multiple muscles, and there are also cases in which a certain muscle is involved in multiple joint movements, such as the gastrocnemius muscle, a bi-articular muscle which flexes the knee and plantarflexes the ankle when contracted [
17]. In addition, muscles exhibit distinct stratification in their anatomical positioning. muscles exhibit differential depth stratification. For instance, while both the soleus and gastrocnemius muscles contribute to ankle plantar flexion, the soleus is positioned deeper within the tissue hierarchy, lying beneath the gastrocnemius [
18].
Most of the existing studies are based on previous experience or theoretical knowledge of anatomy to roughly select the test muscles, and there is no data to support them. Literature [
19] used EMG signals from four lower limb muscles to recognize five lower limb movements with a recognition rate of about 90%, but the rationale and basis for the selection of the muscles were not elaborated in the paper [
19]. In [
20], EMG signals from eight muscles of the bilateral lower limbs of 30 normal young people were collected experimentally to study the statistical information of normal gait walking. However, the role and changes of thigh muscles such as the broad fascia tensor, semimembranosus, and semitendinosus were not mentioned. In [
21], the lower limb muscles were divided into anterior and posterior muscle groups to study the characteristic information of EMG signals in the gait cycle, and it was found that the gastrocnemius and tibialis anterior muscles of the lower leg showed cyclic changes in the gait cycle, and the EMG signals of tibialis anterior muscles reached the peak at the time when the foot was following the ground, while those of gastrocnemius muscles appeared during the period of the full-foot touchdown, and the significant difference between the muscles of the bilateral muscles in the process of movement was analyzed in that literature. The literature has also analyzed the significant differences between bilateral muscles during exercise, but discontinuous movements such as squatting/rising and sitting have not been studied in depth.
How to select the most representative muscles among many lower limb muscles and how to accurately pick and choose the EMG signals from the data source is an important method to further improve the accuracy of the recognition algorithm.
Meanwhile, due to the strong non-smooth randomness of surface EMG signals, the information that can effectively describe the type of muscle activity is often deeply hidden in the acquired signals [
6,
7]. If all data of multi-channel active segment signals are used as the classifier’s input, achieving the ideal classification effect is difficult [
22]. To improve the accuracy of motion classification, the features that are distinguishable between different actions need to be extracted, which has the potential to reduce the computational cost.
The current methods for analyzing the EMG signals of the lower limb surface include time-domain analysis, frequency-domain analysis, and time-frequency domain analysis [
13,
23]. The existing studies show that time-domain features reflect the differences between different movements, while frequency-domain features are more correlated with the degree of muscle fatigue. Traditional methods often directly extract the time-frequency domain eigenvalues of the signal [
5], which lose the critical phase information for motion pattern recognition.
Motivated by the need for neural control of the prosthesis, most of the current research investigated how accurately a variety of mobility tasks and human movement intention could be identified using EMG signals recorded from leg muscles [
24].
The Fourier transform can only act on convergent signals. The window function of the short-time Fourier transform (STFT) is not changeable. Although the window function of the wavelet transform is changeable, allowing multi-resolution analysis, it is difficult to select its basis function. The S-transform integrates the advantages of both methods, which overcome the defects of the STFT window with a fixed time width and can adaptively adjust the analysis time width according to the change in frequency, providing intuitive time-frequency characteristics. It can adaptively adjust the resolution, and its inverse transform is lossless and reversible. There is no need to select the window function domain analysis scale. The S-transform energy concentration with multi-channel EMG signals fusion analysis method was proposed to address the problem of losing the EMG signals’ phase information [
1,
25]. The recognition accuracy of this method is verified by experimentally comparing the SVM motion recognition results of the simple time-frequency domain features with those of the S-transform energy concentration.
The main contributions are as follows: (1) The S-transform energy concentration method was utilized to extract active segment EMG signal features intercepted by endpoint detection. (2) The segmented S-transform energy concentration method is used to decompose the whole time-frequency concentration, and the window width at the discrete points is utilized to improve its time-frequency aggregation, which maximizes the retention of the phase information of the signal. (3) A muscle screening combination method based on correlation analysis was proposed to accurately pick and choose EMG signals from the data source. Multi-channel signal feature fusion analysis was utilized for motion pattern classification to improve motion recognition accuracy.
The paper is organized as follows:
Section 2 introduces the principle of S-transform and the proposed surface EMG decoding method, which can be compared with the traditional method. Based on the raw sEMG signals of 10 experimenters’ lower limb movements, the correlation of multi-channel EMG signals is covered in
Section 3.
Section 4 evaluates the performance of the proposed method using S-transform energy concentration, compared with recognition accuracy based on simple time-frequency domain features. Discussions and conclusions are drawn in
Section 5.
2. Methodology
Its phase information cannot be neglected to identify the movement patterns [
5], and a method of using signal segmentation S-transform energy concentration is proposed. The advantages of the method in this section are verified by comparing the recognition effect of the experiment with the traditional method, which relies on the simple time-frequency domain features.
2.1. The Principle of S-Transform and SVM
The appropriate feature extraction methods to enhance sEMG signal features are very critical because of their strong nonlinear characteristics [
7,
14]. Studies have shown the feasibility of applying the S-transform to the feature analysis of EMG signals, which can improve the accuracy of motion classification to a certain extent [
1].
Time-frequency analysis techniques, such as the short-time Fourier transform (STFT) and wavelet transform (WT), enable the study of how signal frequency components vary over time. The key difference lies in the window function. STFT employs a fixed-size window, limiting its adaptability to non-stationary signals like surface electromyography (EMG). Conversely, WT employs variable-sized windows, using narrower windows for high frequencies and wider windows for low frequencies. However, WT often suffers from a trade-off between temporal and frequency resolution, hindering its application in EMG analysis.
For EMG signals, while simple time-frequency domain features can be useful for limb single-motion recognition, they often lose phase information, limiting their applicability to more complex motion pattern recognition. As the inheritance and development of wavelet and short-time Fourier transform, S-transform, it eliminates the choice of window function and improves the defect of fixed window width [
9]. It has the characteristics of the Fourier transform and can rely on the window function to provide different resolutions at different frequencies [
26].
The phase spectrum of each frequency component in the time-frequency representation keeps a direct connection with the original signal, so more feature quantities can be used. Meanwhile, the feature quantities extracted by the S-transform are insensitive to noise [
25,
27].
This is the time series signal x(t), whose S-transform expression is shown in Equation (1). Where W(τ − t, f) represents the Gaussian window function, τ is its center, t and f denote time and frequency, respectively, and is the imaginary unit.
As seen from Equation (1), the S-transform uses a Gaussian window function, whose window height and width change with frequency. Its window width is proportional to the inverse of frequency, so it has a high-frequency resolution in the low-frequency part and a high time resolution in the high-frequency part. It can be seen that the S-transform has lossless reversibility; there is no loss of information during the whole transform process [
25].
The discrete form of the S-transform is as follows.
Support vector machine is a supervised learning algorithm that is widely used in the field of pattern recognition and is often used to solve data classification problems. As depicted in
Figure 1, the optimization objective involves determining both the orientation and location of the optimal separating hyperplane to maximize the classification margin. This binary classifier’s performance is fundamentally governed by the selection of support vectors.
In this paper, we need to design a multi-classifier to solve the multi-classification problem. Given m classes, train a classifier for each two of the m classes. The total number of binary classifiers is m(m − 1)/2. For example, to achieve four classifications, six classifiers would be required. For data that needs to be classified, it needs to be predicted by all the classifiers, and the same way of voting is used to decide its final class attributes.
2.2. The Architecture of the Proposed Method
In this section, the S-transform energy concentration method is used to extract the features of EMG signals, and the specific process is shown in
Figure 2. Firstly, the original EMG signals are preprocessed, and the preprocessed signals are intercepted by endpoint detection for a useful period. Then the segmented S-transform and energy concentration are calculated for the signals within that period, and the signal features of the specified dimension are extracted using the segmentation operation. The motion classification is performed by SVM, and the multi-channel signal features are fused and analyzed to explore the motion recognition effect.
2.3. Signal Pre-Processing and Active Segment Detection
Surface EMG signals are susceptible to signal interference from other organs, multi-channel signal crosstalk, experimental instrument noise, and environmental noise during acquisition, so their pre-processing process is very important [
28].
The pre-processing in this section mainly consists of two steps. Firstly, to remove the industrial frequency interference through the 50 Hz notch filter. Next, we explored 30 Hz zero-phase shift high-pass filtering to remove the motion artifacts [
1,
9]. The results of the pre-processing of EMG signals of the rectus femoris were used as an example, as shown in
Figure 3. The zero-phase shift filter is used to retain the phase information of the original data sequence in the pre-processing process, which makes up for the shortcomings of the phase distortion of the original filter.
Taking the characteristics of the EMG signal and the pre-processing process into consideration, we adopted a double threshold detection method based on short-time energy and short-time variance sum to carry out active segment detection; its process is shown in
Figure 4. Firstly, the initial short-time energy minimum threshold and the variance and low threshold for characterizing fluctuations are set, both parameters are based on the actual situation; then the appropriate window length and window shift are defined to frame the signal.
The double threshold detection method used in the code segments EMG data based on short-time energy (energy) and short-time variance sum (sum_s). The thresholds are determined based on the following logic: (1) Initial thresholds (energy = 0.02 and sum_s = 0.03) are chosen based on prior experience and analysis of typical EMG signals. (2) Dynamic Adjustment: The maximum values of energy (max(energy_frame)) and variance sum (max(sum_s_frame)) are used as reference points for scaling the thresholds. (3) Scaling Factors: The energy threshold is scaled by a factor of 0.02, setting it to 2% of the maximum energy. The variance sum thresholds are scaled by factors of 0.01 and 0.28, setting them to 1% and 28% of the maximum variance sum, respectively, for start and endpoint detection.
These scaling factors are chosen experimentally to ensure robust detection across different signal characteristics.
The short-time energy and variance of each frame are calculated, and the selection conditions of the starting and ending points of the active segment are flexibly changed by adjusting the energy threshold, or the threshold can be adjusted by the neural network adaptively. The endpoint detection is performed by comparing the short-time energy value of each frame with the set energy threshold value to find the start and endpoint positions of the EMG signal activity segment.
The nth frame of the EMG signal is represented as
xn(
m), and the frame length is
N. Calculate its short-time energy
En as shown in (3), and calculate its sum of the variances
Sn as shown in (4).
Time-frequency analysis methods reflect how the frequency of a signal changes over time. From the time-frequency plots, one can observe the main frequency range and energy intensity of the signal. Several time-frequency analysis methods were used to analyze the time-frequency characteristics of the filtered EMG signal. The filtered signal is shown in
Figure 5a.
The short-time Fourier transform (STFT) involves windowing the signal and then applying the Fourier transform. The length of the window determines the time-frequency resolution of the plot, with a fixed window width throughout the analysis. It can be observed from
Figure 5b that STFT has relatively poor time-frequency characteristics, making it difficult to discern the primary frequency concentration of the signal. Consequently, STFT is not well-suited for analyzing electromyography (EMG) signals, as it does not provide adequate time-frequency resolution.
In contrast, the wavelet transform (WT) offers better time-frequency characteristics. Its time-frequency plot is clearer compared to that of STFT. Unlike STFT, where the window function is fixed, the wavelet transform adapts the window size according to frequency changes, as shown in
Figure 5c. This adaptive resolution makes WT more suitable for analyzing EMG signals, as it accommodates variations in signal waveform dynamics.
Among the time-frequency analysis methods examined, the advantages of S-transform over STFT or CWT can be seen in
Figure 5. STFT demonstrates lower time-frequency resolution. Although wavelet transform overcomes the limitation of a fixed window width in STFT, it approximates frequency through scales rather than providing true frequency resolution, as shown in
Figure 5d. The S-transform offers even clearer time-frequency plots, effectively representing the primary components of EMG signals. With higher resolution, it captures the frequency and time characteristics of EMG signals more comprehensively. The results indicate that the S-transform provides superior time-frequency characteristics compared to STFT and WT.
2.4. S-Transformation and Energy Concentration Measure
Taking the rectus femoris of crossing obstacle task as an example, assuming that the data length of intercepting an action cycle is x, setting a sliding window with window length (x/25) and step length (x/25), S-transformation is performed on each segment of the signal. Setting the frequency interval to 20 Hz, a time-frequency matrix of size 16 × x/25 can be obtained. The window length is made to balance time resolution and frequency resolution. A shorter window length captures transient features of the signal better but reduces frequency resolution. The step length is equal to the window length, meaning the window moves by its full length at each step. This non-overlapping approach simplifies the analysis.
The signal segmentation S-transformation matrix result is shown in
Figure 6, in which the darker blue part indicates the lower time-frequency energy, and when the color gradually changes to yellow, it indicates the higher time-frequency energy. It can be seen that the energy is mainly concentrated between 100 Hz and 300 Hz. The 3D map provides the frequency, phase, and energy amplitude of the corresponding action surface EMG signal. There is information redundancy that is not conducive to action recognition. The information redundancy of the surface electromyography (sEMG) signal primarily manifests in two forms: low-energy regions and high-frequency noise. The dark blue regions in the 3D plot (
Figure 6) indicate areas with very low energy. These regions typically correspond to noise or insignificant signal components that do not carry useful information for action recognition. Frequencies above a certain threshold (e.g., 300 Hz in this case) may contain noise or artifacts that are irrelevant to the muscle activity being analyzed.
Stanković et al. of the University of Montenegro proposed a measure of the time-frequency distribution concentration, as shown in (5). A new scheme to address the energy concentration of the S-transform in the time-frequency domain has been developed by optimizing the width of the window function used, which can lead to higher energy concentration, as shown in (6). The energy concentration measure we used is shown in (7), which decomposes the entire time-frequency concentration and makes full use of the window width at the discrete points to improve its time-frequency aggregation [
20,
28].
Here, CM represents signal energy concentration. t and f denote time and frequency, respectively. Sx(t, f) is the matrix of the signal S-transform. is the tuning parameter, it adjusts the calculation of energy concentration, controlling its sensitivity to high-energy regions of the signal.
Taking the rectus femoris muscle as an example, the segmented S-transform energy concentration calculation results of four actions, such as level walk, stairs ascent, stairs descent, and crossing obstacles, are shown in
Figure 7. A feature quantity of length 25 from the original EMG signal was retained, which better preserved the phase information of the signal, and the energy amplitude difference was more obvious. It is known that the concentration measure reduces the data length, which helps eliminate redundant features [
27,
28,
29].
3. Experimental Preparation and EMG Signal Analysis
The segmented S-transform energy concentration method is used to analyze the human lower limb surface EMG signals and to verify its effectiveness. An experiment of lower limb EMG acquisition and decoding is conducted. We compared the differences in motion recognition effects based on simple time-frequency domain features and S-transform energy concentration feature extraction methods.
3.1. Experimental Setting
The lower limb is a complex movement system of multiple muscle-tendon units. In the actual locomotion tasks, any motion is completed under the action of multiple muscles at the same time. We focused on analyzing the identification of different motion patterns in the lower extremities, mainly involving the knee and ankle joints. The rectus femoris (RF), medial femoris (MF), biceps femoris (BF), semitendinosus (ST), tibialis anterior (TA), and medial gastrocnemius (MG) muscles were selected for the verification [
30].
Ten adults aged 22–25 years were recruited for this experiment, with a male-to-female ratio of 1:1. The subjects were all voluntary college students, in good health, with no history of lower limb disease in the past 6 months. All subjects provided informed consent, and the experimental procedures involving human subjects described in this paper were approved by the Institutional Review Board.
As shown in
Figure 8b, the human right leg muscle was selected for the acquisition, and the EMG sensors (
Figure 8a) were pasted at the corresponding six lower limb muscle group locations, while one electrode was pasted on the inner side of the ankle joint as the reference electrode and one electrode was pasted on the outer side of the ankle joint as the ground electrode [
1,
30].
In the experiment, the subjects were guided by the start and end prompts, and the process is shown in
Figure 8c. Taking the level walk task as an example, the initial state of the subject is standing, and the experiment starts with a computer tone telling the subject to prepare. Meanwhile, it started collecting data, waiting for 1 s, and then prompting walking, the subject naturally stepped forward. Data acquisition was terminated 5 s following the initiation tone. The system prompted a stop after the data collection, and the subjects turned around and waited for the next experiment to start. A single motion is a group of 5 times, each motion is a total of 3 groups, with 3 min rest between experiment intervals. An EMG acquisition frequency of 1200 Hz was used for different motion tasks.
In the EMG acquisition experiment, four types of motion were measured: level walk, stair ascent, stair descent, and crossing obstacles. We invited subjects with different degrees of muscle fitness to experiment and guided the subjects’ movements as naturally as possible, as shown in
Figure 9.
3.2. Simple Time-Frequency Domain Features
Time-domain features reflect more the difference between different movements, while frequency-domain features have more correlation with the degree of muscle fatigue. Extracting the simple time-frequency domain eigenvalues of EMG signals can be applied in limb single-motion recognition, for example, the recognition of the flexion and extension movements of the human hip, knee, and ankle joints. The recognition of the movement of the lower limbs can be identified by analyzing the time-frequency domain features of the lower limb surface EMG signals when the joints are in different motion tasks.
The time-frequency domain features used to lower limb motion recognition include Mean Absolute Value (
MAV), Wavelength (
WL), Root Mean Square (
RMS), Slope Sign Change (
SSC), Willison Amplitude (
WAMP), Unbiased Standard Deviation (
USTD), Shape Factor (
S), Crest Factor (
C), Impulse Factor (
I), Margin Factor (
L), Kurtosis Factor (
KU), and Power Spectral Density (
PSD). The specific formulas are presented in Equation (8).
: The -th sample of the signal. N: The total number of samples in the signal. Threshold: A predefined threshold to determine significant changes. : The mean value of the signal. RMS: Root Mean Square of the signal. MAV: Mean Absolute Value of the signal. Peak: The maximum absolute value of the signal. : The square of the mean of the square roots of the absolute values of the signal. : Frequency. The gait period takes the value of 1.5 s, the window length and the window step length take the value of 1800 sample points, and the maximum value of muscle activation in each trial is selected as the maximum activation.
Time-frequency domain feature extraction based on the pre-processed normalized signal matrix. The pre-processed EMG signals were processed through a low-pass filter, which used a 5 Hz zero-phase-shift low-pass filter, mainly to simulate the low-pass filter characteristics of muscles. The sEMG signal at maximum voluntary contraction was processed through pre-processing and low-pass filtering to find the maximum value of the maximum voluntary contraction sEMG signal, which was regarded as the signal at 100% muscle activation.
The EMG signals were pre-processed through a multi-stage pipeline: (1) high-pass filtering (5 Hz and 30 Hz cutoffs) to remove low-frequency noise and motion artifacts; (2) full-wave rectification and low-pass filtering (0.1 Hz cutoff) to extract the signal envelope; and (3) normalization using peak gait activation values (the maximum amplitude observed during locomotion trials) to ensure consistent scaling. This comprehensive processing approach maintained physiological relevance while optimizing signal quality for subsequent analysis.
Taking MAV as an example for analysis, as shown in
Figure 10a, its distribution reflected that the individual muscles in different gait phase modes have a high degree of differentiation. There is almost no overlap, which is conducive to the classification of pattern recognition. Frequency domain features are analyzed with PSD as an example, as shown in
Figure 10b, and the distribution of this feature of EMG signals is differentiated. There is an overlapping part, and different muscles fluctuate more in different motion tasks. It can be seen that the extraction and analysis of different time-frequency domain eigenvalues of lower limb surface EMG signals can be applied in lower limb motion recognition with different effects.
3.3. EMG Signals Correlation Analysis
Since different muscles are activated differently under the same movement, Spearman’s coefficient was derived to observe the phase correlation of activation in different muscles. It has certain advantages in the evaluation of trends and the degree of similarity between variables. Further analysis of multi-channel signal fusion is required to improve motion recognition accuracy. The optimal multi-channel signal can be screened out based on Spearman’s correlation coefficient. The correlation analysis was performed to examine the degree of similarity between features (not variables) extracted from the EMG signals. The features used for correlation analysis are the features extracted from the S-transform of the EMG signals.
To find the best combination of lower limb EMG for improving the accuracy of human lower limb motion recognition, we analyzed the correlation of multi-channel EMG signals. Their Spearman’s correlation coefficient can be seen in
Figure 11. The results are averages of multiple motion data from one of the 10 subjects. Then, we explored the difference in recognition effect between a single-channel signal and a multi-channel fusion signal.
In the level walk task, the similarity between rectus femoris, medial femoris, semitendinosus, and anterior tibialis was higher. The similarity between the biceps femoris, medial gastrocnemius, and the other four channel muscles is lower. Its lowest value of the similarity coefficient is 0.3523, which occurs between the biceps femoris and medial gastrocnemius.
During stair ascent, the correlation between the muscles is generally similar to that during the level walk, but the difference in similarity between the biceps femoris and the medial gastrocnemius is more pronounced. The lowest value of the similarity coefficient is 0.1977, occurring between the biceps femoris and the medial gastrocnemius. During stair descent, the correlation between the signals of the first five channels of muscles is all higher, and the difference between the signals of the medial gastrocnemius and the other channels is greater. The lowest value of similarity between the medial gastrocnemius and the semitendinosus is −0.2154. The correlation between the two signals of the biceps femoris, semitendinosus, and the other channels is lower during crossing the obstacle, with the lowest value of similarity coefficient being 0.3846, occurring between the biceps femoris with the medial gastrocnemius muscle. The values represent similarity coefficients between muscle pairs during lower limb movements (stair ascent, descent, and crossing obstacles), quantifying the correlation in muscle activation patterns. Higher values indicate a stronger correlation, while lower or negative values suggest a weaker or inverse correlation.
When multiple muscles collaborate to execute a particular movement, they demonstrate similar contraction forces, degrees of contraction, and neuromuscular activation patterns. Consequently, the EMG signals generated by these muscles tend to be identical or analogous. Therefore, when characterizing a movement in terms of EMG signals, selecting a single representative muscle suffices. For instance, during knee extension, the quadriceps muscles of the thigh are involved. The correlation coefficient between the rectus femoris and medial femoris is greater than 0.94, which suggests that these two muscles play the same role in the four movements, and therefore, the simultaneous selection of muscles with similar functions should be avoided to minimize the redundancy of information. Therefore, the rectus femoris alone can be the focus of study for knee extension movements.
These coefficients reveal variations in muscle activation patterns across movements, helping identify relevant muscle combinations for optimizing EMG-based motion recognition systems.
In the investigation of EMG signals, selecting muscles with the same functional roles can lead to signal redundancy, which hampers signal processing. Conversely, relying solely on a single muscle or muscle section may overlook crucial movement information, impeding accurate movement recognition. To address this, a correlation analysis can be performed on the lower limb muscles during movement to identify the most relevant muscles for subsequent EMG signal measurement and movement recognition.
A muscle screening method based on correlation analysis is proposed to screen out the most representative muscles from a large number of lower limb muscles, to accurately fetch electromyographic signals from the data source, which helps to further improve the accuracy of motion recognition.
4. Results
After a series of processing of the raw EMG data, the simple time-frequency domain feature matrix and the S-transformed energy concentration feature matrix of the lower limb surface EMG signals were obtained, respectively. The classification of the two types of features was conducted by SVM, which was computationally performed based on the MATLAB version 2021b algorithm toolkit, and the classification accuracy of the two methods was compared by the experimental results.
The results of the six-channel surface EMG signal identification based on simple time-frequency domain features (Method I) and that based on S-transformed energy concentration features (Method II) are shown in
Figure 12. It described the overall average accuracy distribution of individual sEMG channels across four different motions based on the proposed methods.
The analysis revealed that better motion recognition based on simple time-frequency domain features was the semitendinosus and rectus femoris muscles. Their classification accuracy as well as concentration is higher than that of the other channels of the signal. Comprehensively analyzing their accuracy mean, minimum, and concentration of accuracy, better motion recognition based on S-transformed energy concentration features was the medial gastrocnemius and rectus femoris muscles. Considering the motion recognition results of both methods, if a single-channel muscle signal is used in practical applications, it is most appropriate to use the rectus femoris signal for motion recognition.
In this section, the motion recognition results based on a single-channel signal of 10 subjects under the two methods are statistically shown in
Figure 13, which represents the averaged accuracy of the six channel signals across four different motions. Taking rectus femoris signal motion recognition results as an example, it can be seen that for each subject, the classification effect of the S-transformed energy concentration method is better than that of the simple time-frequency domain features, and the mean accuracy of the former is 93.70%, while the mean accuracy of the latter is 80.71%. Further statistics of the six-channel signal classification effect of 10 subjects showed that 91.67% of the results were better for the S-transform energy concentration method. This shows that the S-transform energy concentration method does have a better pattern recognition effect compared to extracting simple time-frequency domain features.
In the previous analysis, it can be seen that the motion recognition classification result based on the rectus femoris is better. To further improve the accuracy, the experiments were analyzed by fusing the multi-channel signals of rectus femoris, biceps femoris, and medial gastrocnemius. As shown in
Figure 14, the multi-channel fusion motion recognition results of 10 subjects were compared with the single-channel motion recognition results, in which the EMG signal-channel FS represents the fusion signals of rectus femoris with biceps femoris and medial gastrocnemius. These findings strongly supported that the multi-channel sEMG fusion approach demonstrated superior performance over single-channel classification, achieving higher recognition accuracy. This fusion strategy effectively mitigated the limitations of individual muscle channels, making it the preferred method for motion recognition applications.
Assuming that the fusion coefficients of the three are a, b, and c, it was found in the experiment of the second subject (S2) that the motion recognition classification accuracy reached 100% when taking a = 1, b = 0.2, and c = 0.5, and the motion recognition classification accuracy of this subject using the single-channel signal of rectus femoris was 85.71%. Similarly, higher classification accuracy could be obtained in the experiments of the first subject (S1) and the fourth subject (S4) using multi-channel fusion for classification. In the experiment of the third subject (S3), it was found that the classification accuracy could reach 100% when taking a = 1, b = 0.2–0.7, and c = 0.3, and the classification accuracy of this subject using a single-channel signal of rectus femoris was 96.97%. Fusion of multiple muscle signals significantly improved classification accuracy in most subjects (improvement range: +1.77% to +12.64%), with only one case (S10) showing slight degradation (−4.19 points). Potential causes include the subject’s unique muscle synergy patterns, sensor misplacement (e.g., medial gastrocnemius electrode shift), and distinct biomechanical characteristics during movement execution.
It can be seen that the fusion of multi-channel EMG signal features can effectively improve the accuracy of subjects’ motion recognition. However, the similarity between the lower limb EMG of different subjects is different. It is necessary to calculate the fusion coefficient of the multi-channel signal features of subjects and select the appropriate channels for fusion. Moreover, the fusion coefficients of EMG signal channels from S2 and S3 are different and should be set according to their EMG correlation coefficient. The selection of each parameter is based on the optimization goal of the highest accuracy of motion recognition. The fusion coefficients (a, b, c) were systematically selected within the [0, 1] range using a controlled variable methodology. Initialized at 0.1 with 0.1 increments, the parameter space was explored until achieving maximal classification accuracy, which served as the stopping condition for the optimization process.
5. Discussion and Conclusions
5.1. Discussion
This paper explored the recognition of human motion intention by decoding the lower limb surface EMG signals. To make the prosthesis control more flexible, it is necessary to establish the mapping relationship between the lower limb surface EMG signals and the fine movements of the lower limbs, to summarize the regularity characteristics between different muscle channels, and to build a functional network of human lower limb muscles. In this way, the accuracy of lower limb motion intention recognition can be further improved, and the motion control mode of lower limb prosthesis can be enriched.
Ai et al. [
31] achieved 95% recognition accuracy for five lower limb movements by fusing time-domain and wavelet-based sEMG features with accelerometer distance features. While demonstrating effective multi-modal fusion, this approach has limitations, including the wavelet transform’s poor high-frequency resolution and lack of shift-invariance, along with reliance on multiple feature combinations. Too et al. [
31] developed a Pbest-guided binary particle swarm optimization (PBPSO) method for sEMG feature reduction in hand motion classification. Compared with the proposed four other algorithms, PBPSO achieved 90% feature reduction but maintained less than 80% classification accuracy.
The EMG signals of the lower limb muscles can reflect the motion state of the lower limbs. Extracting features that are distinguishable between different motion tasks can reduce the computational cost of the classification model and improve the accuracy of action classification. The proposed feature extraction method using S-transform energy concentration reduced the length of the EMG signal and eliminated its redundant features, which improved the accuracy and reliability of lower limb motion classification. To verify the advantages of the method proposed in this paper, we compared it with another feature extraction method. Although the validity of the proposed method is verified to some extent, the comparison methods are limited.
The signals from both legs of healthy participants were recorded simultaneously in the experimental design. The results obtained using unilateral leg motion information showed the expected performance of motion recognition. This idea is consistent with limb loss in lower-limb amputees [
5,
32].
While surface electromyography (sEMG) remains the predominant input modality for motion intent classification [
6], its standalone performance exhibits inherent limitations. Recent studies demonstrate that multimodal fusion with inertial measurement unit (IMU) data significantly enhances recognition accuracy through complementary kinematic information [
5].
In future work, we can compare different feature extraction methods and classification algorithms, and increase the motion signal source to improve the diversity, accuracy, and practicality of lower limb motion recognition, which aims for the intelligent development of lower limb prosthesis.
5.2. Conclusions
In this paper, we analyzed the application of lower limb surface EMG signal decoding in lower limb prosthesis motion recognition technology. We proposed a surface EMG decoding method using S-transform energy concentration. The experimental analysis of the motion recognition effect from the six-channel lower limb EMG signal features of 10 subjects in four motion tasks. It showed that the method adopted in this paper can effectively improve the classification accuracy compared with the simple time-frequency domain statistical feature analysis. What is more? The channel feature fusion of rectus femoris, biceps femoris, and medial gastrocnemius can further be enhanced to improve motion recognition accuracy.
For future research, some possible improvements to the proposed work might need to be considered. The type of lower limb motion can be increased to explore the fusion coefficients of multi-channel EMG signal, which is beneficial to improve the accuracy and practicality of motion recognition. It is advanced in the application of high-performance low-limb prostheses.