**4. Discussion**

In this paper, a novel estimation method of RFPCA was proposed to study the relationship between sEMG and knee movement. Compared with the results of BPPCA, the RFPCA performs better, both in terms of the root mean square error and the execution time. All of the estimation results using RFPCA are also generally in line with the EV. These results may be due to the strong regression ability of RF, which generates an internal unbiased estimate of the generalization error as the forest building progresses. PCA is able to generate a better input for RF from the original data, which also promotes the accuracy of the results of estimation.

As seen in Figure 7, with the increasing of input samples, *R* starts to decrease and eventually stabilizes, which means that the prediction accuracy increases at first, and then does not change significantly for both methods. As known, walking is a regular movement, the kinematic parameter of the gait is a cyclic process and the sEMG also appears as a periodic signal in different GCs. Thus, we believe that when the sample size increases to a certain value, the differences between the samples decrease, so that the prediction results show little change. However, the larger the data, the longer the time. With acceptable accuracy, choosing a better sample size can effectively reduce the time consumption caused by large samples" and this would contribute to the efficiency of online control using sEMG.

The historic effect of the input has a positive influence on motion estimation, according to the authors of [19]. In this work, as the input size increases, there is a tendency for the *R* of both RFPCA and BPPCA to increase, as shown in Figure 8, and the previous signals seem to have little to no effect on the estimation after more dimensional data is involved in the calculation. Generally, in the high-dimensional input case, the problem of sparse data samples and the difficulty of distance calculation are a common and serious obstacle for all machine learning methods, which is called the "curse of dimensionality". Thus, except for the PCA used in our work, further study of the input dimension of sEMG needs to be considered. Multichannel sensors of sEMG detecting are also worthy of research.

As seen in Figure 9, in terms of the physical and mental quality of the test subjects, the prediction results of different subjects also vary. In addition, the skin preparations of the subjects vary from one subject to another, which can also contribute to detection errors of in the raw sEMG. Moreover, different times for the experiment and other environmental factors will also cause diversity in the outcome when the raw sEMG is being collected. However, both Figures 9 and 10 show that the results of the BPPCA are more unstable for all participants, relatively, and that the results of the RFPCA have estimations similar to the EV. That is, RFPCA has a better error tolerance and is not adversely affected by variations in test subjects.

The root mean square error of RFPCA was small (*R* ≈ 5◦) in the previous sEMG study when *n* = 2, as shown in Figure 10, and this result seems to be large in comparison to the motion angle of the knee. However, we believe that this is acceptable for the exoskeleton control, since the control precision of the exoskeleton joint is not in high demand as a machine tool, and the pilot of the exoskeleton is able to tolerate the small differences of several degrees due to the flexibility of the human body. Also, the estimation model using RFPCA has some hyperparameters in the structure of the forest. Therefore, a better parameter selection may lead to a more accurate estimation model of RFPCA, and this would also help the application of RFPCA for estimation of joint movement in myoelectric control. Furthermore, while the RFPCA performed better in our study, the BPPCA may have advantages over the RFPCA with a better parameter choice.

Since the aim of this work is using ML to build an estimation model which can be adjusted to suit the subject himself, the training and testing data are from the same subject for each validation, and a similar method can be found in [30]. The results of the subjects in this work using RFPCA was acceptable. However, sEMG is unstable, as mentioned above, and the RFPCA method may be unfeasible for subjects who are middle-aged and elderly, since the subjects in our study were under 30 years of age, and the data from different people for training and testing may be an issue through the method of RFPCA. Thus, more people will be invited to participate in the study to further validate the proposed method.
