*2.6. Evaluation Metrics*

To effectively evaluate the performance of the feature extraction methods, in terms of characterizing multiple-classes of movement intents in the context of EMG-PR system, four different metrics were utilized which are described as follows.

1. The commonly applied metric know as classification error (CE) which represent the number of non-correctly identified samples over the sum of all samples (Equation (1)) was utilized to evaluate the classification accuracy of the feature extracted methods:

$$\text{CE} = \frac{\text{Number of incorrectly classified samples}}{\text{Total number of testing samples}} \times 100\tag{1}$$

2. The F1\_score was utilized to further validate the performance of the extracted feature sets. This metric was computed as the weighted average of precision and recall (Equations (2) and (3)) [37]. Basically, the F1\_score reveals the performance of the classifier in classifying the data points of a particular feature set compared to others,

$$\text{Precision} = \frac{\text{TP}}{\text{TP} + \text{FP}} \quad \text{Recall} = \frac{\text{TP}}{\text{TP} + \text{FN}} \tag{2}$$

$$\text{F1}\_{\text{score}} = \frac{2 \ast \text{Recall} \ast \text{Precision}}{\text{Recall} + \text{Precision}} \tag{3}$$

where TP is the count of true positives, TN is the count of true negatives, FP represent number of false positives, and FN is the number of false negatives obtained from a confusion matrix. It is worth noting that F1\_score reaches its best value at 1 and worst at 0.

3. In principle, the computation time of a feature set would generally influence the response time of the microprocessor-based controller embedded in the prosthesis socket [15]. In this regard, the computation time of each extracted feature set presented in Table 2 was investigated by adopting the formulae in Equation (4) that was proposed by Weir and Farell [15],

$$D = \frac{1}{2}W\_L + \frac{1}{2}W\_{inc} + P\_T \tag{4}$$

where D is the delay, *W<sup>L</sup>* is the window length, *Winc* is the window increment and *P<sup>T</sup>* is the signal processing time. It should be noted that the configuration of the system utilized for this study is Microsoft window 7 professional with 64-bit operating system, Intel(R) Core(TM) i7 processor with processing speed of 3.6 GHz and 8 GB random access memory.

4. In the context of EMG-based pattern recognition system, an ideal feature extraction method would normally be influenced/affected by unwanted disturbances that may degrade the decoding of the user's intended limb movement. Therefore, it is important to quantify the robustness of a feature in other to guarantee that the features would be consistently stable when applied in real-life applications. In this regard, the stability index (*SIndex*) metrics adopted in a previous study [38], which is defined by Equation (5) was applied to examine the robustness of the feature extraction methods in the presence of noise,

$$S\_{\text{Index}} = \frac{\frac{1}{N} \sum\_{i=1}^{N} \text{CA}\_i}{\left[ \frac{1}{N-1} \sum\_{i=1}^{N} \left( \text{CA}\_i - \frac{1}{N} \sum\_{i=1}^{N} \text{CA}\_i \right)^2 \right]^{\frac{\alpha}{2}}} \tag{5}$$

where the numerator is the average classification performance, the denominator is the scaled standard deviation, ∝ is the scaled value and is set to 0.1 and N is the sample size. The value of ∝ was chosen after many trials.
