6.1.2. Performance Metrics

In terms of metric evaluation performance, several performance metrics were selected to measure the robustness of the proposed method compared to the other existing work. We divided performance metrics into a few parts. Error evaluations consist of a standard error metric, such as a root mean square error (*RMSE*) and a mean absolute error (*MAE*). The formulation for both errors are as follows:

$$RMSE = \sum\_{i=1}^{n} \frac{1}{n} \left( Q\_i^{test} - Q\_i^B \right)^2 \tag{16}$$

$$MAE = \sum\_{i=1}^{n} \frac{1}{n} \left| \mathbf{Q}\_i^{test} - \mathbf{Q}\_i^B \right| \tag{17}$$

where *Qtest <sup>i</sup>* is the state of the data *<sup>Q</sup>test <sup>i</sup>* ∈ {−1, 1}. In detail, the best logic mining model will produce the *Q<sup>B</sup> <sup>i</sup>* with the lowest error evaluation. Next, standard classification metrics, such as accuracy, *F-score*, precision, and sensitivity will be utilized in the experiment. According to [35], the sensitivity metric *Se* analyses how well a case correctly produces a positive result for an instance that has a specific condition. Note that, *TP* (true positive) is the number of positive instances that correctly classified, *FN* (false negative) is the number of positive instances that incorrectly classified, *TN* (true negative) is the number of negative instances that correctly classified, and *FP* (false positive) is the number of incorrectly classified positive instances.

$$Se = \frac{TP}{TP + FN} \tag{18}$$

Meanwhile, precision is utilized to measure the algorithm's predictive ability. Precision refers to how precise the prediction is from those positively predicted with how many of them are actually positive. The calculation for precision (*Pr*) is defined as follows:

$$Pr = \frac{TP}{TP + FP} \tag{19}$$

Accuracy (*Acc*) is generally the common metric for determining the performance of the classification. This metric measures the percentage of instances categorized correctly:

$$Acc = \frac{TP + TN}{TP + TN + FP + FN} \tag{20}$$

As stated by [36], *F-score* is a significant necessity that reflects the highest probability of correct result, explicitly representing the ability of the algorithm. Additionally, F1-score is described as the harmonic mean of precision and sensitivity. Next, the Matthews correlation coefficient (*MCC*) will be used to examine the performance of the logic mining based on the eight major derived ratios from the combination of all components of a confusion matrix. *MCC* is regarded as a good metric that represents the global model quality and can be used for classes of a different size [37].

$$F\text{ Score} = \frac{2TP}{2TP + FP + FN} \tag{21}$$

$$\text{MCC} = \frac{TP \ TN - FP \ FN}{\sqrt{(TP + FP)(TP + FN)(TN + FP)(TN + FN)}} \tag{22}$$

It is worth mentioning that this is our first encounter to approach logic mining with various performance metrics. In [20,22], the only metric used is only accuracy and testing error.
