3.2.3. Mutation

To avoid local convergence, mutation is performed with a very small probability, which was 1% in the present work. In mutation, one gene is selected randomly from the newly generated offspring and replaced with one from the search space, which in the present study was one of the 77 features. During replacement, a check is made to make sure

<sup>1:</sup> Create randomly first generation.

that the newly added gene is not duplicated in the offspring. This is to avoid redundancy of features. An example of the mutation process is illustrated in Figure 6.

**Figure 6.** Mutation operation and its effects in generation of offspring.

#### 3.2.4. Improved Fitness Function

The accuracy of classifying the test dataset was used to assess chromosome quality. The fitness function was represented in terms of accuracy. Accuracy is the percentage of the sum of true positive (*TP*) and true negative (*TN*) results divided by the sum of true positive (*TP*), true negative (*TN*), false negative (*FN*), and false positive (*FP*) results. It has been adopted by several researchers as a fitness function (as mentioned in the previous section) to evaluate GA performance. Accuracy was calculated per the following formula:

$$Accuracy = \frac{TP + TN}{TP + FN + FP + TN} \tag{1}$$

Accuracy cannot be considered a sole metric with which to evaluate a system, since it is not an accurate measure if the dataset is not balanced (both negative and positive classes have a different number of data instances). Hence, the proposed fitness function to be considered in this work was developed using several measures. These measures were F1-score, accuracy, and false positive rate (FPR). F1-score is a metric that considers both precision and recall and is defined as follows:

$$F1\ \text{score} = \frac{2 \times Precision \times Recall}{Precision + Recall} \tag{2}$$

where precision is the fraction of true positive results among all retrieved results and calculated as follows:

$$Precision = \frac{TP}{TP + FP} \tag{3}$$

On the other hand, recall, or true positive rate (TPR), is the fraction of true positive results among all true positive samples in the dataset and calculated as follows:

$$Recall(true\ positive\ rate) = \frac{TP}{TP + FN} \tag{4}$$

The proposed fitness function combined these three metrics in the following formula with different weights for each metric:

$$fitness = \text{a } F1\_{score} + \beta \times \text{accuracy} + \text{ } \gamma \times \text{TPR} \tag{5}$$

where *α*, *β*, and *γ* are weighting factors that sum to 1. Since F1-score provides high indication to the results, it was assigned the highest value, 0.6, while both accuracy and TPR were considered as secondary measures and assigned 0.2 for each. The optimal value of the fitness function can be obtained through examining all possible combinations of weighting factors, which sum up to 66 possible combinations.
