*5.2. Methodology*

We present hereafter the methodology, which consists of four parts: (1) hyperparameter tuning; (2) feature selection; (3) impact of feature selection; and (4) relabeling.

#### 5.2.1. Hyperparameter Tuning

This section describes how to set the training hyperparameters values of our proposed model BrainShield based on the neural networks to detect malware.

As initial settings, the dropout rate (i.e., 0.3), the learning rate (i.e., 0.002) and the activation function (i.e., Relu) are by default proposed by Keras [34]. Moreover, the number of 50 iterations and the number of 1119 neurons as input are chosen as those large enough to have viable results and to complete hundreds of training in a suitable time. The final values of hyperparameters are illustrated in Table 2.


**Table 2.** Final values of hyperparameters.

In order to obtain the best value for the epoch number (i.e., number of iterations), we consider 7 values for the epoch number (i.e., 50, 100, 150, 200, 250, 300, 400), and we compare the different results of training using the evaluation metrics (i.e., the accuracy, the recall, the precision, the AUC, and the F1 score). The results are obtained from statistical averages over 10 training sessions and are illustrated in Figure 4. Such results show that the evaluation metrics are being improved for up to 250 iterations. Then, for higher values

of epoch number, no improvement in the evaluation metrics is observed (e.g., 70 training sessions with a total duration of 157 min were performed).

Hereafter, we justify the choice of (1) the dropout rate; (2) the learning rate; (3) the number of neurons, and (4) the activation functions.
