*2.2. Classification Algorithms*

Classification is a type of supervised learning. Here, a set of relevant features is associated with a set of categories, which are already labelled (this makes classification a supervised method). This allows for a confident training of the classification model and, later, accurate predictions. When classifying features, different approaches can be implemented, and the specific method for relating features and labels varies from algorithm to algorithm. Hence, evaluating the performance of such algorithms is of great importance, given the context. In this work, three different classification algorithms (namely, K-nearest neighbors, multilayer perceptron, and random forest) are tested and compared. The selection of these methods was based on the fact that each of them presents distinct features that make them unique. This provides a suitable path to cover a wide range of alternatives when classifying an unknown set of features.

The K-nearest neighbors algorithm is a distance-based method, where each data point is put to the test and the distance between such point and its K-nearest neighbors is saved and later compared. Note that no training stage is strictly needed.

The second algorithm, namely, the multilayer perceptron, is an artificial neural network that minimizes a cost function, which allows for accurate predictions once the classification problem is properly trained. The minimization of the cost function can be achieved through a variety of methods, where backpropagation and gradient descent algorithms are popular and accurate choices.

Lastly, the random forest algorithm is an ensemble of decision trees. In this case, predictions are made on the basis of the training of multiple classifiers, and a final prediction takes place via the most repeated forecast of such classifiers. This is applied to improve the robustness of the classification model.

The accuracy of each algorithm is evaluated using the accuracy score function, given in Equation (1).

$$score = \frac{1}{n} \sum\_{i=1}^{n-1} 1(\hat{y}\_i = y\_i), \tag{1}$$

where *n* is the number of samples, *y*ˆ*<sup>i</sup>* is the predicted categorical value, *yi* is the true categorical value, and function 1(*x*) is the indicator function, which outputs 1 when *y*ˆ*<sup>i</sup>* = *yi* and 0 otherwise. A brief description of the algorithms considered in this work is presented below. In addition, a summary of the functionality of each algorithm is depicted in Figure 3, where the logic behind each method is shown via block diagrams. Note that these diagrams are only general, as details regarding the architecture and specific parameters of each algorithm depend on the final configuration of each method, which depends on further studies that decide the suitable values of parameters.

**Figure 3.** Schematics of selected classification algorithms. (**a**): K-nearest neighbors, (**b**): random forest, (**c**): multilayer perceptron.
