3.1.4. Random Forest

The RF is an ensemble of decision trees used for performing both regression and classification tasks [37]. It is built on the basis of the decision tree algorithm, which is capable of fitting complex datasets. The concept of the tree is to search for a variable–value pair within the training set and then to split this to obtain the best two child subsets. Essentially, when making predictions, each data point begins at the top of the tree (as shown in Figure 2) and then down through the branches until it reaches a leaf node, where no further branching can be achieved.

**Figure 2.** A representation of the concept of the random forest algorithm.

Being an ensemble approach, the RF aggregates multiple outputs generated via different sets of decision trees toward obtaining better results. The idea is to take an average over the outcome of each predictor, thus reducing the variance toward arriving at a better prediction model that presents fewer cases of overfitting the training data [38]. Thus, the RF becomes a strong learner, whereas the individual decision trees are considered weak learners. The RF algorithm is trained via the bagging method or bootstrap aggregating approach, which comprises randomly sampling subsets of the training data [39]. It then fits the model to these smaller datasets and then aggregates the predictions. This approach allows for many instances to be used repeatedly during the training phase. Essentially, the RF can be a slow algorithm since it has to grow many trees during the training stage. Further technical details regarding the RF algorithm can be accessed in [37]. The fine-tuned hyperparameters of the RF algorithm include the maximum depth and the number of iterations.
