*3.6. Model Validation*

One of the more used statistical analyses, cross-validation, helps assess and validate the machine learning model's performance. The key intention behind evaluating the model is to see whether or not one can check if the trained model is generalizable. As part of the K-fold cross-validation process, the entire data set is first split into several folds. After that, the model is trained on all folds but one and the test model on the remaining fold. The test is reiterated multiple times until the model tests all the folds. Finally, the average scores obtained in every fold are taken as the final metrics. Predictions are made on the test sets that were not used to train the model during the process. These predictions are called 'out of fold predictions,' a type of 'out of the sample' forecast. In contrast to the simple

train-test, the method discussed prevents overfitting and helps in a more robust model evaluation form.

Cross-validation on a rolling basis is a method that is used for cross-validating the time series models. According to Kuhn and Johnson [52], the value of *k* = 10 is expected. The repeated K-fold cross-validation method replicates the entire process multiple times. For instance, if ten-fold cross-validation were repeated five times, it would result in 50 times outof-fold predictions, estimating the model's efficacy. The ten times K-fold cross-validation is a prevalent method to Kuhn and Johnson [52]. As depicted in Figure 5, the process starts with a small subset of data for training. Subsequently, the forecast for the later data point finally, the data point is for checking the accuracy. The same forecasted data point is included in the following training data set basis on which the next data points are predicted.

**Figure 5.** Cross-validation on a rolling basis.
