4.2.2. Feature-Based Comparative Analysis

This section applies the comparative analysis to different datasets containing the indicators chosen by different feature selection techniques, including VIF, genetic search, evolutionary search, and best-first search. Table A5 in the Appendix A.2 shows the different features chosen by various methods. The comparison is conducted under the *T*-test at the significance level of 0.05 by WEKA software (version 3.9.4, developed at the University of Waikato, New Zealand). To evaluate the predictive machine learning models' performance and have robust results, the 10-fold cross-validation on a rolling basis evaluation technique is used, and each model is repeated ten times. Therefore, the average results of 100 prediction trials, including the forecasting ability of models, namely RMSE and Pearson's *r*, are shown in Tables 5 and 6. The standard deviation is shown in parenthesis.

**Table 5.** RMSE of different models on different data sets.



**Table 6.** Pearson's *r* of different models on different indicators.

According to Tables 5 and 6, the SVR performs better on the attributes made by PCA. Thus, one can use a combination of SVR and PCA to boost the model. No feature selection can improve the models. The VIF method is the worst feature selection method among the mentioned feature selection methods due to the poor prediction results. Different models are compared to identify the best model for each data set, except for VIF data (due to the not promising forecasting results). Table 7 summarizes the model's comparisons, showing that the SVR model has the best accuracy and the MLP has the worst accuracy.

**Table 7.** Order of the models in terms of the accuracy.

