*3.3. Simulation Results*

Computer-based software simulation is performed to validate the proposed model; in particular, MATLAB is used to validate the model. Figure 3 shows the step-by-step work, performed in this manuscript, to validate the proposed model. First of all, the data is collected from the location under interest. Secondly, the models are chosen based on the collected data. In this case, ANFIS is chosen due to the nonlinear nature of the data set. Model accuracy indicators are then selected. On the other hand, the data is partitioned in to two groups: one group (almost two third of the data) is for testing the network, while the other group (remaining data set) is for training the proposed model. In the beginning the network is trained using nearly two-thirds of the total data set. This calibrates the network. Later, the remaining data set (approximately one third) are used to test the network. To verify the overall performance of the observed models, the observed and predicted evaporation values were plotted together for both combinations (combinations 1 and 2, as depicted in Section 3.1). Graphical representation is made in terms of the observed and predicted data. Figures 4–7 show the pattern of the observed and predicted data for all four models. Figures 4 and 5 show the training data pattern for the first and second combination of data sets as mentioned in Section 3.1. These figures show the comparison of target and output sample index of trained data for the (a) ANFIS, (b) FFA, (c) GA, and (d) PSO models. Similarly, Figures 6 and 7 show the test data pattern of all models, and these present the comparison of the target and obtained output sample index of test data for (a) ANFIS, (b) FFA, (c) GA, and (d) PSO, respectively.

According to the graphs, both data sets lie between −15% to +15% of a perfect line. Graphical presentation also demonstrates that the data set are well-trained. According to the analysis, all the models are suitable for the evaporation estimation. However, the pattern for Figure 6a ANFIS (first combination) and Figure 7a ANFIS (second combination) were the best fits, and the pattern for Figure 6b ANFIS–FFA (first combination) and Figure 7b ANFIS–FFA (second combination) show less fitness among the four models. The figures for ANFIS–PSO and ANFIS–GA, for both combinations, were close to each other. Additionally, a few accuracy tests were performed to obtain a better understanding for both training and testing. Few statistical indices tests have been performed and summarized in Tables 2–6.

**Figure 3.** The step-by-step structural outline of the work performed in this manuscript.

**Figure 4.** Comparison of the target (predicted) and obtained output sample index of training data set for (**a**) ANFIS, (**b**) ANFIS–FFA, (**c**) ANFIS–GA, and (**d**) ANFIS–PSO, respectively, using the first combination of the data set.

**Figure 5.** Comparison of the target (predicted) and obtained output sample index of training data set for (**a**) ANFIS, (**b**) ANFIS–FFA, (**c**) ANFIS–GA, and (**d**) ANFIS–PSO, respectively, using the second combination of the data set.

**Figure 6.** Comparison of the target and obtained output sample index of the test data for (**a**) ANFIS, (**b**) ANFIS–FFA, (**c**) ANFIS–GA, and (**d**) ANFIS–PSO, respectively (first combination of the data set).

**Figure 7.** Comparison of the target and obtained output sample index of the test data for (**a**) AN-FIS, (**b**) ANFIS–FFA, (**c**) ANFIS–GA, and (**d**) ANFIS–PSO, respectively (second combination of the data set).


**Table 2.** Summary of model accuracy indicator test for the training data set (for the first combination data set), which was calculated in Excel.

**Table 3.** Summary of model accuracy indicator test for the testing data set (for the first combination data set), which was calculated in Excel.


**Table 4.** Summary of model accuracy indicator test for the training data set (for the second combination data set), which was calculated in Excel.


**Table 5.** Summary of model accuracy indicator test for test data set (for the second combination data set), which was calculated in Excel.


**Table 6.** Summary of model accuracy indicator test during the testing period, provided by 'MATLAB'.


The overall summary of the findings is presented in Table 6. Table 6 presents the results provided by the MATLAB tool. It shows that the MSE values, for all the test models, were very high (MSE for ANFIS 241.72, for FFA 594.80, for GA it is 206.79, and for PSO

it is 213.05) for the testing data, and higher for the training data. To ensure a rigorous comparison of the models, an extended analysis was performed using RMSE, *R*2, MAE, MARE, RMSRE, SI, MRE, Bias, NASH, and VAF as statistical indices for the estimated values. Tables 2–5 present values of all statistical indices for training and testing data set of all models. According to all statistical indices, especially the *R*2, RMSE, VAF, and NASH values, the second combination of the data set presented better results than the first combination of the data set, which is presented in Table 3. The results of the ANFIS and ANFIS–PSO models were almost identical in both combinations. RMSE was lower for ANFIS and ANFIS–GA. ANFIS–FFA posed worse results, among all model, in all the cases. Biasness is less for ANFIS model. According to the test results from Tables 3 and 5, the *R*<sup>2</sup> for ANFIS, GA, and PSO were almost identical, 0.99, whereas *R*<sup>2</sup> for FFA was 0.97. This is found to be aligned with the training result. A commonly used correlation measure, i.e., (*R*2), in the testing of statistical indices cannot always be accurate, or sometimes it could be misleading, when used to compare the predicted and observed models [1]. The two most widely used statistical indicators, i.e., root mean square error (RMSE) and bias error, were used in this analysis. The model performance is inversely proportional to the RMSE value; lower RMSE values present higher accuracy and vice versa. RMSE is the minimum for PSO and GA, which were 14.59, 14.63, and 14.38, 15.07, respectively, whereas ANFIS was 15.54, and FFA presents the worst value: 24.38. Negative biasness was noticed for all the models, where ANFIS and GA possessed minimum biasness.

Hence, the MSE values are higher, and the relative statistical indices are compared to find better results. The MARE and RMSRE results should also be minimal for the best fit model. Again, ANFIS shows the minimum MARE value (0.087), and PSO gives similar result to ANFIS. However, according to the RMSRE results, PSO shows the best result. For more clarity, NASH has been considered another accuracy indicator, and the value should be close to 1 for the best fit. The table presents the highest NASH value for ANFIS (0.97), GA (0.97), and PSO (0.97). FFA was also close to 1 (0.93). To avoid confusion, VAF was calculated. Here, ANFIS, GA, and PSO showed higher results (all three results were close to 97.11), and FFA indicates 93.11.

Time is an important factor of these calculations. The time frame is given below in Table 7 for all four models. It shows that the ANFIS model took less time than the others, and FFA is the complicated one. After analyzing all the results, the FFA model is considered the least acceptable model among the four. ANFIS, with GA and PSO models, were showing better fit in some situations. Although GA and PSO were showing similar results and took same time to run, ANFIS can be considered more acceptable because of its simplicity.


**Table 7.** Time taken by four models (approximate).

#### *3.4. Discussion*

In this study, evaporation was estimated from six climate variables, i.e, minimum temperature, maximum temperature, average temperature, sunshine hour, wind speed, and relative humidity. Evaporation depends on the combined effect of humidity, temperature variation, sunshine, and wind [11]. Sunshine is an important factor that helps evaporate the water body [7]. Similarly, temperature and humidity also play an important role in evaporation. When they decrease, evaporation increases. Wind takes water away to the atmosphere [7]. Therefore, all of them were considered, as they affect evaporation. Key parameters were selected by trial-and-error method. Only one set of parameters was experimented with.

The findings of this research demonstrated that the FFA model is considered the least acceptable model among the four. ANFIS with GA and PSO models were showing a better fit in some situations. Although GA and PSO were showing similar results, based on all accuracy indicator tests (especially, on maximum *R*<sup>2</sup> value, minimum RMSE, less Biasness, maximum VAF, minimum RMSRE value, and maximum value of Nash coefficient), and took the same time to run the model, ANFIS can be considered more acceptable because of its simplicity. This model can be used as a role model for any dataset of an arid climate. It can be helpful for the local stakeholder, in terms of the hydrological resource management system. The main advantage of adopting ANFIS for this location is the pattern of the dataset. As the datasets are inherently nonlinear, the ANFIS model was able to achieve high accuracy in the prediction of evaporation. The ANFIS model and this model, with the optimizers (FFA, GA, and PSO), can be widely used for arid climates, with the same weather variables, in any part of the world.

More investigation is needed for this location. Lack of data was a limitation of this study. More climate variables can be added for more accuracy of the model. Other modern machine learning technique should be implemented in the future, in order to use the available resources to enhance the water resource management system. That would be beneficial for the local agri-economical prospect, as well.

### **4. Conclusions**

The comparison among the adaptive neuro fuzzy inference system (ANFIS) and its hybridization, using three different algorithms (FFA, GA, and PSO), has been illustrated in this study, in the context of evaporation estimation, using different climate variables, namely sunshine, relative humidity, average temperature, maximum temperature, minimum temperature, and wind speed. Two combinations of data sets were trained and tested, in order to verify the correlation among the different models. The study illustrated the accuracy of all four models. However, the performance of the models was evaluated based on the various statistical measures (RMSE, RMSRE, MBE, VAF, NASH, biasness, MBE, MARE, SI, and *R*2). Result shows that the second combination of the testing and training data set posed slightly better results than the first combination. Overall, all four models are suitable for the estimation of evaporation, but ANFIS and ANFIS, with optimizer PSO, is superior for all accuracy indicator values. Relative and absolute accuracy tests were performed to find the best model in this study. Though all the results of the two models (ANFIS and ANFIS–PSO) were merely identical, ANFIS is recommended, due to its simple formulation and easy development, compared to the ANFIS–PSO model. The computational time of ANFIS model is less, in comparison to the other models with optimizers. The main objective of the adoption of different optimizer techniques is to verify the accuracy of the outcome prediction by ANFIS model. Since the prediction was almost identical in all cases, the ANFIS model is recommended, due to its simplicity. The major challenge of this project was the limitation of data. These models can be applied for different data sets to investigate the results, if they were available. This analysis is limited to a particular location. However, in future work, other locations can be explored, and their performance can be compared with modern machine learning methods. Another optimizer, for example, the ant colony optimizer (ACO), can be investigated in future work. Multi gene-genetic programming (MGGP) can also be explored in the future. Another climate variable, such as, atmospheric pressure, can be considered as an input in the future. However, the evaporation of a given location can easily be modelled from the available data using the ANFIS model. Additionally, this model can be applied as a module for calculating evaporation data in hydrological modeling studies.

**Author Contributions:** Conceptualization, M.J., H.B., and A.M.; methodology, M.J. and H.B.; software, A.M.; validation, M.J.; formal analysis, M.J. and A.M.; investigation, M.J.; resources, H.B.; data curation, M.J. and H.B.; writing—original draft preparation, M.J.; writing—review and editing, H.B. and A.M.; visualization, H.B.; supervision, A.M.; project administration, H.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** Part of the data used in this manuscript are available through the corresponding author upon reasonable request.

**Acknowledgments:** Mansura Jasmine acknowledges the support of Mehedi Hasan during the preparation of this manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

The relationships for statistical indices and error measures used in this paper are provided in the following.

*R*2: coefficient of determination, which can be expressed in the following form:

$$r = \frac{n\left(\sum xy\right) - \left(\sum x\right)\left(\sum y\right)}{\sqrt{\left[n\sum x^2 - \left(\sum x\right)^2\right]\left[n\sum y^2 - \left(\sum y\right)^2\right]}}\tag{A1}$$

RMSE: root mean square error, which can be formulated as follows:

$$RMSE = \left[ \frac{\sum\_{i=1}^{M} \left( Y\_{i(model)} - Y\_{i(actual)} \right)}{M} \right]^{\frac{1}{2}} \tag{A2}$$

MARE: absolute relative error. The formula is given below:

$$MARE = \frac{1}{M} \sum\_{i=1}^{M} \left( \frac{\left| \mathbf{Y}\_{i(model)} - \mathbf{Y}\_{i(actual)} \right|}{\mathbf{Y}\_{i(actual)}} \right) \tag{A.3}$$

$$Bias = \frac{\sum\_{i=1}^{M} \left( \mathbf{Y}\_{i(model)} - \mathbf{Y}\_{i(actual)} \right)}{M} \tag{A4}$$

SI: scatter index, which can be expressed as follows:

$$SI = \frac{RMSE}{\frac{1}{M} \sum\_{i=1}^{M} \left( Y\_{i(actual)} \right)} \tag{A5}$$

RMSRE: root mean square relative error. This error can be calculated from the following equation:

$$RMSRE = \frac{1}{N} \sqrt{\sum \left(\frac{y\_t - \hat{y}\_t}{y\_t}\right)^2} \tag{A6}$$

MAE: mean absolute error. This error can be calculated from the following equation:

$$MAE = \frac{1}{n} \sum\_{i=1}^{n} |T\_{i.Actual} - T\_{i.Predicted}| \tag{A7}$$

VAF: variance account for. This term can be presented by the following equation:

$$VAF = \left(\frac{1 - var(T\_{i.Actual} - T\_{i.Predicted})}{var(T\_{i.Actual})}\right) \ast 100\tag{A8}$$

NSE: Nash–Sutcliffe coefficient. This coefficient can be formulated as follows:

$$E\_{NSC} = 1 - \left(\frac{\sum (y\_t - \hat{y}\_t)^2}{\sum (y\_t - \overline{y\_t})^2}\right) \tag{A9}$$

where,

*Yi*(*actual*): the output observational parameter; *Yi*(*model*): the y parameter predicted by the models; *Yi*(*model*): the mean predicted y parameter; *M*: the number of parameters; *n*: number of samples; *ENSC*: the Nash–Sutcliffe test statistic; *Ti*.*Actual*: the *i*th value of actual data; *Ti*.*Predicted*: the *i*th value of predicted data.

#### **References**

