*2.4. Particle Swarm Optimization (PSO)*

The PSO technique was invented by Kennedy and Eberhart [16], based on the characteristics of bird and fish swarms in a multi-dimensional area, for example, looking for food and running away from hazards [16]. Every element in this algorithm is identified as a "particle"; particles create the density (population), and each density is identified as a "swarm". Every particle is considered a candidate for the answer to the question in this algorithm. The swarm and particle values of this technique depend on the chromosome and density (population) items, which are similar to the genetic algorithm [12]. PSO is a trial-and-error solution procedure that explores the characteristics of swarm particles in a multi-dimensional exploration zone. The computation process is different in the case of large data sets because of the higher expenses of developing a significant number of models. PSO can optimize with a large possibility and high meeting (convergence) rate. This optimizer works through the following mathematical expression [12].

$$V\_d^{Max} = \left(\mathbf{x}\_d^{Max} - \mathbf{x}\_d^{Min}\right) / 2 \tag{3}$$

$$V\_d^{Min} = -V\_d^{Max} \tag{4}$$

The values of *xMax <sup>d</sup>* and *<sup>x</sup>Min <sup>d</sup>* are selected according to the limit of the variables. The starting position and velocities of the individuals are irregularly calculated, based on the following equations:

$$\mathbf{x}\_{prd}^{k} = \mathbf{x}\_{d}^{Min} + r \left(\mathbf{x}\_{d}^{Max} - \mathbf{x}\_{d}^{Min}\right) \tag{5}$$

$$
\upsilon\_{prd}^k = V\_d^{Max}(2r - 1) \tag{6}
$$

where *p*, *d*, *v*, *x*, and *r* denote particle number, exploration direction, particle velocity, position of particle, and irregularly created number close to unvaried distribution with the limit (0, 1), respectively. Each particle upgrades its own position, until the position and velocity values face the stopping condition, based on the earlier steps and position of the finest particle in the entire swarm.

$$w\_{prd}^{k+1} = \omega v\_{prd}^k + c\_1 r\_1 \left(\mathbf{x}\_{prd}^{ind} - \mathbf{x}\_{prd}^k\right) + c\_2 r\_2 \left(\mathbf{x}\_d^{glo} - \mathbf{x}\_{prd}^k\right) \tag{7}$$

$$
\mathfrak{x}\_{prd}^{k+1} = \mathfrak{x}\_{prd}^{k} + \upsilon\_{prd}^{k+1} \tag{8}
$$

where *k* indicates the number of repetitions needed for the trial-and-error process. ω, *c*1, and *c*<sup>2</sup> are explore variables; *r*<sup>1</sup> and *r*<sup>2</sup> are two irregular numbers with an unvaried distribution with the limit (0, 1). *xind prd* is the finest location defined by a particle, while *x glo d* is the finest location defined by the entire swarm. Variables *c*<sup>1</sup> and *c*<sup>2</sup> are the cognition and social variables, respectively [16]. Kennedy and Eberhart introduced ω as a coefficient, which is 1 in the PSO algorithm.

$$
\omega = \omega\_{\text{max}} - (\omega\_{\text{max}} - \omega\_{\text{min}}) \frac{k}{k\_{\text{max}}} \tag{9}
$$

*kmax* and *k* are the highest and current number of repetitions for the trial-and-error process, respectively. Regeneration is chosen with the utilization of linear fitness scaling (LFS) to increase diversity of the iteration process.

$$f\_{best} - f\_{worst} <\_{div} \tag{10}$$

where *fbest* and *fworst* represent the finest and the least objective functions in the entire swarm and *div* is the expression for diversity. The following equation presents the objective function.

$$MinF(\mathbf{x}) = \frac{1}{N} \sum\_{i=1}^{N} \left( I\_i^{\text{obs}} - I\_i^{\text{est}} \right)^2 \tag{11}$$

where *Iobs <sup>i</sup>* is observed, and *<sup>I</sup>est <sup>i</sup>* is the estimated evaporation intensity; *N* is the observation number. This optimization process with the particle swarm technique extends up to a required concluding situation. In this analysis, the aim of the PSO algorithm is to minimize the objective function. The levels of computation of this process, using PSO, can be found in reference [12].

## **3. Results and Discussion**

#### *3.1. Data Description*

Arizona is the sixth biggest state of USA, which is situated next to the state of California. The area of this state is 113,000 square miles and partly surrounded by the Pacific Ocean. The weather conditions in Arizona are quite caustic, with tropical summers and muggy winters. Phoenix is the capital of Arizona state, located in the Northeastern part of the Sonoran Desert; therefore, it has a hot desert climate condition. This city has an agricultural neighborhood, which is close to the confluence of the Salt and Gila River. The study area was chosen due to the hot climate condition and proximity to an agricultural neighborhood. Figure 2 shows the study area, which is 355.7 m higher from sea level, with 33.4258 latitude and −111.9217 longitude.

**Figure 2.** Location of the study area under consideration in this manuscript. (**a**) zoom-out view; (**b**) zoom-in view. Source: Internet.

To assess efficiency, all models are separately calibrated, with a total of 86 data points for an eight-year period of 2010–2017 at each selected station within the United States of America and a one-month lead time. Data were collected from the government database of Arizona state in the US. Study area is humid and has an agricultural neighborhood. Two combinations of data sets were studied to check the results and verify whether they are similar in pattern or not. The data set is initially divided into two parts: the training and test portions. About one-third (~27) of the data points of the total data set was selected as the training data set, whereas the remaining two-thirds (~59) of the data points was considered a testing data set.

Table 1 summarizes the statistical indices of the test, training and all data used in this study. The table contains the skewness, kurtosis, coefficient of variation (CV), standard deviation (SD), and first (1st) and third (3rd) quarters (Q) for all the data points (N). The table also reports the minimum (Min) and maximum (Max), along with the average (Avg), of all data point. It further reports the similar statistical indices for both the training and testing cases, as well.


**Table 1.** Statistical indices of the evaporation data set used to verify the modelling.

Standard deviation shows the distribution nature of data set. For example, the standard deviation of the test data set is 82.40, and the average value of the data is 158.73. This means that most of the test data lies between 78.33 (158.73 − 82.40 = 78.33) to 241.13 (158.73 + 82.40 = 241.13). On the other hand, the coefficient of variation shows the precision of the data set in this table. It is defined as the ratio of standard deviation and mean value (in percentages). Two combinations of the testing and training data were chosen arbitrarily, in order to verify the robustness and repeatability of the proposed modeling techniques.

Combination 1: Training data (September 2010 to September 2015); testing data (October 2015 to December 2017).

Combination 2: Training data (September 2010 to December 2012 & June 2015 to December 2017); testing data (January 2013 to May 2015).

#### *3.2. Model Accuracy Indicator*

The performances of all four models were individually evaluated using statistical analysis to monitor accuracy, with respect to the evaporation forecasting data. The accuracy indicators for the ANFIS, FFA, PSO, and GA models were calculated, in terms of the coefficient of determination (*R*2) [17], Nash–Sutcliffe coefficient (NSE) [17], root mean square error (RMSE) [10], mean absolute error (MAE) [17], variance account for (VAF) [18], absolute relative error (MARE), scatter index (SI) [17], bias [13], and root mean square relative error (RMSRE) [17]. The root mean squared error (RMSE) represents a good measure of the goodness of fit at high parameter values. The standard RMSE value should be 0, according to the theory. The relative error (MARE) provides a more balanced idea of the goodness of fit at moderate and low values. The standard value of MARE is also 0. The coefficient of determination *R*<sup>2</sup> should be 1 for a perfect fit model. This coefficient measures the correlation of the predicted values with the observational data—the closer the coefficient is to one, the greater the correlation. The value of this coefficient does not interfere with the data unit considered. The SI index is the relative form of RMSE. The performance factor of the model, expressed as the Nash–Sutcliffe error criterion (*ENSC*), was used to evaluate the predictive power of the model. A value of unity for the *ENSC* indicates optimum conformity between predicted and observed data. In this work, both *R*<sup>2</sup> and *ENSC*

are expressed in percentages. The closer their magnitude to 100, the better the performance of the model. The ideal value for VAF is 100. All of them can be calculated from designed formulations, which are presented in the appendices section (see Appendix A for details).
