**1. Introduction**

Estimating pan evaporation (PE) is essential for monitoring, surveying, and managing water resources. In many arid and semi-arid regions, water resources are scarce and seriously endangered by overexploitation. Therefore, the precise estimation of evaporation becomes imperative for the planning, managing, and scheduling irrigation practices. Evaporation happens if there is an occurrence of vapor pressure differential between two

**Citation:** Kumar, M.; Kumari, A.; Kumar, D.; Al-Ansari, N.; Ali, R.; Kumar, R.; Kumar, A.; Elbeltagi, A.; Kuriqi, A. The Superiority of Data-Driven Techniques for Estimation of Daily Pan Evaporation. *Atmosphere* **2021**, *12*, 701. https:// doi.org/10.3390/atmos12060701

Academic Editors: Anthony R. Lupo and Alexander V. Chernokulsky

Received: 13 April 2021 Accepted: 26 May 2021 Published: 30 May 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

surfaces, i.e., water and air. The most general and essential meteorological parameters that influence the rate of evaporation are relative humidity, temperature, solar radiation, the deficit of vapor pressure, and wind speed. Thus, for the estimation of evaporation losses, these parameters should be considered for the precise planning and managing of different water supplies [1,2].

In the global hydrological cycle, the evaporation stage is defined as transforming water from a liquid to a vapor state [3]. In recent decades, evaporation losses have increased significantly, especially in semi-arid and arid regions [4,5]. Many factors, such as water budgeting, irrigation water management, hydrology, agronomy, and water supply management require a reliable evaporation rate estimation. The water budgeting factor has been modeled on estimates and the responses of cropping water to varying weather conditions. The daily evaporation of the pan (Epan) was considered a significant parameter. It was widely used as an index of lake and reservoir evaporation, evapotranspiration, and irrigation [6].

It is usually calculated in one of two ways, either (a) directly with pan evaporimeters or (b) indirectly with analytical and semi-empirical models dependent on climatic variables [7,8]. However, the calculation has proved sensitive to multiple sources of error, including strong wind circulation, pan visibility, and water depth measurement in the pan, for various reasons, including physical activity in and around the pan, water litter, and pan construction material and pan height. It can also be a repetitive, costly, and timeconsuming process to estimate monthly pan evaporation (EPm) using direct measurement. As a result, in the hydrological field, the introduction of robust and reliable intelligent models is necessary for precise estimation [9–14].

Several researchers have used meteorological variables to forecast Epan values, as reported by [15–18]. Since evaporation is a non-linear, stochastic, and complex operation, a reliable formula to represent all the physical processes involved is difficult to obtain [19]. In recent years, most researchers have commonly acknowledged the use of artificial intelligence techniques, such as artificial neural networks (ANNs), adaptive neuro-fuzzy inference method (ANFIS), and genetic programming (G.P.) in hydrological parameter estimation [15,20–22]. In estimating Epan, Sudheer et al. [23] used an ANN. They found that the ANN worked better than the other traditional approach. For modeling western Turkey's daily pan evaporation, Keskin et al. [24] used a fuzzy approach. To estimate regular Epan, Keskin and Terzi [25] developed multi-layer perceptron (MLP) models. They found that the ANN model showed significantly better performance than the traditional system. Tan et al. [26] applied the ANN methodology to model hourly and daily open water evaporation rates. In regular Epan modeling, Kisi and Çobaner [27] used three distinct ANN methods, namely, the MLP, radial base neural network (RBNN), and generalized regression neural network (GRNN). They found that the MLP and RBNN performed much better than GRNN. In a hot and dry climate, Piri et al. [28] have applied the ANN model to estimate daily Epan. Evaporation estimation methods discussed by Moghaddamnia et al. [19] were implemented based on ANN and ANFIS. The ANN and ANFIS techniques' findings were considered superior to those of the analytical formulas. The fuzzy sets and ANFIS were used for regular modeling of Epan by Keskin et al. [29] and found that the ANFIS method could be more efficiently used than fuzzy sets in modeling the evaporation process. Dogan et al. [30] used the approach of ANFIS for the calculation of evaporation of the pan from the Yuvacik Dam reservoir, Turkey. Tabari et al. [31] looked at the potential of ANN and multivariate non-linear regression techniques to model normal pan evaporation. Their findings concluded that the ANN performed better than non-linear regression. Using linear genetic programming techniques, Guven and Ki¸si [20] modeled regular pan evaporation by gene-expression programming (GEP), multi-layer perceptrons (MLP), radial basis neural networks (RBNN), generalized regression neural networks (GRNN), and Stephens–Stewart (SS) models. Two distinct evapotranspiration models have been used and found that the subtractive clustering (SC) model of ANFIS produces reasonable accuracy with less computational amounts than the ANFIS-GP ANN models [32].

A modern universal learning machine proposed by Vapnik (1995) [33] is the support vector machine (SVM), which is applied to both regression [30,34] and pattern recognition. An SVM uses a kernel mapping device to map the input space data to a high-dimensional feature space where the problem is linearly separable. An SVM's decision function relates to the number of support vectors (S.V.s) and their weights and the kernel chosen a priori, called the kernel [1,21]. Several kinds of kernels are Gaussian and polynomial kernels that may be used [10]. Moreover, artificial neural networks (ANN), wavelet-based artificial neural networks (WANN), support vector machine (SVM) were applied at different combinations of input variables by [23]. Their results showed that ANN, which contains three variables of air temperatures and solar radiation, produces root mean square error (RMSE) of 0.701, mean absolute error (MAE) of 0.525, correlation coefficient (R) of 0.990, and Nash–Sutcliffe efficiency (NSE) of 0.977 had better performances in comparison with WANN and SVR.

In principle, wavelet decomposition emerges as an efficient approximation instrument [18]; that is to say, a set of bases can approximate the random wavelet functions. To approximate Epan, researchers used ANN, WANN, radial function-based support vector machine (SVM-RF), linear function-based support vector machine (SVM-LF), and multilinear regression (MLR) models of climatic variables.

There have been many studies on the estimation of Epan based on weather variables using data-driven methods. However, the estimation of Epan based on lag-time weather variables, which can be obtained easily, is not standard. After testing different acceptable combinations as input variables, the same inputs were used in artificial intelligence processes. In the proposed study, the main objective is to (1) model Epan using ANN, WANN, SVM-RF, SVM-LF, and MLR models under different scenarios and (2) to select the best-developed model and scenario in Epan estimation based on statistical metrics. The document's format is as follows. Section 2 contains the study's materials and methods: Section 3 gives the statistical indexes and methodological properties. The models' applicability to evaporation prediction and the results are discussed in Section 4. The conclusion is found in Section 5.
