**4. Discussion**

## *4.1. Water Saving Performance Analysis*

The soil water balance of spring wheat in different zones under different irrigation schedules is shown in Table 8. Where that the groundwater depth was less than 2.5 m, the irrigation quota, water consumption, phreatic evaporation, seepage, and soil water utilization under the current irrigation schedule were 335 mm, 440 mm, 65 mm, 127 mm, and 65 mm respectively. Under the full irrigation schedule these figures were 320 mm, 493 mm, 55 mm, 70 mm, and 85 mm respectively. Under the optimized irrigation schedule, they were 240 mm, 474 mm, 62 mm, 68 mm, and 139 mm respectively. It could be seen that under the optimized irrigation schedule, the seepage dropped by 46% and the soil water use increased by 114%. Where the groundwater depth was more than 2.5 m, the irrigation quota, water consumption, phreatic evaporation, seepage, and soil water utilization under the current irrigation schedule were 335 mm, 438 mm, 8 mm, 87 mm, and 102 mm respectively. Under the full irrigation schedule these figures were 360 mm, 492 mm, 7 mm, 84 mm, and 129 mm respectively. Under the optimized irrigation schedule, they were 300 mm, 486 mm, 7 mm, 47 mm, and 147 mm respectively. It could be seen that under the optimized irrigation schedule, the seepage dropped by 46% and the soil water use increased by 44%. The spring wheat soil water balance table shows that the optimized irrigation schedule cut the seepage loss and improved the soil water use on the current irrigation schedule. As far as the groundwater depth is concerned, a shallow depth has a more significant effect on reducing seepage and increasing soil water use. Where the groundwater depth was less than 2.5 m, the current irrigation schedule had a slightly higher irrigation quota than the full irrigation schedule; where the groundwater depth was greater than 2.5 m, the current irrigation schedule had a slightly lower irrigation quota than the full irrigation schedule. This suggests that the regional agricultural water saving potential lies mainly in the optimization of the crop irrigation schedule.


**Table 8.** Soil water balance of spring wheat in different zones under different irrigation schedules.

The water saving performance of the optimized irrigation schedule for spring wheat in each zone is shown in Figure 11. As can be seen, compared with the current irrigation schedule, when the groundwater depth was less than 2.5 m, with the optimized irrigation schedule the irrigation water consumption dropped by 95 mm, the yield increased by 377 kg/hm<sup>2</sup> , the water consumption grew by 35 mm, the transpiration was up by 48 mm, the water productivity fell by 0.04 kg/m<sup>3</sup> , the irrigation water productivity was higher by 0.31 kg/m<sup>3</sup> , while with the full irrigation schedule the irrigation water quota dropped by 15 mm, the yield increased by 747 kg/hm<sup>2</sup> , the water consumption grew by 53 mm, the transpiration was up by 67 mm, and the water productivity and the irrigation water productivity remained unchanged. When the groundwater depth was more than 2.5 m, with the optimized irrigation schedule the irrigation water quota dropped by 35 mm, the yield increased by 581 kg/hm<sup>2</sup> , the water consumption grew by 49 mm, the transpiration was up by 63 mm, the water productivity fell by 0.06 kg/m<sup>3</sup> , the irrigation water productivity grew by 0.46 kg/m<sup>3</sup> , while with the full irrigation schedule the irrigation water quota dropped by 25 mm, the yield increased by 729 kg/hm<sup>2</sup> , the water consumption grew by 54 mm, the transpiration was up by 68 mm, the water productivity fell by 0.04 kg/m<sup>3</sup> , and the irrigation water productivity grew by 0.04 kg/m<sup>3</sup> . It could be seen that where the groundwater depth was less than 2.5 m, with the optimized irrigation schedule the irrigation quota of spring wheat fell by 28%, the yield increased by 5%, the irrigation water productivity grew by 20%, the additional water consumption was all used for crop transpiration, and the water productivity reduction was less than 3%; where the groundwater depth was greater than 2.5 m, with the optimized irrigation schedule the irrigation quota was reduced by 10%, the yield increased by 8%, the irrigation water productivity grew by 20%, the additional water consumption was all used for crop transpiration, and the water productivity reduction was less than 4%. From the soil water balance data, the optimized irrigation schedule's water-saving effect is mainly seen in greater yield and higher irrigation water productivity, lower irrigation quota, less ineffective seepage and soil evaporation, and substantial increase in soil water use, while the amount of additional water consumption was all used for crop transpiration. Compared with the groundwater depth over 2.5 m, when this depth was less than 2.5 m, with the optimized irrigation schedule the irrigation quota dropped by 20% and the soil water use was significantly improved. *Water* **2020**, *12*, x FOR PEER REVIEW 17 of 21 productivity remained unchanged. When the groundwater depth was more than 2.5 m, with the optimized irrigation schedule the irrigation water quota dropped by 35 mm, the yield increased by 581 kg/hm<sup>2</sup> , the water consumption grew by 49 mm, the transpiration was up by 63 mm, the water productivity fell by 0.06 kg/m<sup>3</sup> , the irrigation water productivity grew by 0.46 kg/m<sup>3</sup> , while with the full irrigation schedule the irrigation water quota dropped by 25 mm, the yield increased by 729 kg/hm<sup>2</sup> , the water consumption grew by 54 mm, the transpiration was up by 68 mm, the water productivity fell by 0.04 kg/m<sup>3</sup> , and the irrigation water productivity grew by 0.04 kg/m<sup>3</sup> . It could be seen that where the groundwater depth was less than 2.5 m, with the optimized irrigation schedule the irrigation quota of spring wheat fell by 28%, the yield increased by 5%, the irrigation water productivity grew by 20%, the additional water consumption was all used for crop transpiration, and the water productivity reduction was less than 3%; where the groundwater depth was greater than 2.5 m, with the optimized irrigation schedule the irrigation quota was reduced by 10%, the yield increased by 8%, the irrigation water productivity grew by 20%, the additional water consumption was all used for crop transpiration, and the water productivity reduction was less than 4%. From the soil water balance data, the optimized irrigation schedule's water-saving effect is mainly seen in greater yield and higher irrigation water productivity, lower irrigation quota, less ineffective seepage and soil evaporation, and substantial increase in soil water use, while the amount of additional water consumption was all used for crop transpiration. Compared with the groundwater depth over 2.5 m, when this depth was less than 2.5 m, with the optimized irrigation schedule the irrigation quota dropped by 20% and the soil water use was significantly improved.

**Figure 11.** *Cont*.

0.10

(kg/m3)

*Water* **2020**, *12*, x FOR PEER REVIEW 18 of 21

0.8

1.0

Full irrigation

Optimized irrigation schedule

**Figure 11.** Water saving performance of optimized irrigation schedule for spring wheat in each **Figure 11.** Water saving performance of optimized irrigation schedule for spring wheat in each zone. zone.

#### zone. *4.2. Water Consumption Characteristics 4.2. Water Consumption Characteristics*

Full irrigation

**600**

*4.2. Water Consumption Characteristics* Figure 12 shows the daily water consumption and cumulative water consumption of spring wheat at each growth stage in each zone under the optimized irrigation schedule. It could be seen that the average daily water consumption increased rapidly from 1.87–1.89 mm at the sowing–tillering stage to a maximum of 5.82–5.89 mm at the tillering–heading stage, and then gradually fell to 4.95–5.16 mm at the heading–filling stage and further to 2.66–3.08 mm at the filling–ripening stage. The difference in daily water consumption at different growth stages did not vary much between the two groundwater depths. When the groundwater depth was less than 2.5 m, the cumulative water consumption at the sowing–tillering, tillering–shooting, shooting–heading, heading–filling, and filling–ripening stages was 80 mm, 220 mm, 349 mm, 424 mm, and 474 mm respectively. When the groundwater depth was greater than 2.5 m, the cumulative water consumption at these stages was 81 mm, 221 mm, 350 mm, 428 mm, and 486 mm respectively. It could be seen that, irrespective of the groundwater depths, the water consumption characteristics of the optimized irrigation schedule were basically the same—the average daily water consumption peaked at the tillering–heading stage, with a combined water consumption accounting for 55.4%–56.7% of the total water consumption during the whole growth period. Figure 12 shows the daily water consumption and cumulative water consumption of spring wheat at each growth stage in each zone under the optimized irrigation schedule. It could be seen that the average daily water consumption increased rapidly from 1.87–1.89 mm at the sowing–tillering stage to a maximum of 5.82–5.89 mm at the tillering–heading stage, and then gradually fell to 4.95–5.16 mm at the heading–filling stage and further to 2.66–3.08 mm at the filling–ripening stage. The difference in daily water consumption at different growth stages did not vary much between the two groundwater depths. When the groundwater depth was less than 2.5 m, the cumulative water consumption at the sowing–tillering, tillering–shooting, shooting–heading, heading–filling, and filling–ripening stages was 80 mm, 220 mm, 349 mm, 424 mm, and 474 mm respectively. When the groundwater depth was greater than 2.5 m, the cumulative water consumption at these stages was 81 mm, 221 mm, 350 mm, 428 mm, and 486 mm respectively. It could be seen that, irrespective of the groundwater depths, the water consumption characteristics of the optimized irrigation schedule were basically the same—the average daily water consumption peaked at the tillering–heading stage, with a combined water consumption accounting for 55.4%–56.7% of the total water consumption during the whole growth period. Figure 12 shows the daily water consumption and cumulative water consumption of spring wheat at each growth stage in each zone under the optimized irrigation schedule. It could be seen that the average daily water consumption increased rapidly from 1.87–1.89 mm at the sowing–tillering stage to a maximum of 5.82–5.89 mm at the tillering–heading stage, and then gradually fell to 4.95–5.16 mm at the heading–filling stage and further to 2.66–3.08 mm at the filling–ripening stage. The difference in daily water consumption at different growth stages did not vary much between the two groundwater depths. When the groundwater depth was less than 2.5 m, the cumulative water consumption at the sowing–tillering, tillering–shooting, shooting–heading, heading–filling, and filling–ripening stages was 80 mm, 220 mm, 349 mm, 424 mm, and 474 mm respectively. When the groundwater depth was greater than 2.5 m, the cumulative water consumption at these stages was 81 mm, 221 mm, 350 mm, 428 mm, and 486 mm respectively. It could be seen that, irrespective of the groundwater depths, the water consumption characteristics of the optimized irrigation schedule were basically the same—the average daily water consumption peaked at the tillering–heading stage, with a combined water consumption accounting for 55.4%–56.7% of the total water consumption during the whole growth period.

**7.5**

**Comulative water consumption in less than 2.5m**

**Figure 12.** Spring wheat water consumption characteristics under optimized irrigation schedule.

#### **5. Conclusions**

(1) The Aquacrop model after verification can satisfactorily simulate the dynamic process of the soil moisture content during the growth period of spring wheat and its yield under different irrigation schedules respectively, which can be used to investigate the water-yield response mechanism of spring wheat.

(2) The average groundwater table during the spring wheat growing season makes a critical precondition for the simulation of the zoned irrigation schedule, which gives the regional representativeness and feasibility for the optimized irrigation schedules, and importantly provides a solution to the disruption between the spatial variability and the optimization of the irrigation schedule in shallow groundwater areas.

(3) For an irrigation quota within 360 mm, as the irrigation quota increases, the water consumption, seepage, and yield all increase, while the groundwater utilization presents a decreasing trend. In order to get the greater yields, the choice of irrigation schedule is especially important.

(4) Where the groundwater depth is greater than 2.5 m, two rounds of irrigation are made at both the tillering–shooting stage and the shooting–heading stage. Where the groundwater depth is less than 2.5 m, two rounds of irrigation are made at the tillering–shooting stage and one round of irrigation is made at the shooting–heading stage.

(5) The main water-saving effect of the optimized irrigation schedule is that the spring wheat yield, the soil moisture availability, and the irrigation water productivity increase while, the irrigation amount and the ineffective seepage decrease, from which the additional water consumption can be fully used for crop transpiration, being a kind of effective water consumption.

**Author Contributions:** All authors read and approved the manuscript. Z.P. put up with the main idea of this research. conceptualization, Z.P.; methodology, Z.P., B.Z., Y.L. and Z.W.; software, Z.P.; validation, Z.P., B.Z., and J.C.; formal analysis, Z.P. and H.C.; writing—original draft preparation, Z.P.; writing—review and editing, Z.P., B.Z., Z.W., and H.C.; project administration, Z.P. and Y.L.; funding acquisition, B.Z.

**Funding:** This study was supported by the National Key R&D Program of China (2018YFC0407703), the IWHR Research & Development Support Program (ID0145B082017, ID0145B742017, ID0145B102019 and ID01881910), The Key R&D projects of Ningxia Hui Autonomous Region (2018BBF02022), the Chinese National Natural Science Fund (51822907 and 51979287), the Special Fund of State Key Laboratory of Simulation and Regulation of Water Cycle in River Basin, and China Institute of Water Resources and Hydropower Research (SKL2018CG03).

**Acknowledgments:** The authors greatly appreciate the anonymous reviewers and the academic editor for their careful comments and valuable suggestions to improve the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

**Josué Trejo-Alonso <sup>1</sup> , Carlos Fuentes 2,\* , Carlos Chávez 1,\* , Antonio Quevedo <sup>2</sup> , Alfonso Gutierrez-Lopez <sup>3</sup> and Brandon González-Correa <sup>4</sup>**


**Abstract:** In the present work, we construct several artificial neural networks (varying the input data) to calculate the saturated hydraulic conductivity (K<sup>S</sup> ) using a database with 900 measured samples obtained from the Irrigation District 023, in San Juan del Rio, Queretaro, Mexico. All of them were constructed using two hidden layers, a back-propagation algorithm for the learning process, and a logistic function as a nonlinear transfer function. In order to explore different arrays for neurons into hidden layers, we performed the bootstrap technique for each neural network and selected the one with the least Root Mean Square Error (RMSE) value. We also compared these results with pedotransfer functions and another neural networks from the literature. The results show that our artificial neural networks obtained from 0.0459 to 0.0413 in the RMSE measurement, and 0.9725 to 0.9780 for R<sup>2</sup> , which are in good agreement with other works. We also found that reducing the amount of the input data offered us better results.

**Keywords:** modeling water flow; gravity irrigation; infiltration process; artificial intelligence

#### **1. Introduction**

Soil water movement is important in several fields, like irrigation, drainage, hydrology, and agriculture [1]. Among all the measurable quantities in soils, one of the most important is the saturated hydraulic conductivity (KS), defined as the ability to transmit water throughout the saturated zone [2,3], which is highly correlated with the optimization of the flow rate applied to the border or furrow in the gravity irrigation [2–6]. Although this property is measured easily in a laboratory, or in the field, it needs to be applied at a small scale, and most of the time, it is required to be used on a large scale [5,6]. This is inconvenient due to the fact that all these tests and measurements are time-consuming, impractical, and not cost-effective [7,8].

In order to solve the inconveniences mentioned above, a great number of studies about pedotransfer functions (PTFs) were published [7–10]. These mathematical models allow us to estimate the K<sup>S</sup> from some soil characteristics, such as texture, field capacity, the permanent wilting point, bulk density, porosity, and organic matter, among others [7–10]. The robustness of the model is linked to the number of physical parameters used to calculate the saturated hydraulic conductivity; the more parameters, the more accurate the prediction. However, as it was mentioned before, depending on the measurements, the PTFs are difficult to get, due to economic resources and the time it takes to measure all

**Citation:** Trejo-Alonso, J.; Fuentes, C.; Chávez, C.; Quevedo, A.; Gutierrez-Lopez, A.; González-Correa, B. Saturated Hydraulic Conductivity Estimation Using Artificial Neural Networks. *Water* **2021**, *13*, 705. https://doi.org/ 10.3390/w13050705

Academic Editors: George Kargas, Petros Kerkides and Paraskevi Londra

Received: 10 February 2021 Accepted: 27 February 2021 Published: 5 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the variables, which presents as a limitation for this kind of function and the predictive capacity. Besides, some works have been questioned because the soil in which they want to apply are different from the soil used for their development, such as [11].

In recent years, another alternative has been explored—Artificial Neural Networks (ANNs), which have become a common tool used as a special class of PTFs, for example, [12,13]. ANNs are an artificial intelligence that simulate the behavior of the human brain, and their structures consist of a number of interconnected elements called neurons which are logically arranged in layers, known as input, output, and hidden (see, e.g., [12] and references therein). Each neuron connects to all the neurons in the next layer via weighted connections. In Figure 1, we show a schematic structure for an ANN.

**Figure 1.** Schematic representation for an ANN structure. Each circle represents a neuron, where the *I<sup>j</sup>* means the neurons in the input layer, and the *O<sup>j</sup>* are the neurons in the output layer. All *H<sup>j</sup>* circles are the neurons in the hidden layers. The arrows represent the weighted connections.

Until now, there has been no analytical way to obtain the ideal network structure (number of hidden layers and neurons inside them) as a function of the complexity of the problem. The structure must be selected by performing a trial-and-error process. ANNs with one or two hidden layers and an adequate number of hidden neurons are found to be sufficient for most problems (e.g., [12,14]). There are also several works studying the ideal number of neurons in the hidden layers [15,16], but these methods present general guidelines for the selection in the number of neurons only.

The ANNs' name comes not only from of their structure, but because they "learn". The most common algorithm used in ANNs for the learning process is back-propagation (e.g., [17,18]). Each neuron belonging to a layer receives weighted inputs from the neurons in the previous layer and processes them to transmit its output to the neurons in the next layer, and this is done through links. Each link is assigned a weight that is no more than a numerical estimate of the strength of the connection. This weighted sum of inputs in a neuron are converted into the numerical estimate that we see, according to the nonlinear transfer function (the most commonly used is the sigmoid function). The ANNs then modify the weights of neurons in response to errors between the actual output values and the target output values, using what is known as gradient descent [19,20]. This is then applied on the sum of the squares of the errors for all the training patterns, until the mean error of the sum squared of all the training patterns is minimal or within the tolerance specified for the problem.

In this work, we use ANNs to obtain the K<sup>S</sup> in Irrigation District 023 placed in Queretaro Mexico using a sample of 900 plots. The sample, ANN configurations, and validation tests are described in Section 2. Finally, in Section 3, we show the results, and a comparison between the several configurations obtained in this work and comparisons of our results with PTFs and other ANNs in the literature.

#### **2. Materials and Methods**

#### *2.1. Study Area*

The Irrigation District 023 is located in Queretaro, Mexico, at 20◦180 to 20◦340 N, 99◦560 to 100◦120 W with an altitude of 1892 M.A.S.L, and it has an area of 11 048 ha. It includes the municipalities of San Juan del Rio and Pedro Escobedo (Figure 2). Its predominant climate is semiarid with summer rains, with an annual precipitation avarage of 599 mm and annual average temperature of 20 ◦C [21]. The water is conducted through open channels. The main channels are lined with concrete, but all the lateral channels that carry water to the plots are unlined. The separation of the plots in some cases are by trees, unlined channels, drains, or roads [22].

**Figure 2.** Location map of the sampling points in Irrigation District 023 San Juan del Río, Querétaro.

#### *2.2. Soil Database*

An extensive and detailed database description can be found at [23]. In summary, the database used in this study was developed from samplings in 900 plots at Irrigation District 023, San Juan del Rio Queretaro. These samples were sent to the laboratory to obtain the following parameters: soil texture by the Bouyucos hydrometer, bulk density (*ρa*) by the cylinder method of known volume, moisture content at saturation (*θS*), field capacity (FC) and permanent wilting point (PWP) by the method of the pressure membrane pot, and the saturated hydraulic conductivity by the variable head permeameter method. From these measurements, we took seven and considered them through the paper as the input data: percentage of clay, sand and silt, bulk density, permanent wilting point, moisture content at saturation, and field capacity (Table 1).

#### *2.3. The ANNs' Setup*

As mentioned in Section 1, there is no analytical way to define the ANN structure, but based on similar several works (e.g., [12,14,16]) we noticed that the ANNs have one or two hidden layers only. With this in mind, we tested several structures for our ANNs with two hidden layers and an additional one with three hidden layers as a test. Details about the ANNs' settings are described below.


**Table 1.** Statistical properties of data measured in the laboratory.

We began with all the input data (seven) and tested several configurations as follows: we start with two neurons in the first layer and two neurons in the second. Then we continue to vary the number of neurons in the first layer from two to ten, and leave the second layer constant. Next, the number of neurons in the second layer is increased to three and we vary the number of neurons in the first, again, from two to ten, and so on until the number of the second layer varies from two to ten, just like the first one. This input layer has the seven input data mentioned before. Finally, the output layer contains the K<sup>S</sup> predicted value.

Then, we changed the number of data in the input layer (decreasing it by one) and repeated the process as explained before. The choice of which parameter has to be removed is based on the importance plot (details in Section 2.4), except for the last configuration, where we remove *θ<sup>S</sup>* instead of the percentage of silt, because *θ<sup>S</sup>* is closely related with the clay percentage. We kept removing input data until we had three measurements only. For each configuration, we varied the number of neurons in the hidden layers from two to ten.

The ANN was programmed using the neuralnet package [24] and the caret package [25], both of them provided by the R software [26].

The neuralnet package trains neural networks using backpropagation, resilient backpropagation (RPROP) with [27] or without weight backtracking [28], or the modified globally convergent version (GRPROP) by [29]. The function allows flexible settings through custom-choice of error and activation function. The caret package (short for Classification And REgression Training) is a set of functions that attempts to streamline the process for creating predictive models. There are many different modeling functions in R. Some have different syntax for model training and/or prediction. The package started off as a way to provide a uniform interface for the functions themselves, as well as a way to standardize common tasks (such as parameter tuning and variable importance). Specifically, we used the train function of this package to perform the cross-validation process.

Furthermore, the calculation of generalized weights [30] is implemented. In this work, we use RPROP as an algorithm type to calculate the neural network, the sum of squared errors as a differentiable function for the calculation of the error and a logistic differentiable function for smoothing the result of the cross-product of the covariate or neurons and the weights.

Using the RMSE measurement, we selected the optimal ANN structure in each case, and finally, we kept this last ANN structure as ideal. The Mean Absolute Error (MAE) measurement was calculated as:

$$\text{MAE} = \frac{1}{n} \sum\_{i=1}^{n} |y\_i - x\_i| \tag{1}$$

where *y<sup>i</sup>* are the measurement values, *x<sup>i</sup>* are the predicted values, and *n* is the total measurement. We also calculate the RMSE as:

$$\text{RMSE} = \sqrt{\frac{1}{n} \sum\_{i=1}^{n} (y\_i - \mathbf{x}\_i)^2}. \tag{2}$$

#### *2.4. Cross-Validation*

In order to validate our ANN results, we made a cross-validation analysis using the train function from the caret package. This function sets up a grid of tuning parameters for a number of classification and regression routines, fits each model, and calculates a resampling-based performance measure. In this case, we use the parameters optimized to neuralnet, and we generated 25 bootstrap replications for each ANN configuration. Finally, for the prediction of new samples, we used the predict function.

The train function returned several results: the ideal ANN configuration, the best RMSE, MAE and *R* <sup>2</sup> values for each tested configuration, a RMSE matrix, and an importance plot. This last plot indicates which parameter contributed the most to the K<sup>S</sup> final approximation. Based on this last plot, we decided which input data would be removed on the next run, with the exception already mentioned above.

#### **3. Results and Discussion**

As mentioned in Section 2, we generated several ANNs which contained all the combinations from two to ten neurons in each hidden layer. In Figure 3 we show the results for these tests. In the *x* axis we have the number of neurons in the first hidden layer, the *y* axis represents the RMSE value obtained from the 25 bootstrap replications, and each color represents the number of neurons in the second hidden layer. The election of the best configuration is based on the smallest RMSE value in these plots. Additionally, from the top to the bottom and from the left to the right, we show the variation in the input data. Another result is shown in Figure 4. This plot is a 10 × 10 matrix where the color represents the RMSE value for each configuration, the rows are the number of neurons in the second hidden layer, and the columns are the number of neurons in first hidden layer. Remember that each plot differs from the other in the number of input data in the same way that Figure 3. In general, we can see that a small number of neurons in the first layer presents higher values of RMSE. Another important result is presented in the Figure 5 where we show a density plot representation (varying the input data from top to bottom) for RMSE, MAE, and R<sup>2</sup> measurements derived from the bootstrap analysis. From these plots we have small variations for each measurement which indicate that the results are not highly dependent on the ANN configurations. Finally, in Figure 6, we show the importance plots, which are described in the cross-validation section presented previously. These plots help us to decide which parameter we must keep or eliminate in each run when we have different numbers of input data.

In order to explore another possibility and to get more confident results, we made a test increasing the number of hidden layers to three. We applied the same process explained previously for this new configuration, and we got an improvement of ∼9% in the RMSE values, but with 15 times more computation time. This was the reason for dismissal of these configurations.

In Table 2 we present the three best configurations of each run, where we have the number of input data, the ANN structure, and the RMSE, MAE, and R<sup>2</sup> measurements.

Following the results, we present Figure 7, where we compare the K<sup>S</sup> measurements with the K<sup>S</sup> predicted by the ANNs for all tested configurations. The dotted line is a 1:1 relation and the R<sup>2</sup> is obtained from the train function. The plotted values were obtained applying the best ANN configuration for each run (Figure 7). The histograms in the topleft corner show the residuals (∆KS), which is the difference between the predicted and measured values.

Recall that to obtain the ideal neuron configuration, we apply the train package, which allows us to obtain the RMSE, MAE, and R<sup>2</sup> data for each arrangement of neurons in the hidden layers. The only variation was the input data (ranging from 7 to 3). Therefore, it was possible to obtain five different boxplots for the RMSE, whose only difference would be the aforementioned input data. This is shown in Figure 8, where a trend is observed for the RMSE to fall, while the number of input data decreases.

**Figure 3.** ANN test for choosing the ideal number of hidden neurons varying the input data. From top to bottom and left to right the number of input data goes from 7 to 3. The *x* axis represents the number of neurons in the first hidden layer and the *y* axis is the RMSE value. Each color is a different number of neurons in the second hidden layer.

**Figure 4.** RMSE matrix for different number of neurons in the hidden layers. From top to bottom and left to right, the number of input data goes from 7 to 3. The *x* axis represents the number of neurons in the first hidden layer and the *y* axis is the number of neurons in the second hidden layer.

**Figure 5.** Density plots for (from left to right) RMSE, R<sup>2</sup> , and MAE resulting from the 25 bootstrap replications. From top to bottom, the number of input data goes from 7 to 3.

**Figure 6.** Importance plots referring to the weight of each variable in the calculations. From top to bottom and from right to left, the number of input data goes from 7 to 3.


**Table 2.** The top three configurations for ANN structure and their statistical measurements. (1) # Input data (2) contains the ANN neurons' structure (Input-Hidden1-Hidden2-Ouput), where each number represents the quantity of neurons used in each layer, (3) the RMSE measurements, (4) the MAE measurements, and (5) the R<sup>2</sup> measurements.

In Table 3 we show the results obtained from the literature with PTFs or ANNs and compare them with this work.


**Table 3.** Comparison between several works for obtaining K<sup>S</sup> .

The results for RMSE obtained in this work are better in, at least, 35% compared with the ones presented by [31], and we reported the fifth-best value for R<sup>2</sup> .

**Figure 7.** Plots for the comparison between K<sup>S</sup> measurement in the field with the K<sup>S</sup> value obtained applying the different ANN configurations. The dotted line is a 1:1 desirable relation. The histograms represent the residual distribution (∆K<sup>S</sup> ).

**Figure 8.** Boxplot for each run of the train package. The names means two hidden layers and the number of input data used in each run.

Another advantage of ANN is that the initial shape for the function to get the relation between the variables or a principal component analysis is not needed. Additionally, as we can see in Figure 3 and Table 2, the results are independent of the ANN structure. This is supported by the fact that the RMSE values for different configurations are very similar (the largest difference is <sup>∼</sup>10%), and for R<sup>2</sup> values the difference is <sup>∼</sup>5%, and we got ∼11% for MAE. Besides, the Figure 5 presents an almost gaussian distribution for these three statistical measurements, which is in agreement with a non-biased result. In this Figure, we also see that the density distributions are narrower, while the number of input data is smaller. This tells us that in contrast with the PTF models, we found a tendency which indicates that the less input data we have, the more accurate our prediction of K<sup>S</sup> is, as well as the Figure 8 shown this tendency too. This result can be explained by the following reasons. Based on the Principal Component Analysis of [23], we noted that the principal variables contributing the most to the sample were KS, the percentage of clay, *θS*, PWP, and FC, which is supported by the importance plots (Figure 6). Additionally, for the 900 analyzed samples contained in 10 of the 12 existing types of soil (according to the USDA Textural Soil Classification), they showed that the infiltration rate depended directly on the percentage of clay and the *ρa*.

#### **4. Conclusions**

In this work, we developed five Artificial Neural Networks in order to calculate the saturated hydraulic conductivity based on the sample used by [23]. All networks consist of one input layer, two hidden layers, and one output layer. We tested a network with three hidden layers, but with little better results. We took 75% of the sample for training and 25% for validation. We also tested all the possible combinations for the number of neurons in each hidden layer, taking into account that the number of neurons for each hidden layer will vary from 2 to 10 neurons. Finally, we selected the best number of neurons in each layer based on RMSE measurements obtained from a cross-validation analysis.

The results show that, compared with other works, we get better or similar results for RMSE and R<sup>2</sup> measurements and similar configurations for our ANN. Finally, we can say that if the necessary resources are available to obtain a large number of data in the field, it is necessary to develop a study of PTF as well as ANN to compare the results of each process and be able to choose the best option between both of them. The latter will not only be based on the RMSE or R<sup>2</sup> measurements, but also on the desired application (a statistical ground property study or prediction for irrigation proposes). A more detailed

study to define an exact range of the amount of data needed from a reliable artificial neural network study should be carried out, but the latter is beyond the objective of this work. Besides, we have to be more careful in the characteristics of the sample where the models come out. In our case, we analyzed 10 of the 12 types of soils where the bulk density and the percentage of clay became more important parameters compared to others. This made our models more reliable for almost any type of soil.

The coupling of the Saint Venant and Richards equations is the complete mathematical model for modeling gravity irrigation [41]. However, its use requires detailed information on the physical properties of the soil, as well as a series of field and laboratory experiments that can be expensive [42,43]. In this way, the results used in this article, combined with some rapid field and laboratory tests, can be an excellent alternative to reduce costs and the time used to obtain that information.

Finally, the application of artificial neural networks have been demonstrated to successfully solve classification and prediction problems, and this is probably for the nonlinear relation between the variables. The calculation of the saturated hydraulic conductivity in this work proves that we need only three variables to predict new values, but the soil properties are crucial for the correct application of these models in contrast with the ANN configuration, which has been proved to play a minor role in the final results.

**Author Contributions:** Conceptualization, C.C. and C.F.; methodology, C.C., J.T.-A. and A.Q.; software, J.T.-A. and B.G.-C.; validation, B.G.-C., J.T.-A. and A.G.-L.; formal analysis, C.C. and J.T.-A.; investigation, A.G.-L. and J.T.-A.; resources, C.C., C.F. and A.Q.; data curation, C.C. and A.Q.; writing—original draft preparation, J.T.-A.; writing—review and editing, C.C., C.F., A.Q. and A.G.-L.; visualization, J.T.-A. and B.G.-C.; supervision, C.C., C.F. and A.Q.; project administration, C.C.; funding acquisition, C.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported as part of a collaboration between the National Water Commission (CONAGUA, according to its Spanish acronym); The Water Users Association of Second Unit of Module Two, Irrigation District No. 023; the Irrigation District 023, San Juan del Río, Querétaro; and the Autonomous University of Querétaro, under the program RIGRAT 2015–2019.

**Conflicts of Interest:** The authors declare no conflict of interest regarding the publication of this paper.

#### **References**


*Software Society, iEMSs 2004*; International Environmental Modelling and Software Society: Manno, Switzerland, 2004; Volume 2, pp. 990–995.


MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Water* Editorial Office E-mail: water@mdpi.com www.mdpi.com/journal/water

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34

www.mdpi.com

ISBN 978-3-0365-6195-0