**1. Introduction**

Water and food scarcities are becoming serious issues worldwide due to population growth and climate change. According to United Nations statistics [1], approximately 70% of all available fresh water is consumed for agricultural activities, with the main consumer being irrigation. Currently, the average water-use efficiency in irrigation worldwide is about 50% as reported in Fischer et al. [2]. Therefore, it is of vital importance to improve irrigation water-use efficiency, in order to address the water crisis. Currently, it is still a common practice to use open-loop irrigation, which leads to excessive consumption of water resources. Closed-loop irrigation is a promising alternative to reduce water consumption and to better maintain the health of crops [3]. In the development of such a closed-loop irrigation system, it is important to have the soil moisture information of the entire field, which is in general very difficult to obtain. One way to overcome this challenge is to estimate the field's soil moisture based on limited sensor measurements. However, this depends on the accuracy of the agro-hydrological model. In this work, we aim to develop a systematic parameter and state estimation scheme that can provide accurate estimates of soil moisture.

Specifically, in this work, we consider simultaneous state and parameter estimation based on agro-hydrological systems modeled using the Richards equation, which describes soil water dynamics. Richards equation is a partial differential equation (PDE) which falls in the family of porous medium equation (PME). The estimation and control problems of this kind of equation were widely studied in chemical engineering [4–7] and meteorology [8,9]. The Richards equation is essentially composed of the continuity equation and Darcy's law, which is incorporated with two algebraic equations of hydraulic conductivity and capillary capacity (derivative of soil-water retention curve) [10]. The parameters of Richards equation are related to soil properties. Different approaches have been developed to

estimate soil properties. Soil properties may be estimated in a soil lab by directly fitting the soil-water retention curve and hydraulic conductivity curve using collected field data of soil moisture, hydraulic conductivity and corresponding capillary pressure head [10]. However, soil properties may change over time and it would be expensive to take frequent soil samples for lab analysis especially when a big field is considered. Moreover, the hydraulic conductivity is difficult to measure. As an alternative to direct lab analysis soil parameters can be estimated indirectly based on the Richards equation and some easily-accessible field measurements such as soil moisture or capillary pressure head by minimizing the difference between measured values and model predicted values. This type of indirect approaches are referred to as inverse estimation [11]. Inverse estimation has been widely applied and its applications can be generally classified into two groups: methods based on measurements observed from one-step or multi-step outflow experiments [12–15] or methods based on time-series in-situ measurements [16–20]. These inverse estimation methods can only estimate soil parameters but not soil moisture or capillary pressure head. Moreover, they are mainly applicable to pre-collected datasets and cannot be used for online parameter estimation.

Sequential data assimilation is another widely used approach in estimating soil parameters online, which only requires the current measurement and prior knowledge of the system. In general, it consists of two steps, which are prediction and update steps. In the first step, a dynamical system model is initialized to describe a real process. Due to the limited knowledge about the process, the model may not predict the process accurately. Then, in the update step, an algorithm is designed to determine how to correct the prediction, based on field measurements and the dynamical model. Moreover, sequential data assimilation has the ability to deal with uncertainties in the measurements and the model. Particle filters (PF) [21], extended Kalman filters (EKF) [22] and ensemble Kalman filters (EnKF) [23–27] are common and widely used algorithms in sequential data assimilation for soil parameter estimation. Li and Ren [23] studied parameter estimation by augmenting parameters as states and used EnKF as the estimation algorithm. They also studied the possible factors that affect the performance of EnKF. In Reference [24], dual ensemble Kalman filter (DEnKF) was used to first estimate the states using a standard KF and then to estimate the parameters using an unscented Kalman filter. In Reference [25], two EnKFs were used to estimate the states and parameters, separately, which neglected the complex nonlinear interaction between states and parameters. In Reference [26], the authors compared three ensemble-based simultaneous state and parameter estimation methods, augmented ensemble Kalman filter, DEnKF and simultaneous optimization and data assimilation (SODA) to improve the soil moisture estimation accuracy. It concluded that the augmented EnKF was the most robust method for general conditions and SODA was better at handling complex conditions. However, it was pointed out that SODA required the highest computational resources.

However, one limitation of the above discussed methods is that they cannot handle constraints on the states or parameters and the estimation performance deteriorates when the noise is not Gaussian or the initial guess is not good. Constraints on the states and parameters are important information and may be used to significantly improve estimation performance as will be demonstrated in the simulations of this work. To address the above discussed problem, we can consider the optimization based moving horizon estimation (MHE) method, which is widely used in state estimation of nonlinear systems with explicit constraints taken into account [28–30].

In this work, we first introduce the investigated system and the formulation of the mathematical model in Section 2. The formulations of the estimation methods, MHE, EKF and EnKF for the augmented system are introduced in Section 3. Section 4 includes the methods of identifiability and sensitivity, used to study the significance of parameters. Section 5 shows the synthetic experimental setup, determination of significant parameters and MHE estimation results compared with EKF and EnKF, followed by concluding remarks in Section 6. Some preliminary results of this work were reported in Reference [31]. Compared with Reference [31], this paper provides significantly expanded explanations and significantly extended simulation results.

#### **2. System Description and Problem Formulation**

An agro-hydrological system describes the water movements between soil, crop and atmosphere. Figure 1 shows a schematic of an agro-hydrological system, which is a modified version from Reference [32]. The water movements usually involve water transportation within the soil, root water extraction, transpiration and evaporation from the soil and leaves to the air and precipitation including rain and irrigation.

**Figure 1.** A schematic diagram of an agro-hydrological system.

In this work, we focus on soil that is above the water table (i.e., soil in the vadose zone). Within the vadose zone, the water movement is mainly driven by capillary and gravitational forces and the water dynamics can be modeled using Richards equation under the assumptions: (1) soil properties are spatially homogeneous within the system; (2) irrigation is uniformly applied on the surface of the system; and (3) the horizontal water dynamics are much smaller than the vertical dynamics due to the gravity force and the horizontal water dynamics can be neglected. Then, the 1D Richards equation modeling the vertical water dynamics is shown below [33]:

$$
\mathcal{L}\left(h\right)\frac{\partial h}{\partial t} = \frac{\partial}{\partial z}\left[K\left(h\right)\left(\frac{\partial h}{\partial z} + 1\right)\right].\tag{1}
$$

In Equation (1), *h* (m) is the capillary potential in the unsaturated soil, *K*(*h*) (m/s) and *c*(*h*) (1/m) denote hydraulic conductivity and capillary capacity of the soil, respectively. Note that in Richards equation, the value 1 on the right-hand-side denotes the impact of gravitational force on water in the vertical (*z*) direction. The upward *z*-direction is defined as the positive direction. The van Genuchten-Mualem soil hydraulic model *K*(*h*) and *<sup>c</sup>*(*h*), as functions of the capillary potential *h*, are shown as follows [10]:

$$K(h) = K\_{\theta} \left[ \left( 1 + \left( -ah \right)^{n} \right)^{-\left(1 - \frac{1}{n} \right)} \right]^{\frac{1}{2}} \left[ 1 - \left[ 1 - \left[ \left( 1 + \left( -ah \right)^{n} \right)^{-\left(1 - \frac{1}{n} \right)} \right]^{\frac{n}{n-1}} \right]^{1 - \frac{1}{n}} \right]^{2} \tag{2}$$

$$
\mathcal{L}\left(h\right) = \left(\theta\_s - \theta\_r\right) \operatorname{an}\left(1 - \frac{1}{n}\right) \left(-ah\right)^{n-1} \left[1 + \left(-ah\right)^n\right]^{-\left(2 - \frac{1}{n}\right)},\tag{3}
$$

where *Ks* (m/s), *θs* (*m*3/*m*<sup>3</sup>) and *θr* (*m*3/*m*<sup>3</sup>) are saturated hydraulic conductivity, saturated soil moisture and residual soil moisture, respectively. The van Genuchten-Mualem parameters *α* (1/m) and *n* characterize the properties of the soil, which are proportional to the inverse of the soil air entry pressure and of soil porosity, respectively. These two closed-form expressions are derived by van Genuchten based on his expression of soil-water retention curve and Mualem's open-form expression of hydraulic conductivity. Since Mualem's expression is not studied further, in this work, only van Genuchten's soil-water retention equation is shown below [10]:

$$\theta\left(h\right) = \left(\theta\_s - \theta\_r\right) \left[\frac{1}{1 + \left(-ah\right)^n}\right]^{1 - \frac{1}{n}} + \theta\_{r\prime} \tag{4}$$

where *θ* (m3/m3) denotes volumetric water content in soil.

The five parameters *θ<sup>s</sup>*, *θ<sup>r</sup>*, *α*, *n* and *Ks* determine the properties of a type of soil. With sufficient soil samples, *θ<sup>s</sup>*, *θ<sup>r</sup>*, *α* and *n* can be estimated by fitting the soil-water retention curve Equation (4) utilizing soil moisture and capillary potential data sets. Then *Ks* can be estimated by fitting hydraulic conductivity and capillary potential data sets into Equation (2). By using this approach, we can only ge<sup>t</sup> a snapshot of the soil properties at one time instant, however, soil properties do slowly change over time due to agricultural activities [34]. While the experiments can be repeated to ge<sup>t</sup> parameter estimates at different times, it is very time consuming and expensive, especially when the investigated field is large and has various soil types over the field. Therefore, online state and parameter estimation based on ease-to-access field measurements provides a favorable approach to estimate soil properties.

In this work, we study the estimation of soil properties based on real-time field measurements: capillary potential *h*.

#### *Finite Difference Model Development*

Richards equation is a nonlinear partial differential equation (PDE) with respect to both the temporal and spatial variables. Because of its complex structure, it is difficult to have a closed-form solution. Therefore a finite difference method is implemented to find a numerical approximation of its solution. Two-point forward difference scheme and two-point central difference scheme are used to approximate the derivatives with respect to the temporal and spatial variables, respectively:

$$\frac{\partial h\_i\left(k\right)}{\partial t} = \frac{h\_i\left(k+1\right) - h\_i\left(k\right)}{\Delta t} \tag{5}$$

$$\frac{\partial}{\partial z}\left[K\_{l}\left(h\left(k\right)\right)\left(\frac{\partial h\_{l}\left(k\right)}{\partial z}+1\right)\right] = \frac{K\_{l-\frac{1}{2}}\left(h\left(k\right)\right)\left(\frac{h\_{l-1}\left(k\right)-h\_{l}\left(k\right)}{\frac{1}{2}\left(\Delta z\_{l-1}+\Delta z\_{l}\right)}+1\right) - K\_{l+\frac{1}{2}}\left(h\left(k\right)\right)\left(\frac{h\left(k\right)-h\_{l+1}\left(k\right)}{\frac{1}{2}\left(\Delta z\_{l}+\Delta z\_{l+1}\right)}+1\right)}{\Delta z\_{l}}\right),\tag{6}$$

where *k* ∈ [0, *Nt*] ⊂ Z and *i* ∈ [1, *Nx*] ⊂ Z, representing time and position indices, respectively. *Nt* and *Nx* are the total number of time instants and states investigated. Δ*t* = *t*(*k* + 1) − *t*(*k*) and Δ*zi* represents compartment thickness of compartment *i*. The state *i* is at the center of the compartment *i*. The hydraulic conductivity, for example, *Ki*− 1 2, is linearized explicitly as *Ki*− 1 2(*h*) = *K*( *hi*−1+*hi* 2 ).

The discrete-time finite difference model at node *i* and time instant *k* + 1 can be obtained by substituting Equations (5) and (6) into Equation (1) as follows:

$$h\_i(k+1) = h\_i(k) + \Delta t \frac{K\_{i-\frac{1}{2}}(h(k))\left(\frac{h\_{i-1}(k) - h\_i(k)}{\frac{1}{2}(\Delta z\_{i-1} + \Delta z\_i)} + 1\right) - K\_{i+\frac{1}{2}}(h(k))\left(\frac{h\_i(k) - h\_{i+1}(k)}{\frac{1}{2}(\Delta z\_i + \Delta z\_{i+1})} + 1\right)}{\Delta z\_i c\_i(h(k))},\tag{7}$$

where *ci*(*h*(*k*)) is defined as *<sup>c</sup>*(*hi*(*k*)).

The Neumann boundary condition is utilized to characterize the top and bottom boundaries of the system and are shown below, respectively:

$$\left. \frac{\partial h\left(k\right)}{\partial z} \right|\_{T} = -1 - \frac{q\_T(k)}{K\left(h\left(k\right)\right)}\tag{8}$$

$$\left. \frac{\partial \left( h\left(k\right) + z\right)}{\partial z} \right|\_{B} = 1,\tag{9}$$

where the subscripts *T* and *B* represent the top and bottom boundary conditions, respectively. The *qT* (m/s) is the irrigation rate which is considered as the input of the system and free drainage boundary condition is applied at the bottom.

Before introducing estimation methods, for the sake of simplicity, we obtain the compact form of the model by combining *Nx* Equation (7) for all spatial nodes and the boundary conditions, Equations (8) and (9). It is shown below:

$$\mathbf{x}\left(k+1\right) = F\left(\mathbf{x}\left(k\right), \boldsymbol{\mu}\left(k\right), p\left(k\right)\right) + \omega\_{\mathbf{x}}\left(k\right) \tag{10}$$

where *x*(*k*) ∈ X ⊂ R*Nx* represents the state vector containing *Nx* capillary pressure values for corresponding spatial nodes, at the defined time instant *k*. *p*(*k*) ∈ P ⊂ R*Np* , represents the parameter vector containing the parameters to be estimated. *u*(*k*) ∈ U ⊂ R*Nu* , *<sup>ω</sup>x*(*k*) ∈ W*x* ⊂ R*N<sup>ω</sup>x* denote the input and the model disturbances, respectively.

The general output function, with the measurement noise taken into account, is shown below:

$$y\left(k\right) = G\left(\mathbf{x}\left(k\right), \boldsymbol{\nu}\left(k\right)\right) + \boldsymbol{\nu}\left(k\right),\tag{11}$$

where *y*(*k*) ∈ Y ⊂ R*Ny* and *ν*(*k*) ∈ V ⊂ R*Nν* denote the measurement vector and measurement noise. If the volumetric soil moisture *θ* is measured by the soil moisture sensor, Equation (11) is the general form of Equation (4). On the other hand, if tensiometers are used to measure the water potential *h* in the soil, Equation (11) simply represents a matrix indicating which states are measured by the tensiometers.

Furthermore, in order to estimate the states and parameters simultaneously, the parameter vector is augmented at the end of the state vector and treated as a part of the augmented state vector, *X* = [*x*, *<sup>p</sup>*]*<sup>T</sup>*. An estimation of the augmented state vector *X* brings the benefit to estimate the states and parameters at the same time. The augmented model can be constructed by augmenting Equation (10) with the following equation:

$$p\left(k+1\right) = p\left(k\right) + \omega\_p\left(k\right),\tag{12}$$

where *<sup>ω</sup>p*(*k*) ∈ <sup>W</sup>*p* ⊂ R*N<sup>ω</sup>p* . When the parameter vector *p* is assumed to be constant during the study, *<sup>ω</sup>p* is equal to 0.

At last, the augmented model and output function used for simultaneous parameter and state estimation are shown below:

$$\begin{aligned} X\left(k+1\right) &= F\_{\mathfrak{a}}\left(X\left(k\right), \mathfrak{u}\left(k\right)\right) + \omega\_{\mathfrak{a}}\left(k\right) \\ \mathfrak{y}\left(k\right) &= G\_{\mathfrak{a}}\left(X\left(k\right)\right) + \nu\left(k\right) \end{aligned} \tag{13}$$

where *X*(*k*) ∈ X*a* ⊂ R*Nx*+*Np* , *<sup>ω</sup>a*(*k*) ∈ W*a* ⊂ R*Nw*+*Np* and the subscript *a* of *<sup>F</sup>*(·) and *<sup>G</sup>*(·) denotes the augmentation.

#### **3. Estimation Methods**

In this work, three common estimation schemes, MHE, EKF and EnKF are applied to the augmented model to estimate the states and parameters. The design of these methods are detailed next.

#### *3.1. Moving Horizon Estimation*

MHE is an online optimization based estimation method [28]. The MHE optimization problem used in this work is formulated as follows:

$$\min\_{\mathcal{R}(k-N),\cdots,\mathcal{X}(k),\mathcal{Q}\_{k}(k-N),\cdots,\mathcal{Q}\_{k}(k-1)} \left\| \hat{\mathcal{X}}(k-N) - \mathcal{X}(k-N) \right\|\_{P^{-1}}^2 + \sum\_{j=k-N}^{k-1} \left\| \hat{\omega}\_{k}(j) \right\|\_{Q^{-1}}^2 + \sum\_{j=k-N}^{k} \left\| \mathbb{V}(j) \right\|\_{R^{-1}}^2 \tag{14}$$

$$\text{s.t.} \,\hat{X}(j+1) = F\_{\mathfrak{a}}(\hat{X}(j), \mathfrak{u}(j)) + \hat{\mathfrak{a}}\_{\mathfrak{a}}(j), \; j \in [k - N, k - 1] \subset \mathbb{Z} \tag{15}$$

$$\mathcal{V}(j) = y(j) - G\_{\mathfrak{a}}(\hat{\mathcal{X}}(j)), \quad j \in [k - N, k] \subset \mathbb{Z} \tag{16}$$

$$
\bar{X}(k-N) = \hat{X}(k-N|k-N) \tag{17}
$$

$$\mathcal{X}(j) \in \mathbb{X}\_a, \forall (j) \in \mathbb{V}, \ j \in [k - N, k] \subset \mathbb{Z} \tag{18}$$

$$
\omega\_a(j) \in \mathbb{W}\_{a\prime} \quad j \in [k-N, k-1] \subset \mathbb{Z} \tag{19}
$$

In the MHE optimization, the objective is to minimize the distance between the predicted and observed measurements which is measured by the term *ν*<sup>ˆ</sup><sup>2</sup>*R*−<sup>1</sup> as shown in Equation (14), where the term *ν*ˆ is defined in Equation (16). The caret signˆindicates that the variable is estimated. The model uncertainty or the process disturbance is taken into account and represented by *ω*<sup>ˆ</sup> *<sup>a</sup>*<sup>2</sup>*Q*−<sup>1</sup> , where the term *ω*<sup>ˆ</sup> *a* is defined in Equation (15). The arrival cost, *X*ˆ − *X*¯ <sup>2</sup>*P*−<sup>1</sup> summarizes the information from the initial state of the model up to the beginning of the estimation window of the MHE. *N* denotes the length of the estimation window. After each optimization, only the last estimated state within the estimation window is used. *X* ˆ and *ω*<sup>ˆ</sup> *a* within the moving window are the decision variables of the optimization problem. The term *X* ¯ follows the definition of Equation (17). *X* <sup>ˆ</sup>(*k* − *N*|*k* − *N*) represents the estimated state *X* ˆ at time instant *k* − *N*, which is estimated at time instant *k* − *N*. Matrices *P*, *Q*, *R* are positive definite matrices and they are the covariance matrices of state uncertainty, process noise *ωa* and measurement noise *ν*, respectively. In addition, MHE takes into account constraints on the states, parameters and model uncertainties as expressed in Equations (18) and (19).

#### *3.2. Extended Kalman Filter*

EKF is a common method used for state estimation of nonlinear systems based on successively linearizing the nonlinear system. It can be divided into two steps, which are prediction and update steps. The prediction step predicts the state *X* and the state covariance matrix *P*. When a new measurement is available, the Kalman gain *K* is calculated first and then *X* and *P* are updated. The detailed steps are shown below:

#### 1. Prediction step

(a) State prediction:

$$
\hat{X}(k|k-1) = F\_{\mathfrak{a}}(\hat{X}(k-1|k-1), \mathfrak{u}(k-1))
$$

The model disturbance are not propagated as the states and parameters. Instead, it is explicitly included in the state covariance prediction.

(b) State covariance prediction:

$$P(k|k-1) = A\_a(k)P(k-1|k-1)A\_a(k)^T + Q$$

where *Aa*(*k*) = *∂Fa ∂X* ---*<sup>X</sup>*<sup>ˆ</sup>(*k*−<sup>1</sup>|*k*−<sup>1</sup>) and *Q* is the covariance matrix of the model disturbance *ωa*.

#### 2. Update step

(a) Kalman gain calculation:

$$\mathcal{K}(k) = P(k|k-1)\mathbb{C}\_{a}(k)^{T}[\mathbb{C}\_{a}(k)P(k|k-1)\mathbb{C}\_{a}(k)^{T}+R]^{-1}$$

where *Ca*(*k*) = *∂Ga ∂X* ---*<sup>X</sup>*<sup>ˆ</sup>(*k*|*k*−<sup>1</sup>) and *R* is the covariance matrix of the measurement noise *ν*.

(b) State update:

$$\hat{X}(k|k) = \hat{X}(k|k-1) + K(k) \left( y(k) - G\_a(\hat{X}(k|k-1)) \right)$$

The augmented state and parameter vector *X* is updated when a new measurements *y*(*k*) is available.

(c) State covariance update:

$$P(k|k) = \left(I - \mathcal{K}(k)\mathcal{C}\_a(k)\right)P(k|k-1)$$

State covariance matrix *P* is updated. *I* is the identity matrix with dimension *Nx* + *Np*.

#### *3.3. Ensemble Kalman filter*

The EnKF is a method developed by Evensen [35] based on Monte Carlo method. An ensemble of trajectories of the system is generated based on the priori probability distribution of the case. A practical implementation scheme which estimated the probability distribution based on the information embedded within ensembles, instead of propagation of the state covariance matrix *P*, is discussed in Reference [36]. Unlike EKF, it directly utilizes the nonlinear model Equation (13), which does not require frequent model linearization. In addition, the model disturbance and measurement noise are taken into account at the same time as the states and parameters propagate. It starts with generating the ensembles, then follows with the two steps as the same as in EKF.

	- (a) Generating ensembles:

$$\mathcal{X}^{m}(0|0) \sim \mathcal{N}(X(0), P(0)), \ m \in [1, M] \subset \mathbb{Z}$$

where an ensemble containing *M* initial states *X* ˆ *<sup>m</sup>*(0|0), *m* = 1, ... , *M*, is generated and *m* is the index of the ensemble. The ensemble follows the multivariate normal distribution with mean, *X*(0) and covariance matrix of the initial state, *<sup>P</sup>*(0).

2. Prediction step (a) State prediction:

$$\hat{X}^{\mathfrak{M}}(k|k-1) = F\_{\mathfrak{a}}(\hat{X}^{\mathfrak{m}}(k-1|k-1), \mathfrak{u}(k-1)) + \omega\_{\mathfrak{a}}^{\mathfrak{m}}(k-1), \ m \in [1, M] \subset \mathbb{Z}\_{>0}$$

where *ωma* (*k* − 1) ∼ N (0, *Q*). Just like generating the ensemble of *X*ˆ *m*, a normally distributed set of *ωma* are generated with the mean 0 and the covariance matrix *Q*. Overall *M* trajectories propagate, with model disturbance explicitly considered.

	- (a) Kalman gain calculation:

$$K(k) = P\_{xy}(k|k-1)P\_{yy}(k|k-1)^{-1}$$

where *Pxy*(*k*|*k* − 1) = 1 *M*−1 ∑*Mm*=<sup>1</sup>[(*X*<sup>ˆ</sup> *m*(*k*|*k* − 1) − *X*¯(*k*|*k* − <sup>1</sup>))(*y*<sup>ˆ</sup>*<sup>m</sup>*(*k*|*k* − 1) − *y*¯(*k*|*k* − 1))] *Pyy*(*k*|*k* − 1) = 1 *M*−1 ∑*Mm*=<sup>1</sup>[*y*<sup>ˆ</sup>*<sup>m</sup>*(*k*|*<sup>k</sup>* − 1) − *y*¯(*k*|*k* − 1)]2, *X*¯(*k*|*k* − 1) = 1*M* ∑*Mm*=<sup>1</sup> *X*ˆ *m*(*k*|*k* − 1) and *y*¯(*k*|*k* − 1) = 1*M* ∑*Mm*=<sup>1</sup> *y*<sup>ˆ</sup>*<sup>m</sup>*(*k*|*k* − <sup>1</sup>). *Pxy* is the cross-covariance matrix of the state and measurement vectors and *Pyy* is the auto-covariance matrix of the measurement vector. The mean of the state or measurement vector is calculated based on the corresponding ensembles.

(b) State update:

$$\hat{X}^{\mathfrak{m}}(k|k) = \hat{X}^{\mathfrak{m}}(k|k-1) + K(k) \left[ y(k) + \nu^{\mathfrak{m}}(k) - G\_{\mathfrak{a}}(\hat{X}^{\mathfrak{m}}(k|k-1)) \right], \ m \in [1, M] \subset \mathbb{Z}^+$$

where *ν<sup>m</sup>*(*k*) ∼ N (0, *<sup>R</sup>*). All *M* state vectors are updated, when the new measurement *y*(*k*) is available. The measurement uncertainty is taken into account by generating a normally distributed ensemble of measurement noises *<sup>ν</sup><sup>m</sup>*(*k*), which has mean 0 and covariance matrix *R*. At last, the estimated state *X* <sup>ˆ</sup>(*k*|*k*) is obtained as the mean of the corresponding ensembles *X* ˆ *<sup>m</sup>*(*k*|*k*), *m* = 1, . . . , *m*.

#### **4. Proposed Procedure to Determine Significant Parameters and Number of Sensors**

In reality, it is nearly impossible to measure all states and the parameters can not be determined easily. First, according to Reference [37], it states that the original system of Equation (10) is observable using limited number of measurements. That means the states can be recovered. However, for this work the augmented system of Equation (13) is studied. For this case, it is necessary to ensure that the parameters are also identifiable since they are estimated with the states simultaneously. The proposed procedure to check the identifiability of the parameters, to select appropriate parameters for estimation and to determine the minimum number of sensors is shown in Figure 2. The key steps are explained below.

**Figure 2.** A flowchart of the procedure to determine the significant parameters and number of sensors.

#### *4.1. Determine Candidate Parameter Sets for Estimation*

After augmenting the original nonlinear system with the parameters, the entire system may not be observable. In order to determine which parameters can and should be estimated online, we resort to observability analysis [38]. In this step, we assume that all the soil moisture states are measured; that is, *y* = *x*. This ensures that the observability analysis results depend only on the parameters. If the augmented system is not observable, then the unobservability is caused by the augmentation of the parameters in the state vector.

When checking the observability of the augmented system, we start with the system with all the parameters augmented. If the augmented system is not observable, then one of the parameters is removed from the augmented system. If there are *Np* parameters, then there are *Np* different ways to remove the one parameter. All these *Np* cases are considered. If after removing one parameter and upon finding that the new augmented system is observable, we continue to the next step to determine which parameter set to estimate (described in the next subsection). If we can still not find an observable augmented system after removing one parameter, we continue to remove two parameters from the original augmented system. Again, all the possible cases should be considered. If we can still not find an observable system, we continue to remove three parameters from the original augmented system. This continues until we find at least a system that is observable.

When checking the observability, we propose to use the Popov-Belevitch-Hautus (PBH) observability theory. Other observability tests may also be used. Since the augmented system is a nonlinear system, it should be linearized before PBH can be applied. It is recommended that instead of linearizing the system at one point, it should be linearized at different point along typical operating trajectories as used in Reference [39].

Note that the observability analysis described in this step may generate more than one candidate parameter sets that can be estimated through augmentation of the original agro-hydrological system.

#### *4.2. Sensitivity Analysis*

If there is only one candidate parameter set from the previous step, we can continue with the candidate and move to the next subsection to find the minimum number of sensors. However, if there are more than one candidates, we need to determine which parameter set to choose. We propose to use sensitivity analysis to determine the importance of these parameters and pick the set containing the most important parameters for further analysis.

The sensitivity analysis measures how the outputs respond when there is a change in one parameter. The sensitivity matrix *Sy*(*k*) shown below contains the information about, at time instant *k*, how each output is affected by *X*(0) which is constituted of the initial state *x*(0) and the parameters *p*.

$$S\_{\mathcal{Y}}(k) = \begin{bmatrix} \frac{\partial y\_1}{\partial X\_1(0)} & \frac{\partial y\_1}{\partial X\_2(0)} & \cdots & \frac{\partial y\_1}{\partial X\_{N\_x}(0)} \\ \frac{\partial y\_2}{\partial X\_1(0)} & \frac{\partial y\_2}{\partial X\_2(0)} & \cdots & \frac{\partial y\_2}{\partial X\_{N\_x}(0)} \\ \vdots & \vdots & \vdots & \vdots \\ \frac{\partial y\_{N\_y}}{\partial X\_1(0)} & \frac{\partial y\_{N\_y}}{\partial X\_2(0)} & \cdots & \frac{\partial y\_{N\_y}}{\partial X\_{N\_x}(0)} \\ \hline \end{bmatrix} \begin{bmatrix} \frac{\partial y\_1}{\partial X\_1} & \cdots & \frac{\partial y\_1}{\partial X\_1(0)} & \cdots & \frac{\partial y\_2}{\partial X\_{N\_x}(0)} \\ \vdots & \vdots & \vdots & \vdots \\ \frac{\partial y\_{N\_y}}{\partial X\_{N\_x+1}(0)} & \cdots & \frac{\partial y\_{N\_y}}{\partial X\_{N\_x+1}(0)} & \cdots & \frac{\partial y\_{N\_y}}{\partial X\_{N\_x+N\_y}(0)} \\ \vdots & \vdots & \vdots & \vdots \\ \frac{\partial y\_{N\_y}}{\partial X\_{N\_x+N\_y}(0)} & \cdots & \frac{\partial y\_{N\_y}}{\partial X\_{N\_x+N\_y}(0)} \end{bmatrix} \begin{bmatrix} \ddots \\ \ddots \end{bmatrix}\_{k\_y}$$

The detailed steps to derive the sensitivity matrix is explained below and is inspired by Reference [40]. When performing this sensitivity analysis, we consider the augmented system of Equation (13) without considering the disturbance *ωa* and *ν* but with *X*(0) explicitly expressed as shown below:

$$\begin{aligned} X(k+1) &= F\_{\mathfrak{a}}(X(k), u(k), X(0)) \\ y(k) &= G\_{\mathfrak{a}}(X(k), X(0)), \end{aligned} \tag{20}$$

where *X*(0) is considered as an independent variable.

The objective is to check how a change in the initial state *x*0 and the parameters *p* affects the prediction error *e*, which comes from the difference between the predicted *y* and the observed measurements *yM*. We can represent this as:

$$\frac{\partial \varepsilon}{\partial X(0)} = \frac{\partial \left(y - y\_M\right)}{\partial X(0)} = \frac{\partial y}{\partial X(0)} - \frac{\partial y\_M}{\partial X(0)}.\tag{21}$$

Because the observed measurement *yM* is not affected by the initial state and parameters, the above expression is simplified as below:

$$\frac{\partial \varepsilon}{\partial X(0)} = \frac{\partial y}{\partial X(0)}.\tag{22}$$

Equation (22) can be derived by taking the partial derivative of Equation (20) with respect to the augmented state vector *<sup>X</sup>*(0). And the sensitivity equations with respect to *X*(0) are shown below:

$$\begin{split} \frac{\partial X(k+1)}{\partial X(0)} &= \frac{\partial}{\partial X(0)} F\_d(X(k), u(k), X(0)) \\ \frac{\partial y(k)}{\partial X(0)} &= \frac{\partial}{\partial X(0)} G\_d(X(k), X(0)). \end{split} \tag{23}$$

Because the intermediate variable *X*(*k*) depends on the independent variable *X*(0) as well, the chain rule is applied on the right hand sides of Equation (23) and we can further ge<sup>t</sup> that

$$\begin{split} \frac{\partial X(k+1)}{\partial X(0)} &= \frac{\partial F\_d}{\partial X(k)} \cdot \frac{\partial X(k)}{\partial X(0)} + \frac{\partial F\_d}{\partial X(0)}\\ \frac{\partial y(k)}{\partial X(0)} &= \frac{\partial G\_d}{\partial X(k)} \cdot \frac{\partial X(k)}{\partial X(0)} + \frac{\partial G\_d}{\partial X(0)}. \end{split} \tag{24}$$

By defining *SX*(*k*) = *∂X*(*k*) *∂X*(0) and *Sy*(*k*) = *∂y*(*k*) *∂X*(0), the above equations can be converted to ordinary differential equations, which are shown below:

$$\begin{split} S\_{\mathfrak{X}}(k+1) &= \frac{\partial F\_{\mathfrak{d}}}{\partial X(k)} \cdot S\_{\mathfrak{X}}(k) + \frac{\partial F\_{\mathfrak{d}}}{\partial X(0)} \\ S\_{\mathfrak{Y}}(k) &= \frac{\partial G\_{\mathfrak{d}}}{\partial X(k)} \cdot S\_{\mathfrak{X}}(k) + \frac{\partial G\_{\mathfrak{d}}}{\partial X(0)}. \end{split} \tag{25}$$

Therefore, by giving the initial states of Equations (20) and (25) and solving them at the same time, the sensitivity matrix *Sy*(*k*) can be obtained. *Sy*(*k*) may be normalized to obtain the normalized sensitivity matrix *SN*:

$$S\_N(k) = \frac{\partial y(k)}{\partial X(0)} \cdot \frac{X(0)}{y(k)}.\tag{26}$$

Once the sensitivity matrix is obtained, we can use it to determine the relative importance of different parameters. Specifically, we can exam the magnitudes of the elements in the sensitivity matrix. Each parameter corresponds to one column in the sensitivity matrix. We can use, for example, the summation of the absolute values of the elements of each column to compare the relative importance of parameters. A bigger value implies a more important parameter in terms of its impact on the outputs. Among all the candidate parameter sets, we keep the parameter set with the highest sensitivity values.

#### *4.3. Minimum Number of Sensors*

After the parameter set to be estimated is determined, the original system is augmented with the parameters, as illustrated in Reference [37], we can use the maximum multiplicity theory [41] to determine the minimum number of sensors required to ensure the observability of the entire system. Then, state estimation techniques can be used to estimate the states and parameters simultaneously.

#### **5. Simulation Results and Discussion**

#### *5.1. System Description*

In this work, a total length (*L*) of 67 cm loam soil column is investigated, which is shown in Figure 3. The soil column is equally partitioned into 32 compartments. Correspondingly, Richards equation is spatially discretized into 32 states (*Nx*) in the z-direction, with each state centered at the corresponding compartment. At the surface of the soil, the irrigation, *qT*, is performed at the rate of 2.50 cm/day, from 12:00 PM to 4:00 PM daily. At the bottom, the free drainage boundary condition is used, which means the gradient between the last state and the state at the bottom boundary is 0. The soil column has the homogeneous initial condition (*x*(0)) of −0.514 m capillary pressure head and the parameters of the soil are shown in Table 1 [42].

**Table 1.** The initial condition and parameters of the investigated loam soil column.


**Figure 3.** A schematic diagram of the investigated loam soil column.

#### *5.2. Determination of Significant Parameters and Number of Sensors*

The augmented system (Equation (13)) is utilized to achieve simultaneous parameter and state estimation. First without knowing the observability of the augmented system, all 5 parameters ( *Ks*, *θ<sup>s</sup>*, *θ<sup>r</sup>*, *α* and *n*) are augmented; that is, *Np* = 5. In addition, all 32 states are assumed to be measured. A 10-day state trajectory, without considering the process and measurement noise, is used in the rest of the subsection for selecting appropriate parameters for estimation and determining the minimum number of sensors. It is assumed that the measurements are available every 1 h.

Following the procedure as discussed in Section 4.1, we apply the PBH observability test on the augmented system to check the identifiability of the parameters. The test is conducted every sampling time, which requires the system to be linearized accordingly. According to the results, the augmented system is not observable. This implies that it is impossible to identify the 5 parameters simultaneously. In order to look for an observable system, parameters are removed from the augmented system. We start with removing only 1 of the parameters and this results in 5 different augmented systems with each one augmented with 4 parameters. Then, the observability of the 5 augmented systems is checked. It was found that 2 of the 5 systems are observable. In these two systems, either *θs* or *θr* is removed. Since observable systems are found, we proceed to the next step to determine the final parameter set.

To determine which parameter set to use, the significance of *θs* and *θr* is compared based on the sensitivity analysis described in Section 4.2. Sensitivity analysis is conducted based on the original augmented system with all the parameters. The initial state of Equation (25) is an identity matrix of size *Nx* + *Np*. By comparing the summation of the absolute values of the elements of each column of the normalized sensitivity matrices *SN*, it can be found that the summation corresponding to the column *∂yi ∂θs* (82,674) is much bigger than the one for *∂yi ∂θr* (14,997). Based on this, *θs* is considered as a more important parameter because it has more impact on the output than *θ<sup>r</sup>*. Therefore, the parameter set containing *θr* is removed and the final parameter set will be used in the remaining analysis is {*Ks*, *θ<sup>s</sup>*, *α*, *<sup>n</sup>*}.

In the above analysis, it was assumed that all the states (soil moisture) are measured for the purpose of determining the parameters for estimation. When the set of parameters is determined, we remove this assumption to determine the minimum number of sensors (measurements) needed to ensure the observability of the augmented system with 4 parameters. Following the method described in Section 4.3, the maximum multiplicity method is conducted, and it can be found that the minimum number of sensors is 4.

#### *5.3. Simultaneous Parameter and State Estimation*

According to the minimum number of sensors found above, it is assumed that 4 tensiometers (*Ny*) are installed. Specifically, we assume that these sensors are installed at 7.30 cm, 24.1 cm, 40.8 cm and 57.6 cm below the surface, which measure the 4th, 12th, 20th and 28th states, respectively. In the simulations, the actual parameter values used are shown in Table 1 and they are assumed to be constant within the investigated temporal domain. Process noise and measurement noise (*<sup>ω</sup>x* and *ν*) are considered in the simulations and they have zero mean and standard deviations 3 × 10−<sup>6</sup> m and 8 × 10−<sup>3</sup> m, respectively.

In the design of the state and parameter filters (EKF, EnKF) and estimator (MHE), the model augmented with 4 parameters ( *Ks*, *θ<sup>s</sup>*, *α* and *n*) is used. The initial guesses of the initial states and parameters in the filters and estimator are listed in Table 2 and compared with those used in the actual system.

For the EKF and EnKF, the weighting matrices *Q* and *R* are designed as the auto-covariance matrices of *ωx* and *ν* with the standard deviations mentioned before. However, the diagonal elements of *Q* corresponding to augmented parameters are 0, because the parameters are assumed to be constant. In simulations, 10−<sup>20</sup> is used to approximate the value 0 and to ensure the positive definiteness of the matrix. The diagonal elements of *P* corresponding to the states are configured as the square of 3 × 10−<sup>3</sup> and those of parameters are configured as the square of 3 × 10−2. For the designed EnKF, 100 ensembles are used.

**Table 2.** True values of initial states and parameters of the process and the initial guesses used in filters and estimator.


For the design of MHE, the estimation window size is selected to be 8 h. The weighting matrices *P*, *Q*, and *R* retain the same ratio with respect to those used in EKF and EnKF but with a much bigger magnitude to ensure the numerical stability of the associated optimization problem. In addition, the *P* matrix is constant for all the optimizations. The constraints of the states, parameters and the model uncertainty are listed in Table 3. The upper and lower bounds of the term *ω*<sup>ˆ</sup> *p* are 0 because the parameters are constant.

**Table 3.** Lower and upper bounds used in moving horizon estimation (MHE).


In the following simulations, the root mean square error (RMSE) will be used to evaluate the performance of the MHE, EKF and EnKF. The estimation performance in terms of the states and parameters are evaluated separately. Their equations are shown below:

$$RMSE\_x(k) = \sqrt{\frac{\sum\_{i=1}^{N\_x} (\pounds\_i(k) - \pounds\_i(k))^2}{N\_x}} \tag{27}$$

$$RMSE\_p(k) = \sqrt{\frac{\sum\_{i=1}^{N\_p} (\not p\_i(k) - p\_i(k))^2}{N\_p}}.\tag{28}$$

First, we performed simulations assuming that the parameter *θr* (which is not estimated) is known and is the same as the value used in the actual system. Figures 4 and 5 show some representative estimated states and all the parameters using MHE, EKF and EnKF, which are also compared with their true values. Figure 4 shows the state trajectories of the top node and a few middle nodes and one bottom node. From the figure, it can be seen that the top node has more dynamics because it takes time for irrigated water to pass from the upper part and to the lower part. In terms of state estimation performance, from Figure 4, it can be seen that MHE and EnKF give very much more accurate state estimates than the EKF. Note that from Figure 4, it can also be seen that the estimates of the 12th state (*h*12) converge faster than the other estimates. This is because it is a sensor node.

In terms of parameter estimation, Figure 5 shows the results. From the figure, it can be seen that only MHE is capable of estimating the parameters, whereas those estimated by EKF and EnKF diverge from their true values. This may be because of the constraints used in MHE. These constraints provide more useful information to MHE in addition to the measurements.

The trajectories of the performance indices *RMSEx* and *RMSEp* associated with the MHE, EnKF and EKF are shown in Figure 6. These trajectories further confirm that the MHE and EnKF have better performance than EKF in estimation of the states and the MHE outperforms both EnKF and EKF in parameter estimation.

**Figure 4.** Selected trajectories of the process state and estimated states using MHE, extended Kalman filter (EKF) and ensemble Kalman filter (EnKF).

In the previous set of simulations, the parameter *θr* is assumed to be accurately known and is used in the MHE, EnKF and EKF. However, this assumption may not hold in practice. In this set of simulations, we study how an inaccurate *θr* may affect the state and parameter estimation performance. In this set of simulations, the value of *θr* used in the MHE, EnKF and EKF is assumed to be 10% off from the actual value. The tuning parameters used in the filters and estimator are the same as the ones used in the previous simulations. In this case, the EnKF and EKF cannot give accurate parameter estimates as in the previous case, either. The MHE is still the only estimation method that can give good parameter estimates. Table 4 summarizes the estimated parameters using the MHE in the two sets of simulations. The reported estimated values are the mean estimated values after the estimates have converged. According to the results, a 10% difference of *θr* does not affect the estimation results of other parameters when MHE is used. This verifies that the removal of *θr* has a minor impact on the overall state and parameter estimation performance. This further implies that the proposed method in parameter selection is applicable.

In this work, the spatial heterogeneity in soil properties is not considered. When parameter heterogeneity presents, a 3D Richards equation is needed to describe the water dynamics. The studied MHE algorithm can be extended to handle heterogeneous parameters in a straightforward manner. It is expected that the weighting matrices should be tuned taking into account the spatial heterogeneity. Also, a system with different soil types may be decomposed into a few subsystems with each subsystem having the same type of soil and distributed or decentralized estimation may be used accordingly. MHE may still be used in the design of the subsystem estimators.

**Figure 5.** Trajectories of estimated parameters using MHE, EKF and EnKF, compared with their actual values.

**Table 4.** Comparison of estimated parameters using MHE with their true values, when *θr* is assumed to be accurate and 10% off.


**Figure 6.** Trajectories of RMSE measuring the estimation performance of MHE, EKF, and EnKF.

#### *5.4. Effects of the Simulation Parameters*

In this subsection, we further study the performance of MHE in terms of number of measurements and size of estimation window of MHE.

#### 5.4.1. Effects of Number of Measurements

First, we study the effects of number of measurements on the estimation performance of MHE. In addition to the case with 4 measurements, we also consider cases with 8 and 12 measurements. Figure 7 shows how the two performance indices *RMSEx* and *RMSEp* evolve over time. From the top plot, it can be seen that the more sensors are used, the faster state estimates converge. This is because the sensors are directly measuring the states. When there are more sensors, it implies that we have more information of the states. For the parameter, there is no obvious difference between the convergence speeds with different number of measurements. Comparing the convergence speed between the state estimates and parameter estimates, the state estimates converge much faster within one day while the parameter estimates take longer time to converge (about 2 days). Overall, from this set of simulations, it can be concluded that 4 sensors are sufficient to estimate all states and parameters accurately.

**Figure 7.** Trajectories of RMSE measuring the error between actual model and estimated states and parameters of MHE using 4, 8 and 12 measurements.

5.4.2. Effects of MHE Estimation Window Size

The effects of the size of the estimation window of MHE on estimation performance are also studied assuming that there are 4 measurements. Figure 8 shows how the two performance indices *RMSEx* and *RMSEp* evolve over time with different estimation window sizes. From the figure, it can be seen that from both plots that a window size of 8 is sufficient and further increase of the estimation window size does not bring significant performance improvement.

**Figure 8.** Trajectories of RMSE measuring actual model and estimated states and parameters of MHE with window sizes of 8, 12, 16 and 20.

## **6. Conclusions**

In this work, we have investigated simultaneous state and parameter estimation using moving horizon estimation (MHE), extended Kalman filter (EKF) and ensemble Kalman filter (EnKF) applied to an infiltration process in an agro-hydrological system. First, a procedure was proposed to find the appropriate parameter set for estimation based on the observability of the augmented system and the sensitivity of the outputs to the parameters. It was found that only 4 out of 5 parameters (hydraulic conductivity, saturated soil moisture and van Genuchten-Mualem parameters) can be considered in simultaneous state and parameter estimation. The less important parameter (residual soil moistures) was not considered in parameter estimation. After determining the parameter set for estimation, the minimum number of sensors was also found based on the maximum multiplicity theory. Simulation results showed that the MHE has an overall the best state and parameter estimation performance due to the inclusion of state and parameter constraints in the estimation. It was also found that the uncertainty in the residual soil moisture (which was not estimated) does not affect the overall estimation performance too much. The effects of number of measurements and estimation window size of the MHE were also studied through simulations. It was found that 4 measurements and a window size of 8 for MHE are sufficient to provide accurate state and parameter estimates.

**Author Contributions:** Conceptualization, methodology, simulation design, S.B., S.R.S. and J.L.; validation and formal analysis, S.B., S.R.S., X.Y. and J.L.; original draft preparation, S.B.; review and editing, S.B., S.R.S., X.Y., J.L. and S.L.S.; supervision, J.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Natural Sciences and Engineering Research Council of Canada (NSERC). **Conflicts of Interest:** The authors declare no conflict of interest.
