parameters
sigma = 10.0
beta = 8.0/3.0
rho = 28.0
dt = 0.01
tm = 10
nt = int(tm/dt)
t = np.linspace(0,tm,nt+1)
############################ Twin experiment ##################################
def h(u): # Observation operator
w=u
return w
def Dh(u): #Jacobian of observation operator
n = len(u)
D = np.eye(n)
return D
u0True = np.array([1,1,1]) # True initial conditions
np.random.seed(seed=1)
sig_m= 0.15 # standard deviation for measurement noise
R = sig_m**2*np.eye(3) #covariance matrix for measurement noise
dt_m = 0.2 #time period between observations
tm_m=2 #maximum time for observations
nt_m = int(tm_m/dt_m) #number of observation instants
ind_m = (np.linspace(int(dt_m/dt),int(tm_m/dt),nt_m)).astype(int)
t_m = t[ind_m]
#time integration
uTrue = np.zeros([3,nt+1])
uTrue[:,0] = u0True
km = 0
w = np.zeros([3,nt_m])
for k in range(nt):
uTrue[:,k+1] = RK4(Lorenz63,uTrue[:,k],dt,sigma,beta,rho)
if (km<nt_m) and (k+1==ind_m[km]):
w[:,km] = h(uTrue[:,k+1]) + np.random.normal(0,sig_m,[3,])
km = km+1
########################### Data Assimilation #################################
u0b = np.array([2.0,3.0,4.0])
u0a = u0b
J0 = loss(Lorenz63,h,t,ind_m,u0a,w,R,1,sigma,beta,rho)
for iter in range(1000):
#computing the gradient of cost functional with base trajectory
```

```
Fluids 2020, 5, 225
```

```
dJ = Adj4dvar(Lorenz63,JLorenz63,h,Dh,t,ind_m,u0a,w,R,1,sigma,beta,rho)
#minimization direction
p = -dJ/np.linalg.norm(dJ)
#Golden method for linesearch
alpha = GoldenAlpha(p,Lorenz63,h,t,ind_m,u0a,w,R,1,sigma,beta,rho)
#update initial condition with gradient descent
u0a = u0a + alpha*p
J = loss(Lorenz63,h,t,ind_m,u0a,w,R,1,sigma,beta,rho)
if np.abs(J0-J) < 1e-2:
print('Convergence: loss function')
break
else:
J0=J
if np.linalg.norm(dJ) < 1e-4:
print('Convergence: gradient of loss function')
break
##################### Time Integration [Comparison] ###########################
ub = np.zeros([3,nt+1])
ub[:,0] = u0b
ua = np.zeros([3,nt+1])
ua[:,0] = u0a
km = 0
for k in range(nt):
ub[:,k+1] = RK4(Lorenz63,ub[:,k],dt,sigma,beta,rho)
ua[:,k+1] = RK4(Lorenz63,ua[:,k],dt,sigma,beta,rho)
#%%
############################### Plotting ######################################
import matplotlib as mpl
mpl.rc('text', usetex=True)
mpl.rcParams['text.latex.preamble']=[r"\usepackage{amsmath}"]
mpl.rcParams['text.latex.preamble']=[r'\boldmath']
font = {'family' : 'normal',
'weight' : 'bold',
'size' : 20}
mpl.rc('font', **font)
fig, ax = plt.subplots(nrows=3,ncols=1, figsize=(10,8))
ax = ax.flat
for k in range(3):
ax[k].plot(t,uTrue[k,:], label=r'\bf{True}', linewidth = 3)
ax[k].plot(t,ub[k,:], ':', label=r'\bf{Background}', linewidth = 3)
ax[k].plot(t[ind_m],w[k,:], 'o', fillstyle='none', \
label=r'\bf{Observation}', markersize = 8, markeredgewidth = 2)
ax[k].plot(t,ua[k,:], '--', label=r'\bf{Analysis}', linewidth = 3)
ax[k].set_xlabel(r'$t$',fontsize=22)
ax[k].axvspan(0, tm_m, color='y', alpha=0.4, lw=0)
ax[0].legend(loc="center", bbox_to_anchor=(0.5,1.25),ncol =4,fontsize=15)
ax[0].set_ylabel(r'$x(t)$')
ax[1].set_ylabel(r'$y(t)$')
```
#### ax[2].set\_ylabel(r'\$z(t)\$') fig.subplots\_adjust(hspace=0.5)

✝ ✆

**Figure 2.** Results of 4DVAR implementation for the Lorenz 63 system.

Before we move to other data assimilation techniques, we highlight a few remarks regarding our presentation of the 4DVAR


this can be an easy task. However, for a higher dimensional system and more accurate time integrators, this would be cumbersome. Instead, the chain rule can be utilized to compute **D***M*(**u**) as presented in Listing 10, which takes as input the right-hand side of the continuous-time dynamics *f*(·; ·) (described in Listing 3 for Lorenz 63 system) as well as its Jacobian (given in Listing 11 for Lorenz 63).

**Listing 10.** Python functions for computing the Jacobian **D***M*(**u**) of the discrete-time model map *<sup>M</sup>*(**u**; *<sup>θ</sup>*) using the 1st Euler and the 4th Runge–Kutta schemes with chain rule. ✞ ☎

```
import numpy as np
def Jeuler(rhs,Jrhs,state,dt,*args):
n = len(state)
k1 = rhs(state,*args)
dk1 = Jrhs(state,*args)
DM = np.eye(n) + dt*dk1
return DM
def JRK4(rhs,Jrhs,state,dt,*args):
n = len(state)
k1 = rhs(state,*args)
k2 = rhs(state+k1*dt/2,*args)
k3 = rhs(state+k2*dt/2,*args)
dk1 = Jrhs(state,*args)
dk2 = Jrhs(state+k1*dt/2,*args) @ (np.eye(n)+dk1*dt/2)
dk3 = Jrhs(state+k2*dt/2,*args) @ (np.eye(n)+dk2*dt/2)
dk4 = Jrhs(state+k3*dt,*args) @ (np.eye(n)+dk3*dt)
DM = np.eye(n) + (dt/6) * (dk1+2*dk2+2*dk3+dk4)
return DM
```
✝ ✆ **Listing 11.** A Python function for the Jacobian of the continuous-time Lorenz 63 dynamics. ✞ ☎

```
import numpy as np
def JLorenz63(state,*args): #Jacobian of Lorenz 96 model
sigma = args[0]
beta = args[1]
rho = args[2]
x, y, z = state #Unpack the state vector
df = np.zeros([3,3]) #Derivatives
df[0,0] = sigma * (-1)
df[0,1] = sigma * (1)
df[0,2] = sigma * (0)
df[1,0] = 1 * (rho - z)
df[1,1] = -1
df[1,2] = x * (-1)
df[2,0] = 1 * y
df[2,1] = x * 1
df[2,2] = - beta
return df
```
#### **5. Forward Sensitivity Method**

We have seen in Section 4 that the minimization of the cost functional via the 4DVAR algorithm requires the solution of the adjoint problem at each iteration, which incurs a significant computational burden for high dimensional systems. Alternatively, Lakshmivarahan and Lewis [13] proposed the forward sensitivity method (FSM) to derive an expression for the correction vector in terms of the forward sensitivity matrices [14]. In their development, simultaneous correction to the initial condition **u**(*t*0) and the model parameters *θ* is treated. For conciseness and consistency with the methods introduced here, we only consider erroneous initial conditions and assume model parameters are perfectly known. Given the discrete-time model map *M*(**u**; *θ*) in Equation (2), the forecast sensitivity at time *tk*<sup>+</sup>*<sup>a</sup>* to the initial conditions **u**(*t*0) can be defined as follows,

$$\frac{\partial u\_i(t\_{k+1})}{\partial u\_j(t\_0)} = \sum\_{q=1}^n \left( \frac{\partial M\_i(\mathbf{u}(t\_k); \theta)}{\partial u\_q(t\_k)} \right) \left( \frac{\partial u\_q(t\_k)}{\partial u\_j(t\_0)} \right), \qquad 1 \leqslant i, j \leqslant n,\tag{26}$$

where *M*(**u**(*tk*); *θ*)=[*M*1(**u**(*tk*); *θ*), *M*2(**u**(*tk*); *θ*), ... , *Mn*(**u**(*tk*); *θ*)]*T*. Recall that the Jacobian of the model *<sup>M</sup>*(**u**(*tk*); *<sup>θ</sup>*) is defined by the matrix **<sup>D</sup>***M*(**u**(*tk*)) <sup>∈</sup> <sup>R</sup>*n*×*<sup>n</sup>* whose (*i*, *<sup>j</sup>*)*th* entry is defined as *∂Mi*(**u**(*tk*); *θ*) *<sup>∂</sup>uj*(*tk*) . We also define **<sup>U</sup>**(*tk*) as the forward sensitivity matrix of **<sup>u</sup>**(*tk*) <sup>∈</sup> <sup>R</sup>*n*×*<sup>n</sup>* with respect to

initial state **<sup>u</sup>**(*t*0), where [**U**(*tk*)]*i*,*<sup>j</sup>* <sup>=</sup> *<sup>∂</sup>ui*(*tk*) *<sup>∂</sup>uj*(*t*0) for 1 *i*, *j n*. Thus, Equation (26) can be rewritten in matrix form as,

$$\mathbf{U}(t\_{k+1}) = \mathbf{D}\_M(\mathbf{u}(t\_k))\mathbf{U}(t\_k). \tag{27}$$

Equation (27) provides the dynamic evolution of the forward sensitivity matrix in a recursive manner, initialized by **U**(*t*0) = **I***n*, that can be used to relate the prediction error at any time step to the initial condition.

Given the measurement **w**(*tk*) at time *tk* ∈ T , the forecast error **e**(*tk*) is defined as the difference between the model forecast and measurements as

$$\mathbf{e}(t\_k) = \mathbf{w}(t\_k) - \mathbf{h}(\mathbf{u}(t\_k)).\tag{28}$$

This is commonly called the innovation in DA terminology. The cost functional in Equation (13) can be rewritten as

$$J(\mathbf{u}(t\_0)) = \sum\_{t\_k \in \mathcal{T}} \frac{1}{2} \|\mathbf{e}(t\_k)\|\_{\mathbf{R}^{-1}(t\_k)}^2 = \sum\_{t\_k \in \mathcal{T}} \frac{1}{2} \mathbf{e}(t\_k)^T \mathbf{R}^{-1}(t\_k) \mathbf{e}(t\_k). \tag{29}$$

With the assumption that the dynamical model is perfect (i.e., correctly encapsulates all the relevant processes) and the model parameters are known, the deterministic part of the forecast error can be attributed to the inaccuracy in the initial condition **u**(*t*0), defined as Δ**u**<sup>0</sup> = **u***t*(*t*0) − **u***b*(*t*0), where **u***t*(*t*0) denotes the true initial conditions.

Considering a base trajectory **u**(*tk*) for *k* = 1, 2, ... generated from the initial condition of **u**(*t*0), related to the corrected trajectory (**u**(*tk*) for *k* = 1, 2, ... ,) obtained by correcting the initial condition as **u**(*t*0) = **u**(*t*0) + Δ**u**0, we define the difference between both trajectories at any time *tk* as Δ**u***<sup>k</sup>* = **u**(*tk*) − **u**(*tk*). We highlight that **u**(*tk*) is a function of both the initial condition (with the model parameters being known), the first-order Taylor expansion of **u**(*tk*) around the base trajectory can be written as **u**(*tk*) ≈ **u**(*tk*) + **U**(*tk*)Δ**u**0, leading to the following relation

$$
\Delta \mathbf{u}\_k \approx \mathbf{U}(t\_k) \Delta \mathbf{u}\_0. \tag{30}
$$

If we let the perturbed (corrected) trajectory to be the sought true trajectory, Equation (3) can be rewritten as

$$\mathbf{w}(t\_k) = h(\overline{\mathbf{u}}(t\_k) + \Delta \mathbf{u}\_k) + \mathbf{f}\_{m\prime} \tag{31}$$

and a first order expansion of **w**(*tk*) (neglecting the measurement noise) will be as follows,

$$\mathbf{w}(t\_k) \approx h(\overline{\mathbf{u}}(t\_k)) + \mathbf{D}\_h(\overline{\mathbf{u}}(t\_k)) \Delta \mathbf{u}\_{k\prime} \tag{32}$$

and the forecast error at the base trajectory can be approximated as

$$\mathbf{e}(t\_k) = \mathbf{D}\_h(\overline{\mathbf{u}}(t\_k)) \Delta \mathbf{u}\_k. \tag{33}$$

Equations (30) and (33) can be combined to yield the following,

$$\mathbf{e}(t\_k) = \mathbf{D}\_h(\overline{\mathbf{u}}(t\_k))\mathbf{U}(t\_k)\Delta\mathbf{u}\_{0\prime} \tag{34}$$

which relates the forecast error at any time *tk* and the discrepancy between the true and erroneous initial condition in a linear relationship. In order to account for all the time instants at which observations are available (T = {*tO*1, *tO*2, ... , *tON*}), Equation (34) can be concatenated at different times and written as a linear system of equations as follows,

$$\mathbf{Q}\Delta\mathbf{u}\_0 = \mathbf{e}\_{F\_{\prime}} \tag{35}$$

where the matrix **<sup>Q</sup>***Nm*×*<sup>n</sup>* and the vector **<sup>e</sup>***<sup>F</sup>* <sup>∈</sup> <sup>R</sup>*Nm* are computed as,

$$\mathbf{Q} = \begin{bmatrix} \mathbf{D}\_{h}(\overline{\mathbf{u}}(t\_{O1}))\mathbf{U}(t\_{O1}) \\\\ \mathbf{D}\_{h}(\overline{\mathbf{u}}(t\_{O2}))\mathbf{U}t\_{O2}) \\\\ \vdots \\\\ \mathbf{D}\_{h}(\overline{\mathbf{u}}(t\_{ON}))\mathbf{U}(t\_{ON}) \end{bmatrix}, \qquad \mathbf{e}\_{F} = \begin{bmatrix} \mathbf{e}(t\_{O1}) \\\\ \mathbf{e}(t\_{O2}) \\\\ \vdots \\\\ \mathbf{e}(t\_{ON}) \end{bmatrix}. \tag{36}$$

Depending on the value of *Nm* relative to *n*, Equation (35) can give rise to either an over-determined or an under-determined linear inverse problem. In either case, the inverse problem can be solved in a weighted least squares sense to find a first-order estimation of the optimal correction or perturbation to the initial condition Δ**u**0, with **R**−<sup>1</sup> being the weighting matrix, where **R** is a block-diagonal matrix constructed as follows,

$$\mathbf{R} = \begin{bmatrix} \mathbf{R}(t\_{O1}) \\\\ & \mathbf{R}(t\_{O2}) \\\\ & & \ddots \\\\ & & & \mathbf{R}(t\_{ON}) \end{bmatrix}, \tag{37}$$

and the solution of Equation (35) can be written as Δ**u**<sup>0</sup> = **<sup>Q</sup>***T***R**−1**Q**−<sup>1</sup> **<sup>Q</sup>***T***R**−1**e***<sup>F</sup>* for the over-determined case. This first order approximation progressively yields better results by repeating the entire process for multiple iterations until convergence with certain tolerance [13]. In essence, the FSM is an alternative to the 4DVAR algorithm, replacing the solution of the adjoint problem backward in time (i.e., Equation (25)), by the successive matrix evaluation in Equation (27). However, we highlight that in the 4DVAR approach, the actual forecast error is computed as **e***<sup>k</sup>* = **w**(*tk*) − *h*(**u**(*tk*)) while its first-order approximation is utilized in the FSM development. The duality between the two approaches is further discussed in [13].

A Python implementation of the FSM approach is presented in Listing 12. We note that we solve the Equation (35) using the built-in numpy least-squares function. However, more efficient iterative schemes can be adopted in practice.

**Listing 12.** Python function for computing the correction vector Δ**u**<sup>0</sup> using the forward sensitivity method with first-order approximation. ✞ ☎

```
import numpy as np
from scipy.linalg import block_diag
from scipy.linalg import sqrtm
def fsm1st(rhs,Jrhs,ObsOp,JObsOp,t,ind_m,u0b,w,R,opt,*args):