*Proceeding Paper* **Regularization of the Gravity Field Inversion Process with High-Dimensional Vector Autoregressive Models †**

**Andreas Kvas \* and Torsten Mayer-Gürr**


**Abstract:** Earth's gravitational field provides invaluable insights into the changing nature of our planet. It reflects mass change caused by geophysical processes like continental hydrology, changes in the cryosphere or mass flux in the ocean. Satellite missions such as the NASA/DLR operated Gravity Recovery and Climate Experiment (GRACE), and its successor GRACE Follow-On (GRACE-FO) continuously monitor these temporal variations of the gravitational attraction. In contrast to other satellite remote sensing datasets, gravity field recovery is based on geophysical inversion which requires a global, homogeneous data coverage. GRACE and GRACE-FO typically reach this global coverage after about 30 days, so short-lived events such as floods, which occur on time frames from hours to weeks, require additional information to be properly resolved. In this contribution we treat Earth's gravitational field as a stationary random process and model its spatio-temporal correlations in the form of a vector autoregressive (VAR) model. The satellite measurements are combined with this prior information in a Kalman smoother framework to regularize the inversion process, which allows us to estimate daily, global gravity field snapshots. To derive the prior, we analyze geophysical model output which reflects the expected signal content and temporal evolution of the estimated gravity field solutions. The main challenges here are the high dimensionality of the process, with a state vector size in the order of 10<sup>3</sup> to 104, and the limited amount of model output from which to estimate such a high-dimensional VAR model. We introduce geophysically motivated constraints in the VAR model estimation process to ensure a positive-definite covariance function.

**Keywords:** GRACE/GRACE-FO; gravity field recovery; vector autoregressive models

#### **1. Introduction**

Earth's gravitational field is key quantity for observing the state and change of our planet. Its variations in time can be related to surface mass changes [1] and thus it provides insight into geophysical and climate relevant processes, for example, sea level rise [2], ice mass loss [3], or the terrestrial water cycle [4]. Since 2002, the dedicated satellite mission Gravity Recovery And Climate Experiment (GRACE) [5,6] and its successor GRACE Follow-On (GRACE-FO) [7] have monitored temporal variations of Earth's gravitational field and have provided an invaluable data record for climate and Earth system sciences. The standard data products of both missions are unconstrained and constrained monthly snapshots of potential or surface mass changes. The time period of one month is not chosen arbitrarily but is a consequence of the orbit geometry of the satellites. As the satellites do not directly observe the changes in gravitational attraction but rather provide very precise measurements of their absolute and relative motion, the underlying potential field must be determined by an inversion process. This inversion process requires a global, homogeneous data coverage which for GRACE and GRACE-FO is reached after about 30 days.

However, in recent years different approaches to derive sub-monthly gravity field variations which combine satellite measurements with prior information have been developed.

**Citation:** Kvas, A.; Mayer-Gürr, T. Regularization of the Gravity Field Inversion Process with High-Dimensional Vector Autoregressive Models. *Phys. Sci. Forum* **2021**, *3*, 7. https://doi.org/10.3390/ psf2021003007

Academic Editors: Wolfgang von der Linden and Sascha Ranftl

Published: 7 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Notable examples are the Kalman smoother approach by [8], a moving window approach by [9], constrained surface mass estimates from the Center of Space Research (CSR) [10] and a Kalman filter approach for regional applications [11]. Data sets dervied with these approaches have seen various applications in Earth system sciences, for example, analysis of large scale flood events [12], evaluation of geophysical models [13,14] or near real-time flood monitoring [15].

In this contribution we show how maximum entropy spectral estimation can be used to derive prior information for regularizing the inversion process for daily gravity field variations. To that end, we estimate a vector autoregressive (VAR) model from a time series of geophysical model output which approximates the expected signal. Following [15], we show how this spatio-temporal covariance information can be introduced into the inversion process and present a time series of daily gravity field variations derived from approximately thee years of GRACE-FO data.

#### **2. Materials and Methods**

A widely used method to recover Earth's gravitational field from a set of measurements is to solve an overdetermined system of equations in a least-squares adjustment [16]. Here, the measurements or observations **l** are related to unknown gravity field parameters through a (possibly non-linear) functional model **f** with:

$$\mathbf{l} = \mathbf{f}(\mathbf{x}) + \mathbf{e}.\tag{1}$$

We added the residual vector **e** because (1) equation is generally not consistent due to the stochastic nature of **l**. To solve for **x**, we expand **f** into a Taylor series around **x**<sup>0</sup> and truncate it after the linear term. This yields the reduced observations equations:

$$
\Delta \mathbf{l} = \mathbf{l} - \mathbf{f}(\mathbf{x}\_0) = \mathbf{A}(\mathbf{x} - \mathbf{x}\_0) + \mathbf{e}, \qquad \mathbf{e} \sim \mathcal{N}(\mathbf{0}, \Sigma\_{\Lambda \mathbf{I}}), \tag{2}
$$

where the matrix **A** consists of the partial derivatives of **f** with respect to the parameter vector **x**. In (2), the residual vector **e** is assumed to be centered and normally distributed with the covariance matrix **Σ**Δ**l**. This system of linear equations can be solved for the parameter corrections Δ**x**ˆ = **x**ˆ − **x**<sup>0</sup> by forming the system of normal equations:

$$(\mathbf{A}^T \Sigma\_{\Delta \mathbf{l}}^{-1} \mathbf{A}) \Delta \mathbf{\hat{x}} = \mathbf{A}^T \Sigma\_{\Delta \mathbf{l}}^{-1} \Delta \mathbf{l},\tag{3}$$

or more concise, **N**Δ**x**ˆ = **n**. For this study, we express the changes Earth's gravitational field as spherical harmonic series expansion:

$$
\Delta V(r, \theta, \lambda) = \frac{GM}{R} \sum\_{n=0}^{\infty} \sum\_{m=-n}^{n} \left(\frac{R}{r}\right)^{n+1} a\_{nm} Y\_{nm}(\theta, \lambda), \tag{4}
$$

where the fully normalized surface spherical harmonics *Ynm* are given by:

$$Y\_{\rm num} = \begin{cases} P\_{\rm num}(\cos \theta) \cos \lambda \text{ if } m \ge 0\\ P\_{n|m|}(\cos \theta) \sin \lambda \text{ if } m < 0 \end{cases} \tag{5}$$

with *Pnm* being the fully normalized associated Legendre functions [17]. In practice, the series is truncated at a maximum degree *n*max, which depends on the application and is chosen here with *n*max = 40. Furthermore, we set degrees *n* = {0, 1} to zero because we assume that Earth's mass does not vary and we fix the coordinate reference frame in Earth's center of mass. The unknown parameter corrections Δ**x** therefore consist of the spherical harmonic coefficients *anm* with *n* ∈ {2, . . . , 40}.

Since we want to determine daily temporal changes in Earth's gravitational field we set up this least-squares adjustment for a time series *ti* = *i*Δ*t*, *i* ∈ {0, ... , *N* − 1}, where Δ*t* is 1 day. We denote the resulting observation equations for time step *i* with a

corresponding subscript: Δ**l***<sup>i</sup>* = **A***i*Δ**x***<sup>i</sup>* + **e***i*. All epochs can be combined in a single least squares adjustment with the block-diagonal normal equation coefficient matrix **N**¯ *ii* = **N***<sup>i</sup>* and the right hand side **n**¯ = **n***i*. Given the block-diagonal structure of **N**¯ we can solve for each time step individually resulting in the estimated parameter corrections Δ**x**ˆ*<sup>i</sup>* and their corresponding variance-covariance matrix **<sup>Σ</sup>**<sup>ˆ</sup> <sup>Δ</sup>**x**ˆ*<sup>i</sup>* = **<sup>N</sup>**−<sup>1</sup> *<sup>i</sup>* . Next to the input data quality, the uncertainty of Δ**x**ˆ*<sup>i</sup>* primarily depends on the spatial data coverage within the observation interval. To recover the full spherical harmonic spectrum, a global homogeneous data coverage is required. For the application here, where the observation time span is a single day, the satellites perform approximately 15 revolutions and thus we can only infer spherical harmonic coefficients up to order *m* = 15 [18]. The effect this sparse data coverage has on the estimated gravity field solution can be seen in Figure 1, where we propagated the uncertainty of an unconstrained daily GRACE-FO solution to equivalent water height (EWH) in space domain.

**Figure 1.** Uncertainty of an unconstrained daily gravity field estimated from one day of GRACE-FO data propagated to equivalent water height (EWH) in space domain. The solid black lines show the satellite ground tracks for this given day.

The uncertainty between the satellite ground tracks is magnitudes higher than the expected signal, which is in the order of ±25 cm. Consequently, to recovery global, submonthly gravity field variations, additional external information is required.

We introduce prior information on the parameter corrections Δ**x** = [Δ**x***<sup>T</sup>* <sup>0</sup> , ... , <sup>Δ</sup>**x***<sup>T</sup> <sup>N</sup>*−1] *T* in the form of a positive definite covariance matrix **R**. We further assume that **Δx***<sup>i</sup>* is a Gaussian, temporally stationary random process with an expected value of zero. This implies that their temporal covariance function **Σ**Δ**x**(*ti*, *tj*) only depends on the lag *h* between the epoch *i* and *j* [19]. The prior distribution is then given by:

$$
\Delta \mathbf{x} \sim \mathcal{N}(0, \mathbf{R}).\tag{6}
$$

If we sort all epochs in temporally ascending order, the variance-covariance matrix **R** of the resulting vector is block-Toeplitz with the individual blocks given by

$$\mathbf{R}\_{ij} = \begin{cases} \mathbf{E}\_{\Delta \mathbf{x}} (h = |j - i|) \text{ if } i \le j \\ \mathbf{E}\_{\Delta \mathbf{x}}^T (h = |j - i|) \text{ else} \end{cases} \text{ s.} \tag{7}$$

This prior information is then incorporated into the least squares adjustment as pseudo-observations of the form:

$$\mathbf{0} = \Delta \mathbf{x} + \mathbf{v} \qquad \mathbf{v} \sim \mathcal{N}(\mathbf{0}, \mathbf{R}), \tag{8}$$

with **v** = [**v***<sup>T</sup>* <sup>0</sup> , ... , **<sup>v</sup>***<sup>T</sup> <sup>N</sup>*−1] *<sup>T</sup>*. The constrained system of normal equations for the whole time series is subsequently given by:

$$(\bar{\mathbf{N}} + \mathbf{R}^{-1})\Delta\mathbf{\hat{x}} = \mathbf{\dot{n}}.\tag{9}$$

The solution to this augmented least-squares problem is numerically equivalent to the Bayes estimate under the assumption of the given prior information.

Both GRACE and GRACE-FO exhibit an approximate repeat cycle of 4–5 days at the equator. As the observation information is closely tied to the ground track coverage es presented in Figure 1, this means that points at the equator are also updated at this repeat frequency. Between these updates, the solutions are supported by the introduced temporal constraint and slowly tend towards the Taylor series expansion point **x**0. We chose **x**<sup>0</sup> as a long-term estimate of the secular and seasonal gravity field variations, which captures a large part of the total signal. In higher latitudes, where the ground tracks converge, observation updates occur more frequently. This means that as we move closer to the poles, the gravity field solutions become less reliant on the prior information.

Note that the covariance function **Σ**Δ**x**(*h*) is still unknown at this point. To estimate the stochastic properties we perform maximum entropy spectral estimation by fitting a vector autoregressive model (VAR) with:

$$
\Delta \mathbf{x}\_{l} = \sum\_{k=1}^{p} \Phi\_{k} \Delta \mathbf{x}\_{l-k} + \mathbf{w}\_{l\prime} \qquad \mathbf{w}\_{l} \sim \mathcal{N} \mathbf{0} \,\prime \,\Sigma\_{\mathbf{w}} \tag{10}
$$

to a time series of geophysical model output **m***i*, which approximates the true spatiotemporal covariance structure of Earth's time-variable gravity field. We make use of the Earth System Model of the European Space Agency (ESA ESM, [20]) to generate daily spherical harmonics coefficients vectors. To estimate the VAR model, we use Yule–Walker equations [19] to estimate the empirical covariance matrices for lags *h* ∈ {0, ... , 3} using the unbiased estimator:

$$
\hat{\Sigma}(h) = \frac{1}{M - h} \sum\_{k=0}^{M-h-1} \mathbf{m}\_k \mathbf{m}\_{k+h}^T. \tag{11}
$$

After estimating the empirical covariance function and before solving the Yule–Walker equations, we apply geophysically motivated constraints onto the obtained covariance matrices. Specifically, we want set correlations between land and ocean to zero and reduce the overall correlation length. These constraints are very easy to introduce in space domain so we first propagate **Σ**ˆ(*h*) from the spherical harmonics domain to an EWH grid in space domain. Now the matrices can be decomposed into *σij*(*h*) = *rij*(*h*)*σiσj*, where *rij*(*h*) represents the correlation between two grid points at lag *h* and *σi*/*<sup>j</sup>* is the standard deviation of the corresponding grid points. The constraints are then realized by setting *rij*(*h*) = 0 if the points *i* and *j* are on different domains and applying a decay function *r*˜*ij*(*h*) = *rij*(*h*)*e*−*ψ*/*ψ*<sup>0</sup> . The amount at which the correlation between points in dependence of their spherical distance *ψ* is reduced is governed by the the parameter *ψ*<sup>0</sup> which we chose as 1100 km. After propagating the modified empirical covariance matrices back to spherical harmonics domain, we can now solve the Yule–Walker equations to obtain the VAR model coefficients **Φ***<sup>k</sup>* and compute the white noise covariance matrix **Σw**. Now we could compute the covariance function for all lags *h*, however, given the high dimensionality of the problem, assembling the full block variance-covariance matrix is not feasible. Instead we use the VAR model to transform the pseudo-observation equations (8). For each time step *<sup>i</sup>*, we apply **<sup>0</sup>** <sup>=</sup> <sup>−</sup>Δ**x***<sup>i</sup>* <sup>+</sup> <sup>∑</sup>*<sup>p</sup> <sup>k</sup>*=<sup>1</sup> **<sup>Φ</sup>***k*Δ**x***i*−*k*, which can also be written as a lower triangular block matrix applied to the full time series Δ**x** as in:

$$\mathbf{0} = \bar{\Phi}\Delta\mathbf{x} + \mathbf{w} \qquad \mathbf{w} \sim \mathcal{N}(\mathbf{0}, \bar{\Sigma}\_{\mathbf{w}}),\tag{12}$$

where **Σ**¯ **<sup>w</sup>** = **Φ**¯ **RΦ**¯ *<sup>T</sup>* is a block diagonal matrix with **Σ<sup>w</sup>** on the main diagonal. This has the great advantage that the resulting normal equation coefficient matrix is block-banded with a bandwith of *p* + 1 and thus greatly reduces the memory demands. The transformed system of normal equations which we use to estimate the full time series of gravity field variations is then given by:

$$(\hat{\mathbf{N}} + \boldsymbol{\Phi}^T \boldsymbol{\Sigma}\_{\mathbf{w}}^{-1} \boldsymbol{\Phi}) \Delta \hat{\mathbf{x}} = \vec{\mathbf{n}}.\tag{13}$$

Note that for a VAR model of order 1, (13) yields the same results as the Kalman smoother approach presented in [8].

#### **3. Results**

To showcase how the VAR regularization affects the gravity field estimates, we computed systems of normal equations from close to three years of GRACE-FO sensor data [21]. Figure 2 shows the autocovariance matrix computed from the VAR model and the temporal variability of the computed time series propagated to space domain.

**Figure 2.** Comparison of (**a**) the autocovariance matrix derived from the estimated VAR model expressed as standard deviation per grid point and (**b**) the temporal RMS of the computed gravity field solutions.

As can be seen, the estimated gravity field time series shows a lot more signatures than the expected signal covariance. This means that the GRACE-FO observations provide significant information and are not overly constrained by the derived VAR model. It should be noted that the VAR model and subsequently the spatio-temporal constraints depend on the geophysical models used in their derivation. However, the influence of different hydrological models is very small as shown in [12]. This suggests that the daily gravity field solutions are primarily data driven.

To gauge how the daily gravity field variations compare with standard GRACE/GRACE-FO products, we compute area mean time series for selected hydrological basins. As monthly data set we use the unconstrained monthly solutions of ITSG-Grace\_operational [22,23] computed at Graz University of Technology, filtered with DDK4 spatial filter [24].

Figure 3 shows the area mean time series of 4 regions where we see larger water storage variations. We can see that the daily gravity field solutions clearly pick up sub-monthly signal and thus provide additional information compared to the standard GRACE/GRACE-FO products.

**Figure 3.** Total water storage anomaly (TWSA) in equivalent water height for (**a**) Tigris/Euhprates, (**b**) Danube, (**c**) Orinoco, and (**d**) Ganges basin for the year 2019 (trend and annual signal are removed). The estimated daily solutions are compared with DDK4-filtered monthly solutions from ITSG-Grace\_operational shown as step function.

#### **4. Discussion**

We have shown that VAR models provide an efficient way of introducing spatiotemporal prior information into the gravity field inversion process. This approach is equivalent to constraining the gravity field estimates with the full block-Toeplitz variancecovariance matrix, however requires only a fraction of the memory. Constraining the inversion of daily gravity field variations with a VAR model of arbitrary order *p* constitutes the generalization of the Kalman smoother approach of [8], which is the special case for *p* = 1. If the VAR model is derived as a maximum entropy spectral estimate from a time series of geophysical model output, a few challenges arise. The high-dimensionality of the problem with a state vector size in the order of 10<sup>3</sup> to 104, combined with the limited availability of model output results in a very low redundancy in the VAR model estimation. We counteract this by introducing geophysically motivated constraints on the empirical covariance matrix estimates. First, we decouple the land and ocean domain and then we reduce the remaining correlation between far away point by introducing an exponential decay function dependent on the spherical distance. This drastically improves the condition of the VAR estimation problem. Finally, we show that the derived prior information in combination with GRACE-FO measurement data yields reasonable results by computing a time series of daily gravity field variations from June 2018 to March 2021. Here, we can clearly see sub-monthly signals which underlines the proficiency of the approach.

**Author Contributions:** A.K. worked on the theoretical background, performed the gravity field recovery and wrote the manuscript; T.M.-G. worked on the theoretical background. All authors have read and agreed to the published version of the manuscript.

**Data Availability Statement:** The full time series of gravity field variations and the estimated autoregressive model are available on ftp://ftp.tugraz.at/outgoing/ITSG/GRACE/ITSG-Grace\_ operational/daily\_kalman (accessed on 21 March 2021).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**

